KVM: x86: work around QEMU issue with synthetic CPUID leaves
authorPaolo Bonzini <pbonzini@redhat.com>
Fri, 29 Apr 2022 18:43:04 +0000 (14:43 -0400)
committerPaolo Bonzini <pbonzini@redhat.com>
Fri, 29 Apr 2022 19:24:58 +0000 (15:24 -0400)
Synthesizing AMD leaves up to 0x80000021 caused problems with QEMU,
which assumes the *host* CPUID[0x80000000].EAX is higher or equal
to what KVM_GET_SUPPORTED_CPUID reports.

This causes QEMU to issue bogus host CPUIDs when preparing the input
to KVM_SET_CPUID2.  It can even get into an infinite loop, which is
only terminated by an abort():

   cpuid_data is full, no space for cpuid(eax:0x8000001d,ecx:0x3e)

To work around this, only synthesize those leaves if 0x8000001d exists
on the host.  The synthetic 0x80000021 leaf is mostly useful on Zen2,
which satisfies the condition.

Fixes: f144c49e8c39 ("KVM: x86: synthesize CPUID leaf 0x80000021h if useful")
Reported-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
arch/x86/kvm/cpuid.c

index b24ca7f..598334e 100644 (file)
@@ -1085,12 +1085,21 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
        case 0x80000000:
                entry->eax = min(entry->eax, 0x80000021);
                /*
-                * Serializing LFENCE is reported in a multitude of ways,
-                * and NullSegClearsBase is not reported in CPUID on Zen2;
-                * help userspace by providing the CPUID leaf ourselves.
+                * Serializing LFENCE is reported in a multitude of ways, and
+                * NullSegClearsBase is not reported in CPUID on Zen2; help
+                * userspace by providing the CPUID leaf ourselves.
+                *
+                * However, only do it if the host has CPUID leaf 0x8000001d.
+                * QEMU thinks that it can query the host blindly for that
+                * CPUID leaf if KVM reports that it supports 0x8000001d or
+                * above.  The processor merrily returns values from the
+                * highest Intel leaf which QEMU tries to use as the guest's
+                * 0x8000001d.  Even worse, this can result in an infinite
+                * loop if said highest leaf has no subleaves indexed by ECX.
                 */
-               if (static_cpu_has(X86_FEATURE_LFENCE_RDTSC)
-                   || !static_cpu_has_bug(X86_BUG_NULL_SEG))
+               if (entry->eax >= 0x8000001d &&
+                   (static_cpu_has(X86_FEATURE_LFENCE_RDTSC)
+                    || !static_cpu_has_bug(X86_BUG_NULL_SEG)))
                        entry->eax = max(entry->eax, 0x80000021);
                break;
        case 0x80000001: