KVM: nVMX: Hold KVM's srcu lock when syncing vmcs12->shadow
authorwanpeng li <wanpengli@tencent.com>
Mon, 17 Feb 2020 10:37:43 +0000 (18:37 +0800)
committerPaolo Bonzini <pbonzini@redhat.com>
Fri, 21 Feb 2020 17:05:03 +0000 (18:05 +0100)
For the duration of mapping eVMCS, it derefences ->memslots without holding
->srcu or ->slots_lock when accessing hv assist page. This patch fixes it by
moving nested_sync_vmcs12_to_shadow to prepare_guest_switch, where the SRCU
is already taken.

It can be reproduced by running kvm's evmcs_test selftest.

  =============================
  warning: suspicious rcu usage
  5.6.0-rc1+ #53 tainted: g        w ioe
  -----------------------------
  ./include/linux/kvm_host.h:623 suspicious rcu_dereference_check() usage!

  other info that might help us debug this:

   rcu_scheduler_active = 2, debug_locks = 1
  1 lock held by evmcs_test/8507:
   #0: ffff9ddd156d00d0 (&vcpu->mutex){+.+.}, at:
kvm_vcpu_ioctl+0x85/0x680 [kvm]

  stack backtrace:
  cpu: 6 pid: 8507 comm: evmcs_test tainted: g        w ioe     5.6.0-rc1+ #53
  hardware name: dell inc. optiplex 7040/0jctf8, bios 1.4.9 09/12/2016
  call trace:
   dump_stack+0x68/0x9b
   kvm_read_guest_cached+0x11d/0x150 [kvm]
   kvm_hv_get_assist_page+0x33/0x40 [kvm]
   nested_enlightened_vmentry+0x2c/0x60 [kvm_intel]
   nested_vmx_handle_enlightened_vmptrld.part.52+0x32/0x1c0 [kvm_intel]
   nested_sync_vmcs12_to_shadow+0x439/0x680 [kvm_intel]
   vmx_vcpu_run+0x67a/0xe60 [kvm_intel]
   vcpu_enter_guest+0x35e/0x1bc0 [kvm]
   kvm_arch_vcpu_ioctl_run+0x40b/0x670 [kvm]
   kvm_vcpu_ioctl+0x370/0x680 [kvm]
   ksys_ioctl+0x235/0x850
   __x64_sys_ioctl+0x16/0x20
   do_syscall_64+0x77/0x780
   entry_syscall_64_after_hwframe+0x49/0xbe

Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
arch/x86/kvm/vmx/vmx.c

index 3be25ec..dafe4df 100644 (file)
@@ -1175,6 +1175,10 @@ void vmx_prepare_switch_to_guest(struct kvm_vcpu *vcpu)
                                           vmx->guest_msrs[i].mask);
 
        }
+
+       if (vmx->nested.need_vmcs12_to_shadow_sync)
+               nested_sync_vmcs12_to_shadow(vcpu);
+
        if (vmx->guest_state_loaded)
                return;
 
@@ -6482,8 +6486,11 @@ static void vmx_vcpu_run(struct kvm_vcpu *vcpu)
                vmcs_write32(PLE_WINDOW, vmx->ple_window);
        }
 
-       if (vmx->nested.need_vmcs12_to_shadow_sync)
-               nested_sync_vmcs12_to_shadow(vcpu);
+       /*
+        * We did this in prepare_switch_to_guest, because it needs to
+        * be within srcu_read_lock.
+        */
+       WARN_ON_ONCE(vmx->nested.need_vmcs12_to_shadow_sync);
 
        if (kvm_register_is_dirty(vcpu, VCPU_REGS_RSP))
                vmcs_writel(GUEST_RSP, vcpu->arch.regs[VCPU_REGS_RSP]);