From: Sean Christopherson Date: Fri, 2 Jun 2023 01:15:16 +0000 (-0700) Subject: KVM: VMX: Retry APIC-access page reload if invalidation is in-progress X-Git-Tag: microblaze-v6.6~74^2^2~2 X-Git-Url: http://git.monstr.eu/?a=commitdiff_plain;h=878940b33d7678e39a526ffe264ee025977dc67e;p=linux-2.6-microblaze.git KVM: VMX: Retry APIC-access page reload if invalidation is in-progress Re-request an APIC-access page reload if there is a relevant mmu_notifier invalidation in-progress when KVM retrieves the backing pfn, i.e. stall vCPUs until the backing pfn for the APIC-access page is "officially" stable. Relying on the primary MMU to not make changes after invoking ->invalidate_range() works, e.g. any additional changes to a PRESENT PTE would also trigger an ->invalidate_range(), but using ->invalidate_range() to fudge around KVM not honoring past and in-progress invalidations is a bit hacky. Honoring invalidations will allow using KVM's standard mmu_notifier hooks to detect APIC-access page reloads, which will in turn allow removing KVM's implementation of ->invalidate_range() (the APIC-access page case is a true one-off). Opportunistically add a comment to explain why doing nothing if a memslot isn't found is functionally correct. Suggested-by: Jason Gunthorpe Cc: Alistair Popple Cc: Robin Murphy Reviewed-by: Alistair Popple Reviewed-by: Paolo Bonzini Link: https://lore.kernel.org/r/20230602011518.787006-2-seanjc@google.com Signed-off-by: Sean Christopherson --- diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index 7ecab2118106..9ea4a5dfe62a 100644 --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -6714,7 +6714,12 @@ void vmx_set_virtual_apic_mode(struct kvm_vcpu *vcpu) static void vmx_set_apic_access_page_addr(struct kvm_vcpu *vcpu) { - struct page *page; + const gfn_t gfn = APIC_DEFAULT_PHYS_BASE >> PAGE_SHIFT; + struct kvm *kvm = vcpu->kvm; + struct kvm_memslots *slots = kvm_memslots(kvm); + struct kvm_memory_slot *slot; + unsigned long mmu_seq; + kvm_pfn_t pfn; /* Defer reload until vmcs01 is the current VMCS. */ if (is_guest_mode(vcpu)) { @@ -6726,18 +6731,53 @@ static void vmx_set_apic_access_page_addr(struct kvm_vcpu *vcpu) SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES)) return; - page = gfn_to_page(vcpu->kvm, APIC_DEFAULT_PHYS_BASE >> PAGE_SHIFT); - if (is_error_page(page)) + /* + * Grab the memslot so that the hva lookup for the mmu_notifier retry + * is guaranteed to use the same memslot as the pfn lookup, i.e. rely + * on the pfn lookup's validation of the memslot to ensure a valid hva + * is used for the retry check. + */ + slot = id_to_memslot(slots, APIC_ACCESS_PAGE_PRIVATE_MEMSLOT); + if (!slot || slot->flags & KVM_MEMSLOT_INVALID) + return; + + /* + * Ensure that the mmu_notifier sequence count is read before KVM + * retrieves the pfn from the primary MMU. Note, the memslot is + * protected by SRCU, not the mmu_notifier. Pairs with the smp_wmb() + * in kvm_mmu_invalidate_end(). + */ + mmu_seq = kvm->mmu_invalidate_seq; + smp_rmb(); + + /* + * No need to retry if the memslot does not exist or is invalid. KVM + * controls the APIC-access page memslot, and only deletes the memslot + * if APICv is permanently inhibited, i.e. the memslot won't reappear. + */ + pfn = gfn_to_pfn_memslot(slot, gfn); + if (is_error_noslot_pfn(pfn)) return; - vmcs_write64(APIC_ACCESS_ADDR, page_to_phys(page)); + read_lock(&vcpu->kvm->mmu_lock); + if (mmu_invalidate_retry_hva(kvm, mmu_seq, + gfn_to_hva_memslot(slot, gfn))) { + kvm_make_request(KVM_REQ_APIC_PAGE_RELOAD, vcpu); + read_unlock(&vcpu->kvm->mmu_lock); + goto out; + } + + vmcs_write64(APIC_ACCESS_ADDR, pfn_to_hpa(pfn)); + read_unlock(&vcpu->kvm->mmu_lock); + vmx_flush_tlb_current(vcpu); +out: /* * Do not pin apic access page in memory, the MMU notifier * will call us again if it is migrated or swapped out. */ - put_page(page); + kvm_release_pfn_clean(pfn); } static void vmx_hwapic_isr_update(int max_isr)