git.monstr.eu Git - linux-2.6-microblaze.git/log

KVM: X86: Remove tailing newline for tracepoints

It's done by TP_printk() already.

Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: X86: Trace vcpu_id for vmexit

Tracing the ID helps to pair vmenters and vmexits for guests with
multiple vCPUs.

Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Merge tag 'kvmarm-5.4' of git://git./linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm updates for 5.4

- New ITS translation cache
- Allow up to 512 CPUs to be supported with GICv3 (for real this time)
- Now call kvm_arch_vcpu_blocking early in the blocking sequence
- Tidy-up device mappings in S2 when DIC is available
- Clean icache invalidation on VMID rollover
- General cleanup

Merge tag 'kvm-ppc-next-5.4-1' of git://git./linux/kernel/git/paulus/powerpc into HEAD

PPC KVM update for 5.4

- Some prep for extending the uses of the rmap array
- Various minor fixes
- Commits from the powerpc topic/ppc-kvm branch, which fix a problem
with interrupts arriving after free_irq, causing host hangs and crashes.

KVM: x86: Manually calculate reserved bits when loading PDPTRS

Manually generate the PDPTR reserved bit mask when explicitly loading
PDPTRs.  The reserved bits that are being tracked by the MMU reflect the
current paging mode, which is unlikely to be PAE paging in the vast
majority of flows that use load_pdptrs(), e.g. CR0 and CR4 emulation,
__set_sregs(), etc...  This can cause KVM to incorrectly signal a bad
PDPTR, or more likely, miss a reserved bit check and subsequently fail
a VM-Enter due to a bad VMCS.GUEST_PDPTR.

Add a one off helper to generate the reserved bits instead of sharing
code across the MMU's calculations and the PDPTR emulation.  The PDPTR
reserved bits are basically set in stone, and pushing a helper into
the MMU's calculation adds unnecessary complexity without improving
readability.

Oppurtunistically fix/update the comment for load_pdptrs().

Note, the buggy commit also introduced a deliberate functional change,
"Also remove bit 5-6 from rsvd_bits_mask per latest SDM.", which was
effectively (and correctly) reverted by commit cd9ae5fe47df ("KVM: x86:
Fix page-tables reserved bits").  A bit of SDM archaeology shows that
the SDM from late 2008 had a bug (likely a copy+paste error) where it
listed bits 6:5 as AVL and A for PDPTEs used for 4k entries but reserved
for 2mb entries.  I.e. the SDM contradicted itself, and bits 6:5 are and
always have been reserved.

Fixes: 20c466b56168d ("KVM: Use rsvd_bits_mask in load_pdptrs()")
Cc: stable@vger.kernel.org
Cc: Nadav Amit <nadav.amit@gmail.com>
Reported-by: Doug Reiland <doug.reiland@intel.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86: Disable posted interrupts for non-standard IRQs delivery modes

We can easily route hardware interrupts directly into VM context when
they target the "Fixed" or "LowPriority" delivery modes.

However, on modes such as "SMI" or "Init", we need to go via KVM code
to actually put the vCPU into a different mode of operation, so we can
not post the interrupt

Add code in the VMX and SVM PI logic to explicitly refuse to establish
posted mappings for advanced IRQ deliver modes. This reflects the logic
in __apic_accept_irq() which also only ever passes Fixed and LowPriority
interrupts as posted interrupts into the guest.

This fixes a bug I have with code which configures real hardware to
inject virtual SMIs into my guest.

Signed-off-by: Alexander Graf <graf@amazon.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Wanpeng Li <wanpengli@tencent.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: arm/arm64: vgic: Allow more than 256 vcpus for KVM_IRQ_LINE

While parts of the VGIC support a large number of vcpus (we
bravely allow up to 512), other parts are more limited.

One of these limits is visible in the KVM_IRQ_LINE ioctl, which
only allows 256 vcpus to be signalled when using the CPU or PPI
types. Unfortunately, we've cornered ourselves badly by allocating
all the bits in the irq field.

Since the irq_type subfield (8 bit wide) is currently only taking
the values 0, 1 and 2 (and we have been careful not to allow anything
else), let's reduce this field to only 4 bits, and allocate the
remaining 4 bits to a vcpu2_index, which acts as a multiplier:

vcpu_id = 256 * vcpu2_index + vcpu_index

With that, and a new capability (KVM_CAP_ARM_IRQ_LINE_LAYOUT_2)
allowing this to be discovered, it becomes possible to inject
PPIs to up to 4096 vcpus. But please just don't.

Whilst we're there, add a clarification about the use of KVM_IRQ_LINE
on arm, which is not completely conditionned by KVM_CAP_IRQCHIP.

Reported-by: Zenghui Yu <yuzenghui@huawei.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>

arm64: KVM: Device mappings should be execute-never

Since commit 2f6ea23f63cca ("arm64: KVM: Avoid marking pages as XN in
Stage-2 if CTR_EL0.DIC is set"), KVM has stopped marking normal memory
as execute-never at stage2 when the system supports D->I Coherency at
the PoU. This avoids KVM taking a trap when the page is first executed,
in order to clean it to PoU.

The patch that added this change also wrapped PAGE_S2_DEVICE mappings
up in this too. The upshot is, if your CPU caches support DIC ...
you can execute devices.

Revert the PAGE_S2_DEVICE change so PTE_S2_XN is always used
directly.

Fixes: 2f6ea23f63cca ("arm64: KVM: Avoid marking pages as XN in Stage-2 if CTR_EL0.DIC is set")
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: PPC: Book3S HV: Don't lose pending doorbell request on migration on P9

On POWER9, when userspace reads the value of the DPDES register on a
vCPU, it is possible for 0 to be returned although there is a doorbell
interrupt pending for the vCPU. This can lead to a doorbell interrupt
being lost across migration. If the guest kernel uses doorbell
interrupts for IPIs, then it could malfunction because of the lost
interrupt.

This happens because a newly-generated doorbell interrupt is signalled
by setting vcpu->arch.doorbell_request to 1; the DPDES value in
vcpu->arch.vcore->dpdes is not updated, because it can only be updated
when holding the vcpu mutex, in order to avoid races.

To fix this, we OR in vcpu->arch.doorbell_request when reading the
DPDES value.

Cc: stable@vger.kernel.org # v4.13+
Fixes: 579006944e0d ("KVM: PPC: Book3S HV: Virtualize doorbell facility on POWER9")
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>

KVM: PPC: Book3S HV: Check for MMU ready on piggybacked virtual cores

When we are running multiple vcores on the same physical core, they
could be from different VMs and so it is possible that one of the
VMs could have its arch.mmu_ready flag cleared (for example by a
concurrent HPT resize) when we go to run it on a physical core.
We currently check the arch.mmu_ready flag for the primary vcore
but not the flags for the other vcores that will be run alongside
it. This adds that check, and also a check when we select the
secondary vcores from the preempted vcores list.

Cc: stable@vger.kernel.org # v4.14+
Fixes: 38c53af85306 ("KVM: PPC: Book3S HV: Fix exclusion between HPT resizing and other HPT updates")
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>

KVM: PPC: Book3S: Enable XIVE native capability only if OPAL has required functions

There are some POWER9 machines where the OPAL firmware does not support
the OPAL_XIVE_GET_QUEUE_STATE and OPAL_XIVE_SET_QUEUE_STATE calls.
The impact of this is that a guest using XIVE natively will not be able
to be migrated successfully. On the source side, the get_attr operation
on the KVM native device for the KVM_DEV_XIVE_GRP_EQ_CONFIG attribute
will fail; on the destination side, the set_attr operation for the same
attribute will fail.

This adds tests for the existence of the OPAL get/set queue state
functions, and if they are not supported, the XIVE-native KVM device
is not created and the KVM_CAP_PPC_IRQ_XIVE capability returns false.
Userspace can then either provide a software emulation of XIVE, or
else tell the guest that it does not have a XIVE controller available
to it.

Cc: stable@vger.kernel.org # v5.2+
Fixes: 3fab2d10588e ("KVM: PPC: Book3S HV: XIVE: Activate XIVE exploitation mode")
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>

KVM: arm/arm64: vgic: Use a single IO device per redistributor

At the moment we use 2 IO devices per GICv3 redistributor: one
one for the RD_base frame and one for the SGI_base frame.

Instead we can use a single IO device per redistributor (the 2
frames are contiguous). This saves slots on the KVM_MMIO_BUS
which is currently limited to NR_IOBUS_DEVS (1000).

This change allows to instantiate up to 512 redistributors and may
speed the guest boot with a large number of VCPUs.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm/arm64: vgic: Remove spurious semicolons

Detected by Coccinelle (and Will Deacon) using
scripts/coccinelle/misc/semicolon.cocci.

Reported-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: PPC: Book3S HV: Define usage types for rmap array in guest memslot

The rmap array in the guest memslot is an array of size number of guest
pages, allocated at memslot creation time. Each rmap entry in this array
is used to store information about the guest page to which it
corresponds. For example for a hpt guest it is used to store a lock bit,
rc bits, a present bit and the index of a hpt entry in the guest hpt
which maps this page. For a radix guest which is running nested guests
it is used to store a pointer to a linked list of nested rmap entries
which store the nested guest physical address which maps this guest
address and for which there is a pte in the shadow page table.

As there are currently two uses for the rmap array, and the potential
for this to expand to more in the future, define a type field (being the
top 8 bits of the rmap entry) to be used to define the type of the rmap
entry which is currently present and define two values for this field
for the two current uses of the rmap array.

Since the nested case uses the rmap entry to store a pointer, define
this type as having the two high bits set as is expected for a pointer.
Define the hpt entry type as having bit 56 set (bit 7 IBM bit ordering).

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>

KVM: PPC: Book3S: Mark expected switch fall-through

Fix the error below triggered by `-Wimplicit-fallthrough`, by tagging
it as an expected fall-through.

    arch/powerpc/kvm/book3s_32_mmu.c: In function ‘kvmppc_mmu_book3s_32_xlate_pte’:
    arch/powerpc/kvm/book3s_32_mmu.c:241:21: error: this statement may fall through [-Werror=implicit-fallthrough=]
          pte->may_write = true;
          ~~~~~~~~~~~~~~~^~~~~~
    arch/powerpc/kvm/book3s_32_mmu.c:242:5: note: here
         case 3:
         ^~~~

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>

Merge remote-tracking branch 'remotes/powerpc/topic/ppc-kvm' into kvm-ppc-next

This merges in fixes for the XIVE interrupt controller which touch both
generic powerpc and PPC KVM code. To avoid merge conflicts, these
commits will go upstream via the powerpc tree as well as the KVM tree.

Signed-off-by: Paul Mackerras <paulus@ozlabs.org>

KVM: VMX: Fix and tweak the comments for VM-Enter

Fix an incorrect/stale comment regarding the vmx_vcpu pointer, as guest
registers are now loaded using a direct pointer to the start of the
register array.

Opportunistically add a comment to document why the vmx_vcpu pointer is
needed, its consumption via 'call vmx_update_host_rsp' is rather subtle.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: Assert that struct kvm_vcpu is always as offset zero

KVM implementations that wrap struct kvm_vcpu with a vendor specific
struct, e.g. struct vcpu_vmx, must place the vcpu member at offset 0,
otherwise the usercopy region intended to encompass struct kvm_vcpu_arch
will instead overlap random chunks of the vendor specific struct.
E.g. padding a large number of bytes before struct kvm_vcpu triggers
a usercopy warn when running with CONFIG_HARDENED_USERCOPY=y.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: X86: Add pv tlb shootdown tracepoint

Add pv tlb shootdown tracepoint.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86: Unconditionally call x86 ops that are always implemented

Remove a few stale checks for non-NULL ops now that the ops in question
are implemented by both VMX and SVM.

Note, this is **not** stable material, the Fixes tags are there purely
to show when a particular op was first supported by both VMX and SVM.

Fixes: 74f169090b6f ("kvm/svm: Setup MCG_CAP on AMD properly")
Fixes: b31c114b82b2 ("KVM: X86: Provide a capability to disable PAUSE intercepts")
Fixes: 411b44ba80ab ("svm: Implements update_pi_irte hook to setup posted interrupt")
Cc: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86/mmu: Consolidate "is MMIO SPTE" code

Replace the open-coded "is MMIO SPTE" checks in the MMU warnings
related to software-based access/dirty tracking to make the code
slightly more self-documenting.

No functional change intended.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86/mmu: Add explicit access mask for MMIO SPTEs

When shadow paging is enabled, KVM tracks the allowed access type for
MMIO SPTEs so that it can do a permission check on a MMIO GVA cache hit
without having to walk the guest's page tables.  The tracking is done
by retaining the WRITE and USER bits of the access when inserting the
MMIO SPTE (read access is implicitly allowed), which allows the MMIO
page fault handler to retrieve and cache the WRITE/USER bits from the
SPTE.

Unfortunately for EPT, the mask used to retain the WRITE/USER bits is
hardcoded using the x86 paging versions of the bits.  This funkiness
happens to work because KVM uses a completely different mask/value for
MMIO SPTEs when EPT is enabled, and the EPT mask/value just happens to
overlap exactly with the x86 WRITE/USER bits[*].

Explicitly define the access mask for MMIO SPTEs to accurately reflect
that EPT does not want to incorporate any access bits into the SPTE, and
so that KVM isn't subtly relying on EPT's WX bits always being set in
MMIO SPTEs, e.g. attempting to use other bits for experimentation breaks
horribly.

Note, vcpu_match_mmio_gva() explicits prevents matching GVA==0, and all
TDP flows explicit set mmio_gva to 0, i.e. zeroing vcpu->arch.access for
EPT has no (known) functional impact.

[*] Using WX to generate EPT misconfigurations (equivalent to reserved
    bit page fault) ensures KVM can employ its MMIO page fault tricks
    even platforms without reserved address bits.

Fixes: ce88decffd17 ("KVM: MMU: mmio page fault support")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86: Rename access permissions cache member in struct kvm_vcpu_arch

Rename "access" to "mmio_access" to match the other MMIO cache members
and to make it more obvious that it's tracking the access permissions
for the MMIO cache.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

x86: KVM: svm: eliminate hardcoded RIP advancement from vmrun_interception()

Just like we do with other intercepts, in vmrun_interception() we should be
doing kvm_skip_emulated_instruction() and not just RIP += 3. Also, it is
wrong to increment RIP before nested_svm_vmrun() as it can result in
kvm_inject_gp().

We can't call kvm_skip_emulated_instruction() after nested_svm_vmrun() so
move it inside.

Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

x86: KVM: svm: eliminate weird goto from vmrun_interception()

Regardless of whether or not nested_svm_vmrun_msrpm() fails, we return 1
from vmrun_interception() so there's no point in doing goto. Also,
nested_svm_vmrun_msrpm() call can be made from nested_svm_vmrun() where
other nested launch issues are handled.

nested_svm_vmrun() returns a bool, however, its result is ignored in
vmrun_interception() as we always return '1'. As a preparatory change
to putting kvm_skip_emulated_instruction() inside nested_svm_vmrun()
make nested_svm_vmrun() return an int (always '1' for now).

Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

x86: KVM: svm: remove hardcoded instruction length from intercepts

Various intercepts hard-code the respective instruction lengths to optimize
skip_emulated_instruction(): when next_rip is pre-set we skip
kvm_emulate_instruction(vcpu, EMULTYPE_SKIP). The optimization is, however,
incorrect: different (redundant) prefixes could be used to enlarge the
instruction. We can't really avoid decoding.

svm->next_rip is not used when CPU supports 'nrips' (X86_FEATURE_NRIPS)
feature: next RIP is provided in VMCB. The feature is not really new
(Opteron G3s had it already) and the change should have zero affect.

Remove manual svm->next_rip setting with hard-coded instruction lengths.
The only case where we now use svm->next_rip is EXIT_IOIO: the instruction
length is provided to us by hardware.

Hardcoded RIP advancement remains in vmrun_interception(), this is going to
be taken care of separately.

Reported-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

x86: KVM: add xsetbv to the emulator

To avoid hardcoding xsetbv length to '3' we need to support decoding it in
the emulator.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

x86: KVM: clear interrupt shadow on EMULTYPE_SKIP

When doing x86_emulate_instruction(EMULTYPE_SKIP) interrupt shadow has to
be cleared if and only if the skipping is successful.

There are two immediate issues:
- In SVM skip_emulated_instruction() we are not zapping interrupt shadow
  in case kvm_emulate_instruction(EMULTYPE_SKIP) is used to advance RIP
  (!nrpip_save).
- In VMX handle_ept_misconfig() when running as a nested hypervisor we
  (static_cpu_has(X86_FEATURE_HYPERVISOR) case) forget to clear interrupt
  shadow.

Note that we intentionally don't handle the case when the skipped
instruction is supposed to prolong the interrupt shadow ("MOV/POP SS") as
skip-emulation of those instructions should not happen under normal
circumstances.

Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

x86: kvm: svm: propagate errors from skip_emulated_instruction()

On AMD, kvm_x86_ops->skip_emulated_instruction(vcpu) can, in theory,
fail: in !nrips case we call kvm_emulate_instruction(EMULTYPE_SKIP).
Currently, we only do printk(KERN_DEBUG) when this happens and this
is not ideal. Propagate the error up the stack.

On VMX, skip_emulated_instruction() doesn't fail, we have two call
sites calling it explicitly: handle_exception_nmi() and
handle_task_switch(), we can just ignore the result.

On SVM, we also have two explicit call sites:
svm_queue_exception() and it seems we don't need to do anything there as
we check if RIP was advanced or not. In task_switch_interception(),
however, we are better off not proceeding to kvm_task_switch() in case
skip_emulated_instruction() failed.

Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

x86: KVM: svm: don't pretend to advance RIP in case wrmsr_interception() results in #GP

svm->next_rip is only used by skip_emulated_instruction() and in case
kvm_set_msr() fails we rightfully don't do that. Move svm->next_rip
advancement to 'else' branch to avoid creating false impression that
it's always advanced (and make it look like rdmsr_interception()).

This is a preparatory change to removing hardcoded RIP advancement
from instruction intercepts, no functional change.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86: Fix x86_decode_insn() return when fetching insn bytes fails

Jump to the common error handling in x86_decode_insn() if
__do_insn_fetch_bytes() fails so that its error code is converted to the
appropriate return type. Although the various helpers used by
x86_decode_insn() return X86EMUL_* values, x86_decode_insn() itself
returns EMULATION_FAILED or EMULATION_OK.

This doesn't cause a functional issue as the sole caller,
x86_emulate_instruction(), currently only cares about success vs.
failure, and success is indicated by '0' for both types
(X86EMUL_CONTINUE and EMULATION_OK).

Fixes: 285ca9e948fa ("KVM: emulate: speed up do_insn_fetch")
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86: use Intel speculation bugs and features as derived in generic x86 code

Similar to AMD bits, set the Intel bits from the vendor-independent
feature and bug flags, because KVM_GET_SUPPORTED_CPUID does not care
about the vendor and they should be set on AMD processors as well.

Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86: always expose VIRT_SSBD to guests

Even though it is preferrable to use SPEC_CTRL (represented by
X86_FEATURE_AMD_SSBD) instead of VIRT_SPEC, VIRT_SPEC is always
supported anyway because otherwise it would be impossible to
migrate from old to new CPUs. Make this apparent in the
result of KVM_GET_SUPPORTED_CPUID as well.

However, we need to hide the bit on Intel processors, so move
the setting to svm_set_supported_cpuid.

Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reported-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86: fix reporting of AMD speculation bug CPUID leaf

The AMD_* bits have to be set from the vendor-independent
feature and bug flags, because KVM_GET_SUPPORTED_CPUID does not care
about the vendor and they should be set on Intel processors as well.
On top of this, SSBD, STIBP and AMD_SSB_NO bit were not set, and
VIRT_SSBD does not have to be added manually because it is a
cpufeature that comes directly from the host's CPUID bit.

Reviewed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

selftests/kvm: make platform_info_test pass on AMD

test_msr_platform_info_disabled() generates EXIT_SHUTDOWN but VMCB state
is undefined after that so an attempt to launch this guest again from
test_msr_platform_info_enabled() fails. Reorder the tests to make test
pass.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Revert "KVM: x86/mmu: Zap only the relevant pages when removing a memslot"

This reverts commit 4e103134b862314dc2f2f18f2fb0ab972adc3f5f.
Alex Williamson reported regressions with device assignment with
this patch. Even though the bug is probably elsewhere and still
latent, this is needed to fix the regression.

Fixes: 4e103134b862 ("KVM: x86/mmu: Zap only the relevant pages when removing a memslot", 2019-02-05)
Reported-by: Alex Willamson <alex.williamson@redhat.com>
Cc: stable@vger.kernel.org
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

selftests: kvm: fix state save/load on processors without XSAVE

state_test and smm_test are failing on older processors that do not
have xcr0. This is because on those processor KVM does provide
support for KVM_GET/SET_XSAVE (to avoid having to rely on the older
KVM_GET/SET_FPU) but not for KVM_GET/SET_XCRS.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: Call kvm_arch_vcpu_blocking early into the blocking sequence

When a vpcu is about to block by calling kvm_vcpu_block, we call
back into the arch code to allow any form of synchronization that
may be required at this point (SVN stops the AVIC, ARM synchronises
the VMCR and enables GICv4 doorbells). But this synchronization
comes in quite late, as we've potentially waited for halt_poll_ns
to expire.

Instead, let's move kvm_arch_vcpu_blocking() to the beginning of
kvm_vcpu_block(), which on ARM has several benefits:

- VMCR gets synchronised early, meaning that any interrupt delivered
  during the polling window will be evaluated with the correct guest
  PMR
- GICv4 doorbells are enabled, which means that any guest interrupt
  directly injected during that window will be immediately recognised

Tang Nianyao ran some tests on a GICv4 machine to evaluate such
change, and reported up to a 10% improvement for netperf:

<quote>
netperf result:
D06 as server, intel 8180 server as client
with change:
package 512 bytes - 5500 Mbits/s
package 64 bytes - 760 Mbits/s
without change:
package 512 bytes - 5000 Mbits/s
package 64 bytes - 710 Mbits/s
</quote>

Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm/arm64: vgic: Make function comments match function declarations

Since commit 503a62862e8f ("KVM: arm/arm64: vgic: Rely on the GIC driver to
parse the firmware tables"), the vgic_v{2,3}_probe functions stopped using
a DT node. Commit 909777324588 ("KVM: arm/arm64: vgic-new: vgic_init:
implement kvm_vgic_hyp_init") changed the functions again, and now they
require exactly one argument, a struct gic_kvm_info populated by the GIC
driver. Unfortunately the comments regressed and state that a DT node is
used instead. Change the function comments to reflect the current
prototypes.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>

arm64/kvm: Remove VMID rollover I-cache maintenance

For VPIPT I-caches, we need I-cache maintenance on VMID rollover to
avoid an ABA problem. Consider a single vCPU VM, with a pinned stage-2,
running with an idmap VA->IPA and idmap IPA->PA. If we don't do
maintenance on rollover:

        // VMID A
        Writes insn X to PA 0xF
        Invalidates PA 0xF (for VMID A)

        I$ contains [{A,F}->X]

        [VMID ROLLOVER]

        // VMID B
        Writes insn Y to PA 0xF
        Invalidates PA 0xF (for VMID B)

        I$ contains [{A,F}->X, {B,F}->Y]

        [VMID ROLLOVER]

        // VMID A
        I$ contains [{A,F}->X, {B,F}->Y]

        Unexpectedly hits stale I$ line {A,F}->X.

However, for PIPT and VIPT I-caches, the VMID doesn't affect lookup or
constrain maintenance. Given the VMID doesn't affect PIPT and VIPT
I-caches, and given VMID rollover is independent of changes to stage-2
mappings, I-cache maintenance cannot be necessary on VMID rollover for
PIPT or VIPT I-caches.

This patch removes the maintenance on rollover for VIPT and PIPT
I-caches. At the same time, the unnecessary colons are removed from the
asm statement to make it more legible.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Christoffer Dall <christoffer.dall@arm.com>
Reviewed-by: James Morse <james.morse@arm.com>
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: kvmarm@lists.cs.columbia.edu
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm/arm64: vgic-irqfd: Implement kvm_arch_set_irq_inatomic

Now that we have a cache of MSI->LPI translations, it is pretty
easy to implement kvm_arch_set_irq_inatomic (this cache can be
parsed without sleeping).

Hopefully, this will improve some LPI-heavy workloads.

Tested-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm/arm64: vgic-its: Check the LPI translation cache on MSI injection

When performing an MSI injection, let's first check if the translation
is already in the cache. If so, let's inject it quickly without
going through the whole translation process.

Tested-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm/arm64: vgic-its: Cache successful MSI->LPI translation

On a successful translation, preserve the parameters in the LPI
translation cache. Each translation is reusing the last slot
in the list, naturally evicting the least recently used entry.

Tested-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on vgic teardown

In order to avoid leaking vgic_irq structures on teardown, we need to
drop all references to LPIs before deallocating the cache itself.

This is done by invalidating the cache on vgic teardown.

Tested-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on ITS disable

If an ITS gets disabled, we need to make sure that further interrupts
won't hit in the cache. For that, we invalidate the translation cache
when the ITS is disabled.

Tested-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on disabling LPIs

If a vcpu disables LPIs at its redistributor level, we need to make sure
we won't pend more interrupts. For this, we need to invalidate the LPI
translation cache.

Tested-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm/arm64: vgic-its: Invalidate MSI-LPI translation cache on specific commands

The LPI translation cache needs to be discarded when an ITS command
may affect the translation of an LPI (DISCARD, MAPC and MAPD with V=0)
or the routing of an LPI to a redistributor with disabled LPIs (MOVI,
MOVALL).

We decide to perform a full invalidation of the cache, irrespective
of the LPI that is affected. Commands are supposed to be rare enough
that it doesn't matter.

Tested-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm/arm64: vgic-its: Add MSI-LPI translation cache invalidation

There's a number of cases where we need to invalidate the caching
of translations, so let's add basic support for that.

Tested-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm/arm64: vgic: Add __vgic_put_lpi_locked primitive

Our LPI translation cache needs to be able to drop the refcount
on an LPI whilst already holding the lpi_list_lock.

Let's add a new primitive for this.

Tested-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>

KVM: arm/arm64: vgic: Add LPI translation cache definition

Add the basic data structure that expresses an MSI to LPI
translation as well as the allocation/release hooks.

The size of the cache is arbitrarily defined as 16*nr_vcpus.

Tested-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>

powerpc/xive: Implement get_irqchip_state method for XIVE to fix shutdown race

Testing has revealed the existence of a race condition where a XIVE
interrupt being shut down can be in one of the XIVE interrupt queues
(of which there are up to 8 per CPU, one for each priority) at the
point where free_irq() is called.  If this happens, can return an
interrupt number which has been shut down.  This can lead to various
symptoms:

- irq_to_desc(irq) can be NULL.  In this case, no end-of-interrupt
  function gets called, resulting in the CPU's elevated interrupt
  priority (numerically lowered CPPR) never gets reset.  That then
  means that the CPU stops processing interrupts, causing device
  timeouts and other errors in various device drivers.

- The irq descriptor or related data structures can be in the process
  of being freed as the interrupt code is using them.  This typically
  leads to crashes due to bad pointer dereferences.

This race is basically what commit 62e0468650c3 ("genirq: Add optional
hardware synchronization for shutdown", 2019-06-28) is intended to
fix, given a get_irqchip_state() method for the interrupt controller
being used.  It works by polling the interrupt controller when an
interrupt is being freed until the controller says it is not pending.

With XIVE, the PQ bits of the interrupt source indicate the state of
the interrupt source, and in particular the P bit goes from 0 to 1 at
the point where the hardware writes an entry into the interrupt queue
that this interrupt is directed towards.  Normally, the code will then
process the interrupt and do an end-of-interrupt (EOI) operation which
will reset PQ to 00 (assuming another interrupt hasn't been generated
in the meantime).  However, there are situations where the code resets
P even though a queue entry exists (for example, by setting PQ to 01,
which disables the interrupt source), and also situations where the
code leaves P at 1 after removing the queue entry (for example, this
is done for escalation interrupts so they cannot fire again until
they are explicitly re-enabled).

The code already has a 'saved_p' flag for the interrupt source which
indicates that a queue entry exists, although it isn't maintained
consistently.  This patch adds a 'stale_p' flag to indicate that
P has been left at 1 after processing a queue entry, and adds code
to set and clear saved_p and stale_p as necessary to maintain a
consistent indication of whether a queue entry may or may not exist.

With this, we can implement xive_get_irqchip_state() by looking at
stale_p, saved_p and the ESB PQ bits for the interrupt.

There is some additional code to handle escalation interrupts
properly; because they are enabled and disabled in KVM assembly code,
which does not have access to the xive_irq_data struct for the
escalation interrupt.  Hence, stale_p may be incorrect when the
escalation interrupt is freed in kvmppc_xive_{,native_}cleanup_vcpu().
Fortunately, we can fix it up by looking at vcpu->arch.xive_esc_on,
with some careful attention to barriers in order to ensure the correct
result if xive_esc_irq() races with kvmppc_xive_cleanup_vcpu().

Finally, this adds code to make noise on the console (pr_crit and
WARN_ON(1)) if we find an interrupt queue entry for an interrupt
which does not have a descriptor.  While this won't catch the race
reliably, if it does get triggered it will be an indication that
the race is occurring and needs to be debugged.

Fixes: 243e25112d06 ("powerpc/xive: Native exploitation of the XIVE interrupt controller")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190813100648.GE9567@blackberry

KVM: PPC: Book3S HV: Don't push XIVE context when not using XIVE device

At present, when running a guest on POWER9 using HV KVM but not using
an in-kernel interrupt controller (XICS or XIVE), for example if QEMU
is run with the kernel_irqchip=off option, the guest entry code goes
ahead and tries to load the guest context into the XIVE hardware, even
though no context has been set up.

To fix this, we check that the "CAM word" is non-zero before pushing
it to the hardware. The CAM word is initialized to a non-zero value
in kvmppc_xive_connect_vcpu() and kvmppc_xive_native_connect_vcpu(),
and is now cleared in kvmppc_xive_{,native_}cleanup_vcpu.

Fixes: 5af50993850a ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller")
Cc: stable@vger.kernel.org # v4.12+
Reported-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190813100100.GC9567@blackberry

KVM: PPC: Book3S HV: Fix race in re-enabling XIVE escalation interrupts

Escalation interrupts are interrupts sent to the host by the XIVE
hardware when it has an interrupt to deliver to a guest VCPU but that
VCPU is not running anywhere in the system.  Hence we disable the
escalation interrupt for the VCPU being run when we enter the guest
and re-enable it when the guest does an H_CEDE hypercall indicating
it is idle.

It is possible that an escalation interrupt gets generated just as we
are entering the guest.  In that case the escalation interrupt may be
using a queue entry in one of the interrupt queues, and that queue
entry may not have been processed when the guest exits with an H_CEDE.
The existing entry code detects this situation and does not clear the
vcpu->arch.xive_esc_on flag as an indication that there is a pending
queue entry (if the queue entry gets processed, xive_esc_irq() will
clear the flag).  There is a comment in the code saying that if the
flag is still set on H_CEDE, we have to abort the cede rather than
re-enabling the escalation interrupt, lest we end up with two
occurrences of the escalation interrupt in the interrupt queue.

However, the exit code doesn't do that; it aborts the cede in the sense
that vcpu->arch.ceded gets cleared, but it still enables the escalation
interrupt by setting the source's PQ bits to 00.  Instead we need to
set the PQ bits to 10, indicating that an interrupt has been triggered.
We also need to avoid setting vcpu->arch.xive_esc_on in this case
(i.e. vcpu->arch.xive_esc_on seen to be set on H_CEDE) because
xive_esc_irq() will run at some point and clear it, and if we race with
that we may end up with an incorrect result (i.e. xive_esc_on set when
the escalation interrupt has just been handled).

It is extremely unlikely that having two queue entries would cause
observable problems; theoretically it could cause queue overflow, but
the CPU would have to have thousands of interrupts targetted to it for
that to be possible.  However, this fix will also make it possible to
determine accurately whether there is an unhandled escalation
interrupt in the queue, which will be needed by the following patch.

Fixes: 9b9b13a6d153 ("KVM: PPC: Book3S HV: Keep XIVE escalation interrupt masked unless ceded")
Cc: stable@vger.kernel.org # v4.16+
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190813100349.GD9567@blackberry

KVM: PPC: Book3S HV: XIVE: Free escalation interrupts before disabling the VP

When a vCPU is brought done, the XIVE VP (Virtual Processor) is first
disabled and then the event notification queues are freed. When freeing
the queues, we check for possible escalation interrupts and free them
also.

But when a XIVE VP is disabled, the underlying XIVE ENDs also are
disabled in OPAL. When an END (Event Notification Descriptor) is
disabled, its ESB pages (ESn and ESe) are disabled and loads return all
1s. Which means that any access on the ESB page of the escalation
interrupt will return invalid values.

When an interrupt is freed, the shutdown handler computes a 'saved_p'
field from the value returned by a load in xive_do_source_set_mask().
This value is incorrect for escalation interrupts for the reason
described above.

This has no impact on Linux/KVM today because we don't make use of it
but we will introduce in future changes a xive_get_irqchip_state()
handler. This handler will use the 'saved_p' field to return the state
of an interrupt and 'saved_p' being incorrect, softlockup will occur.

Fix the vCPU cleanup sequence by first freeing the escalation interrupts
if any, then disable the XIVE VP and last free the queues.

Fixes: 90c73795afa2 ("KVM: PPC: Book3S HV: Add a new KVM device for the XIVE native exploitation mode")
Fixes: 5af50993850a ("KVM: PPC: Book3S HV: Native usage of the XIVE interrupt controller")
Cc: stable@vger.kernel.org # v4.12+
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190806172538.5087-1-clg@kaod.org

selftests: kvm: fix vmx_set_nested_state_test

vmx_set_nested_state_test is trying to use the KVM_STATE_NESTED_EVMCS without
enabling enlightened VMCS first. Correct the outcome of the test, and actually
test that it succeeds after the capability is enabled.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

selftests: kvm: provide common function to enable eVMCS

There are two tests already enabling eVMCS and a third is coming.
Add a function that enables the capability and tests the result.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

selftests: kvm: do not try running the VM in vmx_set_nested_state_test

This test is only covering various edge cases of the
KVM_SET_NESTED_STATE ioctl. Running the VM does not really
add anything.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

KVM: x86: svm: remove redundant assignment of var new_entry

new_entry is reassigned a new value next line. So
it's redundant and remove it.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

MAINTAINERS: add KVM x86 reviewers

This is probably overdue---KVM x86 has quite a few contributors that
usually review each other's patches, which is really helpful to me.
Formalize this by listing them as reviewers.  I am including people
with various expertise:

- Joerg for SVM (with designated reviewers, it makes more sense to have
him in the main KVM/x86 stanza)

- Sean for MMU and VMX

- Jim for VMX

- Vitaly for Hyper-V and possibly SVM

- Wanpeng for LAPIC and paravirtualization.

Please ack if you are okay with this arrangement, otherwise speak up.

In other news, Radim is going to leave Red Hat soon.  However, he has
not been very much involved in upstream KVM development for some time,
and in the immediate future he is still going to help maintain kvm/queue
while I am on vacation.  Since not much is going to change, I will let
him decide whether he wants to keep the maintainer role after he leaves.

Acked-by: Joerg Roedel <joro@8bytes.org>
Acked-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: Wanpeng Li <wanpengli@tencent.com>
Cc: Sean Christopherson <sean.j.christopherson@intel.com>
Cc: Jim Mattson <jmattson@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

MAINTAINERS: change list for KVM/s390

KVM/s390 does not have a list of its own, and linux-s390 is in the
loop anyway thanks to the generic arch/s390 match. So use the generic
KVM list for s390 patches.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

kvm: x86: skip populating logical dest map if apic is not sw enabled

recalculate_apic_map does not santize ldr and it's possible that
multiple bits are set. In that case, a previous valid entry
can potentially be overwritten by an invalid one.

This condition is hit when booting a 32 bit, >8 CPU, RHEL6 guest and then
triggering a crash to boot a kdump kernel. This is the sequence of
events:
1. Linux boots in bigsmp mode and enables PhysFlat, however, it still
writes to the LDR which probably will never be used.
2. However, when booting into kdump, the stale LDR values remain as
they are not cleared by the guest and there isn't a apic reset.
3. kdump boots with 1 cpu, and uses Logical Destination Mode but the
logical map has been overwritten and points to an inactive vcpu.

Signed-off-by: Radim Krcmar <rkrcmar@redhat.com>
Signed-off-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Linux 5.3-rc4

Merge tag 'dax-fixes-5.3-rc4' of git://git./linux/kernel/git/nvdimm/nvdimm

Pull dax fixes from Dan Williams:
"A filesystem-dax and device-dax fix for v5.3.

  The filesystem-dax fix is tagged for stable as the implementation has
  been mistakenly throwing away all cow pages on any truncate or hole
  punch operation as part of the solution to coordinate device-dma vs
  truncate to dax pages.

  The device-dax change fixes up a regression this cycle from the
  introduction of a common 'internal per-cpu-ref' implementation.

  Summary:

   - Fix dax_layout_busy_page() to not discard private cow pages of
     fs/dax private mappings.

   - Update the memremap_pages core to properly cleanup on behalf of
     internal reference-count users like device-dax"

* tag 'dax-fixes-5.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  mm/memremap: Fix reuse of pgmap instances with internal references
  dax: dax_layout_busy_page() should not unmap cow pages

Merge tag 'ntb-5.3-bugfixes' of git://github.com/jonmason/ntb

Pull NTB fix from Jon Mason:
"Bug fix for NTB MSI kernel compile warning"

* tag 'ntb-5.3-bugfixes' of git://github.com/jonmason/ntb:
NTB/msi: remove incorrect MODULE defines

Merge tag 'riscv/for-v5.3-rc4' of git://git./linux/kernel/git/riscv/linux

Pull RISC-V updates from Paul Walmsley:
"A few minor RISC-V updates for v5.3-rc4:

   - Remove __udivdi3() from the 32-bit Linux port, converting the only
     upstream user to use do_div(), per Linux policy

   - Convert the RISC-V standard clocksource away from per-cpu data
     structures, since only one is used by Linux, even on a multi-CPU
     system

   - A set of DT binding updates that remove an obsolete text binding in
     favor of a YAML binding, fix a bogus compatible string in the
     schema (thus fixing a "make dtbs_check" warning), and clarifies the
     future values expected in one of the RISC-V CPU properties"

* tag 'riscv/for-v5.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  dt-bindings: riscv: fix the schema compatible string for the HiFive Unleashed board
  dt-bindings: riscv: remove obsolete cpus.txt
  RISC-V: Remove udivdi3
  riscv: delay: use do_div() instead of __udivdi3()
  dt-bindings: Update the riscv,isa string description
  RISC-V: Remove per cpu clocksource

Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
"A few fixes for x86:

   - Don't reset the carefully adjusted build flags for the purgatory
     and remove the unwanted flags instead. The 'reset all' approach led
     to build fails under certain circumstances.

   - Unbreak CLANG build of the purgatory by avoiding the builtin
     memcpy/memset implementations.

   - Address missing prototype warnings by including the proper header

   - Fix yet more fall-through issues"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/lib/cpu: Address missing prototypes warning
  x86/purgatory: Use CFLAGS_REMOVE rather than reset KBUILD_CFLAGS
  x86/purgatory: Do not use __builtin_memcpy and __builtin_memset
  x86: mtrr: cyrix: Mark expected switch fall-through
  x86/ptrace: Mark expected switch fall-through

Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf tooling fixes from Thomas Gleixner:
"Perf tooling fixes all over the place:

   - Fix the selection of the main thread COMM in db-export

   - Fix the disassemmbly display for BPF in annotate

   - Fix cpumap mask setup in perf ftrace when only one CPU is present

   - Add the missing 'cpu_clk_unhalted.core' event

   - Fix CPU 0 bindings in NUMA benchmarks

   - Fix the module size calculations for s390

   - Handle the gap between kernel end and module start on s390
     correctly

   - Build and typo fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf pmu-events: Fix missing "cpu_clk_unhalted.core" event
  perf annotate: Fix s390 gap between kernel end and module start
  perf record: Fix module size on s390
  perf tools: Fix include paths in ui directory
  perf tools: Fix a typo in a variable name in the Documentation Makefile
  perf cpumap: Fix writing to illegal memory in handling cpumap mask
  perf ftrace: Fix failure to set cpumask when only one cpu is present
  perf db-export: Fix thread__exec_comm()
  perf annotate: Fix printing of unaugmented disassembled instructions from BPF
  perf bench numa: Fix cpu0 binding

Merge branch 'sched-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull scheduler fixes from Thomas Gleixner:
"Three fixlets for the scheduler:

   - Avoid double bandwidth accounting in the push & pull code

   - Use a sane FIFO priority for the Pressure Stall Information (PSI)
     thread.

   - Avoid permission checks when setting the scheduler params for the
     PSI thread"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/psi: Do not require setsched permission from the trigger creator
  sched/psi: Reduce psimon FIFO priority
  sched/deadline: Fix double accounting of rq/running bw in push & pull

Merge branch 'irq-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull irq fix from Thomas Gleixner:
"A small fix for the affinity spreading code.

  It failed to handle situations where a single vector was requested
  either due to only one CPU being available or vector exhaustion
  causing only a single interrupt to be granted.

  The fix is to simply remove the requirement in the affinity spreading
  code for more than one interrupt being available"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq/affinity: Create affinity mask for single vector

Merge branch 'core-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull objtool warning fix from Thomas Gleixner:
"The recent objtool fixes/enhancements unearthed a unbalanced CLAC in
  the i915 driver.

  Chris asked me to pick the fix up and route it through"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  drm/i915: Remove redundant user_access_end() from __copy_from_user() error path

Merge tag 'gfs2-v5.3-rc3.fixes' of git://git./linux/kernel/git/gfs2/linux-gfs2

Pull gfs2 fix from Andreas Gruenbacher:
"Fix incorrect lseek / fiemap results"

* tag 'gfs2-v5.3-rc3.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
gfs2: gfs2_walk_metadata fix

Makefile: Convert -Wimplicit-fallthrough=3 to just -Wimplicit-fallthrough for clang

A compilation -Wimplicit-fallthrough warning was enabled by commit
a035d552a93b ("Makefile: Globally enable fall-through warning")

Even though clang 10.0.0 does not currently support this warning without
a patch, clang currently does not support a value for this option.

Link: https://bugs.llvm.org/show_bug.cgi?id=39382
The gcc default for this warning is 3 so removing the =3 has no effect
for gcc and enables the warning for patched versions of clang.

Also remove the =3 from an existing use in a parisc Makefile:
arch/parisc/math-emu/Makefile

Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-and-tested-by: Nathan Chancellor <natechancellor@gmail.com>
Cc: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

Merge tag 'char-misc-5.3-rc4' of git://git./linux/kernel/git/gregkh/char-misc

Pull char/misc driver fixes Greg KH:
"Here are some small char/misc driver fixes for 5.3-rc4.

  Two of these are for the habanalabs driver for issues found when
  running on a big-endian system (are they still alive?) The others are
  tiny fixes reported by people, and a MAINTAINERS update about the
  location of the fpga development tree.

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-5.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  coresight: Fix DEBUG_LOCKS_WARN_ON for uninitialized attribute
  MAINTAINERS: Move linux-fpga tree to new location
  nvmem: Use the same permissions for eeprom as for nvmem
  habanalabs: fix host memory polling in BE architecture
  habanalabs: fix F/W download in BE architecture

Merge tag 'driver-core-5.3-rc4' of git://git./linux/kernel/git/gregkh/driver-core

Pull driver core fixes from Greg KH:
"Here are two small fixes for some driver core issues that have been
  reported. There is also a kernfs "fix" here, which was then reverted
  because it was found to cause problems in linux-next.

  The driver core fixes both resolve reported issues, one with gpioint
  stuff that showed up in 5.3-rc1, and the other finally (and hopefully)
  resolves a very long standing race when removing glue directories.
  It's nice to get that issue finally resolved and the developers
  involved should be applauded for the persistence it took to get this
  patch finally accepted.

  All of these have been in linux-next for a while with no reported
  issues. Well, the one reported issue, hence the revert :)"

* tag 'driver-core-5.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  Revert "kernfs: fix memleak in kernel_ops_readdir()"
  kernfs: fix memleak in kernel_ops_readdir()
  driver core: Fix use-after-free and double free on glue directory
  driver core: platform: return -ENXIO for missing GpioInt

Merge tag 'tty-5.3-rc4' of git://git./linux/kernel/git/gregkh/tty

Pull tty fix from Greg KH:
"Here is a single tty kgdb fix for 5.3-rc4.

  It fixes an annoying log message that has caused kdb to become
  useless. It's another fallout from commit ddde3c18b700 ("vt: More
  locking checks") which tries to enforce locking checks more strictly
  in the tty layer, unfortunatly when kdb is stopped, there's no need
  for locks :)

  This patch has been linux-next for a while with no reported issues"

* tag 'tty-5.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  kgdboc: disable the console lock when in kgdb

Merge tag 'staging-5.3-rc4' of git://git./linux/kernel/git/gregkh/staging

Pull staging / IIO driver fixes from Greg KH:
"Here are some small staging and IIO driver fixes for 5.3-rc4.

  Nothing major, just resolutions for a number of small reported issues,
  full details in the shortlog.

  All have been in linux-next for a while with no reported issues"

* tag 'staging-5.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  iio: adc: gyroadc: fix uninitialized return code
  docs: generic-counter.rst: fix broken references for ABI file
  staging: android: ion: Bail out upon SIGKILL when allocating memory.
  Staging: fbtft: Fix GPIO handling
  staging: unisys: visornic: Update the description of 'poll_for_irq()'
  staging: wilc1000: flush the workqueue before deinit the host
  staging: gasket: apex: fix copy-paste typo
  Staging: fbtft: Fix reset assertion when using gpio descriptor
  Staging: fbtft: Fix probing of gpio descriptor
  iio: imu: mpu6050: add missing available scan masks
  iio: cros_ec_accel_legacy: Fix incorrect channel setting
  IIO: Ingenic JZ47xx: Set clock divider on probe
  iio: adc: max9611: Fix misuse of GENMASK macro

Merge tag 'usb-5.3-rc4' of git://git./linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
"Here are some small USB fixes for 5.3-rc4.

  The "biggest" one here is moving code from one file to another in
  order to fix a long-standing race condition with the creation of sysfs
  files for USB devices. Turns out that there are now userspace tools
  out there that are hitting this long-known bug, so it's time to fix
  them. Thankfully the tool-maker in this case fixed the issue :)

  The other patches in here are all fixes for reported issues. Now that
  syzbot knows how to fuzz USB drivers better, and is starting to now
  fuzz the userspace facing side of them at the same time, there will be
  more and more small fixes like these coming, which is a good thing.

  All of these have been in linux-next with no reported issues"

* tag 'usb-5.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: setup authorized_default attributes using usb_bus_notify
  usb: iowarrior: fix deadlock on disconnect
  Revert "USB: rio500: simplify locking"
  usb: usbfs: fix double-free of usb memory upon submiturb error
  usb: yurex: Fix use-after-free in yurex_delete
  usb: typec: tcpm: Ignore unsupported/unknown alternate mode requests
  xhci: Fix NULL pointer dereference at endpoint zero reset.
  usb: host: xhci-rcar: Fix timeout in xhci_suspend()
  usb: typec: ucsi: ccg: Fix uninitilized symbol error
  usb: typec: tcpm: remove tcpm dir if no children
  usb: typec: tcpm: free log buf memory when remove debug file
  usb: typec: tcpm: Add NULL check before dereferencing config

Merge tag 'pinctrl-v5.3-2' of git://git./linux/kernel/git/linusw/linux-pinctrl

Pull pin control fixes from Linus Walleij:

- Delay acquisition of regmaps in the Aspeed G5 driver.

- Make a symbol static to reduce compiler noise.

* tag 'pinctrl-v5.3-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: aspeed: Make aspeed_pinmux_ips static
pinctrl: aspeed-g5: Delay acquisition of regmaps

Merge tag 'powerpc-5.3-4' of git://git./linux/kernel/git/powerpc/linux

Pull powerpc fix from Michael Ellerman:
"Just one fix, a revert of a commit that was meant to be a minor
  improvement to some inline asm, but ended up having no real benefit
  with GCC and broke booting 32-bit machines when using Clang.

  Thanks to: Arnd Bergmann, Christophe Leroy, Nathan Chancellor, Nick
  Desaulniers, Segher Boessenkool"

* tag 'powerpc-5.3-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  Revert "powerpc: slightly improve cache helpers"

Merge tag 'Wimplicit-fallthrough-5.3-rc4' of git://git./linux/kernel/git/gustavoars/linux

Pull fall-through fixes from Gustavo A. R. Silva:
"Mark more switch cases where we are expecting to fall through, fixing
  fall-through warnings in arm, sparc64, mips, i386 and s390"

* tag 'Wimplicit-fallthrough-5.3-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
  ARM: ep93xx: Mark expected switch fall-through
  scsi: fas216: Mark expected switch fall-throughs
  pcmcia: db1xxx_ss: Mark expected switch fall-throughs
  video: fbdev: omapfb_main: Mark expected switch fall-throughs
  watchdog: riowd: Mark expected switch fall-through
  s390/net: Mark expected switch fall-throughs
  crypto: ux500/crypt: Mark expected switch fall-throughs
  watchdog: wdt977: Mark expected switch fall-through
  watchdog: scx200_wdt: Mark expected switch fall-through
  watchdog: Mark expected switch fall-throughs
  ARM: signal: Mark expected switch fall-through
  mfd: omap-usb-host: Mark expected switch fall-throughs
  mfd: db8500-prcmu: Mark expected switch fall-throughs
  ARM: OMAP: dma: Mark expected switch fall-throughs
  ARM: alignment: Mark expected switch fall-throughs
  ARM: tegra: Mark expected switch fall-through
  ARM/hw_breakpoint: Mark expected switch fall-throughs

Merge tag 'kbuild-fixes-v5.3-3' of git://git./linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fixes from Masahiro Yamada:

- revive single target %.ko

- do not create built-in.a where it is unneeded

- do not create modules.order where it is unneeded

- show a warning if subdir-y/m is used to visit a module Makefile

* tag 'kbuild-fixes-v5.3-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: show hint if subdir-y/m is used to visit module Makefile
  kbuild: generate modules.order only in directories visited by obj-y/m
  kbuild: fix false-positive need-builtin calculation
  kbuild: revive single target %.ko

ARM: ep93xx: Mark expected switch fall-through

Mark switch cases where we are expecting to fall through.

Fix the following warnings (Building: arm-ep93xx_defconfig arm):

arch/arm/mach-ep93xx/crunch.c: In function 'crunch_do':
arch/arm/mach-ep93xx/crunch.c:46:3: warning: this statement may
fall through [-Wimplicit-fallthrough=]
      memset(crunch_state, 0, sizeof(*crunch_state));
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   arch/arm/mach-ep93xx/crunch.c:53:2: note: here
     case THREAD_NOTIFY_EXIT:
     ^~~~

Notice that, in this particular case, the code comment is
modified in accordance with what GCC is expecting to find.

Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>

scsi: fas216: Mark expected switch fall-throughs

Mark switch cases where we are expecting to fall through.

Fix the following warnings (Building: rpc_defconfig arm):

drivers/scsi/arm/fas216.c: In function ‘fas216_disconnect_intr’:
drivers/scsi/arm/fas216.c:913:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
   if (fas216_get_last_msg(info, info->scsi.msgin_fifo) == ABORT) {
      ^
drivers/scsi/arm/fas216.c:919:2: note: here
  default:    /* huh?     */
  ^~~~~~~
drivers/scsi/arm/fas216.c: In function ‘fas216_kick’:
drivers/scsi/arm/fas216.c:1959:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
   fas216_allocate_tag(info, SCpnt);
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/scsi/arm/fas216.c:1960:2: note: here
  case TYPE_OTHER:
  ^~~~
drivers/scsi/arm/fas216.c: In function ‘fas216_busservice_intr’:
drivers/scsi/arm/fas216.c:1413:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
   fas216_stoptransfer(info);
   ^~~~~~~~~~~~~~~~~~~~~~~~~
drivers/scsi/arm/fas216.c:1414:2: note: here
  case STATE(STAT_STATUS, PHASE_SELSTEPS):/* Sel w/ steps -> Status       */
  ^~~~
drivers/scsi/arm/fas216.c:1424:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
   fas216_stoptransfer(info);
   ^~~~~~~~~~~~~~~~~~~~~~~~~
drivers/scsi/arm/fas216.c:1425:2: note: here
  case STATE(STAT_MESGIN, PHASE_COMMAND): /* Command -> Message In */
  ^~~~
drivers/scsi/arm/fas216.c: In function ‘fas216_funcdone_intr’:
drivers/scsi/arm/fas216.c:1573:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
   if ((stat & STAT_BUSMASK) == STAT_MESGIN) {
      ^
drivers/scsi/arm/fas216.c:1579:2: note: here
  default:
  ^~~~~~~
drivers/scsi/arm/fas216.c: In function ‘fas216_handlesync’:
drivers/scsi/arm/fas216.c:605:20: warning: this statement may fall through [-Wimplicit-fallthrough=]
   info->scsi.phase = PHASE_MSGOUT_EXPECT;
   ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~
drivers/scsi/arm/fas216.c:607:2: note: here
  case async:
  ^~~~

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>

pcmcia: db1xxx_ss: Mark expected switch fall-throughs

Mark switch cases where we are expecting to fall through.

This patch fixes the following warnings (Building: db1xxx_defconfig mips):

drivers/pcmcia/db1xxx_ss.c:257:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
drivers/pcmcia/db1xxx_ss.c:269:3: warning: this statement may fall through [-Wimplicit-fallthrough=]

Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>

video: fbdev: omapfb_main: Mark expected switch fall-throughs

Mark switch cases where we are expecting to fall through.

This patch fixes the following warning (Building: omap1_defconfig arm):

drivers/watchdog/wdt285.c:170:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
drivers/watchdog/ar7_wdt.c:237:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
drivers/video/fbdev/omap/omapfb_main.c:449:23: warning: this statement may fall through [-Wimplicit-fallthrough=]
drivers/video/fbdev/omap/omapfb_main.c:1549:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
drivers/video/fbdev/omap/omapfb_main.c:1547:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
drivers/video/fbdev/omap/omapfb_main.c:1545:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
drivers/video/fbdev/omap/omapfb_main.c:1543:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
drivers/video/fbdev/omap/omapfb_main.c:1540:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
drivers/video/fbdev/omap/omapfb_main.c:1538:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
drivers/video/fbdev/omap/omapfb_main.c:1535:3: warning: this statement may fall through [-Wimplicit-fallthrough=]

Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>

watchdog: riowd: Mark expected switch fall-through

Mark switch cases where we are expecting to fall through.

This patch fixes the following warnings (Building: sparc64):

drivers/watchdog/riowd.c: In function ‘riowd_ioctl’:
drivers/watchdog/riowd.c:136:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
   riowd_writereg(p, riowd_timeout, WDTO_INDEX);
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/watchdog/riowd.c:139:2: note: here
  case WDIOC_GETTIMEOUT:
  ^~~~

Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>

s390/net: Mark expected switch fall-throughs

Mark switch cases where we are expecting to fall through.

This patch fixes the following warnings (Building: s390):

drivers/s390/net/ctcm_fsms.c: In function ‘ctcmpc_chx_attnbusy’:
drivers/s390/net/ctcm_fsms.c:1703:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
   if (grp->changed_side == 1) {
      ^
drivers/s390/net/ctcm_fsms.c:1707:2: note: here
  case MPCG_STATE_XID0IOWAIX:
  ^~~~

drivers/s390/net/ctcm_mpc.c: In function ‘ctc_mpc_alloc_channel’:
drivers/s390/net/ctcm_mpc.c:358:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
   if (callback)
      ^
drivers/s390/net/ctcm_mpc.c:360:2: note: here
  case MPCG_STATE_XID0IOWAIT:
  ^~~~

drivers/s390/net/ctcm_mpc.c: In function ‘mpc_action_timeout’:
drivers/s390/net/ctcm_mpc.c:1469:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
   if ((fsm_getstate(rch->fsm) == CH_XID0_PENDING) &&
      ^
drivers/s390/net/ctcm_mpc.c:1472:2: note: here
  default:
  ^~~~~~~
drivers/s390/net/ctcm_mpc.c: In function ‘mpc_send_qllc_discontact’:
drivers/s390/net/ctcm_mpc.c:2087:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
   if (grp->estconnfunc) {
      ^
drivers/s390/net/ctcm_mpc.c:2092:2: note: here
  case MPCG_STATE_FLOWC:
  ^~~~

drivers/s390/net/qeth_l2_main.c: In function ‘qeth_l2_process_inbound_buffer’:
drivers/s390/net/qeth_l2_main.c:328:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
    if (IS_OSN(card)) {
       ^
drivers/s390/net/qeth_l2_main.c:337:3: note: here
   default:
   ^~~~~~~

Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>

crypto: ux500/crypt: Mark expected switch fall-throughs

Mark switch cases where we are expecting to fall through.

This patch fixes the following warning (Building: arm):

drivers/crypto/ux500/cryp/cryp.c: In function ‘cryp_save_device_context’:
drivers/crypto/ux500/cryp/cryp.c:316:16: warning: this statement may fall through [-Wimplicit-fallthrough=]
   ctx->key_4_r = readl_relaxed(&src_reg->key_4_r);
drivers/crypto/ux500/cryp/cryp.c:318:2: note: here
  case CRYP_KEY_SIZE_192:
  ^~~~
drivers/crypto/ux500/cryp/cryp.c:320:16: warning: this statement may fall through [-Wimplicit-fallthrough=]
   ctx->key_3_r = readl_relaxed(&src_reg->key_3_r);
drivers/crypto/ux500/cryp/cryp.c:322:2: note: here
  case CRYP_KEY_SIZE_128:
  ^~~~
drivers/crypto/ux500/cryp/cryp.c:324:16: warning: this statement may fall through [-Wimplicit-fallthrough=]
   ctx->key_2_r = readl_relaxed(&src_reg->key_2_r);
drivers/crypto/ux500/cryp/cryp.c:326:2: note: here
  default:
  ^~~~~~~
In file included from ./include/linux/io.h:13:0,
                 from drivers/crypto/ux500/cryp/cryp_p.h:14,
                 from drivers/crypto/ux500/cryp/cryp.c:15:
drivers/crypto/ux500/cryp/cryp.c: In function ‘cryp_restore_device_context’:
./arch/arm/include/asm/io.h:92:22: warning: this statement may fall through [-Wimplicit-fallthrough=]
#define __raw_writel __raw_writel
                      ^
./arch/arm/include/asm/io.h:299:29: note: in expansion of macro ‘__raw_writel’
#define writel_relaxed(v,c) __raw_writel((__force u32) cpu_to_le32(v),c)
                             ^~~~~~~~~~~~
drivers/crypto/ux500/cryp/cryp.c:363:3: note: in expansion of macro ‘writel_relaxed’
   writel_relaxed(ctx->key_4_r, &reg->key_4_r);
   ^~~~~~~~~~~~~~
drivers/crypto/ux500/cryp/cryp.c:365:2: note: here
  case CRYP_KEY_SIZE_192:
  ^~~~
In file included from ./include/linux/io.h:13:0,
                 from drivers/crypto/ux500/cryp/cryp_p.h:14,
                 from drivers/crypto/ux500/cryp/cryp.c:15:
./arch/arm/include/asm/io.h:92:22: warning: this statement may fall through [-Wimplicit-fallthrough=]
#define __raw_writel __raw_writel
                      ^
./arch/arm/include/asm/io.h:299:29: note: in expansion of macro ‘__raw_writel’
#define writel_relaxed(v,c) __raw_writel((__force u32) cpu_to_le32(v),c)
                             ^~~~~~~~~~~~
drivers/crypto/ux500/cryp/cryp.c:367:3: note: in expansion of macro ‘writel_relaxed’
   writel_relaxed(ctx->key_3_r, &reg->key_3_r);
   ^~~~~~~~~~~~~~
drivers/crypto/ux500/cryp/cryp.c:369:2: note: here
  case CRYP_KEY_SIZE_128:
  ^~~~
In file included from ./include/linux/io.h:13:0,
                 from drivers/crypto/ux500/cryp/cryp_p.h:14,
                 from drivers/crypto/ux500/cryp/cryp.c:15:
./arch/arm/include/asm/io.h:92:22: warning: this statement may fall through [-Wimplicit-fallthrough=]
#define __raw_writel __raw_writel
                      ^
./arch/arm/include/asm/io.h:299:29: note: in expansion of macro ‘__raw_writel’
#define writel_relaxed(v,c) __raw_writel((__force u32) cpu_to_le32(v),c)
                             ^~~~~~~~~~~~
drivers/crypto/ux500/cryp/cryp.c:371:3: note: in expansion of macro ‘writel_relaxed’
   writel_relaxed(ctx->key_2_r, &reg->key_2_r);
   ^~~~~~~~~~~~~~
drivers/crypto/ux500/cryp/cryp.c:373:2: note: here
  default:
  ^~~~~~~

Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>

watchdog: wdt977: Mark expected switch fall-through

Mark switch cases where we are expecting to fall through.

This patch fixes the following warning (Building: arm):

drivers/watchdog/wdt977.c: In function ‘wdt977_ioctl’:
  LD [M]  drivers/media/platform/vicodec/vicodec.o
drivers/watchdog/wdt977.c:400:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
   wdt977_keepalive();
   ^~~~~~~~~~~~~~~~~~
drivers/watchdog/wdt977.c:403:2: note: here
  case WDIOC_GETTIMEOUT:
  ^~~~

Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>

watchdog: scx200_wdt: Mark expected switch fall-through

Mark switch cases where we are expecting to fall through.

This patch fixes the following warning (Building: i386):

drivers/watchdog/scx200_wdt.c: In function ‘scx200_wdt_ioctl’:
drivers/watchdog/scx200_wdt.c:188:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
   scx200_wdt_ping();
   ^~~~~~~~~~~~~~~~~
drivers/watchdog/scx200_wdt.c:189:2: note: here
  case WDIOC_GETTIMEOUT:
  ^~~~

Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>

watchdog: Mark expected switch fall-throughs

Mark switch cases where we are expecting to fall through.

This patch fixes the following warnings:

drivers/watchdog/ar7_wdt.c: warning: this statement may fall
through [-Wimplicit-fallthrough=]:  => 237:3
drivers/watchdog/pcwd.c: warning: this statement may fall
through [-Wimplicit-fallthrough=]:  => 653:3
drivers/watchdog/sb_wdog.c: warning: this statement may fall
through [-Wimplicit-fallthrough=]:  => 204:3
drivers/watchdog/wdt.c: warning: this statement may fall
through [-Wimplicit-fallthrough=]:  => 391:3

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>

ARM: signal: Mark expected switch fall-through

Mark switch cases where we are expecting to fall through.

This patch fixes the following warning:

arch/arm/kernel/signal.c: In function 'do_signal':
arch/arm/kernel/signal.c:598:12: warning: this statement may fall through [-Wimplicit-fallthrough=]
    restart -= 2;
    ~~~~~~~~^~~~
arch/arm/kernel/signal.c:599:3: note: here
   case -ERESTARTNOHAND:
   ^~~~

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>

mfd: omap-usb-host: Mark expected switch fall-throughs

Mark switch cases where we are expecting to fall through.

This patch fixes the following warnings:

drivers/mfd/omap-usb-host.c: In function 'usbhs_runtime_resume':
drivers/mfd/omap-usb-host.c:303:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
    if (!IS_ERR(omap->hsic480m_clk[i])) {
       ^
drivers/mfd/omap-usb-host.c:313:3: note: here
   case OMAP_EHCI_PORT_MODE_TLL:
   ^~~~
drivers/mfd/omap-usb-host.c: In function 'usbhs_runtime_suspend':
drivers/mfd/omap-usb-host.c:345:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
    if (!IS_ERR(omap->hsic480m_clk[i]))
       ^
drivers/mfd/omap-usb-host.c:349:3: note: here
   case OMAP_EHCI_PORT_MODE_TLL:
   ^~~~

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>

mfd: db8500-prcmu: Mark expected switch fall-throughs

Mark switch cases where we are expecting to fall through.

This patch fixes the following warnings:

drivers/mfd/db8500-prcmu.c: In function 'dsiclk_rate':
drivers/mfd/db8500-prcmu.c:1592:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
   div *= 2;
   ~~~~^~~~
drivers/mfd/db8500-prcmu.c:1593:2: note: here
  case PRCM_DSI_PLLOUT_SEL_PHI_2:
  ^~~~
drivers/mfd/db8500-prcmu.c:1594:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
   div *= 2;
   ~~~~^~~~
drivers/mfd/db8500-prcmu.c:1595:2: note: here
  case PRCM_DSI_PLLOUT_SEL_PHI:
  ^~~~

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>

ARM: OMAP: dma: Mark expected switch fall-throughs

Mark switch cases where we are expecting to fall through.

This patch fixes the following warnings:

arch/arm/plat-omap/dma.c: In function 'omap_set_dma_src_burst_mode':
arch/arm/plat-omap/dma.c:384:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
   if (dma_omap2plus()) {
      ^
arch/arm/plat-omap/dma.c:393:2: note: here
  case OMAP_DMA_DATA_BURST_16:
  ^~~~
arch/arm/plat-omap/dma.c:394:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
   if (dma_omap2plus()) {
      ^
arch/arm/plat-omap/dma.c:402:2: note: here
  default:
  ^~~~~~~
arch/arm/plat-omap/dma.c: In function 'omap_set_dma_dest_burst_mode':
arch/arm/plat-omap/dma.c:473:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
   if (dma_omap2plus()) {
      ^
arch/arm/plat-omap/dma.c:481:2: note: here
  default:
  ^~~~~~~

Notice that, in this particular case, the code comment is
modified in accordance with what GCC is expecting to find.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>

ARM: alignment: Mark expected switch fall-throughs

Mark switch cases where we are expecting to fall through.

This patch fixes the following warnings:

arch/arm/mm/alignment.c: In function 'thumb2arm':
arch/arm/mm/alignment.c:688:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
   if ((tinstr & (3 << 9)) == 0x0400) {
      ^
arch/arm/mm/alignment.c:700:2: note: here
  default:
  ^~~~~~~
arch/arm/mm/alignment.c: In function 'do_alignment_t32_to_handler':
arch/arm/mm/alignment.c:753:15: warning: this statement may fall through [-Wimplicit-fallthrough=]
   poffset->un = (tinst2 & 0xff) << 2;
   ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
arch/arm/mm/alignment.c:754:2: note: here
  case 0xe940:
  ^~~~

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>

ARM: tegra: Mark expected switch fall-through

Mark switch cases where we are expecting to fall through.

This patch fixes the following warning:

arch/arm/mach-tegra/reset.c: In function 'tegra_cpu_reset_handler_enable':
arch/arm/mach-tegra/reset.c:72:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
   tegra_cpu_reset_handler_set(reset_address);
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/arm/mach-tegra/reset.c:74:2: note: here
  case 0:
  ^~~~

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>

ARM/hw_breakpoint: Mark expected switch fall-throughs

Mark switch cases where we are expecting to fall through.

This patch fixes the following warnings:

arch/arm/kernel/hw_breakpoint.c: In function 'hw_breakpoint_arch_parse':
arch/arm/kernel/hw_breakpoint.c:609:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
   if (hw->ctrl.len == ARM_BREAKPOINT_LEN_2)
      ^
arch/arm/kernel/hw_breakpoint.c:611:2: note: here
  case 3:
  ^~~~
arch/arm/kernel/hw_breakpoint.c:613:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
   if (hw->ctrl.len == ARM_BREAKPOINT_LEN_1)
      ^
arch/arm/kernel/hw_breakpoint.c:615:2: note: here
  default:
  ^~~~~~~
arch/arm/kernel/hw_breakpoint.c: In function 'arch_build_bp_info':
arch/arm/kernel/hw_breakpoint.c:544:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
   if ((hw->ctrl.type != ARM_BREAKPOINT_EXECUTE)
      ^
arch/arm/kernel/hw_breakpoint.c:547:2: note: here
  default:
  ^~~~~~~
In file included from include/linux/kernel.h:11,
                 from include/linux/list.h:9,
                 from include/linux/preempt.h:11,
                 from include/linux/hardirq.h:5,
                 from arch/arm/kernel/hw_breakpoint.c:16:
arch/arm/kernel/hw_breakpoint.c: In function 'hw_breakpoint_pending':
include/linux/compiler.h:78:22: warning: this statement may fall through [-Wimplicit-fallthrough=]
# define unlikely(x) __builtin_expect(!!(x), 0)
                      ^~~~~~~~~~~~~~~~~~~~~~~~~~
include/asm-generic/bug.h:136:2: note: in expansion of macro 'unlikely'
  unlikely(__ret_warn_on);     \
  ^~~~~~~~
arch/arm/kernel/hw_breakpoint.c:863:3: note: in expansion of macro 'WARN'
   WARN(1, "Asynchronous watchpoint exception taken. Debugging results may be unreliable\n");
   ^~~~
arch/arm/kernel/hw_breakpoint.c:864:2: note: here
  case ARM_ENTRY_SYNC_WATCHPOINT:
  ^~~~
arch/arm/kernel/hw_breakpoint.c: In function 'core_has_os_save_restore':
arch/arm/kernel/hw_breakpoint.c:910:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
   if (oslsr & ARM_OSLSR_OSLM0)
      ^
arch/arm/kernel/hw_breakpoint.c:912:2: note: here
  default:
  ^~~~~~~

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>

Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
"Bugfixes (arm and x86) and cleanups"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  selftests: kvm: Adding config fragments
  KVM: selftests: Update gitignore file for latest changes
  kvm: remove unnecessary PageReserved check
  KVM: arm/arm64: vgic: Reevaluate level sensitive interrupts on enable
  KVM: arm: Don't write junk to CP15 registers on reset
  KVM: arm64: Don't write junk to sysregs on reset
  KVM: arm/arm64: Sync ICH_VMCR_EL2 back when about to block
  x86: kvm: remove useless calls to kvm_para_available
  KVM: no need to check return value of debugfs_create functions
  KVM: remove kvm_arch_has_vcpu_debugfs()
  KVM: Fix leak vCPU's VMCS value into other pCPU
  KVM: Check preempted_in_kernel for involuntary preemption
  KVM: LAPIC: Don't need to wakeup vCPU twice afer timer fire
  arm64: KVM: hyp: debug-sr: Mark expected switch fall-through
  KVM: arm64: Update kvm_arm_exception_class and esr_class_str for new EC
  KVM: arm: vgic-v3: Mark expected switch fall-through
  arm64: KVM: regmap: Fix unexpected switch fall-through
  KVM: arm/arm64: Introduce kvm_pmu_vcpu_init() to setup PMU counter index

Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input

Pull input updates from Dmitry Torokhov:

- newer systems with Elan touchpads will be switched over to SMBus

- HP Spectre X360 will be using SMbus/RMI4

- checks for invalid USB descriptors in kbtab and iforce

- build fixes for applespi driver (misconfigs)

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: iforce - add sanity checks
  Input: applespi - use struct_size() helper
  Input: kbtab - sanity check for endpoint type
  Input: usbtouchscreen - initialize PM mutex before using it
  Input: applespi - add dependency on LEDS_CLASS
  Input: synaptics - enable RMI mode for HP Spectre X360
  Input: elantech - annotate fall-through case in elantech_use_host_notify()
  Input: elantech - enable SMBus on new (2018+) systems
  Input: applespi - fix trivial typo in struct description
  Input: applespi - select CRC16 module
  Input: applespi - fix warnings detected by sparse