Scott Wood [Fri, 12 Apr 2013 14:08:42 +0000 (14:08 +0000)]
kvm: add device control API
Currently, devices that are emulated inside KVM are configured in a
hardcoded manner based on an assumption that any given architecture
only has one way to do it. If there's any need to access device state,
it is done through inflexible one-purpose-only IOCTLs (e.g.
KVM_GET/SET_LAPIC). Defining new IOCTLs for every little thing is
cumbersome and depletes a limited numberspace.
This API provides a mechanism to instantiate a device of a certain
type, returning an ID that can be used to set/get attributes of the
device. Attributes may include configuration parameters (e.g.
register base address), device state, operational commands, etc. It
is similar to the ONE_REG API, except that it acts on devices rather
than vcpus.
Both device types and individual attributes can be tested without having
to create the device or get/set the attribute, without the need for
separately managing enumerated capabilities.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Alexander Graf [Tue, 16 Apr 2013 10:12:49 +0000 (12:12 +0200)]
KVM: Move irqfd resample cap handling to generic code
Now that we have most irqfd code completely platform agnostic, let's move
irqfd's resample capability return to generic code as well.
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Alexander Graf [Mon, 15 Apr 2013 21:23:21 +0000 (23:23 +0200)]
KVM: Move irq routing setup to irqchip.c
Setting up IRQ routes is nothing IOAPIC specific. Extract everything
that really is generic code into irqchip.c and only leave the ioapic
specific bits to irq_comm.c.
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Alexander Graf [Mon, 15 Apr 2013 21:04:10 +0000 (23:04 +0200)]
KVM: Extract generic irqchip logic into irqchip.c
The current irq_comm.c file contains pieces of code that are generic
across different irqchip implementations, as well as code that is
fully IOAPIC specific.
Split the generic bits out into irqchip.c.
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Alexander Graf [Mon, 15 Apr 2013 19:12:53 +0000 (21:12 +0200)]
KVM: Move irq routing to generic code
The IRQ routing set ioctl lives in the hacky device assignment code inside
of KVM today. This is definitely the wrong place for it. Move it to the much
more natural kvm_main.c.
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Alexander Graf [Mon, 15 Apr 2013 08:50:54 +0000 (10:50 +0200)]
KVM: Remove kvm_get_intr_delivery_bitmask
The prototype has been stale for a while, I can't spot any real function
define behind it. Let's just remove it.
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Alexander Graf [Mon, 15 Apr 2013 08:49:31 +0000 (10:49 +0200)]
KVM: Drop __KVM_HAVE_IOAPIC condition on irq routing
We have a capability enquire system that allows user space to ask kvm
whether a feature is available.
The point behind this system is that we can have different kernel
configurations with different capabilities and user space can adjust
accordingly.
Because features can always be non existent, we can drop any #ifdefs
on CAP defines that could be used generically, like the irq routing
bits. These can be easily reused for non-IOAPIC systems as well.
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Alexander Graf [Wed, 17 Apr 2013 11:29:30 +0000 (13:29 +0200)]
KVM: Introduce CONFIG_HAVE_KVM_IRQ_ROUTING
Quite a bit of code in KVM has been conditionalized on availability of
IOAPIC emulation. However, most of it is generically applicable to
platforms that don't have an IOPIC, but a different type of irq chip.
Make code that only relies on IRQ routing, not an APIC itself, on
CONFIG_HAVE_KVM_IRQ_ROUTING, so that we can reuse it later.
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Alexander Graf [Mon, 15 Apr 2013 08:42:33 +0000 (10:42 +0200)]
KVM: Add KVM_IRQCHIP_NUM_PINS in addition to KVM_IOAPIC_NUM_PINS
The concept of routing interrupt lines to an irqchip is nothing
that is IOAPIC specific. Every irqchip has a maximum number of pins
that can be linked to irq lines.
So let's add a new define that allows us to reuse generic code for
non-IOAPIC platforms.
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Paul Mackerras [Thu, 18 Apr 2013 19:51:04 +0000 (19:51 +0000)]
KVM: PPC: Book3S HV: Report VPA and DTL modifications in dirty map
At present, the KVM_GET_DIRTY_LOG ioctl doesn't report modifications
done by the host to the virtual processor areas (VPAs) and dispatch
trace logs (DTLs) registered by the guest. This is because those
modifications are done either in real mode or in the host kernel
context, and in neither case does the access go through the guest's
HPT, and thus no change (C) bit gets set in the guest's HPT.
However, the changes done by the host do need to be tracked so that
the modified pages get transferred when doing live migration. In
order to track these modifications, this adds a dirty flag to the
struct representing the VPA/DTL areas, and arranges to set the flag
when the VPA/DTL gets modified by the host. Then, when we are
collecting the dirty log, we also check the dirty flags for the
VPA and DTL for each vcpu and set the relevant bit in the dirty log
if necessary. Doing this also means we now need to keep track of
the guest physical address of the VPA/DTL areas.
So as not to lose track of modifications to a VPA/DTL area when it gets
unregistered, or when a new area gets registered in its place, we need
to transfer the dirty state to the rmap chain. This adds code to
kvmppc_unpin_guest_page() to do that if the area was dirty. To simplify
that code, we now require that all VPA, DTL and SLB shadow buffer areas
fit within a single host page. Guests already comply with this
requirement because pHyp requires that these areas not cross a 4k
boundary.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Paul Mackerras [Thu, 18 Apr 2013 19:50:24 +0000 (19:50 +0000)]
KVM: PPC: Book3S HV: Make HPT reading code notice R/C bit changes
At present, the code that determines whether a HPT entry has changed,
and thus needs to be sent to userspace when it is copying the HPT,
doesn't consider a hardware update to the reference and change bits
(R and C) in the HPT entries to constitute a change that needs to
be sent to userspace. This adds code to check for changes in R and C
when we are scanning the HPT to find changed entries, and adds code
to set the changed flag for the HPTE when we update the R and C bits
in the guest view of the HPTE.
Since we now need to set the HPTE changed flag in book3s_64_mmu_hv.c
as well as book3s_hv_rm_mmu.c, we move the note_hpte_modification()
function into kvm_book3s_64.h.
Current Linux guest kernels don't use the hardware updates of R and C
in the HPT, so this change won't affect them. Linux (or other) kernels
might in future want to use the R and C bits and have them correctly
transferred across when a guest is migrated, so it is better to correct
this deficiency.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Mihai Caraman [Thu, 11 Apr 2013 00:03:14 +0000 (00:03 +0000)]
KVM: PPC: e500: Add e6500 core to Kconfig description
Add e6500 core to Kconfig description.
Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Mihai Caraman [Thu, 11 Apr 2013 00:03:13 +0000 (00:03 +0000)]
KVM: PPC: e500mc: Enable e6500 cores
Extend processor compatibility names to e6500 cores.
Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
Mihai Caraman [Thu, 11 Apr 2013 00:03:12 +0000 (00:03 +0000)]
KVM: PPC: e500: Remove E.PT and E.HV.LRAT categories from VCPUs
Embedded.Page Table (E.PT) category is not supported yet in e6500 kernel.
Configure TLBnCFG to remove E.PT and E.HV.LRAT categories from VCPUs.
Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Mihai Caraman [Thu, 11 Apr 2013 00:03:11 +0000 (00:03 +0000)]
KVM: PPC: e500: Add support for EPTCFG register
EPTCFG register defined by E.PT is accessed unconditionally by Linux guests
in the presence of MAV 2.0. Emulate it now.
Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Mihai Caraman [Thu, 11 Apr 2013 00:03:10 +0000 (00:03 +0000)]
KVM: PPC: e500: Add support for TLBnPS registers
Add support for TLBnPS registers available in MMU Architecture Version
(MAV) 2.0.
Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Mihai Caraman [Thu, 11 Apr 2013 00:03:09 +0000 (00:03 +0000)]
KVM: PPC: e500: Move vcpu's MMU configuration to dedicated functions
Vcpu's MMU default configuration and geometry update logic was buried in
a chunk of code. Move them to dedicated functions to add more clarity.
Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Mihai Caraman [Thu, 11 Apr 2013 00:03:08 +0000 (00:03 +0000)]
KVM: PPC: e500: Expose MMU registers via ONE_REG
MMU registers were exposed to user-space using sregs interface. Add them
to ONE_REG interface using kvmppc_get_one_reg/kvmppc_set_one_reg delegation
mechanism.
Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Mihai Caraman [Thu, 11 Apr 2013 00:03:07 +0000 (00:03 +0000)]
KVM: PPC: Book3E: Refactor ONE_REG ioctl implementation
Refactor Book3E ONE_REG ioctl implementation to use kvmppc_get_one_reg/
kvmppc_set_one_reg delegation interface introduced by Book3S. This is
necessary for MMU SPRs which are platform specifics.
Get rid of useless case braces in the process.
Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Bharat Bhushan [Mon, 8 Apr 2013 00:32:15 +0000 (00:32 +0000)]
booke: exit to user space if emulator request
This allows the exit to user space if emulator request by returning
EMULATE_EXIT_USER. This will be used in subsequent patches in list
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Bharat Bhushan [Mon, 8 Apr 2013 00:32:14 +0000 (00:32 +0000)]
KVM: extend EMULATE_EXIT_USER to support different exit reasons
Currently the instruction emulator code returns EMULATE_EXIT_USER
and common code initializes the "run->exit_reason = .." and
"vcpu->arch.hcall_needed = .." with one fixed reason.
But there can be different reasons when emulator need to exit
to user space. To support that the "run->exit_reason = .."
and "vcpu->arch.hcall_needed = .." initialization is moved a
level up to emulator.
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Bharat Bhushan [Mon, 8 Apr 2013 00:32:13 +0000 (00:32 +0000)]
Rename EMULATE_DO_PAPR to EMULATE_EXIT_USER
Instruction emulation return EMULATE_DO_PAPR when it requires
exit to userspace on book3s. Similar return is required
for booke. EMULATE_DO_PAPR reads out to be confusing so it is
renamed to EMULATE_EXIT_USER.
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Bharat Bhushan [Mon, 8 Apr 2013 00:32:12 +0000 (00:32 +0000)]
KVM: PPC: debug stub interface parameter defined
This patch defines the interface parameter for KVM_SET_GUEST_DEBUG
ioctl support. Follow up patches will use this for setting up
hardware breakpoints, watchpoints and software breakpoints.
Also kvm_arch_vcpu_ioctl_set_guest_debug() is brought one level below.
This is because I am not sure what is required for book3s. So this ioctl
behaviour will not change for book3s.
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Bharat Bhushan [Thu, 25 Apr 2013 06:33:57 +0000 (06:33 +0000)]
KVM: PPC: cache flush for kernel managed pages
Kernel can only access pages which maps as memory.
So flush only the valid kernel pages.
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Gleb Natapov [Wed, 24 Apr 2013 10:38:36 +0000 (13:38 +0300)]
KVM: X86 emulator: fix source operand decoding for 8bit mov[zs]x instructions
Source operand for one byte mov[zs]x is decoded incorrectly if it is in
high byte register. Fix that.
Cc: stable@vger.kernel.org
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Jan Kiszka [Sun, 14 Apr 2013 10:44:54 +0000 (12:44 +0200)]
KVM: nVMX: VM_ENTRY/EXIT_LOAD_IA32_EFER overrides EFER.LMA settings
If we load the complete EFER MSR on entry or exit, EFER.LMA (and LME)
loading is skipped. Their consistency is already checked now before
starting the transition.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Jan Kiszka [Sat, 20 Apr 2013 08:52:36 +0000 (10:52 +0200)]
KVM: nVMX: Validate EFER values for VM_ENTRY/EXIT_LOAD_IA32_EFER
As we may emulate the loading of EFER on VM-entry and VM-exit, implement
the checks that VMX performs on the guest and host values on vmlaunch/
vmresume. Factor out kvm_valid_efer for this purpose which checks for
set reserved bits.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Jan Kiszka [Sun, 14 Apr 2013 19:04:26 +0000 (21:04 +0200)]
KVM: nVMX: Fix conditions for NMI injection
The logic for checking if interrupts can be injected has to be applied
also on NMIs. The difference is that if NMI interception is on these
events are consumed and blocked by the VM exit.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Jan Kiszka [Sun, 14 Apr 2013 10:12:47 +0000 (12:12 +0200)]
KVM: VMX: Move vmx_nmi_allowed after vmx_set_nmi_mask
vmx_set_nmi_mask will soon be used by vmx_nmi_allowed. No functional
changes.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Andrew Honig [Thu, 18 Apr 2013 16:38:14 +0000 (09:38 -0700)]
KVM: x86: Fix memory leak in vmx.c
If userspace creates and destroys multiple VMs within the same process
we leak 20k of memory in the userspace process context per VM. This
patch frees the memory in kvm_arch_destroy_vm. If the process exits
without closing the VM file descriptor or the file descriptor has been
shared with another process then we don't free the memory.
It's still possible for a user space process to leak memory if the last
process to close the fd for the VM is not the process that created it.
However, this is an unexpected case that's only caused by a user space
process that's misbehaving.
Signed-off-by: Andrew Honig <ahonig@google.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Wei Yongjun [Wed, 17 Apr 2013 23:41:00 +0000 (07:41 +0800)]
KVM: x86: fix error return code in kvm_arch_vcpu_init()
Fix to return a negative error code from the error handling
case instead of 0, as returned elsewhere in this function.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Abel Gordon [Thu, 18 Apr 2013 11:39:55 +0000 (14:39 +0300)]
KVM: nVMX: Enable and disable shadow vmcs functionality
Once L1 loads VMCS12 we enable shadow-vmcs capability and copy all the VMCS12
shadowed fields to the shadow vmcs. When we release the VMCS12, we also
disable shadow-vmcs capability.
Signed-off-by: Abel Gordon <abelg@il.ibm.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Abel Gordon [Thu, 18 Apr 2013 11:39:25 +0000 (14:39 +0300)]
KVM: nVMX: Synchronize VMCS12 content with the shadow vmcs
Synchronize between the VMCS12 software controlled structure and the
processor-specific shadow vmcs
Signed-off-by: Abel Gordon <abelg@il.ibm.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Abel Gordon [Thu, 18 Apr 2013 11:38:55 +0000 (14:38 +0300)]
KVM: nVMX: Copy VMCS12 to processor-specific shadow vmcs
Introduce a function used to copy fields from the software controlled VMCS12
to the processor-specific shadow vmcs
Signed-off-by: Abel Gordon <abelg@il.ibm.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Abel Gordon [Thu, 18 Apr 2013 11:38:25 +0000 (14:38 +0300)]
KVM: nVMX: Copy processor-specific shadow-vmcs to VMCS12
Introduce a function used to copy fields from the processor-specific shadow
vmcs to the software controlled VMCS12
Signed-off-by: Abel Gordon <abelg@il.ibm.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Abel Gordon [Thu, 18 Apr 2013 11:37:55 +0000 (14:37 +0300)]
KVM: nVMX: Release shadow vmcs
Unmap vmcs12 and release the corresponding shadow vmcs
Signed-off-by: Abel Gordon <abelg@il.ibm.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Abel Gordon [Thu, 18 Apr 2013 11:37:25 +0000 (14:37 +0300)]
KVM: nVMX: Allocate shadow vmcs
Allocate a shadow vmcs used by the processor to shadow part of the fields
stored in the software defined VMCS12 (let L1 access fields without causing
exits). Note we keep a shadow vmcs only for the current vmcs12. Once a vmcs12
becomes non-current, its shadow vmcs is released.
Signed-off-by: Abel Gordon <abelg@il.ibm.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Abel Gordon [Thu, 18 Apr 2013 11:36:55 +0000 (14:36 +0300)]
KVM: nVMX: Fix VMXON emulation
handle_vmon doesn't check if L1 is already in root mode (VMXON
was previously called). This patch adds this missing check and calls
nested_vmx_failValid if VMX is already ON.
We need this check because L0 will allocate the shadow vmcs when L1
executes VMXON and we want to avoid host leaks (due to shadow vmcs
allocation) if L1 executes VMXON repeatedly.
Signed-off-by: Abel Gordon <abelg@il.ibm.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Abel Gordon [Thu, 18 Apr 2013 11:36:25 +0000 (14:36 +0300)]
KVM: nVMX: Refactor handle_vmwrite
Refactor existent code so we re-use vmcs12_write_any to copy fields from the
shadow vmcs specified by the link pointer (used by the processor,
implementation-specific) to the VMCS12 software format used by L0 to hold
the fields in L1 memory address space.
Signed-off-by: Abel Gordon <abelg@il.ibm.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Abel Gordon [Thu, 18 Apr 2013 11:35:55 +0000 (14:35 +0300)]
KVM: nVMX: Introduce vmread and vmwrite bitmaps
Prepare vmread and vmwrite bitmaps according to a pre-specified list of fields.
These lists are intended to specifiy most frequent accessed fields so we can
minimize the number of fields that are copied from/to the software controlled
VMCS12 format to/from to processor-specific shadow vmcs. The lists were built
measuring the VMCS fields access rate after L2 Ubuntu 12.04 booted when it was
running on top of L1 KVM, also Ubuntu 12.04. Note that during boot there were
additional fields which were frequently modified but they were not added to
these lists because after boot these fields were not longer accessed by L1.
Signed-off-by: Abel Gordon <abelg@il.ibm.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Abel Gordon [Thu, 18 Apr 2013 11:35:25 +0000 (14:35 +0300)]
KVM: nVMX: Detect shadow-vmcs capability
Add logic required to detect if shadow-vmcs is supported by the
processor. Introduce a new kernel module parameter to specify if L0 should use
shadow vmcs (or not) to run L1.
Signed-off-by: Abel Gordon <abelg@il.ibm.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Abel Gordon [Thu, 18 Apr 2013 11:34:55 +0000 (14:34 +0300)]
KVM: nVMX: Shadow-vmcs control fields/bits
Add definitions for all the vmcs control fields/bits
required to enable vmcs-shadowing
Signed-off-by: Abel Gordon <abelg@il.ibm.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Gleb Natapov [Mon, 22 Apr 2013 07:38:15 +0000 (10:38 +0300)]
Merge git://github.com/agraf/linux-2.6.git kvm-ppc-next into queue
Yang Zhang [Wed, 17 Apr 2013 00:46:41 +0000 (08:46 +0800)]
KVM: ia64: Fix kvm_vm_ioctl_irq_line
Fix the compile error with kvm_vm_ioctl_irq_line.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Zhang, Yang Z [Thu, 18 Apr 2013 02:11:54 +0000 (23:11 -0300)]
KVM: x86: Fix posted interrupt with CONFIG_SMP=n
->send_IPI_mask is not defined on UP.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Scott Wood [Mon, 15 Apr 2013 15:07:11 +0000 (15:07 +0000)]
kvm/ppc: don't call complete_mmio_load when it's a store
complete_mmio_load writes back the mmio result into the
destination register. Doing this on a store results in
register corruption.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Stuart Yoder [Tue, 9 Apr 2013 10:36:23 +0000 (10:36 +0000)]
KVM: PPC: emulate dcbst
Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Bharat Bhushan [Wed, 20 Mar 2013 20:24:58 +0000 (20:24 +0000)]
Added ONE_REG interface for debug instruction
This patch adds the one_reg interface to get the special instruction
to be used for setting software breakpoint from userspace.
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Alexander Graf [Wed, 17 Apr 2013 13:20:38 +0000 (15:20 +0200)]
Merge commit 'origin/next' into kvm-ppc-next
Gleb Natapov [Sun, 14 Apr 2013 13:07:37 +0000 (16:07 +0300)]
KVM: VMX: Fix check guest state validity if a guest is in VM86 mode
If guest vcpu is in VM86 mode the vcpu state should be checked as if in
real mode.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Paolo Bonzini [Mon, 15 Apr 2013 13:00:27 +0000 (15:00 +0200)]
KVM: nVMX: check vmcs12 for valid activity state
KVM does not use the activity state VMCS field, and does not support
it in nested VMX either (the corresponding bits in the misc VMX feature
MSR are zero). Fail entry if the activity state is set to anything but
"active".
Since the value will always be the same for L1 and L2, we do not need
to read and write the corresponding VMCS field on L1/L2 transitions,
either.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Alexander Graf [Tue, 16 Apr 2013 17:21:41 +0000 (19:21 +0200)]
KVM: ARM: Fix kvm_vm_ioctl_irq_line
Commit
aa2fbe6d broke the ARM KVM target by introducing a new parameter
to irq handling functions.
Fix the function prototype to get things compiling again and ignore the
parameter just like we did before
Signed-off-by: Alexander Graf <agraf@suse.de>
Acked-by: Christoffer Dall <cdall@cs.columbia.edu>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Yang Zhang [Thu, 11 Apr 2013 11:25:16 +0000 (19:25 +0800)]
KVM: VMX: Use posted interrupt to deliver virtual interrupt
If posted interrupt is avaliable, then uses it to inject virtual
interrupt to guest.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Yang Zhang [Thu, 11 Apr 2013 11:25:15 +0000 (19:25 +0800)]
KVM: VMX: Add the deliver posted interrupt algorithm
Only deliver the posted interrupt when target vcpu is running
and there is no previous interrupt pending in pir.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Yang Zhang [Thu, 11 Apr 2013 11:25:14 +0000 (19:25 +0800)]
KVM: Set TMR when programming ioapic entry
We already know the trigger mode of a given interrupt when programming
the ioapice entry. So it's not necessary to set it in each interrupt
delivery.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Yang Zhang [Thu, 11 Apr 2013 11:25:13 +0000 (19:25 +0800)]
KVM: Call common update function when ioapic entry changed.
Both TMR and EOI exit bitmap need to be updated when ioapic changed
or vcpu's id/ldr/dfr changed. So use common function instead eoi exit
bitmap specific function.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Yang Zhang [Thu, 11 Apr 2013 11:25:12 +0000 (19:25 +0800)]
KVM: VMX: Check the posted interrupt capability
Detect the posted interrupt feature. If it exists, then set it in vmcs_config.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Yang Zhang [Thu, 11 Apr 2013 11:25:11 +0000 (19:25 +0800)]
KVM: VMX: Register a new IPI for posted interrupt
Posted Interrupt feature requires a special IPI to deliver posted interrupt
to guest. And it should has a high priority so the interrupt will not be
blocked by others.
Normally, the posted interrupt will be consumed by vcpu if target vcpu is
running and transparent to OS. But in some cases, the interrupt will arrive
when target vcpu is scheduled out. And host will see it. So we need to
register a dump handler to handle it.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Yang Zhang [Thu, 11 Apr 2013 11:25:10 +0000 (19:25 +0800)]
KVM: VMX: Enable acknowledge interupt on vmexit
The "acknowledge interrupt on exit" feature controls processor behavior
for external interrupt acknowledgement. When this control is set, the
processor acknowledges the interrupt controller to acquire the
interrupt vector on VM exit.
After enabling this feature, an interrupt which arrived when target cpu is
running in vmx non-root mode will be handled by vmx handler instead of handler
in idt. Currently, vmx handler only fakes an interrupt stack and jump to idt
table to let real handler to handle it. Further, we will recognize the interrupt
and only delivery the interrupt which not belong to current vcpu through idt table.
The interrupt which belonged to current vcpu will be handled inside vmx handler.
This will reduce the interrupt handle cost of KVM.
Also, interrupt enable logic is changed if this feature is turnning on:
Before this patch, hypervior call local_irq_enable() to enable it directly.
Now IF bit is set on interrupt stack frame, and will be enabled on a return from
interrupt handler if exterrupt interrupt exists. If no external interrupt, still
call local_irq_enable() to enable it.
Refer to Intel SDM volum 3, chapter 33.2.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Yang Zhang [Thu, 11 Apr 2013 11:21:41 +0000 (19:21 +0800)]
KVM: Use eoi to track RTC interrupt delivery status
Current interrupt coalescing logci which only used by RTC has conflict
with Posted Interrupt.
This patch introduces a new mechinism to use eoi to track interrupt:
When delivering an interrupt to vcpu, the pending_eoi set to number of
vcpu that received the interrupt. And decrease it when each vcpu writing
eoi. No subsequent RTC interrupt can deliver to vcpu until all vcpus
write eoi.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Yang Zhang [Thu, 11 Apr 2013 11:21:40 +0000 (19:21 +0800)]
KVM: Let ioapic know the irq line status
Userspace may deliver RTC interrupt without query the status. So we
want to track RTC EOI for this case.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Yang Zhang [Thu, 11 Apr 2013 11:21:39 +0000 (19:21 +0800)]
KVM: Force vmexit with virtual interrupt delivery
Need the EOI to track interrupt deliver status, so force vmexit
on EOI for rtc interrupt when enabling virtual interrupt delivery.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Yang Zhang [Thu, 11 Apr 2013 11:21:38 +0000 (19:21 +0800)]
KVM: Add reset/restore rtc_status support
restore rtc_status from migration or save/restore
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Yang Zhang [Thu, 11 Apr 2013 11:21:37 +0000 (19:21 +0800)]
KVM: Return destination vcpu on interrupt injection
Add a new parameter to know vcpus who received the interrupt.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Yang Zhang [Thu, 11 Apr 2013 11:21:36 +0000 (19:21 +0800)]
KVM: Introduce struct rtc_status
rtc_status is used to track RTC interrupt delivery status. The pending_eoi
will be increased by vcpu who received RTC interrupt and will be decreased
when EOI to this interrupt.
Also, we use dest_map to record the destination vcpu to avoid the case that
vcpu who didn't get the RTC interupt, but issued EOI with same vector of RTC
and descreased pending_eoi by mistake.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Yang Zhang [Thu, 11 Apr 2013 11:21:35 +0000 (19:21 +0800)]
KVM: Add vcpu info to ioapic_update_eoi()
Add vcpu info to ioapic_update_eoi, so we can know which vcpu
issued this EOI.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Jan Kiszka [Sun, 14 Apr 2013 10:12:50 +0000 (12:12 +0200)]
KVM: nVMX: Avoid reading VM_EXIT_INTR_ERROR_CODE needlessly on nested exits
We only need to update vm_exit_intr_error_code if there is a valid exit
interruption information and it comes with a valid error code.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Jan Kiszka [Sun, 14 Apr 2013 10:12:48 +0000 (12:12 +0200)]
KVM: nVMX: Fix conditions for interrupt injection
If we are entering guest mode, we do not want L0 to interrupt this
vmentry with all its side effects on the vmcs. Therefore, injection
shall be disallowed during L1->L2 transitions, as in the previous
version. However, this check is conceptually independent of
nested_exit_on_intr, so decouple it.
If L1 traps external interrupts, we can kick the guest from L2 to L1,
also just like the previous code worked. But we no longer need to
consider L1's idt_vectoring_info_field. It will always be empty at this
point. Instead, if L2 has pending events, those are now found in the
architectural queues and will, thus, prevent vmx_interrupt_allowed from
being called at all.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Jan Kiszka [Sun, 14 Apr 2013 10:12:46 +0000 (12:12 +0200)]
KVM: nVMX: Rework event injection and recovery
The basic idea is to always transfer the pending event injection on
vmexit into the architectural state of the VCPU and then drop it from
there if it turns out that we left L2 to enter L1, i.e. if we enter
prepare_vmcs12.
vmcs12_save_pending_events takes care to transfer pending L0 events into
the queue of L1. That is mandatory as L1 may decide to switch the guest
state completely, invalidating or preserving the pending events for
later injection (including on a different node, once we support
migration).
This concept is based on the rule that a pending vmlaunch/vmresume is
not canceled. Otherwise, we would risk to lose injected events or leak
them into the wrong queues. Encode this rule via a WARN_ON_ONCE at the
entry of nested_vmx_vmexit.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Jan Kiszka [Sun, 14 Apr 2013 10:12:45 +0000 (12:12 +0200)]
KVM: nVMX: Fix injection of PENDING_INTERRUPT and NMI_WINDOW exits to L1
Check if the interrupt or NMI window exit is for L1 by testing if it has
the corresponding controls enabled. This is required when we allow
direct injection from L0 to L2
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Gleb Natapov [Thu, 11 Apr 2013 09:32:14 +0000 (12:32 +0300)]
KVM: emulator: mark 0xff 0x7d opcode as undefined.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Gleb Natapov [Thu, 11 Apr 2013 09:30:01 +0000 (12:30 +0300)]
KVM: emulator: Do not fail on emulation of undefined opcode
Emulation of undefined opcode should inject #UD instead of causing
emulation failure. Do that by moving Undefined flag check to emulation
stage and injection #UD there.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Gleb Natapov [Thu, 11 Apr 2013 09:10:51 +0000 (12:10 +0300)]
KVM: VMX: do not try to reexecute failed instruction while emulating invalid guest state
During invalid guest state emulation vcpu cannot enter guest mode to try
to reexecute instruction that emulator failed to emulate, so emulation
will happen again and again. Prevent that by telling the emulator that
instruction reexecution should not be attempted.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Gleb Natapov [Thu, 11 Apr 2013 08:59:55 +0000 (11:59 +0300)]
KVM: emulator: fix unimplemented instruction detection
Unimplemented instruction detection is broken for group instructions
since it relies on "flags" field of opcode to be zero, but all
instructions in a group inherit flags from a group encoding. Fix that by
having a separate flag for unimplemented instructions.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Kevin Wolf [Thu, 11 Apr 2013 12:06:03 +0000 (14:06 +0200)]
KVM: x86 emulator: Fix segment loading in VM86
This fixes a regression introduced in commit
03ebebeb1 ("KVM: x86
emulator: Leave segment limit and attributs alone in real mode").
The mentioned commit changed the segment descriptors for both real mode
and VM86 to only update the segment base instead of creating a
completely new descriptor with limit 0xffff so that unreal mode keeps
working across a segment register reload.
This leads to an invalid segment descriptor in the eyes of VMX, which
seems to be okay for real mode because KVM will fix it up before the
next VM entry or emulate the state, but it doesn't do this if the guest
is in VM86, so we end up with:
KVM: entry failed, hardware error 0x80000021
Fix this by effectively reverting commit
03ebebeb1 for VM86 and leaving
it only in place for real mode, which is where it's really needed.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Geoff Levand [Fri, 5 Apr 2013 19:20:30 +0000 (19:20 +0000)]
KVM: Move kvm_rebooting declaration out of x86
The variable kvm_rebooting is a common kvm variable, so move its
declaration from arch/x86/include/asm/kvm_host.h to
include/asm/kvm_host.h.
Fixes this sparse warning when building on arm64:
virt/kvm/kvm_main.c:warning: symbol 'kvm_rebooting' was not declared. Should it be static?
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Geoff Levand [Fri, 5 Apr 2013 19:20:30 +0000 (19:20 +0000)]
KVM: Move kvm_spurious_fault to x86.c
The routine kvm_spurious_fault() is an x86 specific routine, so
move it from virt/kvm/kvm_main.c to arch/x86/kvm/x86.c.
Fixes this sparse warning when building on arm64:
virt/kvm/kvm_main.c:warning: symbol 'kvm_spurious_fault' was not declared. Should it be static?
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Geoff Levand [Fri, 5 Apr 2013 19:20:30 +0000 (19:20 +0000)]
KVM: Make local routines static
The routines get_user_page_nowait(), kvm_io_bus_sort_cmp(), kvm_io_bus_insert_dev()
and kvm_io_bus_get_first_dev() are only referenced within kvm_main.c, so give them
static linkage.
Fixes sparse warnings like these:
virt/kvm/kvm_main.c: warning: symbol 'get_user_page_nowait' was not declared. Should it be static?
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Geoff Levand [Fri, 5 Apr 2013 19:20:30 +0000 (19:20 +0000)]
KVM: Move vm_list kvm_lock declarations out of x86
The variables vm_list and kvm_lock are common to all architectures, so
move the declarations from arch/x86/include/asm/kvm_host.h to
include/linux/kvm_host.h.
Fixes sparse warnings like these when building for arm64:
virt/kvm/kvm_main.c: warning: symbol 'kvm_lock' was not declared. Should it be static?
virt/kvm/kvm_main.c: warning: symbol 'vm_list' was not declared. Should it be static?
Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Jan Kiszka [Mon, 8 Apr 2013 09:07:46 +0000 (11:07 +0200)]
KVM: VMX: Add missing braces to avoid redundant error check
The code was already properly aligned, now also add the braces to avoid
that err is checked even if alloc_apic_access_page didn't run and change
it. Found via Coccinelle by Fengguang Wu.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Yang Zhang [Mon, 8 Apr 2013 07:26:33 +0000 (15:26 +0800)]
KVM: x86: fix memory leak in vmx_init
Free vmx_msr_bitmap_longmode_x2apic and vmx_msr_bitmap_longmode if
kvm_init() fails.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Michael S. Tsirkin [Thu, 4 Apr 2013 10:27:21 +0000 (13:27 +0300)]
kvm: fix MMIO/PIO collision misdetection
PIO and MMIO are separate address spaces, but
ioeventfd registration code mistakenly detected
two eventfds as duplicate if they use the same address,
even if one is PIO and another one MMIO.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Jan Kiszka [Sat, 6 Apr 2013 11:51:21 +0000 (13:51 +0200)]
KVM: nVMX: Check exit control for VM_EXIT_SAVE_IA32_PAT, not entry controls
Obviously a copy&paste mistake: prepare_vmcs12 has to check L1's exit
controls for VM_EXIT_SAVE_IA32_PAT.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Yang Zhang [Sun, 7 Apr 2013 00:25:18 +0000 (08:25 +0800)]
KVM: Call kvm_apic_match_dest() to check destination vcpu
For a given vcpu, kvm_apic_match_dest() will tell you whether
the vcpu in the destination list quickly. Drop kvm_calculate_eoi_exitmap()
and use kvm_apic_match_dest() instead.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Cornelia Huck [Thu, 4 Apr 2013 08:25:06 +0000 (10:25 +0200)]
KVM: s390: virtio_ccw: reset errors for new I/O.
ccw_io_helper neglected to reset vcdev->err after a new channel
program had been successfully started, resulting in stale errors
delivered after one I/O failed. Reset the error after a new
channel program has been successfully started with no old I/O
pending.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Takuya Yoshikawa [Fri, 29 Mar 2013 05:05:26 +0000 (14:05 +0900)]
Revert "KVM: MMU: Move kvm_mmu_free_some_pages() into kvm_mmu_alloc_page()"
With the following commit, shadow pages can be zapped at random during
a shadow page talbe walk:
KVM: MMU: Move kvm_mmu_free_some_pages() into kvm_mmu_alloc_page()
7ddca7e43c8f28f9419da81a0e7730b66aa60fe9
This patch reverts it and fixes __direct_map() and FNAME(fetch)().
Signed-off-by: Takuya Yoshikawa <yoshikawa_takuya_b1@lab.ntt.co.jp>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Paolo Bonzini [Thu, 28 Mar 2013 16:18:35 +0000 (17:18 +0100)]
pmu: prepare for migration support
In order to migrate the PMU state correctly, we need to restore the
values of MSR_CORE_PERF_GLOBAL_STATUS (a read-only register) and
MSR_CORE_PERF_GLOBAL_OVF_CTRL (which has side effects when written).
We also need to write the full 40-bit value of the performance counter,
which would only be possible with a v3 architectural PMU's full-width
counter MSRs.
To distinguish host-initiated writes from the guest's, pass the
full struct msr_data to kvm_pmu_set_msr.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Nick Wang [Mon, 25 Mar 2013 16:22:58 +0000 (17:22 +0100)]
KVM: s390: Enable KVM_CAP_NR_MEMSLOTS on s390
Return KVM_USER_MEM_SLOTS in kvm_dev_ioctl_check_extension().
Signed-off-by: Nick Wang <jfwang@us.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Nick Wang [Mon, 25 Mar 2013 16:22:57 +0000 (17:22 +0100)]
KVM: s390: Remove the sanity checks for kvm memory slot
To model the standby memory with memory_region_add_subregion
and friends, the guest would have one or more regions of ram.
Remove the check allowing only one memory slot and the check
requiring the real address of memory slot starts at zero.
Signed-off-by: Nick Wang <jfwang@us.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Nick Wang [Mon, 25 Mar 2013 16:22:56 +0000 (17:22 +0100)]
KVM: s390: Change the virtual memory mapping location for virtio devices
The current location for mapping virtio devices does not take
into consideration the standby memory. This causes the failure
of mapping standby memory since the location for the mapping is
already taken by the virtio devices. To fix the problem, we move
the location to beyond the end of standby memory.
Signed-off-by: Nick Wang <jfwang@us.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Heiko Carstens [Mon, 25 Mar 2013 16:22:55 +0000 (17:22 +0100)]
KVM: s390: fix compile with !CONFIG_COMPAT
arch/s390/kvm/priv.c should include both
linux/compat.h and asm/compat.h.
Fixes this one:
In file included from arch/s390/kvm/priv.c:23:0:
arch/s390/include/asm/compat.h: In function ‘arch_compat_alloc_user_space’:
arch/s390/include/asm/compat.h:258:2: error: implicit declaration of function ‘is_compat_task’
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Heiko Carstens [Mon, 25 Mar 2013 16:22:54 +0000 (17:22 +0100)]
KVM: s390: fix stsi exception handling
In case of an exception the guest psw condition code should be left alone.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-By: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Heiko Carstens [Mon, 25 Mar 2013 16:22:53 +0000 (17:22 +0100)]
KVM: s390: fix and enforce return code handling for irq injections
kvm_s390_inject_program_int() and friends may fail if no memory is available.
This must be reported to the calling functions, so that this gets passed
down to user space which should fix the situation.
Alternatively we end up with guest state corruption.
So fix this and enforce return value checking by adding a __must_check
annotation to all of these function prototypes.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Heiko Carstens [Mon, 25 Mar 2013 16:22:52 +0000 (17:22 +0100)]
KVM: s390: make if statements in lpsw/lpswe handlers readable
Being unable to parse the 5- and 8-line if statements I had to split them
to be able to make any sense of them and verify that they match the
architecture.
So change the code since I guess that other people will also have a hard
time parsing such long conditional statements with line breaks.
Introduce a common is_valid_psw() function which does all the checks needed.
In case of lpsw (64 bit psw -> 128 bit psw conversion) it will do some not
needed additional checks, since a couple of bits can't be set anyway, but
that doesn't hurt.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Heiko Carstens [Mon, 25 Mar 2013 16:22:51 +0000 (17:22 +0100)]
KVM: s390: fix return code handling in lpsw/lpswe handlers
kvm_s390_inject_program_int() may return with a non-zero return value, in
case of an error (out of memory). Report that to the calling functions
instead of ignoring the error case.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Heiko Carstens [Mon, 25 Mar 2013 16:22:50 +0000 (17:22 +0100)]
KVM: s390: fix psw conversion in lpsw handler
When converting a 64 bit psw to a 128 bit psw the addressing mode bit of
the "addr" part of the 64 bit psw must be moved to the basic addressing
mode bit of the "mask" part of the 128 bit psw.
In addition the addressing mode bit must be cleared when moved to the "addr"
part of the 128 bit psw.
Otherwise an invalid psw would be generated if the orginal psw was in the
31 bit addressing mode.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Heiko Carstens [Mon, 25 Mar 2013 16:22:49 +0000 (17:22 +0100)]
KVM: s390: fix 24 bit psw handling in lpsw/lpswe handler
When checking for validity the lpsw/lpswe handler check that only
the lower 20 bits instead of 24 bits have a non-zero value.
There handling valid psws as invalid ones.
Fix the 24 bit psw mask.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Christian Borntraeger [Mon, 25 Mar 2013 16:22:48 +0000 (17:22 +0100)]
KVM: s390: Dont do a gmap update on minor memslot changes
Some memslot updates dont affect the gmap implementation,
e.g. setting/unsetting dirty tracking. Since a gmap update
will cause tlb flushes and segment table invalidations we
want to avoid that.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Alexander Graf [Wed, 27 Mar 2013 12:41:35 +0000 (13:41 +0100)]
Merge commit 'origin/next' into kvm-ppc-next
Gleb Natapov [Sun, 24 Mar 2013 09:43:09 +0000 (11:43 +0200)]
Merge 'git://github.com/agraf/linux-2.6.git kvm-ppc-next' into queue