summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2012-10-05KVM: PPC: Book3S HV: Take the SRCU read lock before looking up memslotsPaul Mackerras
The generic KVM code uses SRCU (sleeping RCU) to protect accesses to the memslots data structures against updates due to userspace adding, modifying or removing memory slots. We need to do that too, both to avoid accessing stale copies of the memslots and to avoid lockdep warnings. This therefore adds srcu_read_lock/unlock pairs around code that accesses and uses memslots. Since the real-mode handlers for H_ENTER, H_REMOVE and H_BULK_REMOVE need to access the memslots, and we don't want to call the SRCU code in real mode (since we have no assurance that it would only access the linear mapping), we hold the SRCU read lock for the VM while in the guest. This does mean that adding or removing memory slots while some vcpus are executing in the guest will block for up to two jiffies. This tradeoff is acceptable since adding/removing memory slots only happens rarely, while H_ENTER/H_REMOVE/H_BULK_REMOVE are performance-critical hot paths. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: bookehv: Allow duplicate calls of DO_KVM macroMihai Caraman
The current form of DO_KVM macro restricts its use to one call per input parameter set. This is caused by kvmppc_resume_\intno\()_\srr1 symbol definition. Duplicate calls of DO_KVM are required by distinct implementations of exeption handlers which are delegated at runtime. Use a rare label number to avoid conflicts with the calling contexts. Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: BookE: Support FPU on non-hv systemsAlexander Graf
When running on HV aware hosts, we can not trap when the guest sets the FP bit, so we just let it do so when it wants to, because it has full access to MSR. For non-HV aware hosts with an FPU (like 440), we need to also adjust the shadow MSR though. Otherwise the guest gets an FP unavailable trap even when it really enabled the FP bit in MSR. Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: 440: Implement mfdcrxAlexander Graf
We need mfdcrx to execute properly on 460 cores. Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: 440: Implement mtdcrxAlexander Graf
We need mtdcrx to execute properly on 460 cores. Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05Document IACx/DACx registers access using ONE_REG APIBharat Bhushan
Patch to access the debug registers (IACx/DACx) using ONE_REG api was sent earlier. But that missed the respective documentation. Also corrected the index number referencing in section 4.69 Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: E500: Remove E500_TLB_DIRTY flagAlexander Graf
Since we always mark pages as dirty immediately when mapping them read/write now, there's no need for the dirty flag in our cache. Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: Use symbols for exit traceAlexander Graf
Exit traces are a lot easier to read when you don't have to remember cryptic numbers for guest exit reasons. Symbolify them in our trace output. Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: BookE: Add MCSR SPR supportAlexander Graf
Add support for the MCSR SPR. This only implements the SPR storage bits, not actual machine checks. Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: 44x: Initialize PVRAlexander Graf
We need to make sure that vcpu->arch.pvr is initialized to a sane value, so let's just take the host PVR. Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05booke: Added ONE_REG interface for IAC/DAC debug registersBharat Bhushan
IAC/DAC are defined as 32 bit while they are 64 bit wide. So ONE_REG interface is added to set/get them. Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: booke: Add watchdog emulationBharat Bhushan
This patch adds the watchdog emulation in KVM. The watchdog emulation is enabled by KVM_ENABLE_CAP(KVM_CAP_PPC_BOOKE_WATCHDOG) ioctl. The kernel timer are used for watchdog emulation and emulates h/w watchdog state machine. On watchdog timer expiry, it exit to QEMU if TCR.WRC is non ZERO. QEMU can reset/shutdown etc depending upon how it is configured. Signed-off-by: Liu Yu <yu.liu@freescale.com> Signed-off-by: Scott Wood <scottwood@freescale.com> [bharat.bhushan@freescale.com: reworked patch] Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> [agraf: adjust to new request framework] Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: Add return value to core_check_requestsAlexander Graf
Requests may want to tell us that we need to go back into host state, so add a return value for the checks. Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: Add return value in prepare_to_enterAlexander Graf
Our prepare_to_enter helper wants to be able to return in more circumstances to the host than only when an interrupt is pending. Broaden the interface a bit and move even more generic code to the generic helper. Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: Ignore EXITING_GUEST_MODE modeAlexander Graf
We don't need to do anything when mode is EXITING_GUEST_MODE, because we essentially are outside of guest mode and did everything it asked us to do by the time we check it. Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: Move kvm_guest_enter call into generic codeAlexander Graf
We need to call kvm_guest_enter in booke and book3s, so move its call to generic code. Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: Book3S: PR: Rework irq disablingAlexander Graf
Today, we disable preemption while inside guest context, because we need to expose to the world that we are not in a preemptible context. However, during that time we already have interrupts disabled, which would indicate that we are in a non-preemptible context. The reason the checks for irqs_disabled() fail for us though is that we manually control hard IRQs and ignore all the lazy EE framework. Let's stop doing that. Instead, let's always use lazy EE to indicate when we want to disable IRQs, but do a special final switch that gets us into EE disabled, but soft enabled state. That way when we get back out of guest state, we are immediately ready to process interrupts. This simplifies the code drastically and reduces the time that we appear as preempt disabled. Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: Consistentify vcpu exit pathAlexander Graf
When getting out of __vcpu_run, let's be consistent about the state we return in. We want to always * have IRQs enabled * have called kvm_guest_exit before Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: Book3S: PR: Indicate we're out of guest modeAlexander Graf
When going out of guest mode, indicate that we are in vcpu->mode. That way requests from other CPUs don't needlessly need to kick us to process them, because it'll just happen next time we enter the guest. Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: Exit guest context while handling exitAlexander Graf
The x86 implementation of KVM accounts for host time while processing guest exits. Do the same for us. Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: Book3S: PR: Only do resched check once per exitAlexander Graf
Now that we use our generic exit helper, we can safely drop our previous kvm_resched that we used to trigger at the beginning of the exit handler function. Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: BookE: Drop redundant vcpu->mode setAlexander Graf
We only need to set vcpu->mode to outside once. Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: Book3s: PR: Add (dumb) MMU Notifier supportAlexander Graf
Now that we have very simple MMU Notifier support for e500 in place, also add the same simple support to book3s. It gets us one step closer to actual fast support. Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: Use same kvmppc_prepare_to_enter code for booke and book3s_prAlexander Graf
We need to do the same things when preparing to enter a guest for booke and book3s_pr cores. Fold the generic code into a generic function that both call. Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: BookE: No duplicate request != 0 checkAlexander Graf
We only call kvmppc_check_requests() when vcpu->requests != 0, so drop the redundant check in the function itself Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: BookE: Add some more trace pointsAlexander Graf
Without trace points, debugging what exactly is going on inside guest code can be very tricky. Add a few more trace points at places that hopefully tell us more when things go wrong. Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: E500: Implement MMU notifiersAlexander Graf
The e500 target has lived without mmu notifiers ever since it got introduced, but fails for the user space check on them with hugetlbfs. So in order to get that one working, implement mmu notifiers in a reasonably dumb fashion and be happy. On embedded hardware, we almost never end up with mmu notifier calls, since most people don't overcommit. Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: BookE: Add support for vcpu->modeAlexander Graf
Generic KVM code might want to know whether we are inside guest context or outside. It also wants to be able to push us out of guest context. Add support to the BookE code for the generic vcpu->mode field that describes the above states. Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: BookE: Add check_requests helper functionAlexander Graf
We need a central place to check for pending requests in. Add one that only does the timer check we already do in a different place. Later, this central function can be extended by more checks. Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05powerpc/epapr: export epapr_hypercall_startScott Wood
This fixes breakage introduced by the following commit: commit 6d2d82627f4f1e96a33664ace494fa363e0495cb Author: Liu Yu-B13201 <Yu.Liu@freescale.com> Date: Tue Jul 3 05:48:56 2012 +0000 PPC: Don't use hardcoded opcode for ePAPR hcall invocation when a driver that uses ePAPR hypercalls is built as a module. Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: Quieten message about allocating linear regionsPaul Mackerras
This is printed once for every RMA or HPT region that get preallocated. If one preallocates hundreds of such regions (in order to run hundreds of KVM guests), that gets rather painful, so make it a bit quieter. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: E500: Fix clear_tlb_refsAlexander Graf
Our mapping code assumes that TLB0 entries are always mapped. However, after calling clear_tlb_refs() this is no longer the case. Map them dynamically if we find an entry unmapped in TLB0. Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: BookE: Expose remote TLB flushes in debugfsAlexander Graf
We're already counting remote TLB flushes in a variable, but don't export it to user space yet. Do so, so we know what's going on. Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: Expose SYNC cap based on mmu notifiersAlexander Graf
Semantically, the "SYNC" cap means that we have mmu notifiers available. Express this in our #ifdef'ery around the feature, so that we can be sure we don't miss out on ppc targets when they get their implementation. Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: PR: Use generic tracepoint for guest exitAlexander Graf
We want to have tracing information on guest exits for booke as well as book3s. Since most information is identical, use a common trace point. Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05PPC: Don't use hardcoded opcode for ePAPR hcall invocationLiu Yu-B13201
Signed-off-by: Liu Yu <yu.liu@freescale.com> Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05powerpc/fsl-soc: use CONFIG_EPAPR_PARAVIRT for hcallsScott Wood
Signed-off-by: Scott Wood <scottwood@freescale.com> Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05PPC: select EPAPR_PARAVIRT for all users of epapr hcallsStuart Yoder
Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: ev_idle hcall support for e500 guestsLiu Yu-B13201
Signed-off-by: Liu Yu <yu.liu@freescale.com> [varun: 64-bit changes] Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com> Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: Add support for ePAPR idle hcall in host kernelLiu Yu-B13201
And add a new flag definition in kvm_ppc_pvinfo to indicate whether the host supports the EV_IDLE hcall. Signed-off-by: Liu Yu <yu.liu@freescale.com> [stuart.yoder@freescale.com: cleanup,fixes for conditions allowing idle] Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com> [agraf: fix typo] Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: add pvinfo for hcall opcodes on e500mc/e5500Stuart Yoder
Signed-off-by: Liu Yu <yu.liu@freescale.com> [stuart: factored this out from idle hcall support in host patch] Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05KVM: PPC: use definitions in epapr header for hcallsStuart Yoder
Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-05PPC: epapr: create define for return code value of successStuart Yoder
Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>
2012-09-27KVM: s390: Fix vcpu_load handling in interrupt codeChristian Borntraeger
Recent changes (KVM: make processes waiting on vcpu mutex killable) now requires to check the return value of vcpu_load. This triggered a warning in s390 specific kvm code. Turns out that we can actually remove the put/load, since schedule will do the right thing via the preempt notifiers. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-23KVM: x86: Fix guest debug across vcpu INIT resetJan Kiszka
If we reset a vcpu on INIT, we so far overwrote dr7 as provided by KVM_SET_GUEST_DEBUG, and we also cleared switch_db_regs unconditionally. Fix this by saving the dr7 used for guest debugging and calculating the effective register value as well as switch_db_regs on any potential change. This will change to focus of the set_guest_debug vendor op to update_dp_bp_intercept. Found while trying to stop on start_secondary. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-23KVM: Add resampling irqfds for level triggered interruptsAlex Williamson
To emulate level triggered interrupts, add a resample option to KVM_IRQFD. When specified, a new resamplefd is provided that notifies the user when the irqchip has been resampled by the VM. This may, for instance, indicate an EOI. Also in this mode, posting of an interrupt through an irqfd only asserts the interrupt. On resampling, the interrupt is automatically de-asserted prior to user notification. This enables level triggered interrupts to be posted and re-enabled from vfio with no userspace intervention. All resampling irqfds can make use of a single irq source ID, so we reserve a new one for this interface. Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-20KVM: optimize apic interrupt deliveryGleb Natapov
Most interrupt are delivered to only one vcpu. Use pre-build tables to find interrupt destination instead of looping through all vcpus. In case of logical mode loop only through vcpus in a logical cluster irq is sent to. Signed-off-by: Gleb Natapov <gleb@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-20Merge branch 'queue' into nextAvi Kivity
* queue: KVM: MMU: Eliminate pointless temporary 'ac' KVM: MMU: Avoid access/dirty update loop if all is well KVM: MMU: Eliminate eperm temporary KVM: MMU: Optimize is_last_gpte() KVM: MMU: Simplify walk_addr_generic() loop KVM: MMU: Optimize pte permission checks KVM: MMU: Update accessed and dirty bits after guest pagetable walk KVM: MMU: Move gpte_access() out of paging_tmpl.h KVM: MMU: Optimize gpte_access() slightly KVM: MMU: Push clean gpte write protection out of gpte_access() KVM: clarify kvmclock documentation KVM: make processes waiting on vcpu mutex killable KVM: SVM: Make use of asm.h KVM: VMX: Make use of asm.h KVM: VMX: Make lto-friendly Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-20KVM: MMU: Eliminate pointless temporary 'ac'Avi Kivity
'ac' essentially reconstructs the 'access' variable we already have, except for the PFERR_PRESENT_MASK and PFERR_RSVD_MASK. As these are not used by callees, just use 'access' directly. Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-20KVM: MMU: Avoid access/dirty update loop if all is wellAvi Kivity
Keep track of accessed/dirty bits; if they are all set, do not enter the accessed/dirty update loop. Reviewed-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>