aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2010-10-24KVM: MMU: Check for root_level instead of long modeJoerg Roedel1-2/+2
The walk_addr function checks for !is_long_mode in its 64 bit version. But what is meant here is a check for pae paging. Change the condition to really check for pae paging so that it also works with nested nested paging. Signed-off-by: Joerg Roedel <[email protected]> Signed-off-by: Avi Kivity <[email protected]>
2010-10-24KVM: x86: Emulate MSR_EBC_FREQUENCY_IDJes Sorensen1-0/+14
Some operating systems store data about the host processor at the time of installation, and when booted on a more uptodate cpu tries to read MSR_EBC_FREQUENCY_ID. This has been found with XP. Signed-off-by: Jes Sorensen <[email protected]> Reviewed-by: Juan Quintela <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2010-10-24x86: Define MSR_EBC_FREQUENCY_IDJes Sorensen1-0/+1
Signed-off-by: Jes Sorensen <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2010-10-24KVM: SVM: Clean up rip handling in vmrun emulationRoedel, Joerg1-4/+4
This patch changes the rip handling in the vmrun emulation path from using next_rip to the generic kvm register access functions. Signed-off-by: Joerg Roedel <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2010-10-24KVM: SVM: Restore correct registers after sel_cr0 intercept emulationJoerg Roedel1-2/+31
This patch implements restoring of the correct rip, rsp, and rax after the svm emulation in KVM injected a selective_cr0 write intercept into the guest hypervisor. The problem was that the vmexit is emulated in the instruction emulation which later commits the registers right after the write-cr0 instruction. So the l1 guest will continue to run with the l2 rip, rsp and rax resulting in unpredictable behavior. This patch is not the final word, it is just an easy patch to fix the issue. The real fix will be done when the instruction emulator is made aware of nested virtualization. Until this is done this patch fixes the issue and provides an easy way to fix this in -stable too. Cc: [email protected] Signed-off-by: Joerg Roedel <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2010-10-24KVM: MMU: Fix 32 bit legacy paging with NPTJoerg Roedel1-2/+6
This patch fixes 32 bit legacy paging with NPT enabled. The mmu_check_root call on the top-level of the loop causes root_gfn to take values (in the tdp_enabled path) which are outside of guest memory. So the mmu_check_root call fails at some point in the loop interation causing the guest to tiple-fault. This patch changes the mmu_check_root calls to the places where they are really necessary. As a side-effect it introduces a check for the root of a pae page table too. Signed-off-by: Joerg Roedel <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2010-10-24KVM: PPC: Move of include to __KERNEL__ sectionAlexander Graf1-1/+2
We have to protect the include for linux/of.h by __KERNEL__ so it doesn't accidently get referenced outside. This patch fixes this and makes the tree compile again. Reported-by: Stephen Rothwell <[email protected]> Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: Add documentation for magic page enhancementsAlexander Graf1-0/+14
This documents how to detect additional features inside the magic page when a guest maps it. Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: Fix compile error in e500_tlb.cAlexander Graf1-1/+2
The e500_tlb.c file didn't compile for me due to the following error: arch/powerpc/kvm/e500_tlb.c: In function ‘kvmppc_e500_shadow_map’: arch/powerpc/kvm/e500_tlb.c:300: error: format ‘%lx’ expects type ‘long unsigned int’, but argument 2 has type ‘gfn_t’ So let's explicitly cast the argument to make printk happy. Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: e500_tlb: Fix a minor copy-paste tracing bugKyle Moffett1-2/+1
The kvmppc_e500_stlbe_invalidate() function was trying to pass too many parameters to trace_kvm_stlb_inval(). This appears to be a bad copy-paste from a call to trace_kvm_stlb_write(). Signed-off-by: Kyle Moffett <[email protected]> Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: Document KVM_INTERRUPT ioctlAlexander Graf1-2/+31
This adds some documentation for the KVM_INTERRUPT special cases that PowerPC now implements. Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: Implement level interrupts for BookEAlexander Graf2-3/+18
BookE also wants to support level based interrupts, so let's implement all the necessary logic there. We need to trick a bit here because the irqprios are 1:1 assigned to architecture defined values. But since there is some space left there, we can just pick a random one and move it later on - it's internal anyways. Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: Expose level based interrupt capAlexander Graf2-0/+2
Now that we have all the level interrupt magic in place, let's expose the capability to user space, so it can make use of it! Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: Implement Level interrupts on Book3SAlexander Graf3-4/+31
The current interrupt logic is just completely broken. We get a notification from user space, telling us that an interrupt is there. But then user space expects us that we just acknowledge an interrupt once we deliver it to the guest. This is not how real hardware works though. On real hardware, the interrupt controller pulls the external interrupt line until it gets notified that the interrupt was received. So in reality we have two events: pulling and letting go of the interrupt line. To maintain backwards compatibility, I added a new request for the pulling part. The letting go part was implemented earlier already. With this in place, we can now finally start guests that do not randomly stall and stop to work at random times. This patch implements above logic for Book3S. Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: Enable napping only for Book3s_64Alexander Graf1-0/+2
Before I incorrectly enabled napping also for BookE, which would result in needless dcache flushes. Since we only need to force enable napping on Book3s_64 because it doesn't go into MSR_POW otherwise, we can just #ifdef that code to this particular platform. Reported-by: Scott Wood <[email protected]> Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: allow ppc440gp to pass the compatibility checkHollis Blanchard1-1/+2
Match only the first part of cur_cpu_spec->platform. 440GP (the first 440 processor) is identified by the string "ppc440gp", while all later 440 processors use simply "ppc440". Signed-off-by: Hollis Blanchard <[email protected]> Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: fix compilation of "dump tlbs" debug functionHollis Blanchard1-0/+1
Missing local variable. Signed-off-by: Hollis Blanchard <[email protected]> Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: initialize IVORs in addition to IVPRHollis Blanchard1-2/+6
Developers can now tell at a glace the exact type of the premature interrupt, instead of just knowing that there was some premature interrupt. Signed-off-by: Hollis Blanchard <[email protected]> Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: Don't put MSR_POW in MSRAlexander Graf1-1/+5
On Book3S a mtmsr with the MSR_POW bit set indicates that the OS is in idle and only needs to be waked up on the next interrupt. Now, unfortunately we let that bit slip into the stored MSR value which is not what the real CPU does, so that we ended up executing code like this: r = mfmsr(); /* r containts MSR_POW */ mtmsr(r | MSR_EE); This obviously breaks, as we're going into idle mode in code sections that don't expect to be idling. This patch masks MSR_POW out of the stored MSR value on wakeup, making guests happy again. Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: Implement correct SID mapping on Book3s_32Alexander Graf3-32/+48
Up until now we were doing segment mappings wrong on Book3s_32. For Book3s_64 we were using a trick where we know that a single mmu_context gives us 16 bits of context ids. The mm system on Book3s_32 instead uses a clever algorithm to distribute VSIDs across the available range, so a context id really only gives us 16 available VSIDs. To keep at least a few guest processes in the SID shadow, let's map a number of contexts that we can use as VSID pool. This makes the code be actually correct and shouldn't hurt performance too much. Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: Force enable nap on KVMAlexander Graf1-0/+3
There are some heuristics in the PPC power management code that try to find out if the particular hardware we're running on supports proper power management or just hangs the machine when going into nap mode. Since we know that KVM is safe with nap, let's force enable it in the PV code once we're certain that we are on a KVM VM. Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: Make PV mtmsrd L=1 work with r30 and r31Alexander Graf2-5/+24
We had an arbitrary limitation in mtmsrd L=1 that kept us from using r30 and r31 as input registers. Let's get rid of that and get more potential speedups! Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: Update int_pending also on dequeueAlexander Graf1-0/+3
When having a decrementor interrupt pending, the dequeuing happens manually through an mtdec instruction. This instruction simply calls dequeue on that interrupt, so the int_pending hint doesn't get updated. This patch enables updating the int_pending hint also on dequeue, thus correctly enabling guests to stay in guest contexts more often. Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: Make PV mtmsr work with r30 and r31Alexander Graf2-16/+40
So far we've been restricting ourselves to r0-r29 as registers an mtmsr instruction could use. This was bad, as there are some code paths in Linux actually using r30. So let's instead handle all registers gracefully and get rid of that stupid limitation Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: Add mtsrin PV codeAlexander Graf4-0/+114
This is the guest side of the mtsr acceleration. Using this a guest can now call mtsrin with almost no overhead as long as it ensures that it only uses it with (MSR_IR|MSR_DR) == 0. Linux does that, so we're good. Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: Put segment registers in shared pageAlexander Graf5-12/+11
Now that the actual mtsr doesn't do anything anymore, we can move the sr contents over to the shared page, so a guest can directly read and write its sr contents from guest context. Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: Interpret SR registers on demandAlexander Graf3-48/+46
Right now we're examining the contents of Book3s_32's segment registers when the register is written and put the interpreted contents into a struct. There are two reasons this is bad. For starters, the struct has worse real-time performance, as it occupies more ram. But the more important part is that with segment registers being interpreted from their raw values, we can put them in the shared page, allowing guests to mess with them directly. This patch makes the internal representation of SRs be u32s. Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: Move BAT handling code into spr handlerAlexander Graf1-32/+16
The current approach duplicates the spr->bat finding logic and makes it harder to reuse the actually used variables. So let's move everything down to the spr handler. Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: Add feature bitmap for magic pageAlexander Graf3-7/+21
We will soon add SR PV support to the shared page, so we need some infrastructure that allows the guest to query for features KVM exports. This patch adds a second return value to the magic mapping that indicated to the guest which features are available. Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: Remove unused defineAlexander Graf1-1/+0
The define VSID_ALL is unused. Let's remove it. Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: Revert "KVM: PPC: Use kernel hash function"Alexander Graf2-4/+17
It turns out the in-kernel hash function is sub-optimal for our subtle hash inputs where every bit is significant. So let's revert to the original hash functions. This reverts commit 05340ab4f9a6626f7a2e8f9fe5397c61d494f445. Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: Move slb debugging to tracepointsAlexander Graf2-17/+78
This patch moves debugging printks for shadow SLB debugging over to tracepoints. Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: Make invalidation code more reliableAlexander Graf1-6/+8
There is a race condition in the pte invalidation code path where we can't be sure if a pte was invalidated already. So let's move the spin lock around to get rid of the race. Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: Don't flush PTEs on NX/RO hitAlexander Graf1-2/+0
When hitting a no-execute or read-only data/inst storage interrupt we were flushing the respective PTE so we're sure it gets properly overwritten next. According to the spec, this is unnecessary though. The guest issues a tlbie anyways, so we're safe to just keep the PTE around and have it manually removed from the guest, saving us a flush. Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: Preload magic page when in kernel modeAlexander Graf1-0/+10
When the guest jumps into kernel mode and has the magic page mapped, theres a very high chance that it will also use it. So let's detect that scenario and map the segment accordingly. Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: Add tracepoints for generic spte flushesAlexander Graf2-15/+26
The different ways of flusing shadow ptes have their own debug prints which use stupid old printk. Let's move them to tracepoints, making them easier available, faster and possible to activate on demand Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: Fix sid map search after flushAlexander Graf1-2/+2
After a flush the sid map contained lots of entries with 0 for their gvsid and hvsid value. Unfortunately, 0 can be a real value the guest searches for when looking up a vsid so it would incorrectly find the host's 0 hvsid mapping which doesn't belong to our sid space. So let's also check for the valid bit that indicated that the sid we're looking at actually contains useful data. Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: Move pte invalidate debug code to tracepointAlexander Graf2-2/+30
This patch moves the SPTE flush debug printk over to tracepoints. Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: Add tracepoint for generic mmu mapAlexander Graf2-0/+32
This patch moves the generic mmu map debugging over to tracepoints. Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: Move book3s_64 mmu map debug print to trace pointAlexander Graf2-11/+36
This patch moves Book3s MMU debugging over to tracepoints. Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: PPC: Move EXIT_DEBUG partially to tracepointsAlexander Graf2-22/+55
We have a debug printk on every exit that is usually #ifdef'ed out. Using tracepoints makes a lot more sense here though, as they can be dynamically enabled. This patch converts the most commonly used debug printks of EXIT_DEBUG to tracepoints. Signed-off-by: Alexander Graf <[email protected]>
2010-10-24KVM: ia64: define kvm_lapic_enabled() to fix a compile errorTakuya Yoshikawa1-0/+1
The following patch commit 57ce1659316f4ca298919649f9b1b55862ac3826 KVM: x86: In DM_LOWEST, only deliver interrupts to vcpus with enabled LAPIC's ignored the fact that kvm_irq_delivery_to_apic() was also used by ia64. We define kvm_lapic_enabled() to fix a compile error caused by this. This will have the same effect as reverting the problematic patch for ia64. Signed-off-by: Takuya Yoshikawa <[email protected]> Signed-off-by: Avi Kivity <[email protected]>
2010-10-24KVM: MMU: lower the aduit frequencyXiao Guangrong1-0/+7
The audit is very high overhead, so we need lower the frequency to assure the guest is running. Signed-off-by: Xiao Guangrong <[email protected]> Signed-off-by: Avi Kivity <[email protected]>
2010-10-24KVM: MMU: improve spte auditXiao Guangrong1-79/+69
Both audit_mappings() and audit_sptes_have_rmaps() need to walk vcpu's page table, so we can do these checking in a spte walking Signed-off-by: Xiao Guangrong <[email protected]> Signed-off-by: Avi Kivity <[email protected]>
2010-10-24KVM: MMU: improve active sp auditXiao Guangrong1-36/+38
Both audit_rmap() and audit_write_protection() need to walk all active sp, so we can do these checking in a sp walking Signed-off-by: Xiao Guangrong <[email protected]> Signed-off-by: Avi Kivity <[email protected]>
2010-10-24KVM: MMU: move audit to a separate fileXiao Guangrong2-278/+298
Move the audit code from arch/x86/kvm/mmu.c to arch/x86/kvm/mmu_audit.c Signed-off-by: Xiao Guangrong <[email protected]> Signed-off-by: Avi Kivity <[email protected]>
2010-10-24KVM: MMU: support disable/enable mmu audit dynamiclyXiao Guangrong4-20/+101
Add a r/w module parameter named 'mmu_audit', it can control audit enable/disable: enable: echo 1 > /sys/module/kvm/parameters/mmu_audit disable: echo 0 > /sys/module/kvm/parameters/mmu_audit This patch not change the logic Signed-off-by: Xiao Guangrong <[email protected]> Signed-off-by: Avi Kivity <[email protected]>
2010-10-24KVM: Fix guest kernel crash on MSR_K7_CLK_CTLJes Sorensen1-0/+22
MSR_K7_CLK_CTL is a no longer documented MSR, which is only relevant on said old AMD CPU models. This change returns the expected value, which the Linux kernel is expecting to avoid writing back the MSR, plus it ignores all writes to the MSR. Signed-off-by: Jes Sorensen <[email protected]> Signed-off-by: Avi Kivity <[email protected]>
2010-10-24KVM: i8259: Make ICW1 conform to specAvi Kivity1-6/+10
ICW is not a full reset, instead it resets a limited number of registers in the PIC. Change ICW1 emulation to only reset those registers. Signed-off-by: Avi Kivity <[email protected]>
2010-10-24KVM: x86 emulator: clean up control flow in x86_emulate_insn()Avi Kivity1-57/+7
x86_emulate_insn() is full of things like if (rc != X86EMUL_CONTINUE) goto done; break; consolidate all of those at the end of the switch statement. Signed-off-by: Avi Kivity <[email protected]>