aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2014-09-03KVM: mmio: cleanup kvm_set_mmio_spte_maskTiejun Chen3-6/+6
Just reuse rsvd_bits() inside kvm_set_mmio_spte_mask() for slightly better code. Signed-off-by: Tiejun Chen <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-03kvm: x86: fix stale mmio cache bugDavid Matlack3-6/+17
The following events can lead to an incorrect KVM_EXIT_MMIO bubbling up to userspace: (1) Guest accesses gpa X without a memory slot. The gfn is cached in struct kvm_vcpu_arch (mmio_gfn). On Intel EPT-enabled hosts, KVM sets the SPTE write-execute-noread so that future accesses cause EPT_MISCONFIGs. (2) Host userspace creates a memory slot via KVM_SET_USER_MEMORY_REGION covering the page just accessed. (3) Guest attempts to read or write to gpa X again. On Intel, this generates an EPT_MISCONFIG. The memory slot generation number that was incremented in (2) would normally take care of this but we fast path mmio faults through quickly_check_mmio_pf(), which only checks the per-vcpu mmio cache. Since we hit the cache, KVM passes a KVM_EXIT_MMIO up to userspace. This patch fixes the issue by using the memslot generation number to validate the mmio cache. Cc: [email protected] Signed-off-by: David Matlack <[email protected]> [xiaoguangrong: adjust the code to make it simpler for stable-tree fix.] Signed-off-by: Xiao Guangrong <[email protected]> Reviewed-by: David Matlack <[email protected]> Reviewed-by: Xiao Guangrong <[email protected]> Tested-by: David Matlack <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-03kvm: fix potentially corrupt mmio cacheDavid Matlack3-15/+42
vcpu exits and memslot mutations can run concurrently as long as the vcpu does not aquire the slots mutex. Thus it is theoretically possible for memslots to change underneath a vcpu that is handling an exit. If we increment the memslot generation number again after synchronize_srcu_expedited(), vcpus can safely cache memslot generation without maintaining a single rcu_dereference through an entire vm exit. And much of the x86/kvm code does not maintain a single rcu_dereference of the current memslots during each exit. We can prevent the following case: vcpu (CPU 0) | thread (CPU 1) --------------------------------------------+-------------------------- 1 vm exit | 2 srcu_read_unlock(&kvm->srcu) | 3 decide to cache something based on | old memslots | 4 | change memslots | (increments generation) 5 | synchronize_srcu(&kvm->srcu); 6 retrieve generation # from new memslots | 7 tag cache with new memslot generation | 8 srcu_read_unlock(&kvm->srcu) | ... | <action based on cache occurs even | though the caching decision was based | on the old memslots> | ... | <action *continues* to occur until next | memslot generation change, which may | be never> | | By incrementing the generation after synchronizing with kvm->srcu readers, we ensure that the generation retrieved in (6) will become invalid soon after (8). Keeping the existing increment is not strictly necessary, but we do keep it and just move it for consistency from update_memslots to install_new_memslots. It invalidates old cached MMIOs immediately, instead of having to wait for the end of synchronize_srcu_expedited, which makes the code more clearly correct in case CPU 1 is preempted right after synchronize_srcu() returns. To avoid halving the generation space in SPTEs, always presume that the low bit of the generation is zero when reconstructing a generation number out of an SPTE. This effectively disables MMIO caching in SPTEs during the call to synchronize_srcu_expedited. Using the low bit this way is somewhat like a seqcount---where the protected thing is a cache, and instead of retrying we can simply punt if we observe the low bit to be 1. Cc: [email protected] Signed-off-by: David Matlack <[email protected]> Reviewed-by: Xiao Guangrong <[email protected]> Reviewed-by: David Matlack <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-03KVM: do not bias the generation number in kvm_current_mmio_generationPaolo Bonzini2-6/+8
The next patch will give a meaning (a la seqcount) to the low bit of the generation number. Ensure that it matches between kvm->memslots->generation and kvm_current_mmio_generation(). Cc: [email protected] Reviewed-by: David Matlack <[email protected]> Reviewed-by: Xiao Guangrong <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-08-29KVM: x86: use guest maxphyaddr to check MTRR valuesPaolo Bonzini1-3/+2
The check introduced in commit d7a2a246a1b5 (KVM: x86: #GP when attempts to write reserved bits of Variable Range MTRRs, 2014-08-19) will break if the guest maxphyaddr is higher than the host's (which sometimes happens depending on your hardware and how QEMU is configured). To fix this, use cpuid_maxphyaddr similar to how the APIC_BASE MSR does already. Reported-by: Jan Kiszka <[email protected]> Tested-by: Jan Kiszka <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-08-29KVM: remove garbage arg to *hardware_{en,dis}ableRadim Krčmář16-27/+27
In the beggining was on_each_cpu(), which required an unused argument to kvm_arch_ops.hardware_{en,dis}able, but this was soon forgotten. Remove unnecessary arguments that stem from this. Signed-off-by: Radim Krčmář <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-08-29KVM: static inline empty kvm_arch functionsRadim Krčmář11-163/+57
Using static inline is going to save few bytes and cycles. For example on powerpc, the difference is 700 B after stripping. (5 kB before) This patch also deals with two overlooked empty functions: kvm_arch_flush_shadow was not removed from arch/mips/kvm/mips.c 2df72e9bc KVM: split kvm_arch_flush_shadow and kvm_arch_sched_in never made it into arch/ia64/kvm/kvm-ia64.c. e790d9ef6 KVM: add kvm_arch_sched_in Signed-off-by: Radim Krčmář <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-08-29KVM: forward declare structs in kvm_types.hPaolo Bonzini9-34/+21
Opaque KVM structs are useful for prototypes in asm/kvm_host.h, to avoid "'struct foo' declared inside parameter list" warnings (and consequent breakage due to conflicting types). Move them from individual files to a generic place in linux/kvm_types.h. Signed-off-by: Paolo Bonzini <[email protected]>
2014-08-29KVM: x86: remove Aligned bit from movntps/movntpdPaolo Bonzini1-3/+3
These are not explicitly aligned, and do not require alignment on AVX. Signed-off-by: Paolo Bonzini <[email protected]>
2014-08-29KVM: x86 emulator: emulate MOVNTDQAlex Williamson1-1/+6
Windows 8.1 guest with NVIDIA driver and GPU fails to boot with an emulation failure. The KVM spew suggests the fault is with lack of movntdq emulation (courtesy of Paolo): Code=02 00 00 b8 08 00 00 00 f3 0f 6f 44 0a f0 f3 0f 6f 4c 0a e0 <66> 0f e7 41 f0 66 0f e7 49 e0 48 83 e9 40 f3 0f 6f 44 0a 10 f3 0f 6f 0c 0a 66 0f e7 41 10 $ as -o a.out .section .text .byte 0x66, 0x0f, 0xe7, 0x41, 0xf0 .byte 0x66, 0x0f, 0xe7, 0x49, 0xe0 $ objdump -d a.out 0: 66 0f e7 41 f0 movntdq %xmm0,-0x10(%rcx) 5: 66 0f e7 49 e0 movntdq %xmm1,-0x20(%rcx) Add the necessary emulation. Signed-off-by: Alex Williamson <[email protected]> Cc: Paolo Bonzini <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-08-29KVM: vmx: VMXOFF emulation in vm86 should cause #UDNadav Amit1-6/+8
Unlike VMCALL, the instructions VMXOFF, VMLAUNCH and VMRESUME should cause a UD exception in real-mode or vm86. However, the emulator considers all these instructions the same for the matter of mode checks, and emulation upon exit due to #UD exception. As a result, the hypervisor behaves incorrectly on vm86 mode. VMXOFF, VMLAUNCH or VMRESUME cause on vm86 exit due to #UD. The hypervisor then emulates these instruction and inject #GP to the guest instead of #UD. This patch creates a new group for these instructions and mark only VMCALL as an instruction which can be emulated. Signed-off-by: Nadav Amit <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-08-29KVM: x86: fix some sparse warningsPaolo Bonzini2-7/+7
Sparse reports the following easily fixed warnings: arch/x86/kvm/vmx.c:8795:48: sparse: Using plain integer as NULL pointer arch/x86/kvm/vmx.c:2138:5: sparse: symbol vmx_read_l1_tsc was not declared. Should it be static? arch/x86/kvm/vmx.c:6151:48: sparse: Using plain integer as NULL pointer arch/x86/kvm/vmx.c:8851:6: sparse: symbol vmx_sched_in was not declared. Should it be static? arch/x86/kvm/svm.c:2162:5: sparse: symbol svm_read_l1_tsc was not declared. Should it be static? Cc: Radim Krčmář <[email protected]> Signed-off-by: Fengguang Wu <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-08-29KVM: nVMX: nested TPR shadow/threshold emulationWanpeng Li1-3/+51
This patch fix bug https://bugzilla.kernel.org/show_bug.cgi?id=61411 TPR shadow/threshold feature is important to speed up the Windows guest. Besides, it is a must feature for certain VMM. We map virtual APIC page address and TPR threshold from L1 VMCS. If TPR_BELOW_THRESHOLD VM exit is triggered by L2 guest and L1 interested in, we inject it into L1 VMM for handling. Reviewed-by: Paolo Bonzini <[email protected]> Signed-off-by: Wanpeng Li <[email protected]> [Add PAGE_ALIGNED check, do not write useless virtual APIC page address if TPR shadowing is disabled. - Paolo] Signed-off-by: Paolo Bonzini <[email protected]>
2014-08-29KVM: nVMX: introduce nested_get_vmcs12_pagesWanpeng Li1-12/+25
Introduce function nested_get_vmcs12_pages() to check the valid of nested apic access page and virtual apic page earlier. Signed-off-by: Wanpeng Li <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-08-29KVM: Unconditionally export KVM_CAP_USER_NMIChristoffer Dall1-3/+1
The idea between capabilities and the KVM_CHECK_EXTENSION ioctl is that userspace can, at run-time, determine if a feature is supported or not. This allows KVM to being supporting a new feature with a new kernel version without any need to update user space. Unfortunately, since the definition of KVM_CAP_USER_NMI was guarded by #ifdef __KVM_HAVE_USER_NMI, such discovery still required a user space update. Therefore, unconditionally export KVM_CAP_USER_NMI and change the the typo in the comment for the IOCTL number definition as well. Signed-off-by: Christoffer Dall <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-08-29KVM: Unconditionally export KVM_CAP_READONLY_MEMChristoffer Dall2-3/+1
The idea between capabilities and the KVM_CHECK_EXTENSION ioctl is that userspace can, at run-time, determine if a feature is supported or not. This allows KVM to being supporting a new feature with a new kernel version without any need to update user space. Unfortunately, since the definition of KVM_CAP_READONLY_MEM was guarded by #ifdef __KVM_HAVE_READONLY_MEM, such discovery still required a user space update. Therefore, unconditionally export KVM_CAP_READONLY_MEM and change the in-kernel conditional to rely on __KVM_HAVE_READONLY_MEM. Signed-off-by: Christoffer Dall <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-08-29KVM: s390/mm: fix up indentation of set_guest_storage_keyChristian Borntraeger1-6/+6
commit ab3f285f227f ("KVM: s390/mm: try a cow on read only pages for key ops")' misaligned a code block. Let's fixup the indentation. Reported-by: Ben Hutchings <[email protected]> Signed-off-by: Christian Borntraeger <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-08-26Merge tag 'kvm-s390-next-20140825' of ↵Paolo Bonzini13-585/+501
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD KVM: s390: Fixes and features for 3.18 part 1 1. The usual cleanups: get rid of duplicate code, use defines, factor out the sync_reg handling, additional docs for sync_regs, better error handling on interrupt injection 2. We use KVM_REQ_TLB_FLUSH instead of open coding tlb flushes 3. Additional registers for kvm_run sync regs. This is usually not needed in the fast path due to eventfd/irqfd, but kvm stat claims that we reduced the overhead of console output by ~50% on my system 4. A rework of the gmap infrastructure. This is the 2nd step towards host large page support (after getting rid of the storage key dependency). We introduces two radix trees to store the guest-to-host and host-to-guest translations. This gets us rid of most of the page-table walks in the gmap code. Only one in __gmap_link is left, this one is required to link the shadow page table to the process page table. Finally this contains the plumbing to support gmap page tables with less than 5 levels.
2014-08-26KVM: s390/mm: remove outdated gmap data structuresMartin Schwidefsky1-23/+0
The radix tree rework removed all code that uses the gmap_rmap and gmap_pgtable data structures. Remove these outdated definitions. Signed-off-by: Martin Schwidefsky <[email protected]> Signed-off-by: Christian Borntraeger <[email protected]>
2014-08-26KVM: s390/mm: support gmap page tables with less than 5 levelsMartin Schwidefsky3-33/+59
Add an addressing limit to the gmap address spaces and only allocate the page table levels that are needed for the given limit. The limit is fixed and can not be changed after a gmap has been created. Signed-off-by: Martin Schwidefsky <[email protected]> Signed-off-by: Christian Borntraeger <[email protected]>
2014-08-26KVM: s390/mm: use radix trees for guest to host mappingsMartin Schwidefsky7-383/+308
Store the target address for the gmap segments in a radix tree instead of using invalid segment table entries. gmap_translate becomes a simple radix_tree_lookup, gmap_fault is split into the address translation with gmap_translate and the part that does the linking of the gmap shadow page table with the process page table. A second radix tree is used to keep the pointers to the segment table entries for segments that are mapped in the guest address space. On unmap of a segment the pointer is retrieved from the radix tree and is used to carry out the segment invalidation in the gmap shadow page table. As the radix tree can only store one pointer, each host segment may only be mapped to exactly one guest location. Signed-off-by: Martin Schwidefsky <[email protected]> Signed-off-by: Christian Borntraeger <[email protected]>
2014-08-25kvm: x86: fix tracing for 32-bitPaolo Bonzini1-2/+2
Fix commit 7b46268d29543e313e731606d845e65c17f232e4, which mistakenly included the new tracepoint under #ifdef CONFIG_X86_64. Reported-by: Sabrina Dubroca <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-08-25Merge tag 'kvm-s390-20140825' of ↵Paolo Bonzini2-13/+10
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD Here are two fixes for s390 KVM code that prevent: 1. a malicious user to trigger a kernel BUG 2. a malicious user to change the storage key of read-only pages
2014-08-25KVM: s390/mm: cleanup gmap function arguments, variable namesMartin Schwidefsky7-70/+72
Make the order of arguments for the gmap calls more consistent, if the gmap pointer is passed it is always the first argument. In addition distinguish between guest address and user address by naming the variables gaddr for a guest address and vmaddr for a user address. Signed-off-by: Martin Schwidefsky <[email protected]> Reviewed-by: Cornelia Huck <[email protected]> Signed-off-by: Christian Borntraeger <[email protected]>
2014-08-25KVM: s390/mm: readd address parameter to gmap_do_ipte_notifyMartin Schwidefsky2-3/+4
Revert git commit c3a23b9874c1 ("remove unnecessary parameter from gmap_do_ipte_notify"). Signed-off-by: Martin Schwidefsky <[email protected]> Signed-off-by: Christian Borntraeger <[email protected]>
2014-08-25KVM: s390/mm: readd address parameter to pgste_ipte_notifyMartin Schwidefsky1-8/+9
Revert git commit 1b7fd6952063 ("remove unecessary parameter from pgste_ipte_notify") Signed-off-by: Martin Schwidefsky <[email protected]> Signed-off-by: Christian Borntraeger <[email protected]>
2014-08-25KVM: s390: don't use kvm lock in interrupt injection codeJens Freimann1-2/+0
The kvm lock protects us against vcpus going away, but they only go away when the virtual machine is shut down. We don't need this mutex here, so let's get rid of it. Signed-off-by: Jens Freimann <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Reviewed-by: Cornelia Huck <[email protected]> Signed-off-by: Christian Borntraeger <[email protected]>
2014-08-25KVM: s390: return -EFAULT if lowcore is not mapped during irq deliveryJens Freimann3-25/+24
Currently we just kill the userspace process and exit the thread immediatly without making sure that we don't hold any locks etc. Improve this by making KVM_RUN return -EFAULT if the lowcore is not mapped during interrupt delivery. To achieve this we need to pass the return code of guest memory access routines used in interrupt delivery all the way back to the KVM_RUN ioctl. Signed-off-by: Jens Freimann <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Reviewed-by: Cornelia Huck <[email protected]> Signed-off-by: Christian Borntraeger <[email protected]>
2014-08-25KVM: s390: implement KVM_REQ_TLB_FLUSH and make use of itDavid Hildenbrand2-3/+9
Use the KVM_REQ_TLB_FLUSH request in order to trigger tlb flushes instead of manipulating the SIE control block whenever we need it. Also trigger it for a control register sync directly instead of (ab)using kvm_s390_set_prefix(). Signed-off-by: David Hildenbrand <[email protected]> Reviewed-by: Cornelia Huck <[email protected]> Acked-by: Christian Borntraeger <[email protected]> Signed-off-by: Christian Borntraeger <[email protected]>
2014-08-25KVM: s390: synchronize more registers with kvm_runDavid Hildenbrand2-14/+56
In order to reduce the number of syscalls when dropping to user space, this patch enables the synchronization of the following "registers" with kvm_run: - ARCH0: CPU timer, clock comparator, TOD programmable register, guest breaking-event register, program parameter - PFAULT: pfault parameters (token, select, compare) The registers are grouped to reduce the overhead when syncing. As this grows the number of sync registers quite a bit, let's move the code synchronizing registers with kvm_run from kvm_arch_vcpu_ioctl_run() into separate helper routines. Signed-off-by: David Hildenbrand <[email protected]> Reviewed-by: Cornelia Huck <[email protected]> Reviewed-by: Christian Borntraeger <[email protected]> Signed-off-by: Christian Borntraeger <[email protected]>
2014-08-25KVM: s390: no special machine check deliveryChristian Borntraeger3-66/+0
The load PSW handler does not have to inject pending machine checks. This can wait until the CPU runs the generic interrupt injection code. Signed-off-by: Christian Borntraeger <[email protected]> Reviewed-by: Cornelia Huck <[email protected]>
2014-08-25KVM: s390: clear kvm_dirty_regs when dropping to user spaceDavid Hildenbrand1-4/+2
We should make sure that all kvm_dirty_regs bits are cleared before dropping to user space. Until now, some would remain pending. Signed-off-by: David Hildenbrand <[email protected]> Reviewed-by: Cornelia Huck <[email protected]> Signed-off-by: Christian Borntraeger <[email protected]>
2014-08-25KVM: clarify the idea of kvm_dirty_regsDavid Hildenbrand1-0/+4
This patch clarifies that kvm_dirty_regs are just a hint to the kernel and that the kernel might just ignore some flags and sync the values (like done for acrs and gprs now). Signed-off-by: David Hildenbrand <[email protected]> Reviewed-by: Cornelia Huck <[email protected]> Signed-off-by: Christian Borntraeger <[email protected]>
2014-08-25KVM: s390: factor out get_ilc() functionJens Freimann1-20/+21
Let's make this a reusable function. Signed-off-by: Jens Freimann <[email protected]> Acked-by: Cornelia Huck <[email protected]> Acked-by: Christian Borntraeger <[email protected]> Signed-off-by: Christian Borntraeger <[email protected]>
2014-08-25KVM: s390/mm: try a cow on read only pages for key opsChristian Borntraeger1-0/+10
The PFMF instruction handler blindly wrote the storage key even if the page was mapped R/O in the host. Lets try a COW before continuing and bail out in case of errors. Signed-off-by: Christian Borntraeger <[email protected]> Reviewed-by: Dominik Dingel <[email protected]> Cc: [email protected]
2014-08-25KVM: s390: add defines for pfault init delivery codeJens Freimann1-2/+4
Get rid of open coded values for pfault init. Signed-off-by: Jens Freimann <[email protected]> Acked-by: Cornelia Huck <[email protected]> Signed-off-by: Christian Borntraeger <[email protected]>
2014-08-25KVM: s390: Fix user triggerable bug in dead codeChristian Borntraeger1-13/+0
In the early days, we had some special handling for the KVM_EXIT_S390_SIEIC exit, but this was gone in 2009 with commit d7b0b5eb3000 (KVM: s390: Make psw available on all exits, not just a subset). Now this switch statement is just a sanity check for userspace not messing with the kvm_run structure. Unfortunately, this allows userspace to trigger a kernel BUG. Let's just remove this switch statement. Signed-off-by: Christian Borntraeger <[email protected]> Reviewed-by: Cornelia Huck <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Cc: [email protected]
2014-08-21KVM: trace kvm_ple_window grow/shrinkRadim Krčmář3-0/+35
Tracepoint for dynamic PLE window, fired on every potential change. Signed-off-by: Radim Krčmář <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-08-21KVM: VMX: dynamise PLE windowRadim Krčmář1-2/+93
Window is increased on every PLE exit and decreased on every sched_in. The idea is that we don't want to PLE exit if there is no preemption going on. We do this with sched_in() because it does not hold rq lock. There are two new kernel parameters for changing the window: ple_window_grow and ple_window_shrink ple_window_grow affects the window on PLE exit and ple_window_shrink does it on sched_in; depending on their value, the window is modifier like this: (ple_window is kvm_intel's global) ple_window_shrink/ | ple_window_grow | PLE exit | sched_in -------------------+--------------------+--------------------- < 1 | = ple_window | = ple_window < ple_window | *= ple_window_grow | /= ple_window_shrink otherwise | += ple_window_grow | -= ple_window_shrink A third new parameter, ple_window_max, controls the maximal ple_window; it is internally rounded down to a closest multiple of ple_window_grow. VCPU's PLE window is never allowed below ple_window. Signed-off-by: Radim Krčmář <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-08-21KVM: VMX: make PLE window per-VCPURadim Krčmář1-1/+11
Change PLE window into per-VCPU variable, seeded from module parameter, to allow greater flexibility. Brings in a small overhead on every vmentry. Signed-off-by: Radim Krčmář <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-08-21KVM: x86: introduce sched_in to kvm_x86_opsRadim Krčmář4-0/+15
sched_in preempt notifier is available for x86, allow its use in specific virtualization technlogies as well. Signed-off-by: Radim Krčmář <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-08-21KVM: add kvm_arch_sched_inRadim Krčmář7-0/+24
Introduce preempt notifiers for architecture specific code. Advantage over creating a new notifier in every arch is slightly simpler code and guaranteed call order with respect to kvm_sched_in. Signed-off-by: Radim Krčmář <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-08-21KVM: x86: Replace X86_FEATURE_NX offset with the definitionNadav Amit1-2/+2
Replace reference to X86_FEATURE_NX using bit shift with the defined X86_FEATURE_NX. Signed-off-by: Nadav Amit <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-08-21KVM: avoid unnecessary synchronize_rcuChristian Borntraeger1-1/+2
We dont have to wait for a grace period if there is no oldpid that we are going to free. putpid also checks for NULL, so this patch only fences synchronize_rcu. Signed-off-by: Christian Borntraeger <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-08-20KVM: emulate: warn on invalid or uninitialized exception numbersPaolo Bonzini2-1/+5
These were reported when running Jailhouse on AMD processors. Initialize ctxt->exception.vector with an invalid exception number, and warn if it remained invalid even though the emulator got an X86EMUL_PROPAGATE_FAULT return code. Signed-off-by: Paolo Bonzini <[email protected]>
2014-08-20KVM: emulate: do not return X86EMUL_PROPAGATE_FAULT explicitlyPaolo Bonzini1-5/+3
Always get it through emulate_exception or emulate_ts. This ensures that the ctxt->exception fields have been populated. Signed-off-by: Paolo Bonzini <[email protected]>
2014-08-20KVM: x86: Clarify PMU related features bit manipulationNadav Amit1-10/+14
kvm_pmu_cpuid_update makes a lot of bit manuiplation operations, when in fact there are already unions that can be used instead. Changing the bit manipulation to the union for clarity. This patch does not change the functionality. Signed-off-by: Nadav Amit <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-08-20KVM: vmx: fix ept reserved bits for 1-GByte pageWanpeng Li1-10/+12
EPT misconfig handler in kvm will check which reason lead to EPT misconfiguration after vmexit. One of the reasons is that an EPT paging-structure entry is configured with settings reserved for future functionality. However, the handler can't identify if paging-structure entry of reserved bits for 1-GByte page are configured, since PDPTE which point to 1-GByte page will reserve bits 29:12 instead of bits 7:3 which are reserved for PDPTE that references an EPT Page Directory. This patch fix it by reserve bits 29:12 for 1-GByte page. Signed-off-by: Wanpeng Li <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-08-19KVM: x86: recalculate_apic_map after enabling apicNadav Amit1-11/+14
Currently, recalculate_apic_map ignores vcpus whose lapic is software disabled through the spurious interrupt vector. However, once it is re-enabled, the map is not recalculated. Therefore, if the guest OS configured DFR while lapic is software-disabled, the map may be incorrect. This patch recalculates apic map after software enabling the lapic. Signed-off-by: Nadav Amit <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-08-19KVM: x86: Clear apic tsc-deadline after deadlineNadav Amit1-0/+5
Intel SDM 10.5.4.1 says "When the timer generates an interrupt, it disarms itself and clears the IA32_TSC_DEADLINE MSR". This patch clears the MSR upon timer interrupt delivery which delivered on deadline mode. Since the MSR may be reconfigured while an interrupt is pending, causing the new value to be overriden, pending timer interrupts are checked before setting a new deadline. Signed-off-by: Nadav Amit <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>