aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2014-09-24kvm: x86: Unpin and remove kvm_arch->apic_access_pageTang Chen3-9/+24
In order to make the APIC access page migratable, stop pinning it in memory. And because the APIC access page is not pinned in memory, we can remove kvm_arch->apic_access_page. When we need to write its physical address into vmcs, we use gfn_to_page() to get its page struct, which is needed to call page_to_phys(); the page is then immediately unpinned. Suggested-by: Gleb Natapov <[email protected]> Signed-off-by: Tang Chen <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-24kvm: vmx: Implement set_apic_access_page_addrTang Chen1-6/+41
Currently, the APIC access page is pinned by KVM for the entire life of the guest. We want to make it migratable in order to make memory hot-unplug available for machines that run KVM. This patch prepares to handle this for the case where there is no nested virtualization, or where the nested guest does not have an APIC page of its own. All accesses to kvm->arch.apic_access_page are changed to go through kvm_vcpu_reload_apic_access_page. If the APIC access page is invalidated when the host is running, we update the VMCS in the next guest entry. If it is invalidated when the guest is running, the MMU notifier will force an exit, after which we will handle everything as in the previous case. If it is invalidated when a nested guest is running, the request will update either the VMCS01 or the VMCS02. Updating the VMCS01 is done at the next L2->L1 exit, while updating the VMCS02 is done in prepare_vmcs02. Signed-off-by: Tang Chen <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-24kvm: x86: Add request bit to reload APIC access page addressTang Chen3-0/+17
Currently, the APIC access page is pinned by KVM for the entire life of the guest. We want to make it migratable in order to make memory hot-unplug available for machines that run KVM. This patch prepares to handle this in generic code, through a new request bit (that will be set by the MMU notifier) and a new hook that is called whenever the request bit is processed. Signed-off-by: Tang Chen <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-24kvm: Add arch specific mmu notifier for page invalidationTang Chen6-0/+25
This will be used to let the guest run while the APIC access page is not pinned. Because subsequent patches will fill in the function for x86, place the (still empty) x86 implementation in the x86.c file instead of adding an inline function in kvm_host.h. Signed-off-by: Tang Chen <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-24kvm: Rename make_all_cpus_request() to kvm_make_all_cpus_request() and make ↵Tang Chen2-5/+6
it non-static Different architectures need different requests, and in fact we will use this function in architecture-specific code later. This will be outside kvm_main.c, so make it non-static and rename it to kvm_make_all_cpus_request(). Reviewed-by: Paolo Bonzini <[email protected]> Signed-off-by: Tang Chen <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-24kvm: Fix page ageing bugsAndres Lagar-Cavilla16-41/+71
1. We were calling clear_flush_young_notify in unmap_one, but we are within an mmu notifier invalidate range scope. The spte exists no more (due to range_start) and the accessed bit info has already been propagated (due to kvm_pfn_set_accessed). Simply call clear_flush_young. 2. We clear_flush_young on a primary MMU PMD, but this may be mapped as a collection of PTEs by the secondary MMU (e.g. during log-dirty). This required expanding the interface of the clear_flush_young mmu notifier, so a lot of code has been trivially touched. 3. In the absence of shadow_accessed_mask (e.g. EPT A bit), we emulate the access bit by blowing the spte. This requires proper synchronizing with MMU notifier consumers, like every other removal of spte's does. Signed-off-by: Andres Lagar-Cavilla <[email protected]> Acked-by: Rik van Riel <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-24kvm/x86/mmu: Pass gfn and level to rmapp callback.Andres Lagar-Cavilla2-20/+33
Callbacks don't have to do extra computation to learn what the caller (lvm_handle_hva_range()) knows very well. Useful for debugging/tracing/printk/future. Signed-off-by: Andres Lagar-Cavilla <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-24x86: kvm: use alternatives for VMCALL vs. VMMCALL if kernel text is read-onlyPaolo Bonzini3-2/+16
On x86_64, kernel text mappings are mapped read-only with CONFIG_DEBUG_RODATA. In that case, KVM will fail to patch VMCALL instructions to VMMCALL as required on AMD processors. The failure mode is currently a divide-by-zero exception, which obviously is a KVM bug that has to be fixed. However, picking the right instruction between VMCALL and VMMCALL will be faster and will help if you cannot upgrade the hypervisor. Reported-by: Chris Webb <[email protected]> Tested-by: Chris Webb <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: [email protected] Acked-by: Borislav Petkov <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-24kvm: x86: use macros to compute bank MSRsChen Yucong1-4/+4
Avoid open coded calculations for bank MSRs by using well-defined macros that hide the index of higher bank MSRs. No semantic changes. Signed-off-by: Chen Yucong <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-24KVM: x86: Remove debug assertion of non-PAE reserved bitsNadav Amit1-2/+1
Commit 346874c9507a ("KVM: x86: Fix CR3 reserved bits") removed non-PAE reserved bits which were not according to Intel SDM. However, residue was left in a debug assertion (CR3_NONPAE_RESERVED_BITS). Remove it. Signed-off-by: Nadav Amit <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-24kvm: don't take vcpu mutex for obviously invalid vcpu ioctlsDavid Matlack1-0/+4
vcpu ioctls can hang the calling thread if issued while a vcpu is running. However, invalid ioctls can happen when userspace tries to probe the kind of file descriptors (e.g. isatty() calls ioctl(TCGETS)); in that case, we know the ioctl is going to be rejected as invalid anyway and we can fail before trying to take the vcpu mutex. This patch does not change functionality, it just makes invalid ioctls fail faster. Cc: [email protected] Signed-off-by: David Matlack <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-24kvm: Faults which trigger IO release the mmap_semAndres Lagar-Cavilla5-6/+63
When KVM handles a tdp fault it uses FOLL_NOWAIT. If the guest memory has been swapped out or is behind a filemap, this will trigger async readahead and return immediately. The rationale is that KVM will kick back the guest with an "async page fault" and allow for some other guest process to take over. If async PFs are enabled the fault is retried asap from an async workqueue. If not, it's retried immediately in the same code path. In either case the retry will not relinquish the mmap semaphore and will block on the IO. This is a bad thing, as other mmap semaphore users now stall as a function of swap or filemap latency. This patch ensures both the regular and async PF path re-enter the fault allowing for the mmap semaphore to be relinquished in the case of IO wait. Reviewed-by: Radim Krčmář <[email protected]> Signed-off-by: Andres Lagar-Cavilla <[email protected]> Acked-by: Andrew Morton <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-24kvm: x86: fix two typos in commentTiejun Chen2-2/+2
s/drity/dirty and s/vmsc01/vmcs01 Signed-off-by: Tiejun Chen <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-24KVM: vmx: Inject #GP on invalid PAT CRNadav Amit3-2/+7
Guest which sets the PAT CR to invalid value should get a #GP. Currently, if vmx supports loading PAT CR during entry, then the value is not checked. This patch makes the required check in that case. Signed-off-by: Nadav Amit <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-24KVM: x86: emulating descriptor load misses long-mode caseNadav Amit1-0/+9
In 64-bit mode a #GP should be delivered to the guest "if the code segment descriptor pointed to by the selector in the 64-bit gate doesn't have the L-bit set and the D-bit clear." - Intel SDM "Interrupt 13—General Protection Exception (#GP)". This patch fixes the behavior of CS loading emulation code. Although the comment says that segment loading is not supported in long mode, this function is executed in long mode, so the fix is necassary. Signed-off-by: Nadav Amit <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-24KVM: x86: directly use kvm_make_request againLiang Chen4-14/+7
A one-line wrapper around kvm_make_request is not particularly useful. Replace kvm_mmu_flush_tlb() with kvm_make_request(). Signed-off-by: Liang Chen <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-24KVM: x86: count actual tlb flushesRadim Krčmář2-2/+7
- we count KVM_REQ_TLB_FLUSH requests, not actual flushes (KVM can have multiple requests for one flush) - flushes from kvm_flush_remote_tlbs aren't counted - it's easy to make a direct request by mistake Solve these by postponing the counting to kvm_check_request(). Signed-off-by: Radim Krčmář <[email protected]> Signed-off-by: Liang Chen <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-24KVM: nested VMX: disable perf cpuid reportingMarcelo Tosatti2-0/+8
Initilization of L2 guest with -cpu host, on L1 guest with -cpu host triggers: (qemu) KVM: entry failed, hardware error 0x7 ... nested_vmx_run: VMCS MSR_{LOAD,STORE} unsupported Nested VMX MSR load/store support is not sufficient to allow perf for L2 guest. Until properly fixed, trap CPUID and disable function 0xA. Signed-off-by: Marcelo Tosatti <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-24KVM: x86: Don't report guest userspace emulation error to userspaceNadav Amit1-1/+1
Commit fc3a9157d314 ("KVM: X86: Don't report L2 emulation failures to user-space") disabled the reporting of L2 (nested guest) emulation failures to userspace due to race-condition between a vmexit and the instruction emulator. The same rational applies also to userspace applications that are permitted by the guest OS to access MMIO area or perform PIO. This patch extends the current behavior - of injecting a #UD instead of reporting it to userspace - also for guest userspace code. Signed-off-by: Nadav Amit <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-24kvm: Make init_rmode_tss() return 0 on success.Paolo Bonzini1-10/+3
In init_rmode_tss(), there two variables indicating the return value, r and ret, and it return 0 on error, 1 on success. The function is only called by vmx_set_tss_addr(), and ret is redundant. This patch removes the redundant variable, by making init_rmode_tss() return 0 on success, -errno on failure. Reviewed-by: Radim Krčmář <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-24KVM: x86: Warn if guest virtual address space is not 48-bitsNadav Amit2-8/+15
The KVM emulator code assumes that the guest virtual address space (in 64-bit) is 48-bits wide. Fail the KVM_SET_CPUID and KVM_SET_CPUID2 ioctl if userspace tries to create a guest that does not obey this restriction. Signed-off-by: Nadav Amit <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-24kvm-vfio: do not use module_initPaolo Bonzini3-2/+19
/me got confused between the kernel and QEMU. In the kernel, you can only have one module_init function, and it will prevent unloading the module unless you also have the corresponding module_exit function. So, commit 80ce1639727e (KVM: VFIO: register kvm_device_ops dynamically, 2014-09-02) broke unloading of the kvm module, by adding a module_init function and no module_exit. Repair it by making kvm_vfio_ops_init weak, and checking it in kvm_init. Cc: Will Deacon <[email protected]> Cc: Gleb Natapov <[email protected]> Cc: Alex Williamson <[email protected]> Fixes: 80ce1639727e9d38729c34f162378508c307ca25 Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-24KVM: EVENTFD: Remove inclusion of irq.hChristoffer Dall1-1/+3
Commit c77dcac (KVM: Move more code under CONFIG_HAVE_KVM_IRQFD) added functionality that depends on definitions in ioapic.h when __KVM_HAVE_IOAPIC is defined. At the same time, kvm-arm commit 0ba0951 (KVM: EVENTFD: remove inclusion of irq.h) removed the inclusion of irq.h, an architecture-specific header that is not present on ARM but which happened to include ioapic.h on x86. Include ioapic.h directly in eventfd.c if __KVM_HAVE_IOAPIC is defined. This fixes x86 and lets ARM use eventfd.c. Signed-off-by: Christoffer Dall <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-17kvm: Make init_rmode_identity_map() return 0 on success.Tang Chen1-10/+8
In init_rmode_identity_map(), there two variables indicating the return value, r and ret, and it return 0 on error, 1 on success. The function is only called by vmx_create_vcpu(), and ret is redundant. This patch removes the redundant variable, and makes init_rmode_identity_map() return 0 on success, -errno on failure. Signed-off-by: Tang Chen <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-17kvm: Remove ept_identity_pagetable from struct kvm_arch.Tang Chen3-28/+22
kvm_arch->ept_identity_pagetable holds the ept identity pagetable page. But it is never used to refer to the page at all. In vcpu initialization, it indicates two things: 1. indicates if ept page is allocated 2. indicates if a memory slot for identity page is initialized Actually, kvm_arch->ept_identity_pagetable_done is enough to tell if the ept identity pagetable is initialized. So we can remove ept_identity_pagetable. NOTE: In the original code, ept identity pagetable page is pinned in memroy. As a result, it cannot be migrated/hot-removed. After this patch, since kvm_arch->ept_identity_pagetable is removed, ept identity pagetable page is no longer pinned in memory. And it can be migrated/hot-removed. Signed-off-by: Tang Chen <[email protected]> Reviewed-by: Gleb Natapov <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-17KVM: VFIO: register kvm_device_ops dynamicallyWill Deacon3-12/+15
Now that we have a dynamic means to register kvm_device_ops, use that for the VFIO kvm device, instead of relying on the static table. This is achieved by a module_init call to register the ops with KVM. Cc: Gleb Natapov <[email protected]> Cc: Paolo Bonzini <[email protected]> Acked-by: Alex Williamson <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-17KVM: s390: register flic ops dynamicallyCornelia Huck4-6/+3
Using the new kvm_register_device_ops() interface makes us get rid of an #ifdef in common code. Cc: Gleb Natapov <[email protected]> Cc: Paolo Bonzini <[email protected]> Signed-off-by: Cornelia Huck <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-17KVM: ARM: vgic: register kvm_device_ops dynamicallyWill Deacon3-83/+79
Now that we have a dynamic means to register kvm_device_ops, use that for the ARM VGIC, instead of relying on the static table. Cc: Gleb Natapov <[email protected]> Cc: Paolo Bonzini <[email protected]> Acked-by: Marc Zyngier <[email protected]> Reviewed-by: Christoffer Dall <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-17KVM: device: add simple registration mechanism for kvm_device_opsWill Deacon3-33/+55
kvm_ioctl_create_device currently has knowledge of all the device types and their associated ops. This is fairly inflexible when adding support for new in-kernel device emulations, so move what we currently have out into a table, which can support dynamic registration of ops by new drivers for virtual hardware. Cc: Alex Williamson <[email protected]> Cc: Alex Graf <[email protected]> Cc: Gleb Natapov <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Marc Zyngier <[email protected]> Acked-by: Cornelia Huck <[email protected]> Reviewed-by: Christoffer Dall <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-16kvm: ioapic: conditionally delay irq delivery duringeoi broadcastZhang Haoyu3-2/+66
Currently, we call ioapic_service() immediately when we find the irq is still active during eoi broadcast. But for real hardware, there's some delay between the EOI writing and irq delivery. If we do not emulate this behavior, and re-inject the interrupt immediately after the guest sends an EOI and re-enables interrupts, a guest might spend all its time in the ISR if it has a broken handler for a level-triggered interrupt. Such livelock actually happens with Windows guests when resuming from hibernation. As there's no way to recognize the broken handle from new raised ones, this patch delays an interrupt if 10.000 consecutive EOIs found that the interrupt was still high. The guest can then make a little forward progress, until a proper IRQ handler is set or until some detection routine in the guest (such as Linux's note_interrupt()) recognizes the situation. Cc: Michael S. Tsirkin <[email protected]> Signed-off-by: Jason Wang <[email protected]> Signed-off-by: Zhang Haoyu <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-16KVM: x86: Use kvm_make_request when applicableGuo Hui Liu1-8/+7
This patch replace the set_bit method by kvm_make_request to make code more readable and consistent. Signed-off-by: Guo Hui Liu <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-11KVM: x86: make apic_accept_irq tracepoint more genericPaolo Bonzini2-9/+6
Initially the tracepoint was added only to the APIC_DM_FIXED case, also because it reported coalesced interrupts that only made sense for that case. However, the coalesced argument is not used anymore and tracing other delivery modes is useful, so hoist the call out of the switch statement. Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-11kvm: Use APIC_DEFAULT_PHYS_BASE macro as the apic access page address.Tang Chen2-4/+5
We have APIC_DEFAULT_PHYS_BASE defined as 0xfee00000, which is also the address of apic access page. So use this macro. Signed-off-by: Tang Chen <[email protected]> Reviewed-by: Gleb Natapov <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-11Merge tag 'kvm-s390-next-20140910' of ↵Paolo Bonzini6-20/+74
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-next KVM: s390: Fixes and features for next (3.18) 1. Crypto/CPACF support: To enable the MSA4 instructions we have to provide a common control structure for each SIE control block 2. Two cleanups found by a static code checker: one redundant assignment and one useless if 3. Fix the page handling of the diag10 ballooning interface. If the guest freed the pages at absolute 0 some checks and frees were incorrect 4. Limit guests to 16TB 5. Add __must_check to interrupt injection code
2014-09-10KVM: s390/interrupt: remove double assignmentChristian Borntraeger1-1/+0
r is already initialized to 0. Signed-off-by: Christian Borntraeger <[email protected]> Reviewed-by: Thomas Huth <[email protected]>
2014-09-10KVM: s390/cmm: Fix prefix handling for diag 10 balloonChristian Borntraeger1-8/+18
The old handling of prefix pages was broken in the diag10 ballooner. We now rely on gmap_discard to check for start > end and do a slow path if the prefix swap pages are affected: 1. discard the pages from start to prefix 2. discard the absolute 0 pages 3. discard the pages after prefix swap to end Signed-off-by: Christian Borntraeger <[email protected]> Reviewed-by: Thomas Huth <[email protected]>
2014-09-10KVM: s390: get rid of constant condition in ipte_unlock_simpleChristian Borntraeger1-2/+1
Due to the earlier check we know that ipte_lock_count must be 0. No need to add a useless if. Let's make clear that we are going to always wakeup when we execute that code. Signed-off-by: Christian Borntraeger <[email protected]> Acked-by: Heiko Carstens <[email protected]>
2014-09-10KVM: s390: unintended fallthrough for external callChristian Borntraeger1-0/+1
We must not fallthrough if the conditions for external call are not met. Signed-off-by: Christian Borntraeger <[email protected]> Reviewed-by: Thomas Huth <[email protected]> Cc: [email protected]
2014-09-10KVM: s390: Limit guest size to 16TBChristian Borntraeger1-1/+1
Currently we fill up a full 5 level page table to hold the guest mapping. Since commit "support gmap page tables with less than 5 levels" we can do better. Having more than 4 TB might be useful for some testing scenarios, so let's just limit ourselves to 16TB guest size. Having more than that is totally untested as I do not have enough swap space/memory. We continue to allow ucontrol the full size. Signed-off-by: Christian Borntraeger <[email protected]> Acked-by: Cornelia Huck <[email protected]> Cc: Martin Schwidefsky <[email protected]>
2014-09-10KVM: s390: add __must_check to interrupt deliver functionsChristian Borntraeger2-7/+7
We now propagate interrupt injection errors back to the ioctl. We should mark functions that might fail with __must_check. Signed-off-by: Christian Borntraeger <[email protected]> Acked-by: Jens Freimann <[email protected]>
2014-09-10KVM: CPACF: Enable MSA4 instructions for kvm guestTony Krowiak2-1/+46
We have to provide a per guest crypto block for the CPUs to enable MSA4 instructions. According to icainfo on z196 or later this enables CCM-AES-128, CMAC-AES-128, CMAC-AES-192 and CMAC-AES-256. Signed-off-by: Tony Krowiak <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Reviewed-by: Cornelia Huck <[email protected]> Reviewed-by: Michael Mueller <[email protected]> Signed-off-by: Christian Borntraeger <[email protected]> [split MSA4/protected key into two patches]
2014-09-10KVM: fix api documentation of KVM_GET_EMULATED_CPUIDAlex Bennée1-70/+70
It looks like when this was initially merged it got accidentally included in the following section. I've just moved it back in the correct section and re-numbered it as other ioctls have been added since. Signed-off-by: Alex Bennée <[email protected]> Acked-by: Borislav Petkov <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-10KVM: document KVM_SET_GUEST_DEBUG apiAlex Bennée1-0/+44
In preparation for working on the ARM implementation I noticed the debug interface was missing from the API document. I've pieced together the expected behaviour from the code and commit messages written it up as best I can. Signed-off-by: Alex Bennée <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-05KVM: remove redundant assignments in __kvm_set_memory_regionChristian Borntraeger1-3/+0
__kvm_set_memory_region sets r to EINVAL very early. Doing it again is not necessary. The same is true later on, where r is assigned -ENOMEM twice. Signed-off-by: Christian Borntraeger <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-05KVM: remove redundant assigment of return value in kvm_dev_ioctlChristian Borntraeger1-2/+0
The first statement of kvm_dev_ioctl is long r = -EINVAL; No need to reassign the same value. Signed-off-by: Christian Borntraeger <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-05KVM: remove redundant check of in_spin_loopChristian Borntraeger1-2/+1
The expression `vcpu->spin_loop.in_spin_loop' is always true, because it is evaluated only when the condition `!vcpu->spin_loop.in_spin_loop' is false. Signed-off-by: Christian Borntraeger <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-05KVM: x86: propagate exception from permission checks on the nested page faultPaolo Bonzini4-11/+16
Currently, if a permission error happens during the translation of the final GPA to HPA, walk_addr_generic returns 0 but does not fill in walker->fault. To avoid this, add an x86_exception* argument to the translate_gpa function, and let it fill in walker->fault. The nested_page_fault field will be true, since the walk_mmu is the nested_mmu and translate_gpu instead operates on the "outer" (NPT) instance. Reported-by: Valentine Sinitsyn <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-05KVM: x86: skip writeback on injection of nested exceptionPaolo Bonzini2-6/+10
If a nested page fault happens during emulation, we will inject a vmexit, not a page fault. However because writeback happens after the injection, we will write ctxt->eip from L2 into the L1 EIP. We do not write back if an instruction caused an interception vmexit---do the same for page faults. Suggested-by: Gleb Natapov <[email protected]> Reviewed-by: Gleb Natapov <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-03KVM: nSVM: propagate the NPF EXITINFO to the guestPaolo Bonzini2-4/+32
This is similar to what the EPT code does with the exit qualification. This allows the guest to see a valid value for bits 33:32. Signed-off-by: Paolo Bonzini <[email protected]>
2014-09-03KVM: x86: reserve bit 8 of non-leaf PDPEs and PML4Es in 64-bit mode on AMDPaolo Bonzini2-2/+19
Bit 8 would be the "global" bit, which does not quite make sense for non-leaf page table entries. Intel ignores it; AMD ignores it in PDEs, but reserves it in PDPEs and PML4Es. The SVM test is relying on this behavior, so enforce it. Signed-off-by: Paolo Bonzini <[email protected]>