aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2013-01-24KVM: VMX: rename fix_pmode_dataseg to fix_pmode_seg.Gleb Natapov1-7/+7
The function deals with code segment too. Signed-off-by: Gleb Natapov <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-01-24KVM: VMX: don't clobber segment AR of unusable segments.Gleb Natapov1-2/+0
Usability is returned in unusable field, so not need to clobber entire AR. Callers have to know how to deal with unusable segments already since if emulate_invalid_guest_state=true AR is not zeroed. Signed-off-by: Gleb Natapov <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-01-24KVM: VMX: skip vmx->rmode.vm86_active check on cr0 write if unrestricted ↵Gleb Natapov1-8/+6
guest is enabled vmx->rmode.vm86_active is never true is unrestricted guest is enabled. Make it more explicit that neither enter_pmode() nor enter_rmode() is called in this case. Signed-off-by: Gleb Natapov <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-01-24KVM: VMX: remove hack that disables emulation on vcpu reset/initGleb Natapov1-3/+0
There is no reason for it. If state is suitable for vmentry it will be detected during guest entry and no emulation will happen. Signed-off-by: Gleb Natapov <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-01-24KVM: VMX: if unrestricted guest is enabled vcpu state is always valid.Gleb Natapov1-0/+3
Signed-off-by: Gleb Natapov <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-01-24KVM: VMX: reset CPL only on CS register write.Gleb Natapov1-1/+2
Signed-off-by: Gleb Natapov <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-01-24KVM: VMX: remove special CPL cache access during transition to real mode.Gleb Natapov1-8/+4
Since vmx_get_cpl() always returns 0 when VCPU is in real mode it is no longer needed. Also reset CPL cache to zero during transaction to protected mode since transaction may happen while CS.selectors & 3 != 0, but in reality CPL is 0. Signed-off-by: Gleb Natapov <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-01-23KVM: x86 emulator: convert a few freestanding emulations to fastopAvi Kivity1-3/+3
Reviewed-by: Gleb Natapov <[email protected]> Signed-off-by: Avi Kivity <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-01-23KVM: x86 emulator: rearrange fastop definitionsAvi Kivity1-35/+35
Make fastop opcodes usable in other emulations. Reviewed-by: Gleb Natapov <[email protected]> Signed-off-by: Avi Kivity <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-01-23KVM: x86 emulator: convert 2-operand IMUL to fastopAvi Kivity1-8/+6
Reviewed-by: Gleb Natapov <[email protected]> Signed-off-by: Avi Kivity <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-01-23KVM: x86 emulator: convert BT/BTS/BTR/BTC/BSF/BSR to fastopAvi Kivity1-50/+26
Reviewed-by: Gleb Natapov <[email protected]> Signed-off-by: Avi Kivity <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-01-23KVM: x86 emulator: convert INC/DEC to fastopAvi Kivity1-17/+7
Reviewed-by: Gleb Natapov <[email protected]> Signed-off-by: Avi Kivity <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-01-23KVM: x86 emulator: covert SETCC to fastopAvi Kivity1-31/+29
This is a bit of a special case since we don't have the usual byte/word/long/quad switch; instead we switch on the condition code embedded in the instruction. Reviewed-by: Gleb Natapov <[email protected]> Signed-off-by: Avi Kivity <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-01-23KVM: x86 emulator: convert shift/rotate instructions to fastopAvi Kivity1-41/+31
SHL, SHR, ROL, ROR, RCL, RCR, SAR, SAL Reviewed-by: Gleb Natapov <[email protected]> Signed-off-by: Avi Kivity <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-01-23KVM: x86 emulator: Convert SHLD, SHRD to fastopAvi Kivity1-12/+21
Reviewed-by: Gleb Natapov <[email protected]> Signed-off-by: Avi Kivity <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-01-21KVM: x86: improve reexecute_instructionXiao Guangrong3-11/+45
The current reexecute_instruction can not well detect the failed instruction emulation. It allows guest to retry all the instructions except it accesses on error pfn For example, some cases are nested-write-protect - if the page we want to write is used as PDE but it chains to itself. Under this case, we should stop the emulation and report the case to userspace Reviewed-by: Gleb Natapov <[email protected]> Signed-off-by: Xiao Guangrong <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-01-21KVM: x86: let reexecute_instruction work for tdpXiao Guangrong1-18/+43
Currently, reexecute_instruction refused to retry all instructions if tdp is enabled. If nested npt is used, the emulation may be caused by shadow page, it can be fixed by dropping the shadow page. And the only condition that tdp can not retry the instruction is the access fault on error pfn Reviewed-by: Gleb Natapov <[email protected]> Signed-off-by: Xiao Guangrong <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-01-21KVM: x86: clean up reexecute_instructionXiao Guangrong1-7/+6
Little cleanup for reexecute_instruction, also use gpa_to_gfn in retry_instruction Reviewed-by: Gleb Natapov <[email protected]> Signed-off-by: Xiao Guangrong <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-01-17KVM: set_memory_region: Remove unnecessary variable memslotTakuya Yoshikawa1-6/+5
One such variable, slot, is enough for holding a pointer temporarily. We also remove another local variable named slot, which is limited in a block, since it is confusing to have the same name in this function. Reviewed-by: Marcelo Tosatti <[email protected]> Signed-off-by: Takuya Yoshikawa <[email protected]> Signed-off-by: Gleb Natapov <[email protected]>
2013-01-17KVM: set_memory_region: Don't check for overlaps unless we create or move a slotTakuya Yoshikawa1-8/+10
Don't need the check for deleting an existing slot or just modifiying the flags. Reviewed-by: Marcelo Tosatti <[email protected]> Signed-off-by: Takuya Yoshikawa <[email protected]> Signed-off-by: Gleb Natapov <[email protected]>
2013-01-17KVM: set_memory_region: Don't jump to out_free unnecessarilyTakuya Yoshikawa1-4/+3
This makes the separation between the sanity checks and the rest of the code a bit clearer. Reviewed-by: Marcelo Tosatti <[email protected]> Signed-off-by: Takuya Yoshikawa <[email protected]> Signed-off-by: Gleb Natapov <[email protected]>
2013-01-17KVM: s390: kvm/sigp.c: fix memory leakageCong Ding1-1/+3
the variable inti should be freed in the branch CPUSTAT_STOPPED. Signed-off-by: Cong Ding <[email protected]> Signed-off-by: Cornelia Huck <[email protected]> Signed-off-by: Gleb Natapov <[email protected]>
2013-01-14KVM: MMU: Conditionally reschedule when kvm_mmu_slot_remove_write_access() ↵Takuya Yoshikawa1-0/+5
takes a long time If the userspace starts dirty logging for a large slot, say 64GB of memory, kvm_mmu_slot_remove_write_access() needs to hold mmu_lock for a long time such as tens of milliseconds. This patch controls the lock hold time by asking the scheduler if we need to reschedule for others. One penalty for this is that we need to flush TLBs before releasing mmu_lock. But since holding mmu_lock for a long time does affect not only the guest, vCPU threads in other words, but also the host as a whole, we should pay for that. In practice, the cost will not be so high because we can protect a fair amount of memory before being rescheduled: on my test environment, cond_resched_lock() was called only once for protecting 12GB of memory even without THP. We can also revisit Avi's "unlocked TLB flush" work later for completely suppressing extra TLB flushes if needed. Reviewed-by: Marcelo Tosatti <[email protected]> Signed-off-by: Takuya Yoshikawa <[email protected]> Signed-off-by: Gleb Natapov <[email protected]>
2013-01-14KVM: Make kvm_mmu_slot_remove_write_access() take mmu_lock by itselfTakuya Yoshikawa2-4/+4
Better to place mmu_lock handling and TLB flushing code together since this is a self-contained function. Reviewed-by: Marcelo Tosatti <[email protected]> Signed-off-by: Takuya Yoshikawa <[email protected]> Signed-off-by: Gleb Natapov <[email protected]>
2013-01-14KVM: Make kvm_mmu_change_mmu_pages() take mmu_lock by itselfTakuya Yoshikawa2-5/+8
No reason to make callers take mmu_lock since we do not need to protect kvm_mmu_change_mmu_pages() and kvm_mmu_slot_remove_write_access() together by mmu_lock in kvm_arch_commit_memory_region(): the former calls kvm_mmu_commit_zap_page() and flushes TLBs by itself. Note: we do not need to protect kvm->arch.n_requested_mmu_pages by mmu_lock as can be seen from the fact that it is read locklessly. Reviewed-by: Marcelo Tosatti <[email protected]> Signed-off-by: Takuya Yoshikawa <[email protected]> Signed-off-by: Gleb Natapov <[email protected]>
2013-01-14KVM: Remove unused slot_bitmap from kvm_mmu_pageTakuya Yoshikawa3-22/+0
Not needed any more. Reviewed-by: Marcelo Tosatti <[email protected]> Signed-off-by: Takuya Yoshikawa <[email protected]> Signed-off-by: Gleb Natapov <[email protected]>
2013-01-14KVM: MMU: Make kvm_mmu_slot_remove_write_access() rmap basedTakuya Yoshikawa1-13/+15
This makes it possible to release mmu_lock and reschedule conditionally in a later patch. Although this may increase the time needed to protect the whole slot when we start dirty logging, the kernel should not allow the userspace to trigger something that will hold a spinlock for such a long time as tens of milliseconds: actually there is no limit since it is roughly proportional to the number of guest pages. Another point to note is that this patch removes the only user of slot_bitmap which will cause some problems when we increase the number of slots further. Reviewed-by: Marcelo Tosatti <[email protected]> Signed-off-by: Takuya Yoshikawa <[email protected]> Signed-off-by: Gleb Natapov <[email protected]>
2013-01-14KVM: MMU: Remove unused parameter level from __rmap_write_protect()Takuya Yoshikawa1-3/+3
No longer need to care about the mapping level in this function. Reviewed-by: Marcelo Tosatti <[email protected]> Signed-off-by: Takuya Yoshikawa <[email protected]> Signed-off-by: Gleb Natapov <[email protected]>
2013-01-14KVM: Write protect the updated slot only when dirty logging is enabledTakuya Yoshikawa2-2/+7
Calling kvm_mmu_slot_remove_write_access() for a deleted slot does nothing but search for non-existent mmu pages which have mappings to that deleted memory; this is safe but a waste of time. Since we want to make the function rmap based in a later patch, in a manner which makes it unsafe to be called for a deleted slot, we makes the caller see if the slot is non-zero and being dirty logged. Reviewed-by: Marcelo Tosatti <[email protected]> Signed-off-by: Takuya Yoshikawa <[email protected]> Signed-off-by: Gleb Natapov <[email protected]>
2013-01-14Merge branch 'kvm-ppc-next' of https://github.com/agraf/linux-2.6 into queueGleb Natapov12-11/+153
2013-01-10KVM: trace: Fix exit decoding.Cornelia Huck1-1/+1
trace_kvm_userspace_exit has been missing the KVM_EXIT_WATCHDOG exit. CC: Bharat Bhushan <[email protected]> Signed-off-by: Cornelia Huck <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-01-10KVM: MMU: fix infinite fault access retryXiao Guangrong2-10/+38
We have two issues in current code: - if target gfn is used as its page table, guest will refault then kvm will use small page size to map it. We need two #PF to fix its shadow page table - sometimes, say a exception is triggered during vm-exit caused by #PF (see handle_exception() in vmx.c), we remove all the shadow pages shadowed by the target gfn before go into page fault path, it will cause infinite loop: delete shadow pages shadowed by the gfn -> try to use large page size to map the gfn -> retry the access ->... To fix these, we can adjust page size early if the target gfn is used as page table Signed-off-by: Xiao Guangrong <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-01-10KVM: MMU: fix Dirty bit missed if CR0.WP = 0Xiao Guangrong2-39/+38
If the write-fault access is from supervisor and CR0.WP is not set on the vcpu, kvm will fix it by adjusting pte access - it sets the W bit on pte and clears U bit. This is the chance that kvm can change pte access from readonly to writable Unfortunately, the pte access is the access of 'direct' shadow page table, means direct sp.role.access = pte_access, then we will create a writable spte entry on the readonly shadow page table. It will cause Dirty bit is not tracked when two guest ptes point to the same large page. Note, it does not have other impact except Dirty bit since cr0.wp is encoded into sp.role It can be fixed by adjusting pte access before establishing shadow page table. Also, after that, no mmu specified code exists in the common function and drop two parameters in set_spte Signed-off-by: Xiao Guangrong <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-01-10KVM: PPC: BookE: Add EPR ONE_REG syncAlexander Graf3-1/+27
We need to be able to read and write the contents of the EPR register from user space. This patch implements that logic through the ONE_REG API and declares its (never implemented) SREGS counterpart as deprecated. Signed-off-by: Alexander Graf <[email protected]>
2013-01-10KVM: PPC: BookE: Implement EPR exitAlexander Graf7-3/+79
The External Proxy Facility in FSL BookE chips allows the interrupt controller to automatically acknowledge an interrupt as soon as a core gets its pending external interrupt delivered. Today, user space implements the interrupt controller, so we need to check on it during such a cycle. This patch implements logic for user space to enable EPR exiting, disable EPR exiting and EPR exiting itself, so that user space can acknowledge an interrupt when an external interrupt has successfully been delivered into the guest vcpu. Signed-off-by: Alexander Graf <[email protected]>
2013-01-10KVM: PPC: BookE: Emulate mfspr on EPRAlexander Graf1-0/+3
The EPR register is potentially valid for PR KVM as well, so we need to emulate accesses to it. It's only defined for reading, so only handle the mfspr case. Signed-off-by: Alexander Graf <[email protected]>
2013-01-10KVM: PPC: BookE: Allow irq deliveries to inject requestsAlexander Graf1-0/+5
When injecting an interrupt into guest context, we usually don't need to check for requests anymore. At least not until today. With the introduction of EPR, we will have to create a request when the guest has successfully accepted an external interrupt though. So we need to prepare the interrupt delivery to abort guest entry gracefully. Otherwise we'd delay the EPR request. Signed-off-by: Alexander Graf <[email protected]>
2013-01-10KVM: PPC: Fix mfspr/mtspr MMUCFG emulationMihai Caraman2-5/+2
On mfspr/mtspr emulation path Book3E's MMUCFG SPR with value 1015 clashes with G4's MSSSR0 SPR. Move MSSSR0 emulation from generic part to Books3S. MSSSR0 also clashes with Book3S's DABRX SPR. DABRX was not explicitly handled so Book3S execution flow will behave as before. Signed-off-by: Mihai Caraman <[email protected]> Signed-off-by: Alexander Graf <[email protected]>
2013-01-10KVM: PPC: Book3S: PR: Enable alternative instruction for SC 1Alexander Graf3-0/+34
When running on top of pHyp, the hypercall instruction "sc 1" goes straight into pHyp without trapping in supervisor mode. So if we want to support PAPR guest in this configuration we need to add a second way of accessing PAPR hypercalls, preferably with the exact same semantics except for the instruction. So let's overlay an officially reserved instruction and emulate PAPR hypercalls whenever we hit that one. Signed-off-by: Alexander Graf <[email protected]>
2013-01-10KVM: PPC: Only WARN on invalid emulationAlexander Graf1-1/+2
When we hit an emulation result that we didn't expect, that is an error, but it's nothing that warrants a BUG(), because it can be guest triggered. So instead, let's only WARN() the user that this happened. Signed-off-by: Alexander Graf <[email protected]>
2013-01-10KVM: PPC: Fix SREGS documentation referenceMihai Caraman1-1/+1
Reflect the uapi folder change in SREGS API documentation. Signed-off-by: Mihai Caraman <[email protected]> Reviewed-by: Amos Kong <[email protected]> Signed-off-by: Alexander Graf <[email protected]>
2013-01-09KVM: s390: Gracefully handle busy conditions on ccw_device_startChristian Borntraeger1-5/+8
In rare cases a virtio command might try to issue a ccw before a former ccw was answered with a tsch. This will cause CC=2 (busy). Lets just retry in that case. Signed-off-by: Christian Borntraeger <[email protected]> Signed-off-by: Cornelia Huck <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-01-09KVM: s390: Dynamic allocation of virtio-ccw I/O data.Cornelia Huck1-106/+174
Dynamically allocate any data structures like ccw used when doing channel I/O. Otherwise, we'd need to add extra serialization for the different callbacks using the same data structures. Reported-by: Christian Borntraeger <[email protected]> Signed-off-by: Cornelia Huck <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-01-09KVM: x86 emulator: convert basic ALU ops to fastopAvi Kivity1-78/+34
Opcodes: TEST CMP ADD ADC SUB SBB XOR OR AND Acked-by: Gleb Natapov <[email protected]> Signed-off-by: Avi Kivity <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-01-09KVM: x86 emulator: add macros for defining 2-operand fastop emulationAvi Kivity1-0/+12
Acked-by: Gleb Natapov <[email protected]> Signed-off-by: Avi Kivity <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-01-09KVM: x86 emulator: convert NOT, NEG to fastopAvi Kivity1-13/+4
Acked-by: Gleb Natapov <[email protected]> Signed-off-by: Avi Kivity <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-01-09KVM: x86 emulator: mark CMP, CMPS, SCAS, TEST as NoWriteAvi Kivity1-12/+8
Acked-by: Gleb Natapov <[email protected]> Signed-off-by: Avi Kivity <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-01-09KVM: x86 emulator: introduce NoWrite flagAvi Kivity1-0/+4
Instead of disabling writeback via OP_NONE, just specify NoWrite. Acked-by: Gleb Natapov <[email protected]> Signed-off-by: Avi Kivity <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-01-09KVM: x86 emulator: Support for declaring single operand fastopsAvi Kivity1-0/+25
Acked-by: Gleb Natapov <[email protected]> Signed-off-by: Avi Kivity <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>
2013-01-09KVM: x86 emulator: framework for streamlining arithmetic opcodesAvi Kivity1-0/+41
We emulate arithmetic opcodes by executing a "similar" (same operation, different operands) on the cpu. This ensures accurate emulation, esp. wrt. eflags. However, the prologue and epilogue around the opcode is fairly long, consisting of a switch (for the operand size) and code to load and save the operands. This is repeated for every opcode. This patch introduces an alternative way to emulate arithmetic opcodes. Instead of the above, we have four (three on i386) functions consisting of just the opcode and a ret; one for each operand size. For example: .align 8 em_notb: not %al ret .align 8 em_notw: not %ax ret .align 8 em_notl: not %eax ret .align 8 em_notq: not %rax ret The prologue and epilogue are shared across all opcodes. Note the functions use a special calling convention; notably eflags is an input/output parameter and is not clobbered. Rather than dispatching the four functions through a jump table, the functions are declared as a constant size (8) so their address can be calculated. Acked-by: Gleb Natapov <[email protected]> Signed-off-by: Avi Kivity <[email protected]> Signed-off-by: Marcelo Tosatti <[email protected]>