aboutsummaryrefslogtreecommitdiff
path: root/arch/powerpc/kvm
AgeCommit message (Collapse)AuthorFilesLines
2018-06-01KVM: PPC: Book3S PR: Enable kvmppc_get/set_one_reg_pr() for HTM registersSimon Guo1-0/+133
We need to migrate PR KVM during transaction and userspace will use kvmppc_get_one_reg_pr()/kvmppc_set_one_reg_pr() APIs to get/set transaction checkpoint state. This patch adds support for that. So far, QEMU on PR KVM doesn't fully function for migration but the savevm/loadvm can be done against a RHEL72 guest. During savevm/ loadvm procedure, the kvm ioctls will be invoked as well. Test has been performed to savevm/loadvm for a guest running a HTM test program: https://github.com/justdoitqd/publicFiles/blob/master/test-tm-mig.c Signed-off-by: Simon Guo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-06-01KVM: PPC: Book3S: Remove load/put vcpu for KVM_GET_REGS/KVM_SET_REGSSimon Guo1-6/+0
In both HV and PR KVM, the KVM_SET_REGS/KVM_GET_REGS ioctl should be able to perform without the vcpu loaded. Since the vcpu mutex locking/unlock has been moved out of vcpu_load() /vcpu_put(), KVM_SET_REGS/KVM_GET_REGS don't need to do ioctl with the vcpu loaded anymore. This patch removes vcpu_load()/vcpu_put() from KVM_SET_REGS/KVM_GET_REGS ioctl. Signed-off-by: Simon Guo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-06-01KVM: PPC: Remove load/put vcpu for KVM_GET/SET_ONE_REG ioctlSimon Guo1-2/+0
Since the vcpu mutex locking/unlock has been moved out of vcpu_load() /vcpu_put(), KVM_GET_ONE_REG and KVM_SET_ONE_REG doesn't need to do ioctl with loading vcpu anymore. This patch removes vcpu_load()/vcpu_put() from KVM_GET_ONE_REG and KVM_SET_ONE_REG ioctl. Signed-off-by: Simon Guo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-06-01KVM: PPC: Move vcpu_load/vcpu_put down to each ioctl case in kvm_arch_vcpu_ioctlSimon Guo1-3/+6
Although we already have kvm_arch_vcpu_async_ioctl() which doesn't require ioctl to load vcpu, the sync ioctl code need to be cleaned up when CONFIG_HAVE_KVM_VCPU_ASYNC_IOCTL is not configured. This patch moves vcpu_load/vcpu_put down to each ioctl switch case so that each ioctl can decide to do vcpu_load/vcpu_put or not independently. Signed-off-by: Simon Guo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-06-01KVM: PPC: Book3S PR: Enable HTM for PR KVM for KVM_CHECK_EXTENSION ioctlSimon Guo1-3/+2
With current patch set, PR KVM now supports HTM. So this patch turns it on for PR KVM. Tested with: https://github.com/justdoitqd/publicFiles/blob/master/test_kvm_htm_cap.c Signed-off-by: Simon Guo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-06-01KVM: PPC: Book3S PR: Support TAR handling for PR KVM HTMSimon Guo3-7/+36
Currently guest kernel doesn't handle TAR facility unavailable and it always runs with TAR bit on. PR KVM will lazily enable TAR. TAR is not a frequent-use register and it is not included in SVCPU struct. Due to the above, the checkpointed TAR val might be a bogus TAR val. To solve this issue, we will make vcpu->arch.fscr tar bit consistent with shadow_fscr when TM is enabled. At the end of emulating treclaim., the correct TAR val need to be loaded into the register if FSCR_TAR bit is on. At the beginning of emulating trechkpt., TAR needs to be flushed so that the right tar val can be copied into tar_tm. Tested with: tools/testing/selftests/powerpc/tm/tm-tar tools/testing/selftests/powerpc/ptrace/ptrace-tm-tar (remove DSCR/PPR related testing). Signed-off-by: Simon Guo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-06-01KVM: PPC: Book3S PR: Add guard code to prevent returning to guest with PR=0 ↵Simon Guo3-2/+19
and Transactional state Currently PR KVM doesn't support transaction memory in guest privileged state. This patch adds a check at setting guest msr, so that we can never return to guest with PR=0 and TS=0b10. A tabort will be emulated to indicate this and fail transaction immediately. [[email protected] - don't change the TM_CAUSE_MISC definition, instead use TM_CAUSE_KVM_FAC_UNAV.] Signed-off-by: Simon Guo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-06-01KVM: PPC: Book3S PR: Add emulation for tabort. in privileged stateSimon Guo1-0/+68
Currently privileged-state guest will be run with TM disabled. Although the privileged-state guest cannot initiate a new transaction, it can use tabort to terminate its problem state's transaction. So it is still necessary to emulate tabort. for privileged-state guest. Tested with: https://github.com/justdoitqd/publicFiles/blob/master/test_tabort.c Signed-off-by: Simon Guo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-06-01KVM: PPC: Book3S PR: Add emulation for trechkpt.Simon Guo2-1/+62
This patch adds host emulation when guest PR KVM executes "trechkpt.", which is a privileged instruction and will trap into host. We firstly copy vcpu ongoing content into vcpu tm checkpoint content, then perform kvmppc_restore_tm_pr() to do trechkpt. with updated vcpu tm checkpoint values. Signed-off-by: Simon Guo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-06-01KVM: PPC: Book3S PR: Add emulation for treclaim.Simon Guo1-0/+76
This patch adds support for "treclaim." emulation when PR KVM guest executes treclaim. and traps to host. We will firstly do treclaim. and save TM checkpoint. Then it is necessary to update vcpu current reg content with checkpointed vals. When rfid into guest again, those vcpu current reg content (now the checkpoint vals) will be loaded into regs. Signed-off-by: Simon Guo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-06-01KVM: PPC: Book3S PR: Restore NV regs after emulating mfspr from TM SPRsSimon Guo1-2/+13
Currently kvmppc_handle_fac() will not update NV GPRs and thus it can return with GUEST_RESUME. However PR KVM guest always disables MSR_TM bit in privileged state. If PR privileged-state guest is trying to read TM SPRs, it will trigger TM facility unavailable exception and fall into kvmppc_handle_fac(). Then the emulation will be done by kvmppc_core_emulate_mfspr_pr(). The mfspr instruction can include a RT with NV reg. So it is necessary to restore NV GPRs at this case, to reflect the update to NV RT. This patch make kvmppc_handle_fac() return GUEST_RESUME_NV for TM facility unavailable exceptions in guest privileged state. Signed-off-by: Simon Guo <[email protected]> Reviewed-by: Paul Mackerras <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-06-01KVM: PPC: Book3S PR: Always fail transactions in guest privileged stateSimon Guo2-1/+50
Currently the kernel doesn't use transaction memory. And there is an issue for privileged state in the guest that: tbegin/tsuspend/tresume/tabort TM instructions can impact MSR TM bits without trapping into the PR host. So following code will lead to a false mfmsr result: tbegin <- MSR bits update to Transaction active. beq <- failover handler branch mfmsr <- still read MSR bits from magic page with transaction inactive. It is not an issue for non-privileged guest state since its mfmsr is not patched with magic page and will always trap into the PR host. This patch will always fail tbegin attempt for privileged state in the guest, so that the above issue is prevented. It is benign since currently (guest) kernel doesn't initiate a transaction. Test case: https://github.com/justdoitqd/publicFiles/blob/master/test_tbegin_pr.c Signed-off-by: Simon Guo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-06-01KVM: PPC: Book3S PR: Emulate mtspr/mfspr using active TM SPRsSimon Guo2-11/+49
The mfspr/mtspr on TM SPRs(TEXASR/TFIAR/TFHAR) are non-privileged instructions and can be executed by PR KVM guest in problem state without trapping into the host. We only emulate mtspr/mfspr texasr/tfiar/tfhar in guest PR=0 state. When we are emulating mtspr tm sprs in guest PR=0 state, the emulation result needs to be visible to guest PR=1 state. That is, the actual TM SPR val should be loaded into actual registers. We already flush TM SPRs into vcpu when switching out of CPU, and load TM SPRs when switching back. This patch corrects mfspr()/mtspr() emulation for TM SPRs to make the actual source/dest be the actual TM SPRs. Signed-off-by: Simon Guo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-06-01KVM: PPC: Book3S PR: Add math support for PR KVM HTMSimon Guo1-0/+35
The math registers will be saved into vcpu->arch.fp/vr and corresponding vcpu->arch.fp_tm/vr_tm area. We flush or giveup the math regs into vcpu->arch.fp/vr before saving transaction. After transaction is restored, the math regs will be loaded back into regs. If there is a FP/VEC/VSX unavailable exception during transaction active state, the math checkpoint content might be incorrect and we need to do treclaim./load the correct checkpoint val/trechkpt. sequence to retry the transaction. That will make our solution complicated. To solve this issue, we always make the hardware guest MSR math bits (shadow_msr) consistent with the MSR val which guest sees (kvmppc_get_msr()) when guest msr is with tm enabled. Then all FP/VEC/VSX unavailable exception can be delivered to guest and guest handles the exception by itself. Signed-off-by: Simon Guo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-06-01KVM: PPC: Book3S PR: Add transaction memory save/restore skeletonSimon Guo1-0/+27
The transaction memory checkpoint area save/restore behavior is triggered when VCPU qemu process is switching out/into CPU, i.e. at kvmppc_core_vcpu_put_pr() and kvmppc_core_vcpu_load_pr(). MSR TM active state is determined by TS bits: active: 10(transactional) or 01 (suspended) inactive: 00 (non-transactional) We don't "fake" TM functionality for guest. We "sync" guest virtual MSR TM active state(10 or 01) with shadow MSR. That is to say, we don't emulate a transactional guest with a TM inactive MSR. TM SPR support(TFIAR/TFAR/TEXASR) has already been supported by commit 9916d57e64a4 ("KVM: PPC: Book3S PR: Expose TM registers"). Math register support (FPR/VMX/VSX) will be done at subsequent patch. Whether TM context need to be saved/restored can be determined by kvmppc_get_msr() TM active state: * TM active - save/restore TM context * TM inactive - no need to do so and only save/restore TM SPRs. Signed-off-by: Simon Guo <[email protected]> Suggested-by: Paul Mackerras <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-06-01KVM: PPC: Book3S PR: Add kvmppc_save/restore_tm_sprs() APIsSimon Guo1-0/+22
This patch adds 2 new APIs, kvmppc_save_tm_sprs() and kvmppc_restore_tm_sprs(), for the purpose of TEXASR/TFIAR/TFHAR save/restore. Signed-off-by: Simon Guo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-06-01KVM: PPC: Book3S PR: Add new kvmppc_copyto/from_vcpu_tm APIsSimon Guo1-0/+41
This patch adds 2 new APIs: kvmppc_copyto_vcpu_tm() and kvmppc_copyfrom_vcpu_tm(). These 2 APIs will be used to copy from/to TM data between VCPU_TM/VCPU area. PR KVM will use these APIs for treclaim. or trechkpt. emulation. Signed-off-by: Simon Guo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-06-01KVM: PPC: Book3S PR: Avoid changing TS bits when exiting guestSimon Guo1-0/+13
PR KVM host usually runs with TM enabled in its host MSR value, and with non-transactional TS value. When a guest with TM active traps into PR KVM host, the rfid at the tail of kvmppc_interrupt_pr() will try to switch TS bits from S0 (Suspended & TM disabled) to N1 (Non-transactional & TM enabled). That will leads to TM Bad Thing interrupt. This patch manually sets target TS bits unchanged to avoid this exception. Signed-off-by: Simon Guo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-06-01KVM: PPC: Book3S PR: Implement RFID TM behavior to suppress change from S0 to N0Simon Guo1-2/+19
According to ISA specification for RFID, in MSR TM disabled and TS suspended state (S0), if the target MSR is TM disabled and TS state is inactive (N0), rfid should suppress this update. This patch makes the RFID emulation of PR KVM consistent with this. Signed-off-by: Simon Guo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-06-01KVM: PPC: Book3S PR: Sync TM bits to shadow msr for problem state guestSimon Guo1-23/+50
MSR TS bits can be modified with non-privileged instruction such as tbegin./tend. That means guest can change MSR value "silently" without notifying host. It is necessary to sync the TM bits to host so that host can calculate shadow msr correctly. Note, privileged mode in the guest will always fail transactions so we only take care of problem state mode in the guest. The logic is put into kvmppc_copy_from_svcpu() so that kvmppc_handle_exit_pr() can use correct MSR TM bits even when preemption occurs. Signed-off-by: Simon Guo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-06-01KVM: PPC: Book3S PR: Pass through MSR TM and TS bits to shadow_msrSimon Guo1-0/+5
PowerPC TM functionality needs MSR TM/TS bits support in hardware level. Guest TM functionality can not be emulated with "fake" MSR (msr in magic page) TS bits. This patch syncs TM/TS bits in shadow_msr with the MSR value in magic page, so that the MSR TS value which guest sees is consistent with actual MSR bits running in guest. Signed-off-by: Simon Guo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-06-01KVM: PPC: Book3S PR: Transition to Suspended state when injecting interruptSimon Guo1-1/+10
This patch simulates interrupt behavior per Power ISA while injecting interrupt in PR KVM: - When interrupt happens, transactional state should be suspended. kvmppc_mmu_book3s_64_reset_msr() will be invoked when injecting an interrupt. This patch performs this ISA logic in kvmppc_mmu_book3s_64_reset_msr(). Signed-off-by: Simon Guo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-06-01KVM: PPC: Book3S PR: Add C function wrapper for _kvmppc_save/restore_tm()Simon Guo2-5/+95
Currently __kvmppc_save/restore_tm() APIs can only be invoked from assembly function. This patch adds C function wrappers for them so that they can be safely called from C function. Signed-off-by: Simon Guo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-06-01KVM: PPC: Book3S PR: Turn on FP/VSX/VMX MSR bits in kvmppc_save_tm()Simon Guo1-0/+2
kvmppc_save_tm() invokes store_fp_state/store_vr_state(). So it is mandatory to turn on FP/VSX/VMX MSR bits for its execution, just like what kvmppc_restore_tm() did. Previously HV KVM has turned the bits on outside of function kvmppc_save_tm(). Now we include this bit change in kvmppc_save_tm() so that the logic is cleaner. And PR KVM can reuse it later. Signed-off-by: Simon Guo <[email protected]> Reviewed-by: Paul Mackerras <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-06-01KVM: PPC: Book3S PR: Add guest MSR parameter for ↵Simon Guo2-47/+57
kvmppc_save_tm()/kvmppc_restore_tm() HV KVM and PR KVM need different MSR source to indicate whether treclaim. or trecheckpoint. is necessary. This patch add new parameter (guest MSR) for these kvmppc_save_tm/ kvmppc_restore_tm() APIs: - For HV KVM, it is VCPU_MSR - For PR KVM, it is current host MSR or VCPU_SHADOW_SRR1 This enhancement enables these 2 APIs to be reused by PR KVM later. And the patch keeps HV KVM logic unchanged. This patch also reworks kvmppc_save_tm()/kvmppc_restore_tm() to have a clean ABI: r3 for vcpu and r4 for guest_msr. During kvmppc_save_tm/kvmppc_restore_tm(), the R1 need to be saved or restored. Currently the R1 is saved into HSTATE_HOST_R1. In PR KVM, we are going to add a C function wrapper for kvmppc_save_tm/kvmppc_restore_tm() where the R1 will be incremented with added stackframe and save into HSTATE_HOST_R1. There are several places in HV KVM to load HSTATE_HOST_R1 as R1, and we don't want to bring risk or confusion by TM code. This patch will use HSTATE_SCRATCH2 to save/restore R1 in kvmppc_save_tm/kvmppc_restore_tm() to avoid future confusion, since the r1 is actually a temporary/scratch value to be saved/stored. [[email protected] - rebased on top of 7b0e827c6970 ("KVM: PPC: Book3S HV: Factor fake-suspend handling out of kvmppc_save/restore_tm", 2018-05-30)] Signed-off-by: Simon Guo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-05-31KVM: PPC: Book3S PR: Move kvmppc_save_tm/kvmppc_restore_tm to separate fileSimon Guo3-231/+282
It is a simple patch just for moving kvmppc_save_tm/kvmppc_restore_tm() functionalities to tm.S. There is no logic change. The reconstruct of those APIs will be done in later patches to improve readability. It is for preparation of reusing those APIs on both HV/PR PPC KVM. Some slight change during move the functions includes: - surrounds some HV KVM specific code with CONFIG_KVM_BOOK3S_HV_POSSIBLE for compilation. - use _GLOBAL() to define kvmppc_save_tm/kvmppc_restore_tm() [[email protected] - rebased on top of 7b0e827c6970 ("KVM: PPC: Book3S HV: Factor fake-suspend handling out of kvmppc_save/restore_tm", 2018-05-30)] Signed-off-by: Simon Guo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-05-31KVM: PPC: Book3S HV: Factor fake-suspend handling out of kvmppc_save/restore_tmPaul Mackerras1-69/+126
This splits out the handling of "fake suspend" mode, part of the hypervisor TM assist code for POWER9, and puts almost all of it in new kvmppc_save_tm_hv and kvmppc_restore_tm_hv functions. The new functions branch to kvmppc_save/restore_tm if the CPU does not require hypervisor TM assistance. With this, it will be more straightforward to move kvmppc_save_tm and kvmppc_restore_tm to another file and use them for transactional memory support in PR KVM. Additionally, it also makes the code a bit clearer and reduces the number of feature sections. Signed-off-by: Paul Mackerras <[email protected]>
2018-05-31KVM: PPC: Book3S PR: Allow KVM_PPC_CONFIGURE_V3_MMU to succeedPaul Mackerras1-0/+12
Currently, PR KVM does not implement the configure_mmu operation, and so the KVM_PPC_CONFIGURE_V3_MMU ioctl always fails with an EINVAL error. This causes recent kernels to fail to boot as a PR KVM guest on POWER9, since recent kernels booted in HPT mode do the H_REGISTER_PROC_TBL hypercall, which causes userspace (QEMU) to do KVM_PPC_CONFIGURE_V3_MMU, which fails. This implements a minimal configure_mmu operation for PR KVM. It succeeds only if the MMU is being configured for HPT mode and no process table is being registered. This is enough to get recent kernels to boot as a PR KVM guest. Reviewed-by: Greg Kurz <[email protected]> Tested-by: Greg Kurz <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-05-22KVM: PPC: Reimplement LOAD_VMX/STORE_VMX instruction mmio emulation with ↵Simon Guo2-88/+295
analyse_instr() input This patch reimplements LOAD_VMX/STORE_VMX MMIO emulation with analyse_instr() input. When emulating the store, the VMX reg will need to be flushed so that the right reg val can be retrieved before writing to IO MEM. This patch also adds support for lvebx/lvehx/lvewx/stvebx/stvehx/stvewx MMIO emulation. To meet the requirement of handling different element sizes, kvmppc_handle_load128_by2x64()/kvmppc_handle_store128_by2x64() were replaced with kvmppc_handle_vmx_load()/kvmppc_handle_vmx_store(). The framework used is similar to VSX instruction MMIO emulation. Suggested-by: Paul Mackerras <[email protected]> Signed-off-by: Simon Guo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-05-22KVM: PPC: Expand mmio_vsx_copy_type to cover VMX load/store element typesSimon Guo2-12/+12
VSX MMIO emulation uses mmio_vsx_copy_type to represent VSX emulated element size/type, such as KVMPPC_VSX_COPY_DWORD_LOAD, etc. This patch expands mmio_vsx_copy_type to cover VMX copy type, such as KVMPPC_VMX_COPY_BYTE(stvebx/lvebx), etc. As a result, mmio_vsx_copy_type is also renamed to mmio_copy_type. It is a preparation for reimplementing VMX MMIO emulation. Signed-off-by: Simon Guo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-05-22KVM: PPC: Reimplement LOAD_VSX/STORE_VSX instruction mmio emulation with ↵Simon Guo1-136/+91
analyse_instr() input This patch reimplements LOAD_VSX/STORE_VSX instruction MMIO emulation with analyse_instr() input. It utilizes VSX_FPCONV/VSX_SPLAT/SIGNEXT exported by analyse_instr() and handle accordingly. When emulating VSX store, the VSX reg will need to be flushed so that the right reg val can be retrieved before writing to IO MEM. [[email protected] - mask the register number to 5 bits.] Suggested-by: Paul Mackerras <[email protected]> Signed-off-by: Simon Guo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-05-22KVM: PPC: Reimplement LOAD_FP/STORE_FP instruction mmio emulation with ↵Simon Guo1-157/+44
analyse_instr() input This patch reimplements LOAD_FP/STORE_FP instruction MMIO emulation with analyse_instr() input. It utilizes the FPCONV/UPDATE properties exported by analyse_instr() and invokes kvmppc_handle_load(s)/kvmppc_handle_store() accordingly. For FP store MMIO emulation, the FP regs need to be flushed firstly so that the right FP reg vals can be read from vcpu->arch.fpr, which will be stored into MMIO data. Suggested-by: Paul Mackerras <[email protected]> Signed-off-by: Simon Guo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-05-22KVM: PPC: Add giveup_ext() hook to PPC KVM opsSimon Guo2-0/+10
Currently HV will save math regs(FP/VEC/VSX) when trap into host. But PR KVM will only save math regs when qemu task switch out of CPU, or when returning from qemu code. To emulate FP/VEC/VSX mmio load, PR KVM need to make sure that math regs were flushed firstly and then be able to update saved VCPU FPR/VEC/VSX area reasonably. This patch adds giveup_ext() field to KVM ops. Only PR KVM has non-NULL giveup_ext() ops. kvmppc_complete_mmio_load() can invoke that hook (when not NULL) to flush math regs accordingly, before updating saved register vals. Math regs flush is also necessary for STORE, which will be covered in later patch within this patch series. Signed-off-by: Simon Guo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-05-22KVM: PPC: Reimplement non-SIMD LOAD/STORE instruction mmio emulation with ↵Simon Guo3-235/+48
analyse_instr() input This patch reimplements non-SIMD LOAD/STORE instruction MMIO emulation with analyse_instr() input. It utilizes the BYTEREV/UPDATE/SIGNEXT properties exported by analyse_instr() and invokes kvmppc_handle_load(s)/kvmppc_handle_store() accordingly. It also moves CACHEOP type handling into the skeleton. instruction_type within kvm_ppc.h is renamed to avoid conflict with sstep.h. Suggested-by: Paul Mackerras <[email protected]> Signed-off-by: Simon Guo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-05-22KVM: PPC: Add KVMPPC_VSX_COPY_WORD_LOAD_DUMP type support for mmio emulationSimon Guo1-0/+23
Some VSX instructions like lxvwsx will splat word into VSR. This patch adds a new VSX copy type KVMPPC_VSX_COPY_WORD_LOAD_DUMP to support this. Signed-off-by: Simon Guo <[email protected]> Reviewed-by: Paul Mackerras <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-05-18KVM: PPC: Book3S PR: Enable use on POWER9 inside HPT-mode guestsPaul Mackerras1-2/+9
This relaxes the restriction on using PR KVM on POWER9. The existing code does work inside a guest partition running in HPT mode, because hypercalls such as H_ENTER use the old HPTE format, not the new format used by POWER9, and so no change to PR KVM's HPT manipulation code is required. PR KVM will still refuse to run if the kernel is using radix translation or if it is running bare-metal. Signed-off-by: Paul Mackerras <[email protected]>
2018-05-18KVM: PPC: Book3S HV: Send kvmppc_bad_interrupt NMIs to Linux handlersNicholas Piggin1-1/+14
It's possible to take a SRESET or MCE in these paths due to a bug in the host code or a NMI IPI, etc. A recent bug attempting to load a virtual address from real mode gave th complete but cryptic error, abridged: Oops: Bad interrupt in KVM entry/exit code, sig: 6 [#1] LE SMP NR_CPUS=2048 NUMA PowerNV CPU: 53 PID: 6582 Comm: qemu-system-ppc Not tainted NIP: c0000000000155ac LR: c0000000000c2430 CTR: c000000000015580 REGS: c000000fff76dd80 TRAP: 0200 Not tainted MSR: 9000000000201003 <SF,HV,ME,RI,LE> CR: 48082222 XER: 00000000 CFAR: 0000000102900ef0 DAR: d00017fffd941a28 DSISR: 00000040 SOFTE: 3 NIP [c0000000000155ac] perf_trace_tlbie+0x2c/0x1a0 LR [c0000000000c2430] do_tlbies+0x230/0x2f0 Sending the NMIs through the Linux handlers gives a nicer output: Severe Machine check interrupt [Not recovered] NIP [c0000000000155ac]: perf_trace_tlbie+0x2c/0x1a0 Initiator: CPU Error type: Real address [Load (bad)] Effective address: d00017fffcc01a28 opal: Machine check interrupt unrecoverable: MSR(RI=0) opal: Hardware platform error: Unrecoverable Machine Check exception CPU: 0 PID: 6700 Comm: qemu-system-ppc Tainted: G M NIP: c0000000000155ac LR: c0000000000c23c0 CTR: c000000000015580 REGS: c000000fff9e9d80 TRAP: 0200 Tainted: G M MSR: 9000000000201001 <SF,HV,ME,LE> CR: 48082222 XER: 00000000 CFAR: 000000010cbc1a30 DAR: d00017fffcc01a28 DSISR: 00000040 SOFTE: 3 NIP [c0000000000155ac] perf_trace_tlbie+0x2c/0x1a0 LR [c0000000000c23c0] do_tlbies+0x1c0/0x280 Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-05-18KVM: PPC: Book3S HV: Fix kvmppc_bad_host_intr for real mode interruptsNicholas Piggin1-0/+2
When CONFIG_RELOCATABLE=n, the Linux real mode interrupt handlers call into KVM using real address. This needs to be translated to the kernel linear effective address before the MMU is switched on. kvmppc_bad_host_intr misses adding these bits, so when it is used to handle a system reset interrupt (that always gets delivered in real mode), it results in an instruction access fault immediately after the MMU is turned on. Fix this by ensuring the top 2 address bits are set when the MMU is turned on. Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-05-18KVM: PPC: Book3S HV: radix: Do not clear partition PTE when RC or write bits ↵Nicholas Piggin1-21/+47
do not match Adding the write bit and RC bits to pte permissions does not require a pte clear and flush. There should not be other bits changed here, because restricting access or changing the PFN must have already invalidated any existing ptes (otherwise the race is already lost). Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-05-18KVM: PPC: Book3S HV: radix: Refine IO region partition scope attributesNicholas Piggin1-3/+7
When the radix fault handler has no page from the process address space (e.g., for IO memory), it looks up the process pte and sets partition table pte using that to get attributes like CI and guarded. If the process table entry is to be writable, set _PAGE_DIRTY as well to avoid an RC update. If not, then ensure _PAGE_DIRTY does not come across. Set _PAGE_ACCESSED as well to avoid RC update. Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-05-18KVM: PPC: Book3S HV: Make radix handle process scoped LPID flush in C, with ↵Nicholas Piggin2-7/+32
relocation on The radix guest code can has fewer restrictions about what context it can run in, so move this flushing out of assembly and have it use the Linux TLB flush implementations introduced previously. This allows powerpc:tlbie trace events to be used. This changes the tlbiel sequence to only execute RIC=2 flush once on the first set flushed, then RIC=0 for the rest of the sets. The end result of the flush should be unchanged. This matches the local PID flush pattern that was introduced in a5998fcb92 ("powerpc/mm/radix: Optimise tlbiel flush all case"). Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-05-18KVM: PPC: Book3S HV: Make radix use the Linux translation flush functions ↵Nicholas Piggin1-28/+8
for partition scope This has the advantage of consolidating TLB flush code in fewer places, and it also implements powerpc:tlbie trace events. Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-05-18KVM: PPC: Book3S HV: Recursively unmap all page table entries when unmappingNicholas Piggin1-54/+138
When partition scope mappings are unmapped with kvm_unmap_radix, the pte is cleared, but the page table structure is left in place. If the next page fault requests a different page table geometry (e.g., due to THP promotion or split), kvmppc_create_pte is responsible for changing the page tables. When a page table entry is to be converted to a large pte, the page table entry is cleared, the PWC flushed, then the page table it points to freed. This will cause pte page tables to leak when a 1GB page is to replace a pud entry points to a pmd table with pte tables under it: The pmd table will be freed, but its pte tables will be missed. Fix this by replacing the simple clear and free code with one that walks down the page tables and frees children. Care must be taken to clear the root entry being unmapped then flushing the PWC before freeing any page tables, as explained in comments. This requires PWC flush to logically become a flush-all-PWC (which it already is in hardware, but the KVM API needs to be changed to avoid confusion). This code also checks that no unexpected pte entries exist in any page table being freed, and unmaps those and emits a WARN. This is an expensive operation for the pte page level, but partition scope changes are rare, so it's unconditional for now to iron out bugs. It can be put under a CONFIG option or removed after some time. Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-05-18KVM: PPC: Book3S HV: Use a helper to unmap ptes in the radix fault pathNicholas Piggin1-23/+23
Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-05-18KVM: PPC: Book3S HV: Lockless tlbie for HPT hcallsNicholas Piggin2-21/+3
tlbies to an LPAR do not have to be serialised since POWER4/PPC970, after which the MMU_FTR_LOCKLESS_TLBIE feature was introduced to avoid tlbie locking. Since commit c17b98cf6028 ("KVM: PPC: Book3S HV: Remove code for PPC970 processors"), KVM no longer supports processors that do not have this feature, so the tlbie locking can be removed completely. A sanity check for the feature is put in kvmppc_mmu_hv_init. Testing was done on a POWER9 system in HPT mode, with a -smp 32 guest in HPT mode. 32 instances of the powerpc fork benchmark from selftests were run with --fork, and the results measured. Without this patch, total throughput was about 13.5K/sec, and this is the top of the host profile: 74.52% [k] do_tlbies 2.95% [k] kvmppc_book3s_hv_page_fault 1.80% [k] calc_checksum 1.80% [k] kvmppc_vcpu_run_hv 1.49% [k] kvmppc_run_core After this patch, throughput was about 51K/sec, with this profile: 21.28% [k] do_tlbies 5.26% [k] kvmppc_run_core 4.88% [k] kvmppc_book3s_hv_page_fault 3.30% [k] _raw_spin_lock_irqsave 3.25% [k] gup_pgd_range Signed-off-by: Nicholas Piggin <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-05-18KVM: PPC: Fix a mmio_host_swabbed uninitialized usage issueSimon Guo2-1/+2
When KVM emulates VMX store, it will invoke kvmppc_get_vmx_data() to retrieve VMX reg val. kvmppc_get_vmx_data() will check mmio_host_swabbed to decide which double word of vr[] to be used. But the mmio_host_swabbed can be uninitialized during VMX store procedure: kvmppc_emulate_loadstore \- kvmppc_handle_store128_by2x64 \- kvmppc_get_vmx_data So vcpu->arch.mmio_host_swabbed is not meant to be used at all for emulation of store instructions, and this patch makes that true for VMX stores. This patch also initializes mmio_host_swabbed to avoid possible future problems. Signed-off-by: Simon Guo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-05-18KVM: PPC: Move nip/ctr/lr/xer registers to pt_regs in kvm_vcpu_archSimon Guo9-46/+49
This patch moves nip/ctr/lr/xer registers from scattered places in kvm_vcpu_arch to pt_regs structure. cr register is "unsigned long" in pt_regs and u32 in vcpu->arch. It will need more consideration and may move in later patches. Signed-off-by: Simon Guo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-05-18KVM: PPC: Add pt_regs into kvm_vcpu_arch and move vcpu->arch.gpr[] into itSimon Guo7-44/+45
Current regs are scattered at kvm_vcpu_arch structure and it will be more neat to organize them into pt_regs structure. Also it will enable reimplementation of MMIO emulation code with analyse_instr() later. Signed-off-by: Simon Guo <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>
2018-05-18Merge remote-tracking branch 'remotes/powerpc/topic/ppc-kvm' into kvm-ppc-nextPaul Mackerras1-5/+31
This merges in the ppc-kvm topic branch of the powerpc repository to get some changes on which future patches will depend, in particular the definitions of various new TLB flushing functions. Signed-off-by: Paul Mackerras <[email protected]>
2018-05-17KVM: PPC: Book3S: Change return type to vm_fault_tSouptick Joarder1-1/+1
Use new return type vm_fault_t for fault handler in struct vm_operations_struct. For now, this is just documenting that the function returns a VM_FAULT value rather than an errno. Once all instances are converted, vm_fault_t will become a distinct type. commit 1c8f422059ae ("mm: change return type to vm_fault_t") Signed-off-by: Souptick Joarder <[email protected]> Signed-off-by: Paul Mackerras <[email protected]>