aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-11-08KVM: arm64: Fix host stage-2 finalizationQuentin Perret1-2/+12
We currently walk the hypervisor stage-1 page-table towards the end of hyp init in nVHE protected mode and adjust the host page ownership attributes in its stage-2 in order to get a consistent state from both point of views. The walk is done on the entire hyp VA space, and expects to only ever find page-level mappings. While this expectation is reasonable in the half of hyp VA space that maps memory with a fixed offset (see the loop in pkvm_create_mappings_locked()), it can be incorrect in the other half where nothing prevents the usage of block mappings. For instance, on systems where memory is physically aligned at an address that happens to maps to a PMD aligned VA in the hyp_vmemmap, kvm_pgtable_hyp_map() will install block mappings when backing the hyp_vmemmap, which will later cause finalize_host_mappings() to fail. Furthermore, it should be noted that all pages backing the hyp_vmemmap are also mapped in the 'fixed offset range' of the hypervisor, which implies that finalize_host_mappings() will walk both aliases and update the host stage-2 attributes twice. The order in which this happens is unpredictable, though, since the hyp VA layout is highly dependent on the position of the idmap page, hence resulting in a fragile mess at best. In order to fix all of this, let's restrict the finalization walk to only cover memory regions in the 'fixed-offset range' of the hyp VA space and nothing else. This not only fixes a correctness issue, but will also result in a slighlty faster hyp initialization overall. Fixes: 2c50166c62ba ("KVM: arm64: Mark host bss and rodata section as shared") Signed-off-by: Quentin Perret <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-11-08KVM: arm64: Change the return type of kvm_vcpu_preferred_target()YueHaibing3-11/+3
kvm_vcpu_preferred_target() always return 0 because kvm_target_cpu() never returns a negative error code. Signed-off-by: YueHaibing <[email protected]> Reviewed-by: Alexandru Elisei <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-11-08KVM: arm64: nvhe: Fix a non-kernel-doc commentRandy Dunlap1-1/+1
Do not use kernel-doc "/**" notation when the comment is not in kernel-doc format. Fixes this docs build warning: arch/arm64/kvm/hyp/nvhe/sys_regs.c:478: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * Handler for protected VM restricted exceptions. Signed-off-by: Randy Dunlap <[email protected]> Reported-by: kernel test robot <[email protected]> Cc: Fuad Tabba <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-11-08KVM: arm64: Extract ESR_ELx.EC onlyMark Rutland3-2/+3
Since ARMv8.0 the upper 32 bits of ESR_ELx have been RES0, and recently some of the upper bits gained a meaning and can be non-zero. For example, when FEAT_LS64 is implemented, ESR_ELx[36:32] contain ISS2, which for an ST64BV or ST64BV0 can be non-zero. This can be seen in ARM DDI 0487G.b, page D13-3145, section D13.2.37. Generally, we must not rely on RES0 bit remaining zero in future, and when extracting ESR_ELx.EC we must mask out all other bits. All C code uses the ESR_ELx_EC() macro, which masks out the irrelevant bits, and therefore no alterations are required to C code to avoid consuming irrelevant bits. In a couple of places the KVM assembly extracts ESR_ELx.EC using LSR on an X register, and so could in theory consume previously RES0 bits. In both cases this is for comparison with EC values ESR_ELx_EC_HVC32 and ESR_ELx_EC_HVC64, for which the upper bits of ESR_ELx must currently be zero, but this could change in future. This patch adjusts the KVM vectors to use UBFX rather than LSR to extract ESR_ELx.EC, ensuring these are robust to future additions to ESR_ELx. Cc: [email protected] Signed-off-by: Mark Rutland <[email protected]> Cc: Alexandru Elisei <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: James Morse <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Suzuki K Poulose <[email protected]> Cc: Will Deacon <[email protected]> Acked-by: Will Deacon <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-21Merge branch kvm/selftests/memslot into kvmarm-master/nextMarc Zyngier2-22/+36
* kvm/selftests/memslot: : . : Enable KVM memslot selftests on arm64, making them less : x86 specific. : . KVM: selftests: Build the memslot tests for arm64 KVM: selftests: Make memslot_perf_test arch independent Signed-off-by: Marc Zyngier <[email protected]>
2021-10-21KVM: selftests: Build the memslot tests for arm64Ricardo Koller1-0/+2
Add memslot_perf_test and memslot_modification_stress_test to the list of aarch64 selftests. Signed-off-by: Ricardo Koller <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Reviewed-by: Oliver Upton <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-21KVM: selftests: Make memslot_perf_test arch independentRicardo Koller1-22/+34
memslot_perf_test uses ucalls for synchronization between guest and host. Ucalls API is architecture independent: tests do not need to know details like what kind of exit they generate on a specific arch. More specifically, there is no need to check whether an exit is KVM_EXIT_IO in x86 for the host to know that the exit is ucall related, as get_ucall() already makes that check. Change memslot_perf_test to not require specifying what exit does a ucall generate. Also add a missing ucall_init. Signed-off-by: Ricardo Koller <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Reviewed-by: Oliver Upton <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-18Merge branch kvm-arm64/pkvm/fixed-features into kvmarm-master/nextMarc Zyngier18-155/+1200
* kvm-arm64/pkvm/fixed-features: (22 commits) : . : Add the pKVM fixed feature that allows a bunch of exceptions : to either be forbidden or be easily handled at EL2. : . KVM: arm64: pkvm: Give priority to standard traps over pvm handling KVM: arm64: pkvm: Pass vpcu instead of kvm to kvm_get_exit_handler_array() KVM: arm64: pkvm: Move kvm_handle_pvm_restricted around KVM: arm64: pkvm: Consolidate include files KVM: arm64: pkvm: Preserve pending SError on exit from AArch32 KVM: arm64: pkvm: Handle GICv3 traps as required KVM: arm64: pkvm: Drop sysregs that should never be routed to the host KVM: arm64: pkvm: Drop AArch32-specific registers KVM: arm64: pkvm: Make the ERR/ERX*_EL1 registers RAZ/WI KVM: arm64: pkvm: Use a single function to expose all id-regs KVM: arm64: Fix early exit ptrauth handling KVM: arm64: Handle protected guests at 32 bits KVM: arm64: Trap access to pVM restricted features KVM: arm64: Move sanitized copies of CPU features KVM: arm64: Initialize trap registers for protected VMs KVM: arm64: Add handlers for protected VM System Registers KVM: arm64: Simplify masking out MTE in feature id reg KVM: arm64: Add missing field descriptor for MDCR_EL2 KVM: arm64: Pass struct kvm to per-EC handlers KVM: arm64: Move early handlers to per-EC handlers ... Signed-off-by: Marc Zyngier <[email protected]>
2021-10-18KVM: arm64: pkvm: Give priority to standard traps over pvm handlingMarc Zyngier1-4/+7
Checking for pvm handling first means that we cannot handle ptrauth traps or apply any of the workarounds (GICv3 or TX2 #219). Flip the order around. Signed-off-by: Marc Zyngier <[email protected]> Reviewed-by: Fuad Tabba <[email protected]> Tested-by: Fuad Tabba <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-18KVM: arm64: pkvm: Pass vpcu instead of kvm to kvm_get_exit_handler_array()Marc Zyngier3-5/+5
Passing a VM pointer around is odd, and results in extra work on VHE. Follow the rest of the design that uses the vcpu instead, and let the nVHE code look into the struct kvm as required. Signed-off-by: Marc Zyngier <[email protected]> Reviewed-by: Fuad Tabba <[email protected]> Tested-by: Fuad Tabba <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-18KVM: arm64: pkvm: Move kvm_handle_pvm_restricted aroundMarc Zyngier3-14/+14
Place kvm_handle_pvm_restricted() next to its little friends such as kvm_handle_pvm_sysreg(). This allows to make inject_undef64() static. Signed-off-by: Marc Zyngier <[email protected]> Reviewed-by: Fuad Tabba <[email protected]> Tested-by: Fuad Tabba <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-18KVM: arm64: pkvm: Consolidate include filesMarc Zyngier6-24/+9
kvm_fixed_config.h is pkvm specific, and would be better placed near its users. At the same time, include/nvhe/sys_regs.h is now almost empty. Merge the two into arch/arm64/kvm/hyp/include/nvhe/fixed_config.h. Signed-off-by: Marc Zyngier <[email protected]> Reviewed-by: Fuad Tabba <[email protected]> Tested-by: Fuad Tabba <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-18KVM: arm64: pkvm: Preserve pending SError on exit from AArch32Marc Zyngier1-1/+2
Don't drop a potential SError when a guest gets caught red-handed running AArch32 code. Signed-off-by: Marc Zyngier <[email protected]> Reviewed-by: Fuad Tabba <[email protected]> Tested-by: Fuad Tabba <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-18KVM: arm64: pkvm: Handle GICv3 traps as requiredMarc Zyngier1-1/+17
Forward accesses to the ICV_*SGI*_EL1 registers to EL1, and emulate ICV_SRE_EL1 by returning a fixed value. This should be enough to support GICv3 in a protected guest. Signed-off-by: Marc Zyngier <[email protected]> Reviewed-by: Fuad Tabba <[email protected]> Tested-by: Fuad Tabba <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-18KVM: arm64: pkvm: Drop sysregs that should never be routed to the hostMarc Zyngier1-50/+0
A bunch of system registers (most of them MM related) should never trap to the host under any circumstance. Keep them close to our chest. Signed-off-by: Marc Zyngier <[email protected]> Reviewed-by: Fuad Tabba <[email protected]> Tested-by: Fuad Tabba <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-18KVM: arm64: pkvm: Drop AArch32-specific registersMarc Zyngier1-4/+0
All the SYS_*32_EL2 registers are AArch32-specific. Since we forbid AArch32, there is no need to handle those in any way. Signed-off-by: Marc Zyngier <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Reviewed-by: Fuad Tabba <[email protected]> Tested-by: Fuad Tabba <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-18KVM: arm64: pkvm: Make the ERR/ERX*_EL1 registers RAZ/WIMarc Zyngier1-11/+22
The ERR*/ERX* registers should be handled as RAZ/WI, and there should be no need to involve EL1 for that. Add a helper that handles such registers, and repaint the sysreg table to declare these registers as RAZ/WI. Signed-off-by: Marc Zyngier <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Reviewed-by: Fuad Tabba <[email protected]> Tested-by: Fuad Tabba <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-18KVM: arm64: pkvm: Use a single function to expose all id-regsMarc Zyngier3-35/+26
Rather than exposing a whole set of helper functions to retrieve individual ID registers, use the existing decoding tree and expose a single helper instead. This allow a number of functions to be made static, and we now have a single entry point to maintain. Signed-off-by: Marc Zyngier <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Reviewed-by: Fuad Tabba <[email protected]> Tested-by: Fuad Tabba <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-18KVM: arm64: Fix early exit ptrauth handlingMarc Zyngier1-10/+4
The previous rework of the early exit code to provide an EC-based decoding tree missed the fact that we have two trap paths for ptrauth: the instructions (EC_PAC) and the sysregs (EC_SYS64). Rework the handlers to call the ptrauth handling code on both paths. Signed-off-by: Marc Zyngier <[email protected]> Reviewed-by: Fuad Tabba <[email protected]> Tested-by: Fuad Tabba <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-17Merge branch kvm-arm64/memory-accounting into kvmarm-master/nextMarc Zyngier9-16/+18
* kvm-arm64/memory-accounting: : . : Sprinkle a bunch of GFP_KERNEL_ACCOUNT all over the code base : to better track memory allocation made on behalf of a VM. : . KVM: arm64: Add memcg accounting to KVM allocations KVM: arm64: vgic: Add memcg accounting to vgic allocations Signed-off-by: Marc Zyngier <[email protected]>
2021-10-17KVM: arm64: Add memcg accounting to KVM allocationsJia He4-5/+7
Inspired by commit 254272ce6505 ("kvm: x86: Add memcg accounting to KVM allocations"), it would be better to make arm64 KVM consistent with common kvm codes. The memory allocations of VM scope should be charged into VM process cgroup, hence change GFP_KERNEL to GFP_KERNEL_ACCOUNT. There remain a few cases since these allocations are global, not in VM scope. Signed-off-by: Jia He <[email protected]> Reviewed-by: Oliver Upton <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-17KVM: arm64: vgic: Add memcg accounting to vgic allocationsJia He5-11/+11
Inspired by commit 254272ce6505 ("kvm: x86: Add memcg accounting to KVM allocations"), it would be better to make arm64 vgic consistent with common kvm codes. The memory allocations of VM scope should be charged into VM process cgroup, hence change GFP_KERNEL to GFP_KERNEL_ACCOUNT. There remain a few cases since these allocations are global, not in VM scope. Signed-off-by: Jia He <[email protected]> Reviewed-by: Oliver Upton <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-17Merge branch kvm-arm64/selftest/timer into kvmarm-master/nextMarc Zyngier21-48/+2626
* kvm-arm64/selftest/timer: : . : Add a set of selftests for the KVM/arm64 timer emulation. : Comes with a minimal GICv3 infrastructure. : . KVM: arm64: selftests: arch_timer: Support vCPU migration KVM: arm64: selftests: Add arch_timer test KVM: arm64: selftests: Add host support for vGIC KVM: arm64: selftests: Add basic GICv3 support KVM: arm64: selftests: Add light-weight spinlock support KVM: arm64: selftests: Add guest support to get the vcpuid KVM: arm64: selftests: Maintain consistency for vcpuid type KVM: arm64: selftests: Add support to disable and enable local IRQs KVM: arm64: selftests: Add basic support to generate delays KVM: arm64: selftests: Add basic support for arch_timers KVM: arm64: selftests: Add support for cpu_relax KVM: arm64: selftests: Introduce ARM64_SYS_KVM_REG tools: arm64: Import sysreg.h KVM: arm64: selftests: Add MMIO readl/writel support Signed-off-by: Marc Zyngier <[email protected]>
2021-10-17KVM: arm64: selftests: arch_timer: Support vCPU migrationRaghavendra Rao Ananta1-1/+114
Since the timer stack (hardware and KVM) is per-CPU, there are potential chances for races to occur when the scheduler decides to migrate a vCPU thread to a different physical CPU. Hence, include an option to stress-test this part as well by forcing the vCPUs to migrate across physical CPUs in the system at a particular rate. Originally, the bug for the fix with commit 3134cc8beb69d0d ("KVM: arm64: vgic: Resample HW pending state on deactivation") was discovered using arch_timer test with vCPU migrations and can be easily reproduced. Signed-off-by: Raghavendra Rao Ananta <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-17KVM: arm64: selftests: Add arch_timer testRaghavendra Rao Ananta3-0/+368
Add a KVM selftest to validate the arch_timer functionality. Primarily, the test sets up periodic timer interrupts and validates the basic architectural expectations upon its receipt. The test provides command-line options to configure the period of the timer, number of iterations, and number of vCPUs. Signed-off-by: Raghavendra Rao Ananta <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-17KVM: arm64: selftests: Add host support for vGICRaghavendra Rao Ananta4-3/+92
Implement a simple library to perform vGIC-v3 setup from a host point of view. This includes creating a vGIC device, setting up distributor and redistributor attributes, and mapping the guest physical addresses. The definition of REDIST_REGION_ATTR_ADDR is taken from aarch64/vgic_init test. Hence, replace the definition by including vgic.h in the test file. Signed-off-by: Raghavendra Rao Ananta <[email protected]> Reviewed-by: Ricardo Koller <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-17KVM: arm64: selftests: Add basic GICv3 supportRaghavendra Rao Ananta6-1/+448
Add basic support for ARM Generic Interrupt Controller v3. The support provides guests to setup interrupts. The work is inspired from kvm-unit-tests and the kernel's GIC driver (drivers/irqchip/irq-gic-v3.c). Signed-off-by: Raghavendra Rao Ananta <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Reviewed-by: Ricardo Koller <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-17KVM: arm64: selftests: Add light-weight spinlock supportRaghavendra Rao Ananta3-1/+41
Add a simpler version of spinlock support for ARM64 for the guests to use. The implementation is loosely based on the spinlock implementation in kvm-unit-tests. Signed-off-by: Raghavendra Rao Ananta <[email protected]> Reviewed-by: Oliver Upton <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-17KVM: arm64: selftests: Add guest support to get the vcpuidRaghavendra Rao Ananta2-0/+8
At times, such as when in the interrupt handler, the guest wants to get the vcpuid that it's running on to pull the per-cpu private data. As a result, introduce guest_get_vcpuid() that returns the vcpuid of the calling vcpu. The interface is architecture independent, but defined only for arm64 as of now. Suggested-by: Reiji Watanabe <[email protected]> Signed-off-by: Raghavendra Rao Ananta <[email protected]> Reviewed-by: Ricardo Koller <[email protected]> Reviewed-by: Reiji Watanabe <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-17KVM: arm64: selftests: Maintain consistency for vcpuid typeRaghavendra Rao Ananta2-2/+2
The prototype of aarch64_vcpu_setup() accepts vcpuid as 'int', while the rest of the aarch64 (and struct vcpu) carries it as 'uint32_t'. Hence, change the prototype to make it consistent throughout the board. Signed-off-by: Raghavendra Rao Ananta <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-17KVM: arm64: selftests: Add support to disable and enable local IRQsRaghavendra Rao Ananta1-0/+10
Add functions local_irq_enable() and local_irq_disable() to enable and disable the IRQs from the guest, respectively. Signed-off-by: Raghavendra Rao Ananta <[email protected]> Reviewed-by: Oliver Upton <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-17KVM: arm64: selftests: Add basic support to generate delaysRaghavendra Rao Ananta1-0/+25
Add udelay() support to generate a delay in the guest. The routines are derived and simplified from kernel's arch/arm64/lib/delay.c. Signed-off-by: Raghavendra Rao Ananta <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Reviewed-by: Oliver Upton <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-17KVM: arm64: selftests: Add basic support for arch_timersRaghavendra Rao Ananta1-0/+142
Add a minimalistic library support to access the virtual timers, that can be used for simple timing functionalities, such as introducing delays in the guest. Signed-off-by: Raghavendra Rao Ananta <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-17KVM: arm64: selftests: Add support for cpu_relaxRaghavendra Rao Ananta1-0/+5
Implement the guest helper routine, cpu_relax(), to yield the processor to other tasks. The function was derived from arch/arm64/include/asm/vdso/processor.h. Signed-off-by: Raghavendra Rao Ananta <[email protected]> Reviewed-by: Oliver Upton <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-17KVM: arm64: selftests: Introduce ARM64_SYS_KVM_REGRaghavendra Rao Ananta4-19/+21
With the inclusion of sysreg.h, that brings in system register encodings, it would be redundant to re-define register encodings again in processor.h to use it with ARM64_SYS_REG for the KVM functions such as set_reg() or get_reg(). Hence, add helper macro, ARM64_SYS_KVM_REG, that converts SYS_* definitions in sysreg.h into ARM64_SYS_REG definitions. Also replace all the users of ARM64_SYS_REG, relying on the encodings created in processor.h, with ARM64_SYS_KVM_REG and remove the definitions. Signed-off-by: Raghavendra Rao Ananta <[email protected]> Reviewed-by: Ricardo Koller <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-17tools: arm64: Import sysreg.hRaghavendra Rao Ananta3-26/+1311
Bring-in the kernel's arch/arm64/include/asm/sysreg.h into tools/ for arm64 to make use of all the standard register definitions in consistence with the kernel. Make use of the register read/write definitions from sysreg.h, instead of the existing definitions. A syntax correction is needed for the files that use write_sysreg() to make it compliant with the new (kernel's) syntax. Reviewed-by: Andrew Jones <[email protected]> Reviewed-by: Oliver Upton <[email protected]> Signed-off-by: Raghavendra Rao Ananta <[email protected]> [maz: squashed two commits in order to keep the series bisectable] Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/r/[email protected]
2021-10-17KVM: arm64: selftests: Add MMIO readl/writel supportRaghavendra Rao Ananta1-1/+45
Define the readl() and writel() functions for the guests to access (4-byte) the MMIO region. The routines, and their dependents, are inspired from the kernel's arch/arm64/include/asm/io.h and arch/arm64/include/asm/barrier.h. Signed-off-by: Raghavendra Rao Ananta <[email protected]> Reviewed-by: Oliver Upton <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-17Merge branch kvm-arm64/vgic-fixes-5.16 into kvmarm-master/nextMarc Zyngier4-17/+34
* kvm-arm64/vgic-fixes-5.16: : . : Multiple updates to the GICv3 emulation in order to better support : the dreadful Apple M1 that only implements half of it, and in a : broken way... : . KVM: arm64: vgic-v3: Align emulated cpuif LPI state machine with the pseudocode KVM: arm64: vgic-v3: Don't advertise ICC_CTLR_EL1.SEIS KVM: arm64: vgic-v3: Reduce common group trapping to ICV_DIR_EL1 when possible KVM: arm64: vgic-v3: Work around GICv3 locally generated SErrors KVM: arm64: Force ID_AA64PFR0_EL1.GIC=1 when exposing a virtual GICv3 Signed-off-by: Marc Zyngier <[email protected]>
2021-10-17KVM: arm64: vgic-v3: Align emulated cpuif LPI state machine with the pseudocodeMarc Zyngier1-12/+8
Having realised that a virtual LPI does transition through an active state that does not exist on bare metal, align the CPU interface emulation with the behaviour specified in the architecture pseudocode. The LPIs now transition to active on IAR read, and to inactive on EOI write. Special care is taken not to increment the EOIcount for an LPI that isn't present in the LRs. Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-17KVM: arm64: vgic-v3: Don't advertise ICC_CTLR_EL1.SEISMarc Zyngier1-2/+0
Since we are trapping all sysreg accesses when ICH_VTR_EL2.SEIS is set, and that we never deliver an SError when emulating any of the GICv3 sysregs, don't advertise ICC_CTLR_EL1.SEIS. Reviewed-by: Alexandru Elisei <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-17KVM: arm64: vgic-v3: Reduce common group trapping to ICV_DIR_EL1 when possibleMarc Zyngier2-4/+14
On systems that advertise ICH_VTR_EL2.SEIS, we trap all GICv3 sysreg accesses from the guest. From a performance perspective, this is OK as long as the guest doesn't hammer the GICv3 CPU interface. In most cases, this is fine, unless the guest actively uses priorities and switches PMR_EL1 very often. Which is exactly what happens when a Linux guest runs with irqchip.gicv3_pseudo_nmi=1. In these condition, the performance plumets as we hit PMR each time we mask/unmask interrupts. Not good. There is however an opportunity for improvement. Careful reading of the architecture specification indicates that the only GICv3 sysreg belonging to the common group (which contains the SGI registers, PMR, DIR, CTLR and RPR) that is allowed to generate a SError is DIR. Everything else is safe. It is thus possible to substitute the trapping of all the common group with just that of DIR if it supported by the implementation. Yes, that's yet another optional bit of the architecture. So let's just do that, as it leads to some impressive result on the M1: Without this change: bash-5.1# /host/home/maz/hackbench 100 process 1000 Running with 100*40 (== 4000) tasks. Time: 56.596 With this change: bash-5.1# /host/home/maz/hackbench 100 process 1000 Running with 100*40 (== 4000) tasks. Time: 8.649 which is a pretty convincing result. Signed-off-by: Marc Zyngier <[email protected]> Reviewed-by: Alexandru Elisei <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-17KVM: arm64: vgic-v3: Work around GICv3 locally generated SErrorsMarc Zyngier1-0/+8
The infamous M1 has a feature nobody else ever implemented, in the form of the "GIC locally generated SError interrupts", also known as SEIS for short. These SErrors are generated when a guest does something that violates the GIC state machine. It would have been simpler to just *ignore* the damned thing, but that's not what this HW does. Oh well. This part of of the architecture is also amazingly under-specified. There is a whole 10 lines that describe the feature in a spec that is 930 pages long, and some of these lines are factually wrong. Oh, and it is deprecated, so the insentive to clarify it is low. Now, the spec says that this should be a *virtual* SError when HCR_EL2.AMO is set. As it turns out, that's not always the case on this CPU, and the SError sometimes fires on the host as a physical SError. Goodbye, cruel world. This clearly is a HW bug, and it means that a guest can easily take the host down, on demand. Thankfully, we have seen systems that were just as broken in the past, and we have the perfect vaccine for it. Apple M1, please meet the Cavium ThunderX workaround. All your GIC accesses will be trapped, sanitised, and emulated. Only the signalling aspect of the HW will be used. It won't be super speedy, but it will at least be safe. You're most welcome. Given that this has only ever been seen on this single implementation, that the spec is unclear at best and that we cannot trust it to ever be implemented correctly, gate the workaround solely on ICH_VTR_EL2.SEIS being set. Tested-by: Joey Gouly <[email protected]> Reviewed-by: Alexandru Elisei <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-17KVM: arm64: Force ID_AA64PFR0_EL1.GIC=1 when exposing a virtual GICv3Marc Zyngier1-0/+5
Until now, we always let ID_AA64PFR0_EL1.GIC reflect the value visible on the host, even if we were running a GICv2-enabled VM on a GICv3+compat host. That's fine, but we also now have the case of a host that does not expose ID_AA64PFR0_EL1.GIC==1 despite having a vGIC. Yes, this is confusing. Thank you M1. Let's go back to first principles and expose ID_AA64PFR0_EL1.GIC=1 when a GICv3 is exposed to the guest. This also hides a GICv4.1 CPU interface from the guest which has no business knowing about the v4.1 extension. Reviewed-by: Alexandru Elisei <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-12Merge branch kvm-arm64/misc-5.16 into kvmarm-master/nextMarc Zyngier1-1/+4
* kvm-arm64/misc-5.16: : . : - Allow KVM to be disabled from the command-line : - Clean up CONFIG_KVM vs CONFIG_HAVE_KVM : - Fix endianess evaluation on MMIO access from EL0 : . KVM: arm64: Fix reporting of endianess when the access originates at EL0 Signed-off-by: Marc Zyngier <[email protected]>
2021-10-12KVM: arm64: Fix reporting of endianess when the access originates at EL0Marc Zyngier1-1/+4
We currently check SCTLR_EL1.EE when computing the address of a faulting guest access. However, the fault could have occured at EL0, in which case the right bit to check would be SCTLR_EL1.E0E. This is pretty unlikely to cause any issue in practice: You'd have to have a guest with a LE EL1 and a BE EL0 (or the other way around), and have mapped a device into the EL0 page tables. Good luck with that! Signed-off-by: Marc Zyngier <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-11Merge branch kvm-arm64/pkvm/restrict-hypercalls into kvmarm-master/nextMarc Zyngier1-1/+3
* kvm-arm64/pkvm/restrict-hypercalls: : . : Restrict the use of some hypercalls as well as kexec once : the protected KVM mode has been initialised. : . Documentation: admin-guide: Document side effects when pKVM is enabled Signed-off-by: Marc Zyngier <[email protected]>
2021-10-11Documentation: admin-guide: Document side effects when pKVM is enabledAlexandru Elisei1-1/+3
Recent changes to KVM for arm64 has made it impossible for the host to hibernate or use kexec when protected mode is enabled via the kernel command line. There are people who rely on kexec (for example, developers who use kexec as a quick way to test a new kernel), let's document this change in behaviour, so it doesn't catch them by surprise and we have a place to point people to if it does. Signed-off-by: Alexandru Elisei <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-11KVM: arm64: Handle protected guests at 32 bitsFuad Tabba1-0/+34
Protected KVM does not support protected AArch32 guests. However, it is possible for the guest to force run AArch32, potentially causing problems. Add an extra check so that if the hypervisor catches the guest doing that, it can prevent the guest from running again by resetting vcpu->arch.target and returning ARM_EXCEPTION_IL. If this were to happen, The VMM can try and fix it by re- initializing the vcpu with KVM_ARM_VCPU_INIT, however, this is likely not possible for protected VMs. Adapted from commit 22f553842b14 ("KVM: arm64: Handle Asymmetric AArch32 systems") Signed-off-by: Fuad Tabba <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-11KVM: arm64: Trap access to pVM restricted featuresFuad Tabba1-0/+57
Trap accesses to restricted features for VMs running in protected mode. Access to feature registers are emulated, and only supported features are exposed to protected VMs. Accesses to restricted registers as well as restricted instructions are trapped, and an undefined exception is injected into the protected guests, i.e., with EC = 0x0 (unknown reason). This EC is the one used, according to the Arm Architecture Reference Manual, for unallocated or undefined system registers or instructions. Only affects the functionality of protected VMs. Otherwise, should not affect non-protected VMs when KVM is running in protected mode. Signed-off-by: Fuad Tabba <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-11KVM: arm64: Move sanitized copies of CPU featuresFuad Tabba2-6/+2
Move the sanitized copies of the CPU feature registers to the recently created sys_regs.c. This consolidates all copies in a more relevant file. No functional change intended. Acked-by: Will Deacon <[email protected]> Signed-off-by: Fuad Tabba <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]