aboutsummaryrefslogtreecommitdiff
path: root/arch/arm64/kvm/hyp/include
AgeCommit message (Collapse)AuthorFilesLines
2024-10-01KVM: arm64: Constrain the host to the maximum shared SVE VL with pKVMMark Brown1-1/+1
When pKVM saves and restores the host floating point state on a SVE system, it programs the vector length in ZCR_EL2.LEN to be whatever the maximum VL for the PE is. But it uses a buffer allocated with kvm_host_sve_max_vl, the maximum VL shared by all PEs in the system. This means that if we run on a system where the maximum VLs are not consistent, we will overflow the buffer on PEs which support larger VLs. Since the host will not currently attempt to make use of non-shared VLs, fix this by explicitly setting the EL2 VL to be the maximum shared VL when we save and restore. This will enforce the limit on host VL usage. Should we wish to support asymmetric VLs, this code will need to be updated along with the required changes for the host: https://lore.kernel.org/r/[email protected] Fixes: b5b9955617bc ("KVM: arm64: Eagerly restore host fpsimd/sve state in pKVM") Signed-off-by: Mark Brown <[email protected]> Tested-by: Fuad Tabba <[email protected]> Reviewed-by: Fuad Tabba <[email protected]> Link: https://lore.kernel.org/r/[email protected] [maz: added punctuation to the commit message] Signed-off-by: Marc Zyngier <[email protected]>
2024-09-16Merge tag 'for-linus-non-x86' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds1-0/+3
Pull kvm updates from Paolo Bonzini: "These are the non-x86 changes (mostly ARM, as is usually the case). The generic and x86 changes will come later" ARM: - New Stage-2 page table dumper, reusing the main ptdump infrastructure - FP8 support - Nested virtualization now supports the address translation (FEAT_ATS1A) family of instructions - Add selftest checks for a bunch of timer emulation corner cases - Fix multiple cases where KVM/arm64 doesn't correctly handle the guest trying to use a GICv3 that wasn't advertised - Remove REG_HIDDEN_USER from the sysreg infrastructure, making things little simpler - Prevent MTE tags being restored by userspace if we are actively logging writes, as that's a recipe for disaster - Correct the refcount on a page that is not considered for MTE tag copying (such as a device) - When walking a page table to split block mappings, synchronize only at the end the walk rather than on every store - Fix boundary check when transfering memory using FFA - Fix pKVM TLB invalidation, only affecting currently out of tree code but worth addressing for peace of mind LoongArch: - Revert qspinlock to test-and-set simple lock on VM. - Add Loongson Binary Translation extension support. - Add PMU support for guest. - Enable paravirt feature control from VMM. - Implement function kvm_para_has_feature(). RISC-V: - Fix sbiret init before forwarding to userspace - Don't zero-out PMU snapshot area before freeing data - Allow legacy PMU access from guest - Fix to allow hpmcounter31 from the guest" * tag 'for-linus-non-x86' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (64 commits) LoongArch: KVM: Implement function kvm_para_has_feature() LoongArch: KVM: Enable paravirt feature control from VMM LoongArch: KVM: Add PMU support for guest KVM: arm64: Get rid of REG_HIDDEN_USER visibility qualifier KVM: arm64: Simplify visibility handling of AArch32 SPSR_* KVM: arm64: Simplify handling of CNTKCTL_EL12 LoongArch: KVM: Add vm migration support for LBT registers LoongArch: KVM: Add Binary Translation extension support LoongArch: KVM: Add VM feature detection function LoongArch: Revert qspinlock to test-and-set simple lock on VM KVM: arm64: Register ptdump with debugfs on guest creation arm64: ptdump: Don't override the level when operating on the stage-2 tables arm64: ptdump: Use the ptdump description from a local context arm64: ptdump: Expose the attribute parsing functionality KVM: arm64: Add memory length checks and remove inline in do_ffa_mem_xfer KVM: arm64: Move pagetable definitions to common header KVM: arm64: nv: Add support for FEAT_ATS1A KVM: arm64: nv: Plumb handling of AT S1* traps from EL2 KVM: arm64: nv: Make AT+PAN instructions aware of FEAT_PAN3 KVM: arm64: nv: Sanitise SCTLR_EL1.EPAN according to VM configuration ...
2024-09-16Merge tag 'arm64-upstream' of ↵Linus Torvalds2-1/+31
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "The highlights are support for Arm's "Permission Overlay Extension" using memory protection keys, support for running as a protected guest on Android as well as perf support for a bunch of new interconnect PMUs. Summary: ACPI: - Enable PMCG erratum workaround for HiSilicon HIP10 and 11 platforms. - Ensure arm64-specific IORT header is covered by MAINTAINERS. CPU Errata: - Enable workaround for hardware access/dirty issue on Ampere-1A cores. Memory management: - Define PHYSMEM_END to fix a crash in the amdgpu driver. - Avoid tripping over invalid kernel mappings on the kexec() path. - Userspace support for the Permission Overlay Extension (POE) using protection keys. Perf and PMUs: - Add support for the "fixed instruction counter" extension in the CPU PMU architecture. - Extend and fix the event encodings for Apple's M1 CPU PMU. - Allow LSM hooks to decide on SPE permissions for physical profiling. - Add support for the CMN S3 and NI-700 PMUs. Confidential Computing: - Add support for booting an arm64 kernel as a protected guest under Android's "Protected KVM" (pKVM) hypervisor. Selftests: - Fix vector length issues in the SVE/SME sigreturn tests - Fix build warning in the ptrace tests. Timers: - Add support for PR_{G,S}ET_TSC so that 'rr' can deal with non-determinism arising from the architected counter. Miscellaneous: - Rework our IPI-based CPU stopping code to try NMIs if regular IPIs don't succeed. - Minor fixes and cleanups" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (94 commits) perf: arm-ni: Fix an NULL vs IS_ERR() bug arm64: hibernate: Fix warning for cast from restricted gfp_t arm64: esr: Define ESR_ELx_EC_* constants as UL arm64: pkeys: remove redundant WARN perf: arm_pmuv3: Use BR_RETIRED for HW branch event if enabled MAINTAINERS: List Arm interconnect PMUs as supported perf: Add driver for Arm NI-700 interconnect PMU dt-bindings/perf: Add Arm NI-700 PMU perf/arm-cmn: Improve format attr printing perf/arm-cmn: Clean up unnecessary NUMA_NO_NODE check arm64/mm: use lm_alias() with addresses passed to memblock_free() mm: arm64: document why pte is not advanced in contpte_ptep_set_access_flags() arm64: Expose the end of the linear map in PHYSMEM_END arm64: trans_pgd: mark PTEs entries as valid to avoid dead kexec() arm64/mm: Delete __init region from memblock.reserved perf/arm-cmn: Support CMN S3 dt-bindings: perf: arm-cmn: Add CMN S3 perf/arm-cmn: Refactor DTC PMU register access perf/arm-cmn: Make cycle counts less surprising perf/arm-cmn: Improve build-time assertion ...
2024-09-12Merge branch kvm-arm64/nv-at-pan into kvmarm-master/nextMarc Zyngier1-1/+1
* kvm-arm64/nv-at-pan: : . : Add NV support for the AT family of instructions, which mostly results : in adding a page table walker that deals with most of the complexity : of the architecture. : : From the cover letter: : : "Another task that a hypervisor supporting NV on arm64 has to deal with : is to emulate the AT instruction, because we multiplex all the S1 : translations on a single set of registers, and the guest S2 is never : truly resident on the CPU. : : So given that we lie about page tables, we also have to lie about : translation instructions, hence the emulation. Things are made : complicated by the fact that guest S1 page tables can be swapped out, : and that our shadow S2 is likely to be incomplete. So while using AT : to emulate AT is tempting (and useful), it is not going to always : work, and we thus need a fallback in the shape of a SW S1 walker." : . KVM: arm64: nv: Add support for FEAT_ATS1A KVM: arm64: nv: Plumb handling of AT S1* traps from EL2 KVM: arm64: nv: Make AT+PAN instructions aware of FEAT_PAN3 KVM: arm64: nv: Sanitise SCTLR_EL1.EPAN according to VM configuration KVM: arm64: nv: Add SW walker for AT S1 emulation KVM: arm64: nv: Make ps_to_output_size() generally available KVM: arm64: nv: Add emulation of AT S12E{0,1}{R,W} KVM: arm64: nv: Add basic emulation of AT S1E2{R,W} KVM: arm64: nv: Add basic emulation of AT S1E1{R,W}P KVM: arm64: nv: Add basic emulation of AT S1E{0,1}{R,W} KVM: arm64: nv: Honor absence of FEAT_PAN2 KVM: arm64: nv: Turn upper_attr for S2 walk into the full descriptor KVM: arm64: nv: Enforce S2 alignment when contiguous bit is set arm64: Add ESR_ELx_FSC_ADDRSZ_L() helper arm64: Add system register encoding for PSTATE.PAN arm64: Add PAR_EL1 field description arm64: Add missing APTable and TCR_ELx.HPD masks KVM: arm64: Make kvm_at() take an OP_AT_* Signed-off-by: Marc Zyngier <[email protected]> # Conflicts: # arch/arm64/kvm/nested.c
2024-09-12Merge branch kvm-arm64/fpmr into kvmarm-master/nextMarc Zyngier1-0/+3
* kvm-arm64/fpmr: : . : Add FP8 support to the KVM/arm64 floating point handling. : : This includes new ID registers (ID_AA64PFR2_EL1 ID_AA64FPFR0_EL1) : being made visible to guests, as well as a new confrol register : (FPMR) which gets context-switched. : . KVM: arm64: Expose ID_AA64PFR2_EL1 to userspace and guests KVM: arm64: Enable FP8 support when available and configured KVM: arm64: Expose ID_AA64FPFR0_EL1 as a writable ID reg KVM: arm64: Honor trap routing for FPMR KVM: arm64: Add save/restore support for FPMR KVM: arm64: Move FPMR into the sysreg array KVM: arm64: Add predicate for FPMR support in a VM KVM: arm64: Move SVCR into the sysreg array Signed-off-by: Marc Zyngier <[email protected]>
2024-09-04KVM: arm64: use `at s1e1a` for POEJoey Gouly1-1/+4
FEAT_ATS1E1A introduces a new instruction: `at s1e1a`. This is an address translation, without permission checks. POE allows read permissions to be removed from S1 by the guest. This means that an `at` instruction could fail, and not get the IPA. Switch to using `at s1e1a` so that KVM can get the IPA regardless of S1 permissions. Signed-off-by: Joey Gouly <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Oliver Upton <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Reviewed-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2024-09-04KVM: arm64: Save/restore POE registersJoey Gouly1-0/+27
Define the new system registers that POE introduces and context switch them. Signed-off-by: Joey Gouly <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Oliver Upton <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Reviewed-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2024-08-30KVM: arm64: Make kvm_at() take an OP_AT_*Joey Gouly1-1/+1
To allow using newer instructions that current assemblers don't know about, replace the `at` instruction with the underlying SYS instruction. Signed-off-by: Joey Gouly <[email protected]> Cc: Oliver Upton <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Reviewed-by: Marc Zyngier <[email protected]> Reviewed-by: Anshuman Khandual <[email protected]> Acked-by: Will Deacon <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
2024-08-27KVM: arm64: Add save/restore support for FPMRMarc Zyngier1-0/+3
Just like the rest of the FP/SIMD state, FPMR needs to be context switched. The only interesting thing here is that we need to treat the pKVM part a bit differently, as the host FP state is never written back to the vcpu thread, but instead stored locally and eagerly restored. Reviewed-by: Mark Brown <[email protected]> Tested-by: Mark Brown <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Marc Zyngier <[email protected]>
2024-08-07KVM: arm64: Tidying up PAuth code in KVMFuad Tabba1-1/+0
Tidy up some of the PAuth trapping code to clear up some comments and avoid clang/checkpatch warnings. Also, don't bother setting PAuth HCR_EL2 bits in pKVM, since it's handled by the hypervisor. Signed-off-by: Fuad Tabba <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Oliver Upton <[email protected]>
2024-07-20Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds3-14/+52
Pull kvm updates from Paolo Bonzini: "ARM: - Initial infrastructure for shadow stage-2 MMUs, as part of nested virtualization enablement - Support for userspace changes to the guest CTR_EL0 value, enabling (in part) migration of VMs between heterogenous hardware - Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1 of the protocol - FPSIMD/SVE support for nested, including merged trap configuration and exception routing - New command-line parameter to control the WFx trap behavior under KVM - Introduce kCFI hardening in the EL2 hypervisor - Fixes + cleanups for handling presence/absence of FEAT_TCRX - Miscellaneous fixes + documentation updates LoongArch: - Add paravirt steal time support - Add support for KVM_DIRTY_LOG_INITIALLY_SET - Add perf kvm-stat support for loongarch RISC-V: - Redirect AMO load/store access fault traps to guest - perf kvm stat support - Use guest files for IMSIC virtualization, when available s390: - Assortment of tiny fixes which are not time critical x86: - Fixes for Xen emulation - Add a global struct to consolidate tracking of host values, e.g. EFER - Add KVM_CAP_X86_APIC_BUS_CYCLES_NS to allow configuring the effective APIC bus frequency, because TDX - Print the name of the APICv/AVIC inhibits in the relevant tracepoint - Clean up KVM's handling of vendor specific emulation to consistently act on "compatible with Intel/AMD", versus checking for a specific vendor - Drop MTRR virtualization, and instead always honor guest PAT on CPUs that support self-snoop - Update to the newfangled Intel CPU FMS infrastructure - Don't advertise IA32_PERF_GLOBAL_OVF_CTRL as an MSR-to-be-saved, as it reads '0' and writes from userspace are ignored - Misc cleanups x86 - MMU: - Small cleanups, renames and refactoring extracted from the upcoming Intel TDX support - Don't allocate kvm_mmu_page.shadowed_translation for shadow pages that can't hold leafs SPTEs - Unconditionally drop mmu_lock when allocating TDP MMU page tables for eager page splitting, to avoid stalling vCPUs when splitting huge pages - Bug the VM instead of simply warning if KVM tries to split a SPTE that is non-present or not-huge. KVM is guaranteed to end up in a broken state because the callers fully expect a valid SPTE, it's all but dangerous to let more MMU changes happen afterwards x86 - AMD: - Make per-CPU save_area allocations NUMA-aware - Force sev_es_host_save_area() to be inlined to avoid calling into an instrumentable function from noinstr code - Base support for running SEV-SNP guests. API-wise, this includes a new KVM_X86_SNP_VM type, encrypting/measure the initial image into guest memory, and finalizing it before launching it. Internally, there are some gmem/mmu hooks needed to prepare gmem-allocated pages before mapping them into guest private memory ranges This includes basic support for attestation guest requests, enough to say that KVM supports the GHCB 2.0 specification There is no support yet for loading into the firmware those signing keys to be used for attestation requests, and therefore no need yet for the host to provide certificate data for those keys. To support fetching certificate data from userspace, a new KVM exit type will be needed to handle fetching the certificate from userspace. An attempt to define a new KVM_EXIT_COCO / KVM_EXIT_COCO_REQ_CERTS exit type to handle this was introduced in v1 of this patchset, but is still being discussed by community, so for now this patchset only implements a stub version of SNP Extended Guest Requests that does not provide certificate data x86 - Intel: - Remove an unnecessary EPT TLB flush when enabling hardware - Fix a series of bugs that cause KVM to fail to detect nested pending posted interrupts as valid wake eents for a vCPU executing HLT in L2 (with HLT-exiting disable by L1) - KVM: x86: Suppress MMIO that is triggered during task switch emulation Explicitly suppress userspace emulated MMIO exits that are triggered when emulating a task switch as KVM doesn't support userspace MMIO during complex (multi-step) emulation Silently ignoring the exit request can result in the WARN_ON_ONCE(vcpu->mmio_needed) firing if KVM exits to userspace for some other reason prior to purging mmio_needed See commit 0dc902267cb3 ("KVM: x86: Suppress pending MMIO write exits if emulator detects exception") for more details on KVM's limitations with respect to emulated MMIO during complex emulator flows Generic: - Rename the AS_UNMOVABLE flag that was introduced for KVM to AS_INACCESSIBLE, because the special casing needed by these pages is not due to just unmovability (and in fact they are only unmovable because the CPU cannot access them) - New ioctl to populate the KVM page tables in advance, which is useful to mitigate KVM page faults during guest boot or after live migration. The code will also be used by TDX, but (probably) not through the ioctl - Enable halt poll shrinking by default, as Intel found it to be a clear win - Setup empty IRQ routing when creating a VM to avoid having to synchronize SRCU when creating a split IRQCHIP on x86 - Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag that arch code can use for hooking both sched_in() and sched_out() - Take the vCPU @id as an "unsigned long" instead of "u32" to avoid truncating a bogus value from userspace, e.g. to help userspace detect bugs - Mark a vCPU as preempted if and only if it's scheduled out while in the KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest memory when retrieving guest state during live migration blackout Selftests: - Remove dead code in the memslot modification stress test - Treat "branch instructions retired" as supported on all AMD Family 17h+ CPUs - Print the guest pseudo-RNG seed only when it changes, to avoid spamming the log for tests that create lots of VMs - Make the PMU counters test less flaky when counting LLC cache misses by doing CLFLUSH{OPT} in every loop iteration" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits) crypto: ccp: Add the SNP_VLEK_LOAD command KVM: x86/pmu: Add kvm_pmu_call() to simplify static calls of kvm_pmu_ops KVM: x86: Introduce kvm_x86_call() to simplify static calls of kvm_x86_ops KVM: x86: Replace static_call_cond() with static_call() KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event x86/sev: Move sev_guest.h into common SEV header KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event KVM: x86: Suppress MMIO that is triggered during task switch emulation KVM: x86/mmu: Clean up make_huge_page_split_spte() definition and intro KVM: x86/mmu: Bug the VM if KVM tries to split a !hugepage SPTE KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory() KVM: x86/mmu: Make kvm_mmu_do_page_fault() return mapped level KVM: x86/mmu: Account pf_{fixed,emulate,spurious} in callers of "do page fault" KVM: x86/mmu: Bump pf_taken stat only in the "real" page fault handler KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory KVM: Document KVM_PRE_FAULT_MEMORY ioctl mm, virt: merge AS_UNMOVABLE and AS_INACCESSIBLE perf kvm: Add kvm-stat for loongarch64 LoongArch: KVM: Add PV steal time support in guest side ...
2024-07-14Merge branch kvm-arm64/nv-tcr2 into kvmarm/nextOliver Upton1-10/+25
* kvm-arm64/nv-tcr2: : Fixes to the handling of TCR_EL1, courtesy of Marc Zyngier : : Series addresses a couple gaps that are present in KVM (from cover : letter): : : - VM configuration: HCRX_EL2.TCR2En is forced to 1, and we blindly : save/restore stuff. : : - trap bit description and routing: none, obviously, since we make a : point in not trapping. KVM: arm64: Honor trap routing for TCR2_EL1 KVM: arm64: Make PIR{,E0}_EL1 save/restore conditional on FEAT_TCRX KVM: arm64: Make TCR2_EL1 save/restore dependent on the VM features KVM: arm64: Get rid of HCRX_GUEST_FLAGS KVM: arm64: Correctly honor the presence of FEAT_TCRX Signed-off-by: Oliver Upton <[email protected]>
2024-07-14Merge branch kvm-arm64/nv-sve into kvmarm/nextOliver Upton1-1/+23
* kvm-arm64/nv-sve: : CPTR_EL2, FPSIMD/SVE support for nested : : This series brings support for honoring the guest hypervisor's CPTR_EL2 : trap configuration when running a nested guest, along with support for : FPSIMD/SVE usage at L1 and L2. KVM: arm64: Allow the use of SVE+NV KVM: arm64: nv: Add additional trap setup for CPTR_EL2 KVM: arm64: nv: Add trap description for CPTR_EL2 KVM: arm64: nv: Add TCPAC/TTA to CPTR->CPACR conversion helper KVM: arm64: nv: Honor guest hypervisor's FP/SVE traps in CPTR_EL2 KVM: arm64: nv: Load guest FP state for ZCR_EL2 trap KVM: arm64: nv: Handle CPACR_EL1 traps KVM: arm64: Spin off helper for programming CPTR traps KVM: arm64: nv: Ensure correct VL is loaded before saving SVE state KVM: arm64: nv: Use guest hypervisor's max VL when running nested guest KVM: arm64: nv: Save guest's ZCR_EL2 when in hyp context KVM: arm64: nv: Load guest hyp's ZCR into EL1 state KVM: arm64: nv: Handle ZCR_EL2 traps KVM: arm64: nv: Forward SVE traps to guest hypervisor KVM: arm64: nv: Forward FP/ASIMD traps to guest hypervisor Signed-off-by: Oliver Upton <[email protected]>
2024-07-14Merge branch kvm-arm64/el2-kcfi into kvmarm/nextOliver Upton1-2/+3
* kvm-arm64/el2-kcfi: : kCFI support in the EL2 hypervisor, courtesy of Pierre-Clément Tosi : : Enable the usage fo CONFIG_CFI_CLANG (kCFI) for hardening indirect : branches in the EL2 hypervisor. Unlike kernel support for the feature, : CFI failures at EL2 are always fatal. KVM: arm64: nVHE: Support CONFIG_CFI_CLANG at EL2 KVM: arm64: Introduce print_nvhe_hyp_panic helper arm64: Introduce esr_brk_comment, esr_is_cfi_brk KVM: arm64: VHE: Mark __hyp_call_panic __noreturn KVM: arm64: nVHE: gen-hyprel: Skip R_AARCH64_ABS32 KVM: arm64: nVHE: Simplify invalid_host_el2_vect KVM: arm64: Fix __pkvm_init_switch_pgd call ABI KVM: arm64: Fix clobbered ELR in sync abort/SError Signed-off-by: Oliver Upton <[email protected]>
2024-07-04KVM: arm64: Replace custom macros with fields from ID_AA64PFR0_EL1Anshuman Khandual1-5/+5
This replaces custom macros usage (i.e ID_AA64PFR0_EL1_ELx_64BIT_ONLY and ID_AA64PFR0_EL1_ELx_32BIT_64BIT) and instead directly uses register fields from ID_AA64PFR0_EL1 sysreg definition. Cc: Marc Zyngier <[email protected]> Cc: Oliver Upton <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Anshuman Khandual <[email protected]> Acked-by: Marc Zyngier <[email protected]> Acked-by: Mark Rutland <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
2024-06-27KVM: arm64: Make PIR{,E0}_EL1 save/restore conditional on FEAT_TCRXMarc Zyngier1-10/+14
As per the architecture, if FEAT_S1PIE is implemented, then FEAT_TCRX must be implemented as well. Take advantage of this to avoid checking for S1PIE when TCRX isn't implemented. Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Oliver Upton <[email protected]>
2024-06-27KVM: arm64: Make TCR2_EL1 save/restore dependent on the VM featuresMarc Zyngier1-2/+13
As for other registers, save/restore of TCR2_EL1 should be gated on the feature being actually present. In the case of a nVHE hypervisor, it is perfectly fine to leave the host value in the register, as HCRX_EL2.TCREn==0 imposes that TCR2_EL1 is treated as 0. Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Oliver Upton <[email protected]>
2024-06-20KVM: arm64: nv: Load guest FP state for ZCR_EL2 trapOliver Upton1-0/+4
Round out the ZCR_EL2 gymnastics by loading SVE state in the fast path when the guest hypervisor tries to access SVE state. Reviewed-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Oliver Upton <[email protected]>
2024-06-20KVM: arm64: nv: Use guest hypervisor's max VL when running nested guestOliver Upton1-0/+12
The max VL for nested guests is additionally constrained by the max VL selected by the guest hypervisor. Use that instead of KVM's max VL when running a nested guest. Note that the guest hypervisor's ZCR_EL2 is sanitised against the VM's max VL at the time of access, so there's no additional handling required at the time of use. Reviewed-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Oliver Upton <[email protected]>
2024-06-20KVM: arm64: nv: Load guest hyp's ZCR into EL1 stateOliver Upton1-1/+2
Load the guest hypervisor's ZCR_EL2 into the corresponding EL1 register when restoring SVE state, as ZCR_EL2 affects the VL in the hypervisor context. Reviewed-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Oliver Upton <[email protected]>
2024-06-20KVM: arm64: nv: Forward SVE traps to guest hypervisorOliver Upton1-0/+2
Similar to FPSIMD traps, don't load SVE state if the guest hypervisor has SVE traps enabled and forward the trap instead. Note that ZCR_EL2 will require some special handling, as it takes a sysreg trap to EL2 when HCR_EL2.NV = 1. Reviewed-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Oliver Upton <[email protected]>
2024-06-20KVM: arm64: nv: Forward FP/ASIMD traps to guest hypervisorJintack Lim1-0/+3
Give precedence to the guest hypervisor's trap configuration when routing an FP/ASIMD trap taken to EL2. Take advantage of the infrastructure for translating CPTR_EL2 into the VHE (i.e. EL1) format and base the trap decision solely on the VHE view of the register. The in-memory value of CPTR_EL2 will always be up to date for the guest hypervisor (more on that later), so just read it directly from memory. Bury all of this behind a macro keyed off of the CPTR bitfield in anticipation of supporting other traps (e.g. SVE). [maz: account for HCR_EL2.E2H when testing for TFP/FPEN, with all the hard work actually being done by Chase Conklin] [ oliver: translate nVHE->VHE format for testing traps; macro for reuse in other CPTR_EL2.xEN fields ] Signed-off-by: Jintack Lim <[email protected]> Signed-off-by: Christoffer Dall <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Reviewed-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Oliver Upton <[email protected]>
2024-06-20KVM: arm64: Fix clobbered ELR in sync abort/SErrorPierre-Clément Tosi1-2/+3
When the hypervisor receives a SError or synchronous exception (EL2h) while running with the __kvm_hyp_vector and if ELR_EL2 doesn't point to an extable entry, it panics indirectly by overwriting ELR with the address of a panic handler in order for the asm routine it returns to to ERET into the handler. However, this clobbers ELR_EL2 for the handler itself. As a result, hyp_panic(), when retrieving what it believes to be the PC where the exception happened, actually ends up reading the address of the panic handler that called it! This results in an erroneous and confusing panic message where the source of any synchronous exception (e.g. BUG() or kCFI) appears to be __guest_exit_panic, making it hard to locate the actual BRK instruction. Therefore, store the original ELR_EL2 in the per-CPU kvm_hyp_ctxt and point the sysreg to a routine that first restores it to its previous value before running __guest_exit_panic. Fixes: 7db21530479f ("KVM: arm64: Restore hyp when panicking in guest context") Signed-off-by: Pierre-Clément Tosi <[email protected]> Acked-by: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Oliver Upton <[email protected]>
2024-06-14KVM: arm64: Update the identification range for the FF-A smcsSebastian Ene1-1/+1
The FF-A spec 1.2 reserves the following ranges for identifying FF-A calls: 0x84000060-0x840000FF: FF-A 32-bit calls 0xC4000060-0xC40000FF: FF-A 64-bit calls. Use the range identification according to the spec and allow calls that are currently out of the range(eg. FFA_MSG_SEND_DIRECT_REQ2) to be identified correctly. Acked-by: Will Deacon <[email protected]> Signed-off-by: Sebastian Ene <[email protected]> Reviewed-by: Sudeep Holla <[email protected]> Tested-by: Sudeep Holla <[email protected]> Acked-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Oliver Upton <[email protected]>
2024-06-04KVM: arm64: Consolidate initializing the host data's fpsimd_state/sve in pKVMFuad Tabba1-1/+0
Now that we have introduced finalize_init_hyp_mode(), lets consolidate the initializing of the host_data fpsimd_state and sve state. Reviewed-by: Oliver Upton <[email protected]> Signed-off-by: Fuad Tabba <[email protected]> Reviewed-by: Mark Brown <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Marc Zyngier <[email protected]>
2024-06-04KVM: arm64: Eagerly restore host fpsimd/sve state in pKVMFuad Tabba1-1/+12
When running in protected mode we don't want to leak protected guest state to the host, including whether a guest has used fpsimd/sve. Therefore, eagerly restore the host state on guest exit when running in protected mode, which happens only if the guest has used fpsimd/sve. Reviewed-by: Oliver Upton <[email protected]> Signed-off-by: Fuad Tabba <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Marc Zyngier <[email protected]>
2024-06-04KVM: arm64: Specialize handling of host fpsimd state on trapFuad Tabba1-1/+3
In subsequent patches, n/vhe will diverge on saving the host fpsimd/sve state when taking a guest fpsimd/sve trap. Add a specialized helper to handle it. No functional change intended. Reviewed-by: Mark Brown <[email protected]> Reviewed-by: Oliver Upton <[email protected]> Signed-off-by: Fuad Tabba <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Marc Zyngier <[email protected]>
2024-06-04KVM: arm64: Abstract set/clear of CPTR_EL2 bits behind helperFuad Tabba1-14/+4
The same traps controlled by CPTR_EL2 or CPACR_EL1 need to be toggled in different parts of the code, but the exact bits and their polarity differ between these two formats and the mode (vhe/nvhe/hvhe). To reduce the amount of duplicated code and the chance of getting the wrong bit/polarity or missing a field, abstract the set/clear of CPTR_EL2 bits behind a helper. Since (h)VHE is the way of the future, use the CPACR_EL1 format, which is a subset of the VHE CPTR_EL2, as a reference. No functional change intended. Suggested-by: Oliver Upton <[email protected]> Reviewed-by: Oliver Upton <[email protected]> Signed-off-by: Fuad Tabba <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Marc Zyngier <[email protected]>
2024-06-04KVM: arm64: Fix prototype for __sve_save_state/__sve_restore_stateFuad Tabba1-1/+2
Since the prototypes for __sve_save_state/__sve_restore_state at hyp were added, the underlying macro has acquired a third parameter for saving/restoring ffr. Fix the prototypes to account for the third parameter, and restore the ffr for the guest since it is saved. Suggested-by: Mark Brown <[email protected]> Signed-off-by: Fuad Tabba <[email protected]> Reviewed-by: Mark Brown <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Marc Zyngier <[email protected]>
2024-05-03Merge branch kvm-arm64/pkvm-6.10 into kvmarm-master/nextMarc Zyngier2-7/+7
* kvm-arm64/pkvm-6.10: (25 commits) : . : At last, a bunch of pKVM patches, courtesy of Fuad Tabba. : From the cover letter: : : "This series is a bit of a bombay-mix of patches we've been : carrying. There's no one overarching theme, but they do improve : the code by fixing existing bugs in pKVM, refactoring code to : make it more readable and easier to re-use for pKVM, or adding : functionality to the existing pKVM code upstream." : . KVM: arm64: Force injection of a data abort on NISV MMIO exit KVM: arm64: Restrict supported capabilities for protected VMs KVM: arm64: Refactor setting the return value in kvm_vm_ioctl_enable_cap() KVM: arm64: Document the KVM/arm64-specific calls in hypercalls.rst KVM: arm64: Rename firmware pseudo-register documentation file KVM: arm64: Reformat/beautify PTP hypercall documentation KVM: arm64: Clarify rationale for ZCR_EL1 value restored on guest exit KVM: arm64: Introduce and use predicates that check for protected VMs KVM: arm64: Add is_pkvm_initialized() helper KVM: arm64: Simplify vgic-v3 hypercalls KVM: arm64: Move setting the page as dirty out of the critical section KVM: arm64: Change kvm_handle_mmio_return() return polarity KVM: arm64: Fix comment for __pkvm_vcpu_init_traps() KVM: arm64: Prevent kmemleak from accessing .hyp.data KVM: arm64: Do not map the host fpsimd state to hyp in pKVM KVM: arm64: Rename __tlb_switch_to_{guest,host}() in VHE KVM: arm64: Support TLB invalidation in guest context KVM: arm64: Avoid BBM when changing only s/w bits in Stage-2 PTE KVM: arm64: Check for PTE validity when checking for executable/cacheable KVM: arm64: Avoid BUG-ing from the host abort path ... Signed-off-by: Marc Zyngier <[email protected]>
2024-05-03Merge branch kvm-arm64/nv-eret-pauth into kvmarm-master/nextMarc Zyngier1-60/+2
* kvm-arm64/nv-eret-pauth: : . : Add NV support for the ERETAA/ERETAB instructions. From the cover letter: : : "Although the current upstream NV support has *some* support for : correctly emulating ERET, that support is only partial as it doesn't : support the ERETAA and ERETAB variants. : : Supporting these instructions was cast aside for a long time as it : involves implementing some form of PAuth emulation, something I wasn't : overly keen on. But I have reached a point where enough of the : infrastructure is there that it actually makes sense. So here it is!" : . KVM: arm64: nv: Work around lack of pauth support in old toolchains KVM: arm64: Drop trapping of PAuth instructions/keys KVM: arm64: nv: Advertise support for PAuth KVM: arm64: nv: Handle ERETA[AB] instructions KVM: arm64: nv: Add emulation for ERETAx instructions KVM: arm64: nv: Add kvm_has_pauth() helper KVM: arm64: nv: Reinject PAC exceptions caused by HCR_EL2.API==0 KVM: arm64: nv: Handle HCR_EL2.{API,APK} independently KVM: arm64: nv: Honor HFGITR_EL2.ERET being set KVM: arm64: nv: Fast-track 'InHost' exception returns KVM: arm64: nv: Add trap forwarding for ERET and SMC KVM: arm64: nv: Configure HCR_EL2 for FEAT_NV2 KVM: arm64: nv: Drop VCPU_HYP_CONTEXT flag KVM: arm64: Constraint PAuth support to consistent implementations KVM: arm64: Add helpers for ESR_ELx_ERET_ISS_ERET* KVM: arm64: Harden __ctxt_sys_reg() against out-of-range values Signed-off-by: Marc Zyngier <[email protected]>
2024-05-01KVM: arm64: Introduce and use predicates that check for protected VMsFuad Tabba1-0/+5
In order to determine whether or not a VM or vcpu are protected, introduce helpers to query this state. While at it, use the vcpu helper to check vcpus protected state instead of the kvm one. Co-authored-by: Marc Zyngier <[email protected]> Signed-off-by: Fuad Tabba <[email protected]> Acked-by: Oliver Upton <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Marc Zyngier <[email protected]>
2024-05-01KVM: arm64: Refactor checks for FP state ownershipFuad Tabba1-1/+1
To avoid direct comparison against the fp_owner enum, add a new function that performs the check, host_owns_fp_regs(), to complement the existing guest_owns_fp_regs(). To check for fpsimd state ownership, use the helpers instead of directly using the enums. No functional change intended. Suggested-by: Marc Zyngier <[email protected]> Signed-off-by: Fuad Tabba <[email protected]> Reviewed-by: Mark Brown <[email protected]> Acked-by: Oliver Upton <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Marc Zyngier <[email protected]>
2024-05-01KVM: arm64: Move guest_owns_fp_regs() to increase its scopeFuad Tabba1-6/+0
guest_owns_fp_regs() will be used to check fpsimd state ownership across kvm/arm64. Therefore, move it to kvm_host.h to widen its scope. Moreover, the host state is not per-vcpu anymore, the vcpu parameter isn't used, so remove it as well. No functional change intended. Signed-off-by: Fuad Tabba <[email protected]> Reviewed-by: Mark Brown <[email protected]> Acked-by: Oliver Upton <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Marc Zyngier <[email protected]>
2024-05-01KVM: arm64: Initialize the kvm host data's fpsimd_state pointer in pKVMFuad Tabba1-0/+1
Since the host_fpsimd_state has been removed from kvm_vcpu_arch, it isn't pointing to the hyp's version of the host fp_regs in protected mode. Initialize the host_data fpsimd_state point to the host_data's context fp_regs on pKVM initialization. Fixes: 51e09b5572d6 ("KVM: arm64: Exclude host_fpsimd_state pointer from kvm_vcpu_arch") Signed-off-by: Fuad Tabba <[email protected]> Acked-by: Oliver Upton <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Marc Zyngier <[email protected]>
2024-04-20KVM: arm64: Drop trapping of PAuth instructions/keysMarc Zyngier1-79/+1
We currently insist on disabling PAuth on vcpu_load(), and get to enable it on first guest use of an instruction or a key (ignoring the NV case for now). It isn't clear at all what this is trying to achieve: guests tend to use PAuth when available, and nothing forces you to expose it to the guest if you don't want to. This also isn't totally free: we take a full GPR save/restore between host and guest, only to write ten 64bit registers. The "value proposition" escapes me. So let's forget this stuff and enable PAuth eagerly if exposed to the guest. This results in much simpler code. Performance wise, that's not bad either (tested on M2 Pro running a fully automated Debian installer as the workload): - On a non-NV guest, I can see reduction of 0.24% in the number of cycles (measured with perf over 10 consecutive runs) - On a NV guest (L2), I see a 2% reduction in wall-clock time (measured with 'time', as M2 doesn't have a PMUv3 and NV doesn't support it either) So overall, a much reduced complexity and a (small) performance improvement. Reviewed-by: Oliver Upton <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Marc Zyngier <[email protected]>
2024-04-20KVM: arm64: nv: Handle HCR_EL2.{API,APK} independentlyMarc Zyngier1-5/+27
Although KVM couples API and APK for simplicity, the architecture makes no such requirement, and the two can be independently set or cleared. Check for which of the two possible reasons we have trapped here, and if the corresponding L1 control bit isn't set, delegate the handling for forwarding. Otherwise, set this exact bit in HCR_EL2 and resume the guest. Of course, in the non-NV case, we keep setting both bits and be done with it. Note that the entry core already saves/restores the keys should any of the two control bits be set. This results in a bit of rework, and the removal of the (trivial) vcpu_ptrauth_enable() helper. Reviewed-by: Oliver Upton <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Marc Zyngier <[email protected]>
2024-04-20KVM: arm64: nv: Configure HCR_EL2 for FEAT_NV2Marc Zyngier1-3/+1
Add the HCR_EL2 configuration for FEAT_NV2, adding the required bits for running a guest hypervisor, and overall merging the allowed bits provided by the guest. This heavily replies on unavaliable features being sanitised when the HCR_EL2 shadow register is accessed, and only a couple of bits must be explicitly disabled. Non-NV guests are completely unaffected by any of this. Reviewed-by: Joey Gouly <[email protected]> Reviewed-by: Oliver Upton <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Marc Zyngier <[email protected]>
2024-04-12KVM: arm64: Exclude FP ownership from kvm_vcpu_archMarc Zyngier1-3/+3
In retrospect, it is fairly obvious that the FP state ownership is only meaningful for a given CPU, and that locating this information in the vcpu was just a mistake. Move the ownership tracking into the host data structure, and rename it from fp_state to fp_owner, which is a better description (name suggested by Mark Brown). Reviewed-by: Mark Brown <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
2024-04-12KVM: arm64: Exclude host_fpsimd_state pointer from kvm_vcpu_archMarc Zyngier1-1/+1
As the name of the field indicates, host_fpsimd_state is strictly a host piece of data, and we reset this pointer on each PID change. So let's move it where it belongs, and set it at load-time. Although this is slightly more often, it is a well defined life-cycle which matches other pieces of data. Reviewed-by: Mark Brown <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
2024-04-12KVM: arm64: Exclude mdcr_el2_host from kvm_vcpu_archMarc Zyngier1-2/+2
As for the rest of the host debug state, the host copy of mdcr_el2 has little to do in the vcpu, and is better placed in the host_data structure. Reviewed-by : Suzuki K Poulose <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
2024-04-12KVM: arm64: Exclude host_debug_data from vcpu_archMarc Zyngier1-2/+2
Keeping host_debug_state on a per-vcpu basis is completely pointless. The lifetime of this data is only that of the inner run-loop, which means it is never accessed outside of the core EL2 code. Move the structure into kvm_host_data, and save over 500 bytes per vcpu. Reviewed-by: Suzuki K Poulose <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
2024-04-12KVM: arm64: Add accessor for per-CPU stateMarc Zyngier2-6/+6
In order to facilitate the introduction of new per-CPU state, add a new host_data_ptr() helped that hides some of the per-CPU verbosity, and make it easier to move that state around in the future. Reviewed-by: Suzuki K Poulose <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
2024-02-19KVM: arm64: Make FEAT_MOPS UNDEF if not advertised to the guestMarc Zyngier1-1/+1
We unconditionally enable FEAT_MOPS, which is obviously wrong. So let's only do that when it is advertised to the guest. Which means we need to rely on a per-vcpu HCRX_EL2 shadow register. Signed-off-by: Marc Zyngier <[email protected]> Reviewed-by: Joey Gouly <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Oliver Upton <[email protected]>
2024-02-19KVM: arm64: Make PIR{,E0}_EL1 UNDEF if S1PIE is not advertised to the guestMarc Zyngier1-3/+21
As part of the ongoing effort to honor the guest configuration, add the necessary checks to make PIR_EL1 and co UNDEF if not advertised to the guest, and avoid context switching them. Signed-off-by: Marc Zyngier <[email protected]> Reviewed-by: Joey Gouly <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Oliver Upton <[email protected]>
2024-02-19KVM: arm64: Streamline save/restore of HFG[RW]TR_EL2Marc Zyngier1-33/+9
The way we save/restore HFG[RW]TR_EL2 can now be simplified, and the Ampere erratum hack is the only thing that still stands out. Reviewed-by: Joey Gouly <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Oliver Upton <[email protected]>
2024-02-19KVM: arm64: Move existing feature disabling over to FGU infrastructureMarc Zyngier1-14/+3
We already trap a bunch of existing features for the purpose of disabling them (MAIR2, POR, ACCDATA, SME...). Let's move them over to our brand new FGU infrastructure. Reviewed-by: Joey Gouly <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Oliver Upton <[email protected]>
2024-02-19KVM: arm64: Propagate and handle Fine-Grained UNDEF bitsMarc Zyngier1-20/+61
In order to correctly honor our FGU bits, they must be converted into a set of FGT bits. They get merged as part of the existing FGT setting. Similarly, the UNDEF injection phase takes place when handling the trap. This results in a bit of rework in the FGT macros in order to help with the code generation, as burying per-CPU accesses in macros results in a lot of expansion, not to mention the vcpu->kvm access on nvhe (kern_hyp_va() is not optimisation-friendly). Reviewed-by: Joey Gouly <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Oliver Upton <[email protected]>
2024-01-17Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds3-36/+81
Pull kvm updates from Paolo Bonzini: "Generic: - Use memdup_array_user() to harden against overflow. - Unconditionally advertise KVM_CAP_DEVICE_CTRL for all architectures. - Clean up Kconfigs that all KVM architectures were selecting - New functionality around "guest_memfd", a new userspace API that creates an anonymous file and returns a file descriptor that refers to it. guest_memfd files are bound to their owning virtual machine, cannot be mapped, read, or written by userspace, and cannot be resized. guest_memfd files do however support PUNCH_HOLE, which can be used to switch a memory area between guest_memfd and regular anonymous memory. - New ioctl KVM_SET_MEMORY_ATTRIBUTES allowing userspace to specify per-page attributes for a given page of guest memory; right now the only attribute is whether the guest expects to access memory via guest_memfd or not, which in Confidential SVMs backed by SEV-SNP, TDX or ARM64 pKVM is checked by firmware or hypervisor that guarantees confidentiality (AMD PSP, Intel TDX module, or EL2 in the case of pKVM). x86: - Support for "software-protected VMs" that can use the new guest_memfd and page attributes infrastructure. This is mostly useful for testing, since there is no pKVM-like infrastructure to provide a meaningfully reduced TCB. - Fix a relatively benign off-by-one error when splitting huge pages during CLEAR_DIRTY_LOG. - Fix a bug where KVM could incorrectly test-and-clear dirty bits in non-leaf TDP MMU SPTEs if a racing thread replaces a huge SPTE with a non-huge SPTE. - Use more generic lockdep assertions in paths that don't actually care about whether the caller is a reader or a writer. - let Xen guests opt out of having PV clock reported as "based on a stable TSC", because some of them don't expect the "TSC stable" bit (added to the pvclock ABI by KVM, but never set by Xen) to be set. - Revert a bogus, made-up nested SVM consistency check for TLB_CONTROL. - Advertise flush-by-ASID support for nSVM unconditionally, as KVM always flushes on nested transitions, i.e. always satisfies flush requests. This allows running bleeding edge versions of VMware Workstation on top of KVM. - Sanity check that the CPU supports flush-by-ASID when enabling SEV support. - On AMD machines with vNMI, always rely on hardware instead of intercepting IRET in some cases to detect unmasking of NMIs - Support for virtualizing Linear Address Masking (LAM) - Fix a variety of vPMU bugs where KVM fail to stop/reset counters and other state prior to refreshing the vPMU model. - Fix a double-overflow PMU bug by tracking emulated counter events using a dedicated field instead of snapshotting the "previous" counter. If the hardware PMC count triggers overflow that is recognized in the same VM-Exit that KVM manually bumps an event count, KVM would pend PMIs for both the hardware-triggered overflow and for KVM-triggered overflow. - Turn off KVM_WERROR by default for all configs so that it's not inadvertantly enabled by non-KVM developers, which can be problematic for subsystems that require no regressions for W=1 builds. - Advertise all of the host-supported CPUID bits that enumerate IA32_SPEC_CTRL "features". - Don't force a masterclock update when a vCPU synchronizes to the current TSC generation, as updating the masterclock can cause kvmclock's time to "jump" unexpectedly, e.g. when userspace hotplugs a pre-created vCPU. - Use RIP-relative address to read kvm_rebooting in the VM-Enter fault paths, partly as a super minor optimization, but mostly to make KVM play nice with position independent executable builds. - Guard KVM-on-HyperV's range-based TLB flush hooks with an #ifdef on CONFIG_HYPERV as a minor optimization, and to self-document the code. - Add CONFIG_KVM_HYPERV to allow disabling KVM support for HyperV "emulation" at build time. ARM64: - LPA2 support, adding 52bit IPA/PA capability for 4kB and 16kB base granule sizes. Branch shared with the arm64 tree. - Large Fine-Grained Trap rework, bringing some sanity to the feature, although there is more to come. This comes with a prefix branch shared with the arm64 tree. - Some additional Nested Virtualization groundwork, mostly introducing the NV2 VNCR support and retargetting the NV support to that version of the architecture. - A small set of vgic fixes and associated cleanups. Loongarch: - Optimization for memslot hugepage checking - Cleanup and fix some HW/SW timer issues - Add LSX/LASX (128bit/256bit SIMD) support RISC-V: - KVM_GET_REG_LIST improvement for vector registers - Generate ISA extension reg_list using macros in get-reg-list selftest - Support for reporting steal time along with selftest s390: - Bugfixes Selftests: - Fix an annoying goof where the NX hugepage test prints out garbage instead of the magic token needed to run the test. - Fix build errors when a header is delete/moved due to a missing flag in the Makefile. - Detect if KVM bugged/killed a selftest's VM and print out a helpful message instead of complaining that a random ioctl() failed. - Annotate the guest printf/assert helpers with __printf(), and fix the various bugs that were lurking due to lack of said annotation" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (185 commits) x86/kvm: Do not try to disable kvmclock if it was not enabled KVM: x86: add missing "depends on KVM" KVM: fix direction of dependency on MMU notifiers KVM: introduce CONFIG_KVM_COMMON KVM: arm64: Add missing memory barriers when switching to pKVM's hyp pgd KVM: arm64: vgic-its: Avoid potential UAF in LPI translation cache RISC-V: KVM: selftests: Add get-reg-list test for STA registers RISC-V: KVM: selftests: Add steal_time test support RISC-V: KVM: selftests: Add guest_sbi_probe_extension RISC-V: KVM: selftests: Move sbi_ecall to processor.c RISC-V: KVM: Implement SBI STA extension RISC-V: KVM: Add support for SBI STA registers RISC-V: KVM: Add support for SBI extension registers RISC-V: KVM: Add SBI STA info to vcpu_arch RISC-V: KVM: Add steal-update vcpu request RISC-V: KVM: Add SBI STA extension skeleton RISC-V: paravirt: Implement steal-time support RISC-V: Add SBI STA extension definitions RISC-V: paravirt: Add skeleton for pv-time support RISC-V: KVM: Fix indentation in kvm_riscv_vcpu_set_reg_csr() ...
2024-01-08mm, treewide: introduce NR_PAGE_ORDERSKirill A. Shutemov1-1/+1
NR_PAGE_ORDERS defines the number of page orders supported by the page allocator, ranging from 0 to MAX_ORDER, MAX_ORDER + 1 in total. NR_PAGE_ORDERS assists in defining arrays of page orders and allows for more natural iteration over them. [[email protected]: fixup for kerneldoc warning] Link: https://lkml.kernel.org/r/20240101111512.7empzyifq7kxtzk3@box Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Kirill A. Shutemov <[email protected]> Reviewed-by: Zi Yan <[email protected]> Cc: Linus Torvalds <[email protected]> Signed-off-by: Andrew Morton <[email protected]>