Age | Commit message (Collapse) | Author | Files | Lines |
|
Our idmap is becoming too big, to the point where it doesn't fit in
a 4kB page anymore.
There are some low-hanging fruits though, such as the el2_init_state
horror that is expanded 3 times in the kernel. Let's at least limit
ourselves to two copies, which makes the kernel link again.
At some point, we'll have to have a better way of doing this.
Reported-by: Nathan Chancellor <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/20241009204903.GA3353168@thelio-3990X
|
|
When pKVM saves and restores the host floating point state on a SVE system,
it programs the vector length in ZCR_EL2.LEN to be whatever the maximum VL
for the PE is. But it uses a buffer allocated with kvm_host_sve_max_vl, the
maximum VL shared by all PEs in the system. This means that if we run on a
system where the maximum VLs are not consistent, we will overflow the buffer
on PEs which support larger VLs.
Since the host will not currently attempt to make use of non-shared VLs, fix
this by explicitly setting the EL2 VL to be the maximum shared VL when we
save and restore. This will enforce the limit on host VL usage. Should we
wish to support asymmetric VLs, this code will need to be updated along with
the required changes for the host:
https://lore.kernel.org/r/[email protected]
Fixes: b5b9955617bc ("KVM: arm64: Eagerly restore host fpsimd/sve state in pKVM")
Signed-off-by: Mark Brown <[email protected]>
Tested-by: Fuad Tabba <[email protected]>
Reviewed-by: Fuad Tabba <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[maz: added punctuation to the commit message]
Signed-off-by: Marc Zyngier <[email protected]>
|
|
On an error, hyp_vcpu will be accessed while this memory has already
been relinquished to the host and unmapped from the hypervisor. Protect
the CPTR assignment with an early return.
Fixes: b5b9955617bc ("KVM: arm64: Eagerly restore host fpsimd/sve state in pKVM")
Reviewed-by: Oliver Upton <[email protected]>
Signed-off-by: Vincent Donnefort <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Zyngier <[email protected]>
|
|
Pull kvm updates from Paolo Bonzini:
"These are the non-x86 changes (mostly ARM, as is usually the case).
The generic and x86 changes will come later"
ARM:
- New Stage-2 page table dumper, reusing the main ptdump
infrastructure
- FP8 support
- Nested virtualization now supports the address translation
(FEAT_ATS1A) family of instructions
- Add selftest checks for a bunch of timer emulation corner cases
- Fix multiple cases where KVM/arm64 doesn't correctly handle the
guest trying to use a GICv3 that wasn't advertised
- Remove REG_HIDDEN_USER from the sysreg infrastructure, making
things little simpler
- Prevent MTE tags being restored by userspace if we are actively
logging writes, as that's a recipe for disaster
- Correct the refcount on a page that is not considered for MTE tag
copying (such as a device)
- When walking a page table to split block mappings, synchronize only
at the end the walk rather than on every store
- Fix boundary check when transfering memory using FFA
- Fix pKVM TLB invalidation, only affecting currently out of tree
code but worth addressing for peace of mind
LoongArch:
- Revert qspinlock to test-and-set simple lock on VM.
- Add Loongson Binary Translation extension support.
- Add PMU support for guest.
- Enable paravirt feature control from VMM.
- Implement function kvm_para_has_feature().
RISC-V:
- Fix sbiret init before forwarding to userspace
- Don't zero-out PMU snapshot area before freeing data
- Allow legacy PMU access from guest
- Fix to allow hpmcounter31 from the guest"
* tag 'for-linus-non-x86' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (64 commits)
LoongArch: KVM: Implement function kvm_para_has_feature()
LoongArch: KVM: Enable paravirt feature control from VMM
LoongArch: KVM: Add PMU support for guest
KVM: arm64: Get rid of REG_HIDDEN_USER visibility qualifier
KVM: arm64: Simplify visibility handling of AArch32 SPSR_*
KVM: arm64: Simplify handling of CNTKCTL_EL12
LoongArch: KVM: Add vm migration support for LBT registers
LoongArch: KVM: Add Binary Translation extension support
LoongArch: KVM: Add VM feature detection function
LoongArch: Revert qspinlock to test-and-set simple lock on VM
KVM: arm64: Register ptdump with debugfs on guest creation
arm64: ptdump: Don't override the level when operating on the stage-2 tables
arm64: ptdump: Use the ptdump description from a local context
arm64: ptdump: Expose the attribute parsing functionality
KVM: arm64: Add memory length checks and remove inline in do_ffa_mem_xfer
KVM: arm64: Move pagetable definitions to common header
KVM: arm64: nv: Add support for FEAT_ATS1A
KVM: arm64: nv: Plumb handling of AT S1* traps from EL2
KVM: arm64: nv: Make AT+PAN instructions aware of FEAT_PAN3
KVM: arm64: nv: Sanitise SCTLR_EL1.EPAN according to VM configuration
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Will Deacon:
"The highlights are support for Arm's "Permission Overlay Extension"
using memory protection keys, support for running as a protected guest
on Android as well as perf support for a bunch of new interconnect
PMUs.
Summary:
ACPI:
- Enable PMCG erratum workaround for HiSilicon HIP10 and 11
platforms.
- Ensure arm64-specific IORT header is covered by MAINTAINERS.
CPU Errata:
- Enable workaround for hardware access/dirty issue on Ampere-1A
cores.
Memory management:
- Define PHYSMEM_END to fix a crash in the amdgpu driver.
- Avoid tripping over invalid kernel mappings on the kexec() path.
- Userspace support for the Permission Overlay Extension (POE) using
protection keys.
Perf and PMUs:
- Add support for the "fixed instruction counter" extension in the
CPU PMU architecture.
- Extend and fix the event encodings for Apple's M1 CPU PMU.
- Allow LSM hooks to decide on SPE permissions for physical
profiling.
- Add support for the CMN S3 and NI-700 PMUs.
Confidential Computing:
- Add support for booting an arm64 kernel as a protected guest under
Android's "Protected KVM" (pKVM) hypervisor.
Selftests:
- Fix vector length issues in the SVE/SME sigreturn tests
- Fix build warning in the ptrace tests.
Timers:
- Add support for PR_{G,S}ET_TSC so that 'rr' can deal with
non-determinism arising from the architected counter.
Miscellaneous:
- Rework our IPI-based CPU stopping code to try NMIs if regular IPIs
don't succeed.
- Minor fixes and cleanups"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (94 commits)
perf: arm-ni: Fix an NULL vs IS_ERR() bug
arm64: hibernate: Fix warning for cast from restricted gfp_t
arm64: esr: Define ESR_ELx_EC_* constants as UL
arm64: pkeys: remove redundant WARN
perf: arm_pmuv3: Use BR_RETIRED for HW branch event if enabled
MAINTAINERS: List Arm interconnect PMUs as supported
perf: Add driver for Arm NI-700 interconnect PMU
dt-bindings/perf: Add Arm NI-700 PMU
perf/arm-cmn: Improve format attr printing
perf/arm-cmn: Clean up unnecessary NUMA_NO_NODE check
arm64/mm: use lm_alias() with addresses passed to memblock_free()
mm: arm64: document why pte is not advanced in contpte_ptep_set_access_flags()
arm64: Expose the end of the linear map in PHYSMEM_END
arm64: trans_pgd: mark PTEs entries as valid to avoid dead kexec()
arm64/mm: Delete __init region from memblock.reserved
perf/arm-cmn: Support CMN S3
dt-bindings: perf: arm-cmn: Add CMN S3
perf/arm-cmn: Refactor DTC PMU register access
perf/arm-cmn: Make cycle counts less surprising
perf/arm-cmn: Improve build-time assertion
...
|
|
* kvm-arm64/s2-ptdump:
: .
: Stage-2 page table dumper, reusing the main ptdump infrastructure,
: courtesy of Sebastian Ene. From the cover letter:
:
: "This series extends the ptdump support to allow dumping the guest
: stage-2 pagetables. When CONFIG_PTDUMP_STAGE2_DEBUGFS is enabled, ptdump
: registers the new following files under debugfs:
: - /sys/debug/kvm/<guest_id>/stage2_page_tables
: - /sys/debug/kvm/<guest_id>/stage2_levels
: - /sys/debug/kvm/<guest_id>/ipa_range
:
: This allows userspace tools (eg. cat) to dump the stage-2 pagetables by
: reading the 'stage2_page_tables' file.
: [...]"
: .
KVM: arm64: Register ptdump with debugfs on guest creation
arm64: ptdump: Don't override the level when operating on the stage-2 tables
arm64: ptdump: Use the ptdump description from a local context
arm64: ptdump: Expose the attribute parsing functionality
KVM: arm64: Move pagetable definitions to common header
Signed-off-by: Marc Zyngier <[email protected]>
|
|
* kvm-arm64/nv-at-pan:
: .
: Add NV support for the AT family of instructions, which mostly results
: in adding a page table walker that deals with most of the complexity
: of the architecture.
:
: From the cover letter:
:
: "Another task that a hypervisor supporting NV on arm64 has to deal with
: is to emulate the AT instruction, because we multiplex all the S1
: translations on a single set of registers, and the guest S2 is never
: truly resident on the CPU.
:
: So given that we lie about page tables, we also have to lie about
: translation instructions, hence the emulation. Things are made
: complicated by the fact that guest S1 page tables can be swapped out,
: and that our shadow S2 is likely to be incomplete. So while using AT
: to emulate AT is tempting (and useful), it is not going to always
: work, and we thus need a fallback in the shape of a SW S1 walker."
: .
KVM: arm64: nv: Add support for FEAT_ATS1A
KVM: arm64: nv: Plumb handling of AT S1* traps from EL2
KVM: arm64: nv: Make AT+PAN instructions aware of FEAT_PAN3
KVM: arm64: nv: Sanitise SCTLR_EL1.EPAN according to VM configuration
KVM: arm64: nv: Add SW walker for AT S1 emulation
KVM: arm64: nv: Make ps_to_output_size() generally available
KVM: arm64: nv: Add emulation of AT S12E{0,1}{R,W}
KVM: arm64: nv: Add basic emulation of AT S1E2{R,W}
KVM: arm64: nv: Add basic emulation of AT S1E1{R,W}P
KVM: arm64: nv: Add basic emulation of AT S1E{0,1}{R,W}
KVM: arm64: nv: Honor absence of FEAT_PAN2
KVM: arm64: nv: Turn upper_attr for S2 walk into the full descriptor
KVM: arm64: nv: Enforce S2 alignment when contiguous bit is set
arm64: Add ESR_ELx_FSC_ADDRSZ_L() helper
arm64: Add system register encoding for PSTATE.PAN
arm64: Add PAR_EL1 field description
arm64: Add missing APTable and TCR_ELx.HPD masks
KVM: arm64: Make kvm_at() take an OP_AT_*
Signed-off-by: Marc Zyngier <[email protected]>
# Conflicts:
# arch/arm64/kvm/nested.c
|
|
* kvm-arm64/vgic-sre-traps:
: .
: Fix the multiple of cases where KVM/arm64 doesn't correctly
: handle the guest trying to use a GICv3 that isn't advertised.
:
: From the cover letter:
:
: "It recently appeared that, when running on a GICv3-equipped platform
: (which is what non-ancient arm64 HW has), *not* configuring a GICv3
: for the guest could result in less than desirable outcomes.
:
: We have multiple issues to fix:
:
: - for registers that *always* trap (the SGI registers) or that *may*
: trap (the SRE register), we need to check whether a GICv3 has been
: instantiated before acting upon the trap.
:
: - for registers that only conditionally trap, we must actively trap
: them even in the absence of a GICv3 being instantiated, and handle
: those traps accordingly.
:
: - finally, ID registers must reflect the absence of a GICv3, so that
: we are consistent.
:
: This series goes through all these requirements. The main complexity
: here is to apply a GICv3 configuration on the host in the absence of a
: GICv3 in the guest. This is pretty hackish, but I don't have a much
: better solution so far.
:
: As part of making wider use of of the trap bits, we fully define the
: trap routing as per the architecture, something that we eventually
: need for NV anyway."
: .
KVM: arm64: selftests: Cope with lack of GICv3 in set_id_regs
KVM: arm64: Add selftest checking how the absence of GICv3 is handled
KVM: arm64: Unify UNDEF injection helpers
KVM: arm64: Make most GICv3 accesses UNDEF if they trap
KVM: arm64: Honor guest requested traps in GICv3 emulation
KVM: arm64: Add trap routing information for ICH_HCR_EL2
KVM: arm64: Add ICH_HCR_EL2 to the vcpu state
KVM: arm64: Zero ID_AA64PFR0_EL1.GIC when no GICv3 is presented to the guest
KVM: arm64: Add helper for last ditch idreg adjustments
KVM: arm64: Force GICv3 trap activation when no irqchip is configured on VHE
KVM: arm64: Force SRE traps when SRE access is not enabled
KVM: arm64: Move GICv3 trap configuration to kvm_calculate_traps()
Signed-off-by: Marc Zyngier <[email protected]>
|
|
* kvm-arm64/fpmr:
: .
: Add FP8 support to the KVM/arm64 floating point handling.
:
: This includes new ID registers (ID_AA64PFR2_EL1 ID_AA64FPFR0_EL1)
: being made visible to guests, as well as a new confrol register
: (FPMR) which gets context-switched.
: .
KVM: arm64: Expose ID_AA64PFR2_EL1 to userspace and guests
KVM: arm64: Enable FP8 support when available and configured
KVM: arm64: Expose ID_AA64FPFR0_EL1 as a writable ID reg
KVM: arm64: Honor trap routing for FPMR
KVM: arm64: Add save/restore support for FPMR
KVM: arm64: Move FPMR into the sysreg array
KVM: arm64: Add predicate for FPMR support in a VM
KVM: arm64: Move SVCR into the sysreg array
Signed-off-by: Marc Zyngier <[email protected]>
|
|
* kvm-arm64/mmu-misc-6.12:
: .
: Various minor MMU improvements and bug-fixes:
:
: - Prevent MTE tags being restored by userspace if we are actively
: logging writes, as that's a recipe for disaster
:
: - Correct the refcount on a page that is not considered for MTE
: tag copying (such as a device)
:
: - When walking a page table to split blocks, keep the DSB at the end
: the walk, as there is no need to perform it on every store.
:
: - Fix boundary check when transfering memory using FFA
: .
KVM: arm64: Add memory length checks and remove inline in do_ffa_mem_xfer
KVM: arm64: Disallow copying MTE to guest memory while KVM is dirty logging
KVM: arm64: Release pfn, i.e. put page, if copying MTE tags hits ZONE_DEVICE
KVM: arm64: Move data barrier to end of split walk
Signed-off-by: Marc Zyngier <[email protected]>
|
|
When we share memory through FF-A and the description of the buffers
exceeds the size of the mapped buffer, the fragmentation API is used.
The fragmentation API allows specifying chunks of descriptors in subsequent
FF-A fragment calls and no upper limit has been established for this.
The entire memory region transferred is identified by a handle which can be
used to reclaim the transferred memory.
To be able to reclaim the memory, the description of the buffers has to fit
in the ffa_desc_buf.
Add a bounds check on the FF-A sharing path to prevent the memory reclaim
from failing.
Also do_ffa_mem_xfer() does not need __always_inline, except for the
BUILD_BUG_ON() aspect, which gets moved to a macro.
[maz: fixed the BUILD_BUG_ON() breakage with LLVM, thanks to Wei-Lin Chang
for the timely report]
Fixes: 634d90cf0ac65 ("KVM: arm64: Handle FFA_MEM_LEND calls from the host")
Cc: [email protected]
Reviewed-by: Sebastian Ene <[email protected]>
Signed-off-by: Snehal Koukuntla <[email protected]>
Reviewed-by: Oliver Upton <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Zyngier <[email protected]>
|
|
In preparation for using the stage-2 definitions in ptdump, move some of
these macros in the common header.
Signed-off-by: Sebastian Ene <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Zyngier <[email protected]>
|
|
FEAT_ATS1E1A introduces a new instruction: `at s1e1a`.
This is an address translation, without permission checks.
POE allows read permissions to be removed from S1 by the guest. This means
that an `at` instruction could fail, and not get the IPA.
Switch to using `at s1e1a` so that KVM can get the IPA regardless of S1
permissions.
Signed-off-by: Joey Gouly <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Oliver Upton <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Reviewed-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
|
|
Define the new system registers that POE introduces and context switch them.
Signed-off-by: Joey Gouly <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Oliver Upton <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Reviewed-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Will Deacon <[email protected]>
|
|
To allow using newer instructions that current assemblers don't know about,
replace the `at` instruction with the underlying SYS instruction.
Signed-off-by: Joey Gouly <[email protected]>
Cc: Oliver Upton <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Will Deacon <[email protected]>
Reviewed-by: Marc Zyngier <[email protected]>
Reviewed-by: Anshuman Khandual <[email protected]>
Acked-by: Will Deacon <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
|
|
We don't expect to trap any GICv3 register for host handling,
apart from ICC_SRE_EL1 and the SGI registers. If they trap,
that's because the guest is playing with us despite being
told it doesn't have a GICv3.
If it does, UNDEF is what it will get.
Reviewed-by: Oliver Upton <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Zyngier <[email protected]>
|
|
On platforms that require emulation of the CPU interface, we still
need to honor the traps requested by the guest (ICH_HCR_EL2 as well
as the FGTs for ICC_IGRPEN{0,1}_EL1.
Check for these bits early and lail out if any trap applies.
Reviewed-by: Oliver Upton <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Zyngier <[email protected]>
|
|
We so far only write the ICH_HCR_EL2 config in two situations:
- when we need to emulate the GICv3 CPU interface due to HW bugs
- when we do direct injection, as the virtual CPU interface needs
to be enabled
This is all good. But it also means that we don't do anything special
when we emulate a GICv2, or that there is no GIC at all.
What happens in this case when the guest uses the GICv3 system
registers? The *guest* gets a trap for a sysreg access (EC=0x18)
while we'd really like it to get an UNDEF.
Fixing this is a bit involved:
- we need to set all the required trap bits (TC, TALL0, TALL1, TDIR)
- for these traps to take effect, we need to (counter-intuitively)
set ICC_SRE_EL1.SRE to 1 so that the above traps take priority.
Note that doesn't fully work when GICv2 emulation is enabled, as
we cannot set ICC_SRE_EL1.SRE to 1 (it breaks Group0 delivery as
IRQ).
Reviewed-by: Oliver Upton <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Zyngier <[email protected]>
|
|
* kvm-arm64/tlbi-fixes-6.12:
: .
: A couple of TLB invalidation fixes, only affecting pKVM
: out of tree, courtesy of Will Deacon.
: .
KVM: arm64: Ensure TLBI uses correct VMID after changing context
KVM: arm64: Invalidate EL1&0 TLB entries for all VMIDs in nvhe hyp init
Signed-off-by: Marc Zyngier <[email protected]>
|
|
Just like the rest of the FP/SIMD state, FPMR needs to be context
switched.
The only interesting thing here is that we need to treat the pKVM
part a bit differently, as the host FP state is never written back
to the vcpu thread, but instead stored locally and eagerly restored.
Reviewed-by: Mark Brown <[email protected]>
Tested-by: Mark Brown <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Zyngier <[email protected]>
|
|
When the target context passed to enter_vmid_context() matches the
current running context, the function returns early without manipulating
the registers of the stage-2 MMU. This can result in a stale VMID due to
the lack of an ISB instruction in exit_vmid_context() after writing the
VTTBR when ARM64_WORKAROUND_SPECULATIVE_AT is not enabled.
For example, with pKVM enabled:
// Initially running in host context
enter_vmid_context(guest);
-> __load_stage2(guest); isb // Writes VTCR & VTTBR
exit_vmid_context(guest);
-> __load_stage2(host); // Restores VTCR & VTTBR
enter_vmid_context(host);
-> Returns early as we're already in host context
tlbi vmalls12e1is // !!! Can use the stale VMID as we
// haven't performed context
// synchronisation since restoring
// VTTBR.VMID
Add an unconditional ISB instruction to exit_vmid_context() after
restoring the VTTBR. This already existed for the
ARM64_WORKAROUND_SPECULATIVE_AT path, so we can simply hoist that onto
the common path.
Cc: Marc Zyngier <[email protected]>
Cc: Oliver Upton <[email protected]>
Cc: Fuad Tabba <[email protected]>
Fixes: 58f3b0fc3b87 ("KVM: arm64: Support TLB invalidation in guest context")
Signed-off-by: Will Deacon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Zyngier <[email protected]>
|
|
When initialising the nVHE hypervisor, we invalidate potentially stale
TLB entries for the EL1&0 regime using a 'vmalls12e1' invalidation.
However, this invalidation operation applies only to the active VMID
and therefore we could proceed with stale TLB entries for other VMIDs.
Replace the operation with an 'alle1' which applies to all entries for
the EL1&0 regime, regardless of the VMID.
Cc: Marc Zyngier <[email protected]>
Cc: Oliver Upton <[email protected]>
Fixes: 1025c8c0c6ac ("KVM: arm64: Wrap the host with a stage 2")
Signed-off-by: Will Deacon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Zyngier <[email protected]>
|
|
This DSB guarantees page table updates have been made visible to the
hardware table walker. Moving the DSB from stage2_split_walker() to
after the walk is finished in kvm_pgtable_stage2_split() results in a
roughly 70% reduction in Clear Dirty Log Time in
dirty_log_perf_test (modified to use eager page splitting) when using
huge pages. This gain holds steady through a range of vcpus
used (tested 1-64) and memory used (tested 1-64GB).
This is safe to do because nothing else is using the page tables while
they are still being mapped and this is how other page table walkers
already function. None of them have a data barrier in the walker
itself because relative ordering of table PTEs to table contents comes
from the release semantics of stage2_make_pte().
Signed-off-by: Colton Lewis <[email protected]>
Acked-by: Oliver Upton <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Marc Zyngier <[email protected]>
|
|
Tidy up some of the PAuth trapping code to clear up some comments
and avoid clang/checkpatch warnings. Also, don't bother setting
PAuth HCR_EL2 bits in pKVM, since it's handled by the hypervisor.
Signed-off-by: Fuad Tabba <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Oliver Upton <[email protected]>
|
|
Add -Wno-override-init to the build flags for sys_regs.c,
handle_exit.c, and switch.c to fix warnings like the following:
arch/arm64/kvm/hyp/vhe/switch.c:271:43: warning: initialized field overwritten [-Woverride-init]
271 | [ESR_ELx_EC_CP15_32] = kvm_hyp_handle_cp15_32,
|
Signed-off-by: Sebastian Ott <[email protected]>
Reviewed-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Oliver Upton <[email protected]>
|
|
Pull kvm updates from Paolo Bonzini:
"ARM:
- Initial infrastructure for shadow stage-2 MMUs, as part of nested
virtualization enablement
- Support for userspace changes to the guest CTR_EL0 value, enabling
(in part) migration of VMs between heterogenous hardware
- Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1
of the protocol
- FPSIMD/SVE support for nested, including merged trap configuration
and exception routing
- New command-line parameter to control the WFx trap behavior under
KVM
- Introduce kCFI hardening in the EL2 hypervisor
- Fixes + cleanups for handling presence/absence of FEAT_TCRX
- Miscellaneous fixes + documentation updates
LoongArch:
- Add paravirt steal time support
- Add support for KVM_DIRTY_LOG_INITIALLY_SET
- Add perf kvm-stat support for loongarch
RISC-V:
- Redirect AMO load/store access fault traps to guest
- perf kvm stat support
- Use guest files for IMSIC virtualization, when available
s390:
- Assortment of tiny fixes which are not time critical
x86:
- Fixes for Xen emulation
- Add a global struct to consolidate tracking of host values, e.g.
EFER
- Add KVM_CAP_X86_APIC_BUS_CYCLES_NS to allow configuring the
effective APIC bus frequency, because TDX
- Print the name of the APICv/AVIC inhibits in the relevant
tracepoint
- Clean up KVM's handling of vendor specific emulation to
consistently act on "compatible with Intel/AMD", versus checking
for a specific vendor
- Drop MTRR virtualization, and instead always honor guest PAT on
CPUs that support self-snoop
- Update to the newfangled Intel CPU FMS infrastructure
- Don't advertise IA32_PERF_GLOBAL_OVF_CTRL as an MSR-to-be-saved, as
it reads '0' and writes from userspace are ignored
- Misc cleanups
x86 - MMU:
- Small cleanups, renames and refactoring extracted from the upcoming
Intel TDX support
- Don't allocate kvm_mmu_page.shadowed_translation for shadow pages
that can't hold leafs SPTEs
- Unconditionally drop mmu_lock when allocating TDP MMU page tables
for eager page splitting, to avoid stalling vCPUs when splitting
huge pages
- Bug the VM instead of simply warning if KVM tries to split a SPTE
that is non-present or not-huge. KVM is guaranteed to end up in a
broken state because the callers fully expect a valid SPTE, it's
all but dangerous to let more MMU changes happen afterwards
x86 - AMD:
- Make per-CPU save_area allocations NUMA-aware
- Force sev_es_host_save_area() to be inlined to avoid calling into
an instrumentable function from noinstr code
- Base support for running SEV-SNP guests. API-wise, this includes a
new KVM_X86_SNP_VM type, encrypting/measure the initial image into
guest memory, and finalizing it before launching it. Internally,
there are some gmem/mmu hooks needed to prepare gmem-allocated
pages before mapping them into guest private memory ranges
This includes basic support for attestation guest requests, enough
to say that KVM supports the GHCB 2.0 specification
There is no support yet for loading into the firmware those signing
keys to be used for attestation requests, and therefore no need yet
for the host to provide certificate data for those keys.
To support fetching certificate data from userspace, a new KVM exit
type will be needed to handle fetching the certificate from
userspace.
An attempt to define a new KVM_EXIT_COCO / KVM_EXIT_COCO_REQ_CERTS
exit type to handle this was introduced in v1 of this patchset, but
is still being discussed by community, so for now this patchset
only implements a stub version of SNP Extended Guest Requests that
does not provide certificate data
x86 - Intel:
- Remove an unnecessary EPT TLB flush when enabling hardware
- Fix a series of bugs that cause KVM to fail to detect nested
pending posted interrupts as valid wake eents for a vCPU executing
HLT in L2 (with HLT-exiting disable by L1)
- KVM: x86: Suppress MMIO that is triggered during task switch
emulation
Explicitly suppress userspace emulated MMIO exits that are
triggered when emulating a task switch as KVM doesn't support
userspace MMIO during complex (multi-step) emulation
Silently ignoring the exit request can result in the
WARN_ON_ONCE(vcpu->mmio_needed) firing if KVM exits to userspace
for some other reason prior to purging mmio_needed
See commit 0dc902267cb3 ("KVM: x86: Suppress pending MMIO write
exits if emulator detects exception") for more details on KVM's
limitations with respect to emulated MMIO during complex emulator
flows
Generic:
- Rename the AS_UNMOVABLE flag that was introduced for KVM to
AS_INACCESSIBLE, because the special casing needed by these pages
is not due to just unmovability (and in fact they are only
unmovable because the CPU cannot access them)
- New ioctl to populate the KVM page tables in advance, which is
useful to mitigate KVM page faults during guest boot or after live
migration. The code will also be used by TDX, but (probably) not
through the ioctl
- Enable halt poll shrinking by default, as Intel found it to be a
clear win
- Setup empty IRQ routing when creating a VM to avoid having to
synchronize SRCU when creating a split IRQCHIP on x86
- Rework the sched_in/out() paths to replace kvm_arch_sched_in() with
a flag that arch code can use for hooking both sched_in() and
sched_out()
- Take the vCPU @id as an "unsigned long" instead of "u32" to avoid
truncating a bogus value from userspace, e.g. to help userspace
detect bugs
- Mark a vCPU as preempted if and only if it's scheduled out while in
the KVM_RUN loop, e.g. to avoid marking it preempted and thus
writing guest memory when retrieving guest state during live
migration blackout
Selftests:
- Remove dead code in the memslot modification stress test
- Treat "branch instructions retired" as supported on all AMD Family
17h+ CPUs
- Print the guest pseudo-RNG seed only when it changes, to avoid
spamming the log for tests that create lots of VMs
- Make the PMU counters test less flaky when counting LLC cache
misses by doing CLFLUSH{OPT} in every loop iteration"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits)
crypto: ccp: Add the SNP_VLEK_LOAD command
KVM: x86/pmu: Add kvm_pmu_call() to simplify static calls of kvm_pmu_ops
KVM: x86: Introduce kvm_x86_call() to simplify static calls of kvm_x86_ops
KVM: x86: Replace static_call_cond() with static_call()
KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event
x86/sev: Move sev_guest.h into common SEV header
KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event
KVM: x86: Suppress MMIO that is triggered during task switch emulation
KVM: x86/mmu: Clean up make_huge_page_split_spte() definition and intro
KVM: x86/mmu: Bug the VM if KVM tries to split a !hugepage SPTE
KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY
KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory()
KVM: x86/mmu: Make kvm_mmu_do_page_fault() return mapped level
KVM: x86/mmu: Account pf_{fixed,emulate,spurious} in callers of "do page fault"
KVM: x86/mmu: Bump pf_taken stat only in the "real" page fault handler
KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory
KVM: Document KVM_PRE_FAULT_MEMORY ioctl
mm, virt: merge AS_UNMOVABLE and AS_INACCESSIBLE
perf kvm: Add kvm-stat for loongarch64
LoongArch: KVM: Add PV steal time support in guest side
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 changes for 6.11
- Initial infrastructure for shadow stage-2 MMUs, as part of nested
virtualization enablement
- Support for userspace changes to the guest CTR_EL0 value, enabling
(in part) migration of VMs between heterogenous hardware
- Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1 of
the protocol
- FPSIMD/SVE support for nested, including merged trap configuration
and exception routing
- New command-line parameter to control the WFx trap behavior under KVM
- Introduce kCFI hardening in the EL2 hypervisor
- Fixes + cleanups for handling presence/absence of FEAT_TCRX
- Miscellaneous fixes + documentation updates
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
"The biggest part is the virtual CPU hotplug that touches ACPI,
irqchip. We also have some GICv3 optimisation for pseudo-NMIs that has
been queued via the arm64 tree. Otherwise the usual perf updates,
kselftest, various small cleanups.
Core:
- Virtual CPU hotplug support for arm64 ACPI systems
- cpufeature infrastructure cleanups and making the FEAT_ECBHB ID
bits visible to guests
- CPU errata: expand the speculative SSBS workaround to more CPUs
- GICv3, use compile-time PMR values: optimise the way regular IRQs
are masked/unmasked when GICv3 pseudo-NMIs are used, removing the
need for a static key in fast paths by using a priority value
chosen dynamically at boot time
ACPI:
- 'acpi=nospcr' option to disable SPCR as default console for arm64
- Move some ACPI code (cpuidle, FFH) to drivers/acpi/arm64/
Perf updates:
- Rework of the IMX PMU driver to enable support for I.MX95
- Enable support for tertiary match groups in the CMN PMU driver
- Initial refactoring of the CPU PMU code to prepare for the fixed
instruction counter introduced by Arm v9.4
- Add missing PMU driver MODULE_DESCRIPTION() strings
- Hook up DT compatibles for recent CPU PMUs
Kselftest updates:
- Kernel mode NEON fp-stress
- Cleanups, spelling mistakes
Miscellaneous:
- arm64 Documentation update with a minor clarification on TBI
- Fix missing IPI statistics
- Implement raw_smp_processor_id() using thread_info rather than a
per-CPU variable (better code generation)
- Make MTE checking of in-kernel asynchronous tag faults conditional
on KASAN being enabled
- Minor cleanups, typos"
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (69 commits)
selftests: arm64: tags: remove the result script
selftests: arm64: tags_test: conform test to TAP output
perf: add missing MODULE_DESCRIPTION() macros
arm64: smp: Fix missing IPI statistics
irqchip/gic-v3: Fix 'broken_rdists' unused warning when !SMP and !ACPI
ACPI: Add acpi=nospcr to disable ACPI SPCR as default console on ARM64
Documentation: arm64: Update memory.rst for TBI
arm64/cpufeature: Replace custom macros with fields from ID_AA64PFR0_EL1
KVM: arm64: Replace custom macros with fields from ID_AA64PFR0_EL1
perf: arm_pmuv3: Include asm/arm_pmuv3.h from linux/perf/arm_pmuv3.h
perf: arm_v6/7_pmu: Drop non-DT probe support
perf/arm: Move 32-bit PMU drivers to drivers/perf/
perf: arm_pmuv3: Drop unnecessary IS_ENABLED(CONFIG_ARM64) check
perf: arm_pmuv3: Avoid assigning fixed cycle counter with threshold
arm64: Kconfig: Fix dependencies to enable ACPI_HOTPLUG_CPU
perf: imx_perf: add support for i.MX95 platform
perf: imx_perf: fix counter start and config sequence
perf: imx_perf: refactor driver for imx93
perf: imx_perf: let the driver manage the counter usage rather the user
perf: imx_perf: add macro definitions for parsing config attr
...
|
|
* kvm-arm64/nv-tcr2:
: Fixes to the handling of TCR_EL1, courtesy of Marc Zyngier
:
: Series addresses a couple gaps that are present in KVM (from cover
: letter):
:
: - VM configuration: HCRX_EL2.TCR2En is forced to 1, and we blindly
: save/restore stuff.
:
: - trap bit description and routing: none, obviously, since we make a
: point in not trapping.
KVM: arm64: Honor trap routing for TCR2_EL1
KVM: arm64: Make PIR{,E0}_EL1 save/restore conditional on FEAT_TCRX
KVM: arm64: Make TCR2_EL1 save/restore dependent on the VM features
KVM: arm64: Get rid of HCRX_GUEST_FLAGS
KVM: arm64: Correctly honor the presence of FEAT_TCRX
Signed-off-by: Oliver Upton <[email protected]>
|
|
* kvm-arm64/nv-sve:
: CPTR_EL2, FPSIMD/SVE support for nested
:
: This series brings support for honoring the guest hypervisor's CPTR_EL2
: trap configuration when running a nested guest, along with support for
: FPSIMD/SVE usage at L1 and L2.
KVM: arm64: Allow the use of SVE+NV
KVM: arm64: nv: Add additional trap setup for CPTR_EL2
KVM: arm64: nv: Add trap description for CPTR_EL2
KVM: arm64: nv: Add TCPAC/TTA to CPTR->CPACR conversion helper
KVM: arm64: nv: Honor guest hypervisor's FP/SVE traps in CPTR_EL2
KVM: arm64: nv: Load guest FP state for ZCR_EL2 trap
KVM: arm64: nv: Handle CPACR_EL1 traps
KVM: arm64: Spin off helper for programming CPTR traps
KVM: arm64: nv: Ensure correct VL is loaded before saving SVE state
KVM: arm64: nv: Use guest hypervisor's max VL when running nested guest
KVM: arm64: nv: Save guest's ZCR_EL2 when in hyp context
KVM: arm64: nv: Load guest hyp's ZCR into EL1 state
KVM: arm64: nv: Handle ZCR_EL2 traps
KVM: arm64: nv: Forward SVE traps to guest hypervisor
KVM: arm64: nv: Forward FP/ASIMD traps to guest hypervisor
Signed-off-by: Oliver Upton <[email protected]>
|
|
* kvm-arm64/el2-kcfi:
: kCFI support in the EL2 hypervisor, courtesy of Pierre-Clément Tosi
:
: Enable the usage fo CONFIG_CFI_CLANG (kCFI) for hardening indirect
: branches in the EL2 hypervisor. Unlike kernel support for the feature,
: CFI failures at EL2 are always fatal.
KVM: arm64: nVHE: Support CONFIG_CFI_CLANG at EL2
KVM: arm64: Introduce print_nvhe_hyp_panic helper
arm64: Introduce esr_brk_comment, esr_is_cfi_brk
KVM: arm64: VHE: Mark __hyp_call_panic __noreturn
KVM: arm64: nVHE: gen-hyprel: Skip R_AARCH64_ABS32
KVM: arm64: nVHE: Simplify invalid_host_el2_vect
KVM: arm64: Fix __pkvm_init_switch_pgd call ABI
KVM: arm64: Fix clobbered ELR in sync abort/SError
Signed-off-by: Oliver Upton <[email protected]>
|
|
* kvm-arm64/shadow-mmu:
: Shadow stage-2 MMU support for NV, courtesy of Marc Zyngier
:
: Initial implementation of shadow stage-2 page tables to support a guest
: hypervisor. In the author's words:
:
: So here's the 10000m (approximately 30000ft for those of you stuck
: with the wrong units) view of what this is doing:
:
: - for each {VMID,VTTBR,VTCR} tuple the guest uses, we use a
: separate shadow s2_mmu context. This context has its own "real"
: VMID and a set of page tables that are the combination of the
: guest's S2 and the host S2, built dynamically one fault at a time.
:
: - these shadow S2 contexts are ephemeral, and behave exactly as
: TLBs. For all intent and purposes, they *are* TLBs, and we discard
: them pretty often.
:
: - TLB invalidation takes three possible paths:
:
: * either this is an EL2 S1 invalidation, and we directly emulate
: it as early as possible
:
: * or this is an EL1 S1 invalidation, and we need to apply it to
: the shadow S2s (plural!) that match the VMID set by the L1 guest
:
: * or finally, this is affecting S2, and we need to teardown the
: corresponding part of the shadow S2s, which invalidates the TLBs
KVM: arm64: nv: Truely enable nXS TLBI operations
KVM: arm64: nv: Add handling of NXS-flavoured TLBI operations
KVM: arm64: nv: Add handling of range-based TLBI operations
KVM: arm64: nv: Add handling of outer-shareable TLBI operations
KVM: arm64: nv: Invalidate TLBs based on shadow S2 TTL-like information
KVM: arm64: nv: Tag shadow S2 entries with guest's leaf S2 level
KVM: arm64: nv: Handle FEAT_TTL hinted TLB operations
KVM: arm64: nv: Handle TLBI IPAS2E1{,IS} operations
KVM: arm64: nv: Handle TLBI ALLE1{,IS} operations
KVM: arm64: nv: Handle TLBI VMALLS12E1{,IS} operations
KVM: arm64: nv: Handle TLB invalidation targeting L2 stage-1
KVM: arm64: nv: Handle EL2 Stage-1 TLB invalidation
KVM: arm64: nv: Add Stage-1 EL2 invalidation primitives
KVM: arm64: nv: Unmap/flush shadow stage 2 page tables
KVM: arm64: nv: Handle shadow stage 2 page faults
KVM: arm64: nv: Implement nested Stage-2 page table walk logic
KVM: arm64: nv: Support multiple nested Stage-2 mmu structures
Signed-off-by: Oliver Upton <[email protected]>
|
|
This replaces custom macros usage (i.e ID_AA64PFR0_EL1_ELx_64BIT_ONLY and
ID_AA64PFR0_EL1_ELx_32BIT_64BIT) and instead directly uses register fields
from ID_AA64PFR0_EL1 sysreg definition.
Cc: Marc Zyngier <[email protected]>
Cc: Oliver Upton <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Anshuman Khandual <[email protected]>
Acked-by: Marc Zyngier <[email protected]>
Acked-by: Mark Rutland <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Catalin Marinas <[email protected]>
|
|
As per the architecture, if FEAT_S1PIE is implemented, then FEAT_TCRX
must be implemented as well.
Take advantage of this to avoid checking for S1PIE when TCRX isn't
implemented.
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Oliver Upton <[email protected]>
|
|
As for other registers, save/restore of TCR2_EL1 should be gated
on the feature being actually present.
In the case of a nVHE hypervisor, it is perfectly fine to leave
the host value in the register, as HCRX_EL2.TCREn==0 imposes that
TCR2_EL1 is treated as 0.
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Oliver Upton <[email protected]>
|
|
We need to teach KVM a couple of new tricks. CPTR_EL2 and its
VHE accessor CPACR_EL1 need to be handled specially:
- CPACR_EL1 is trapped on VHE so that we can track the TCPAC
and TTA bits
- CPTR_EL2.{TCPAC,E0POE} are propagated from L1 to L2
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Oliver Upton <[email protected]>
|
|
Start folding the guest hypervisor's FP/SVE traps into the value
programmed in hardware. Note that as of writing this is dead code, since
KVM does a full put() / load() for every nested exception boundary which
saves + flushes the FP/SVE state.
However, this will become useful when we can keep the guest's FP/SVE
state alive across a nested exception boundary and the host no longer
needs to conservatively program traps.
Reviewed-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Oliver Upton <[email protected]>
|
|
Round out the ZCR_EL2 gymnastics by loading SVE state in the fast path
when the guest hypervisor tries to access SVE state.
Reviewed-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Oliver Upton <[email protected]>
|
|
Handle CPACR_EL1 accesses when running a VHE guest. In order to
limit the cost of the emulation, implement it ass a shallow exit.
In the other cases:
- this is a nVHE L1 which will write to memory, and we don't trap
- this is a L2 guest:
* the L1 has CPTR_EL2.TCPAC==0, and the L2 has direct register
access
* the L1 has CPTR_EL2.TCPAC==1, and the L2 will trap, but the
handling is defered to the general handling for forwarding
Signed-off-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Oliver Upton <[email protected]>
|
|
A subsequent change to KVM will add preliminary support for merging a
guest hypervisor's CPTR traps with that of KVM. Prepare by spinning off
a new helper for managing CPTR traps.
Avoid reading CPACR_EL1 for the baseline trap config, and start off with
the most restrictive set of traps that is subsequently relaxed.
Reviewed-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Oliver Upton <[email protected]>
|
|
The max VL for nested guests is additionally constrained by the max VL
selected by the guest hypervisor. Use that instead of KVM's max VL when
running a nested guest.
Note that the guest hypervisor's ZCR_EL2 is sanitised against the VM's
max VL at the time of access, so there's no additional handling required
at the time of use.
Reviewed-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Oliver Upton <[email protected]>
|
|
Load the guest hypervisor's ZCR_EL2 into the corresponding EL1 register
when restoring SVE state, as ZCR_EL2 affects the VL in the hypervisor
context.
Reviewed-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Oliver Upton <[email protected]>
|
|
Similar to FPSIMD traps, don't load SVE state if the guest hypervisor
has SVE traps enabled and forward the trap instead. Note that ZCR_EL2
will require some special handling, as it takes a sysreg trap to EL2
when HCR_EL2.NV = 1.
Reviewed-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Oliver Upton <[email protected]>
|
|
Give precedence to the guest hypervisor's trap configuration when
routing an FP/ASIMD trap taken to EL2. Take advantage of the
infrastructure for translating CPTR_EL2 into the VHE (i.e. EL1) format
and base the trap decision solely on the VHE view of the register. The
in-memory value of CPTR_EL2 will always be up to date for the guest
hypervisor (more on that later), so just read it directly from memory.
Bury all of this behind a macro keyed off of the CPTR bitfield in
anticipation of supporting other traps (e.g. SVE).
[maz: account for HCR_EL2.E2H when testing for TFP/FPEN, with
all the hard work actually being done by Chase Conklin]
[ oliver: translate nVHE->VHE format for testing traps; macro for reuse
in other CPTR_EL2.xEN fields ]
Signed-off-by: Jintack Lim <[email protected]>
Signed-off-by: Christoffer Dall <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Reviewed-by: Marc Zyngier <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Oliver Upton <[email protected]>
|
|
The compiler implements kCFI by adding type information (u32) above
every function that might be indirectly called and, whenever a function
pointer is called, injects a read-and-compare of that u32 against the
value corresponding to the expected type. In case of a mismatch, a BRK
instruction gets executed. When the hypervisor triggers such an
exception in nVHE, it panics and triggers and exception return to EL1.
Therefore, teach nvhe_hyp_panic_handler() to detect kCFI errors from the
ESR and report them. If necessary, remind the user that EL2 kCFI is not
affected by CONFIG_CFI_PERMISSIVE.
Pass $(CC_FLAGS_CFI) to the compiler when building the nVHE hyp code.
Use SYM_TYPED_FUNC_START() for __pkvm_init_switch_pgd, as nVHE can't
call it directly and must use a PA function pointer from C (because it
is part of the idmap page), which would trigger a kCFI failure if the
type ID wasn't present.
Signed-off-by: Pierre-Clément Tosi <[email protected]>
Acked-by: Will Deacon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Oliver Upton <[email protected]>
|
|
Given that the sole purpose of __hyp_call_panic() is to call panic(), a
__noreturn function, give it the __noreturn attribute, removing the need
for its caller to use unreachable().
Signed-off-by: Pierre-Clément Tosi <[email protected]>
Acked-by: Will Deacon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Oliver Upton <[email protected]>
|
|
Ignore R_AARCH64_ABS32 relocations, instead of panicking, when emitting
the relocation table of the hypervisor. The toolchain might produce them
when generating function calls with kCFI to represent the 32-bit type ID
which can then be resolved across compilation units at link time. These
are NOT actual 32-bit addresses and are therefore not needed in the
final (runtime) relocation table (which is unlikely to use 32-bit
absolute addresses for arm64 anyway).
Signed-off-by: Pierre-Clément Tosi <[email protected]>
Acked-by: Will Deacon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Oliver Upton <[email protected]>
|
|
The invalid_host_el2_vect macro is used by EL2{t,h} handlers in nVHE
*host* context, which should never run with a guest context loaded.
Therefore, remove the superfluous vCPU context check and branch
unconditionally to hyp_panic.
Signed-off-by: Pierre-Clément Tosi <[email protected]>
Acked-by: Will Deacon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Oliver Upton <[email protected]>
|
|
Fix the mismatch between the (incorrect) C signature, C call site, and
asm implementation by aligning all three on an API passing the
parameters (pgd and SP) separately, instead of as a bundled struct.
Remove the now unnecessary memory accesses while the MMU is off from the
asm, which simplifies the C caller (as it does not need to convert a VA
struct pointer to PA) and makes the code slightly more robust by
offsetting the struct fields from C and properly expressing the call to
the C compiler (e.g. type checker and kCFI).
Fixes: f320bc742bc2 ("KVM: arm64: Prepare the creation of s1 mappings at EL2")
Signed-off-by: Pierre-Clément Tosi <[email protected]>
Acked-by: Will Deacon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Oliver Upton <[email protected]>
|
|
When the hypervisor receives a SError or synchronous exception (EL2h)
while running with the __kvm_hyp_vector and if ELR_EL2 doesn't point to
an extable entry, it panics indirectly by overwriting ELR with the
address of a panic handler in order for the asm routine it returns to to
ERET into the handler.
However, this clobbers ELR_EL2 for the handler itself. As a result,
hyp_panic(), when retrieving what it believes to be the PC where the
exception happened, actually ends up reading the address of the panic
handler that called it! This results in an erroneous and confusing panic
message where the source of any synchronous exception (e.g. BUG() or
kCFI) appears to be __guest_exit_panic, making it hard to locate the
actual BRK instruction.
Therefore, store the original ELR_EL2 in the per-CPU kvm_hyp_ctxt and
point the sysreg to a routine that first restores it to its previous
value before running __guest_exit_panic.
Fixes: 7db21530479f ("KVM: arm64: Restore hyp when panicking in guest context")
Signed-off-by: Pierre-Clément Tosi <[email protected]>
Acked-by: Will Deacon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Oliver Upton <[email protected]>
|