aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-12-09KVM: x86: ignore SIPIs that are received while not in wait-for-sipi stateMaxim Levitsky1-7/+8
In the commit 1c96dcceaeb3 ("KVM: x86: fix apic_accept_events vs check_nested_events"), we accidently started latching SIPIs that are received while the cpu is not waiting for them. This causes vCPUs to never enter a halted state. Fixes: 1c96dcceaeb3 ("KVM: x86: fix apic_accept_events vs check_nested_events") Signed-off-by: Maxim Levitsky <[email protected]> Message-Id: <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-12-09Merge remote-tracking branch 'origin/kvm-arm64/psci-relay' into ↵Marc Zyngier65-823/+1711
kvmarm-master/next Signed-off-by: Marc Zyngier <[email protected]>
2020-12-08KVM: arm64: Fix nVHE boot on VHE systemsMarc Zyngier1-1/+4
Conflict resolution gone astray results in the kernel not booting on VHE-capable HW when VHE support is disabled. Thankfully spotted by David. Reported-by: David Brazdil <[email protected]> Signed-off-by: Marc Zyngier <[email protected]>
2020-12-04Merge remote-tracking branch 'origin/kvm-arm64/misc-5.11' into ↵Marc Zyngier8-11/+50
kvmarm-master/queue Signed-off-by: Marc Zyngier <[email protected]>
2020-12-04KVM: arm64: Fix EL2 mode availability checksDavid Brazdil2-3/+24
With protected nVHE hyp code interception host's PSCI SMCs, the host starts seeing new CPUs boot in EL1 instead of EL2. The kernel logic that keeps track of the boot mode needs to be adjusted. Add a static key enabled if KVM protected mode initialization is successful. When the key is enabled, is_hyp_mode_available continues to report `true` because its users either treat it as a check whether KVM will be / was initialized, or whether stub HVCs can be made (eg. hibernate). is_hyp_mode_mismatched is changed to report `false` when the key is enabled. That's because all cores' modes matched at the point of KVM init and KVM will not allow cores not present at init to boot. That said, the function is never used after KVM is initialized. Signed-off-by: David Brazdil <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-12-04KVM: arm64: Trap host SMCs in protected modeDavid Brazdil3-1/+15
While protected KVM is installed, start trapping all host SMCs. For now these are simply forwarded to EL3, except PSCI CPU_ON/CPU_SUSPEND/SYSTEM_SUSPEND which are intercepted and the hypervisor installed on newly booted cores. Create new constant HCR_HOST_NVHE_PROTECTED_FLAGS with the new set of HCR flags to use while the nVHE vector is installed when the kernel was booted with the protected flag enabled. Switch back to the default HCR flags when switching back to the stub vector. Signed-off-by: David Brazdil <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-12-04KVM: arm64: Keep nVHE EL2 vector installedDavid Brazdil1-4/+8
KVM by default keeps the stub vector installed and installs the nVHE vector only briefly for init and later on demand. Change this policy to install the vector at init and then never uninstall it if the kernel was given the protected KVM command line parameter. Signed-off-by: David Brazdil <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-12-04KVM: arm64: Intercept host's SYSTEM_SUSPEND PSCI SMCsDavid Brazdil2-1/+27
Add a handler of SYSTEM_SUSPEND host PSCI SMCs. The semantics are equivalent to CPU_SUSPEND, typically called on the last online CPU. Reuse the same entry point and boot args struct as CPU_SUSPEND. Signed-off-by: David Brazdil <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-12-04KVM: arm64: Intercept host's CPU_SUSPEND PSCI SMCsDavid Brazdil2-2/+52
Add a handler of CPU_SUSPEND host PSCI SMCs. The SMC can either enter a sleep state indistinguishable from a WFI or a deeper sleep state that behaves like a CPU_OFF+CPU_ON except that the core is still considered online while asleep. The handler saves r0,pc of the host and makes the same call to EL3 with the hyp CPU entry point. It either returns back to the handler and then back to the host, or wakes up into the entry point and initializes EL2 state before dropping back to EL1. No EL2 state needs to be saved/restored for this purpose. CPU_ON and CPU_SUSPEND are both implemented using struct psci_boot_args to store the state upon powerup, with each CPU having separate structs for CPU_ON and CPU_SUSPEND so that CPU_SUSPEND can operate locklessly and so that a CPU_ON call targeting a CPU cannot interfere with a concurrent CPU_SUSPEND call on that CPU. Signed-off-by: David Brazdil <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-12-04KVM: arm64: Intercept host's CPU_ON SMCsDavid Brazdil2-0/+163
Add a handler of the CPU_ON PSCI call from host. When invoked, it looks up the logical CPU ID corresponding to the provided MPIDR and populates the state struct of the target CPU with the provided x0, pc. It then calls CPU_ON itself, with an entry point in hyp that initializes EL2 state before returning ERET to the provided PC in EL1. There is a simple atomic lock around the boot args struct. If it is already locked, CPU_ON will return PENDING_ON error code. Signed-off-by: David Brazdil <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-12-04KVM: arm64: Add function to enter host from KVM nVHE hyp codeDavid Brazdil1-0/+9
All nVHE hyp code is currently executed as handlers of host's HVCs. This will change as nVHE starts intercepting host's PSCI CPU_ON SMCs. The newly booted CPU will need to initialize EL2 state and then enter the host. Add __host_enter function that branches into the existing host state-restoring code after the trap handler would have returned. Signed-off-by: David Brazdil <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-12-04KVM: arm64: Extract __do_hyp_init into a helper functionDavid Brazdil1-15/+32
In preparation for adding a CPU entry point in nVHE hyp code, extract most of __do_hyp_init hypervisor initialization code into a common helper function. This will be invoked by the entry point to install KVM on the newly booted CPU. Signed-off-by: David Brazdil <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-12-04KVM: arm64: Forward safe PSCI SMCs coming from hostDavid Brazdil1-1/+41
Forward the following PSCI SMCs issued by host to EL3 as they do not require the hypervisor's intervention. This assumes that EL3 correctly implements the PSCI specification. Only function IDs implemented in Linux are included. Where both 32-bit and 64-bit variants exist, it is assumed that the host will always use the 64-bit variant. * SMCs that only return information about the system * PSCI_VERSION - PSCI version implemented by EL3 * PSCI_FEATURES - optional features supported by EL3 * AFFINITY_INFO - power state of core/cluster * MIGRATE_INFO_TYPE - whether Trusted OS can be migrated * MIGRATE_INFO_UP_CPU - resident core of Trusted OS * operations which do not affect the hypervisor * MIGRATE - migrate Trusted OS to a different core * SET_SUSPEND_MODE - toggle OS-initiated mode * system shutdown/reset * SYSTEM_OFF * SYSTEM_RESET * SYSTEM_RESET2 Signed-off-by: David Brazdil <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-12-04KVM: arm64: Add offset for hyp VA <-> PA conversionDavid Brazdil2-3/+30
Add a host-initialized constant to KVM nVHE hyp code for converting between EL2 linear map virtual addresses and physical addresses. Also add `__hyp_pa` macro that performs the conversion. Signed-off-by: David Brazdil <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-12-04KVM: arm64: Bootstrap PSCI SMC handler in nVHE EL2David Brazdil6-6/+151
Add a handler of PSCI SMCs in nVHE hyp code. The handler is initialized with the version used by the host's PSCI driver and the function IDs it was configured with. If the SMC function ID matches one of the configured PSCI calls (for v0.1) or falls into the PSCI function ID range (for v0.2+), the SMC is handled by the PSCI handler. For now, all SMCs return PSCI_RET_NOT_SUPPORTED. Signed-off-by: David Brazdil <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-12-04KVM: arm64: Add SMC handler in nVHE EL2David Brazdil2-3/+70
Add handler of host SMCs in KVM nVHE trap handler. Forward all SMCs to EL3 and propagate the result back to EL1. This is done in preparation for validating host SMCs in KVM protected mode. The implementation assumes that firmware uses SMCCC v1.2 or older. That means x0-x17 can be used both for arguments and results, other GPRs are preserved. Signed-off-by: David Brazdil <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-12-04KVM: arm64: Create nVHE copy of cpu_logical_mapDavid Brazdil2-0/+35
When KVM starts validating host's PSCI requests, it will need to map MPIDR back to the CPU ID. To this end, copy cpu_logical_map into nVHE hyp memory when KVM is initialized. Only copy the information for CPUs that are online at the point of KVM initialization so that KVM rejects CPUs whose features were not checked against the finalized capabilities. Signed-off-by: David Brazdil <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-12-04KVM: arm64: Support per_cpu_ptr in nVHE hyp codeDavid Brazdil4-1/+35
When compiling with __KVM_NVHE_HYPERVISOR__, redefine per_cpu_offset() to __hyp_per_cpu_offset() which looks up the base of the nVHE per-CPU region of the given cpu and computes its offset from the .hyp.data..percpu section. This enables use of per_cpu_ptr() helpers in nVHE hyp code. Until now only this_cpu_ptr() was supported by setting TPIDR_EL2. Signed-off-by: David Brazdil <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-12-04KVM: arm64: Add .hyp.data..ro_after_init ELF sectionDavid Brazdil4-0/+20
Add rules for renaming the .data..ro_after_init ELF section in KVM nVHE object files to .hyp.data..ro_after_init, linking it into the kernel and mapping it in hyp at runtime. The section is RW to the host, then mapped RO in hyp. The expectation is that the host populates the variables in the section and they are never changed by hyp afterwards. Signed-off-by: David Brazdil <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-12-04KVM: arm64: Init MAIR/TCR_EL2 from params structDavid Brazdil5-33/+34
MAIR_EL2 and TCR_EL2 are currently initialized from their _EL1 values. This will not work once KVM starts intercepting PSCI ON/SUSPEND SMCs and initializing EL2 state before EL1 state. Obtain the EL1 values during KVM init and store them in the init params struct. The struct will stay in memory and can be used when booting new cores. Take the opportunity to move copying the T0SZ value from idmap_t0sz in KVM init rather than in .hyp.idmap.text. This avoids the need for the idmap_t0sz symbol alias. Signed-off-by: David Brazdil <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-12-04KVM: arm64: Move hyp-init params to a per-CPU structDavid Brazdil6-20/+32
Once we start initializing KVM on newly booted cores before the rest of the kernel, parameters to __do_hyp_init will need to be provided by EL2 rather than EL1. At that point it will not be possible to pass its three arguments directly because PSCI_CPU_ON only supports one context argument. Refactor __do_hyp_init to accept its parameters in a struct. This prepares the code for KVM booting cores as well as removes any limits on the number of __do_hyp_init arguments. Signed-off-by: David Brazdil <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-12-04KVM: arm64: Remove vector_ptr param of hyp-initDavid Brazdil4-8/+31
KVM precomputes the hyp VA of __kvm_hyp_host_vector, essentially a constant (minus ASLR), before passing it to __kvm_hyp_init. Now that we have alternatives for converting kimg VA to hyp VA, replace this with computing the constant inside __kvm_hyp_init, thus removing the need for an argument. Signed-off-by: David Brazdil <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-12-04arm64: Extract parts of el2_setup into a macroDavid Brazdil2-120/+199
When a CPU is booted in EL2, the kernel checks for VHE support and initializes the CPU core accordingly. For nVHE it also installs the stub vectors and drops down to EL1. Once KVM gains the ability to boot cores without going through the kernel entry point, it will need to initialize the CPU the same way. Extract the relevant bits of el2_setup into an init_el2_state macro with an argument specifying whether to initialize for VHE or nVHE. The following ifdefs are removed: * CONFIG_ARM_GIC_V3 - always selected on arm64 * CONFIG_COMPAT - hstr_el2 can be set even without 32-bit support No functional change intended. Size of el2_setup increased by 148 bytes due to duplication. Signed-off-by: David Brazdil <[email protected]> [maz: reworked to fit the new PSTATE initial setup code] Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-12-04arm64: Make cpu_logical_map() take unsigned intDavid Brazdil2-3/+3
CPU index should never be negative. Change the signature of (set_)cpu_logical_map to take an unsigned int. This still works even if the users treat the CPU index as an int, and will allow the hypervisor's implementation to check that the index is valid with a single upper-bound check. Signed-off-by: David Brazdil <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-12-04psci: Add accessor for psci_0_1_function_idsDavid Brazdil2-7/+14
Make it possible to retrieve a copy of the psci_0_1_function_ids struct. This is useful for KVM if it is configured to intercept host's PSCI SMCs. Signed-off-by: David Brazdil <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Acked-by: Mark Rutland <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-12-04psci: Replace psci_function_id array with a structDavid Brazdil1-15/+14
Small refactor that replaces array of v0.1 function IDs indexed by an enum of function-name constants with a struct of function IDs "indexed" by field names. This is done in preparation for exposing the IDs to other parts of the kernel. Exposing a struct avoids the need for bounds checking. Signed-off-by: David Brazdil <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Acked-by: Mark Rutland <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-12-04psci: Split functions to v0.1 and v0.2+ variantsDavid Brazdil1-34/+60
Refactor implementation of v0.1+ functions (CPU_SUSPEND, CPU_OFF, CPU_ON, MIGRATE) to have two functions psci_0_1_foo / psci_0_2_foo that select the function ID and call a common helper __psci_foo. This is a small cleanup so that the function ID array is only used for v0.1 configurations. Signed-off-by: David Brazdil <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Acked-by: Mark Rutland <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-12-04psci: Support psci_ops.get_version for v0.1David Brazdil1-3/+10
KVM's host PSCI SMC filter needs to be aware of the PSCI version of the system but currently it is impossible to distinguish between v0.1 and PSCI disabled because both have get_version == NULL. Populate get_version for v0.1 with a function that returns a constant. psci_opt.get_version is currently unused so this has no effect on existing functionality. Signed-off-by: David Brazdil <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Acked-by: Mark Rutland <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-12-04KVM: arm64: Add ARM64_KVM_PROTECTED_MODE CPU capabilityDavid Brazdil5-2/+41
Expose the boolean value whether the system is running with KVM in protected mode (nVHE + kernel param). CPU capability was selected over a global variable to allow use in alternatives. Signed-off-by: David Brazdil <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-12-04KVM: arm64: Add kvm-arm.mode early kernel parameterDavid Brazdil3-0/+35
Add an early parameter that allows users to select the mode of operation for KVM/arm64. For now, the only supported value is "protected". By passing this flag users opt into the hypervisor placing additional restrictions on the host kernel. These allow the hypervisor to spawn guests whose state is kept private from the host. Restrictions will include stage-2 address translation to prevent host from accessing guest memory, filtering its SMC calls, etc. Without this parameter, the default behaviour remains selecting VHE/nVHE based on hardware support and CONFIG_ARM64_VHE. Signed-off-by: David Brazdil <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-12-04Merge remote-tracking branch 'arm64/for-next/uaccess' into HEADMarc Zyngier39-579/+564
Signed-off-by: Marc Zyngier <[email protected]>
2020-12-03Merge remote-tracking branch 'origin/kvm-arm64/csv3' into kvmarm-master/queueMarc Zyngier5-8/+37
Signed-off-by: Marc Zyngier <[email protected]>
2020-12-03KVM: arm64: Use kvm_write_guest_lock when init stolen timeKeqian Zhu1-5/+1
There is a lock version kvm_write_guest. Use it to simplify code. Signed-off-by: Keqian Zhu <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Reviewed-by: Steven Price <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-12-03KVM: arm64: Some fixes of PV-time interface documentKeqian Zhu1-2/+2
Rename PV_FEATURES to PV_TIME_FEATURES. Signed-off-by: Keqian Zhu <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Reviewed-by: Andrew Jones <[email protected]> Reviewed-by: Steven Price <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2020-12-03KVM: x86: adjust SEV for commit 7e8e6eed75ePaolo Bonzini1-1/+1
Since the ASID is now stored in svm->asid, pre_sev_run should also place it there and not directly in the VMCB control area. Reported-by: Ashish Kalra <[email protected]> Signed-off-by: Paolo Bonzini <[email protected]>
2020-12-03arm64: mark __system_matches_cap as __maybe_unusedMark Rutland1-1/+1
Now that the PAN toggling has been removed, the only user of __system_matches_cap() is has_generic_auth(), which is only built when CONFIG_ARM64_PTR_AUTH is selected, and Qian reports that this results in a build-time warning when CONFIG_ARM64_PTR_AUTH is not selected: | arch/arm64/kernel/cpufeature.c:2649:13: warning: '__system_matches_cap' defined but not used [-Wunused-function] | static bool __system_matches_cap(unsigned int n) | ^~~~~~~~~~~~~~~~~~~~ It's tricky to restructure things to prevent this, so let's mark __system_matches_cap() as __maybe_unused, as we used to do for the other user of __system_matches_cap() which we just removed. Reported-by: Qian Cai <[email protected]> Suggested-by: Qian Cai <[email protected]> Signed-off-by: Mark Rutland <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
2020-12-02arm64: uaccess: remove vestigal UAO supportMark Rutland3-33/+0
Now that arm64 no longer uses UAO, remove the vestigal feature detection code and Kconfig text. Signed-off-by: Mark Rutland <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: James Morse <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
2020-12-02arm64: uaccess: remove redundant PAN togglingMark Rutland3-58/+19
Some code (e.g. futex) needs to make privileged accesses to userspace memory, and uses uaccess_{enable,disable}_privileged() in order to permit this. All other uaccess primitives use LDTR/STTR, and never need to toggle PAN. Remove the redundant PAN toggling. Signed-off-by: Mark Rutland <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: James Morse <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
2020-12-02arm64: uaccess: remove addr_limit_user_check()Mark Rutland2-7/+2
Now that set_fs() is gone, addr_limit_user_check() is redundant. Remove the checks and associated thread flag. To ensure that _TIF_WORK_MASK can be used as an immediate value in an AND instruction (as it is in `ret_to_user`), TIF_MTE_ASYNC_FAULT is renumbered to keep the constituent bits of _TIF_WORK_MASK contiguous. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: James Morse <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
2020-12-02arm64: uaccess: remove set_fs()Mark Rutland13-92/+13
Now that the uaccess primitives dont take addr_limit into account, we have no need to manipulate this via set_fs() and get_fs(). Remove support for these, along with some infrastructure this renders redundant. We no longer need to flip UAO to access kernel memory under KERNEL_DS, and head.S unconditionally clears UAO for all kernel configurations via an ERET in init_kernel_el. Thus, we don't need to dynamically flip UAO, nor do we need to context-switch it. However, we still need to adjust PAN during SDEI entry. Masking of __user pointers no longer needs to use the dynamic value of addr_limit, and can use a constant derived from the maximum possible userspace task size. A new TASK_SIZE_MAX constant is introduced for this, which is also used by core code. In configurations supporting 52-bit VAs, this may include a region of unusable VA space above a 48-bit TTBR0 limit, but never includes any portion of TTBR1. Note that TASK_SIZE_MAX is an exclusive limit, while USER_DS and KERNEL_DS were inclusive limits, and is converted to a mask by subtracting one. As the SDEI entry code repurposes the otherwise unnecessary pt_regs::orig_addr_limit field to store the TTBR1 of the interrupted context, for now we rename that to pt_regs::sdei_ttbr1. In future we can consider factoring that out. Signed-off-by: Mark Rutland <[email protected]> Acked-by: James Morse <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
2020-12-02arm64: uaccess cleanup macro namingMark Rutland6-26/+26
Now the uaccess primitives use LDTR/STTR unconditionally, the uao_{ldp,stp,user_alternative} asm macros are misnamed, and have a redundant argument. Let's remove the redundant argument and rename these to user_{ldp,stp,ldst} respectively to clean this up. Signed-off-by: Mark Rutland <[email protected]> Reviewed-by: Robin Murohy <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: James Morse <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
2020-12-02arm64: uaccess: split user/kernel routinesMark Rutland2-65/+47
This patch separates arm64's user and kernel memory access primitives into distinct routines, adding new __{get,put}_kernel_nofault() helpers to access kernel memory, upon which core code builds larger copy routines. The kernel access routines (using LDR/STR) are not affected by PAN (when legitimately accessing kernel memory), nor are they affected by UAO. Switching to KERNEL_DS may set UAO, but this does not adversely affect the kernel access routines. The user access routines (using LDTR/STTR) are not affected by PAN (when legitimately accessing user memory), but are affected by UAO. As these are only legitimate to use under USER_DS with UAO clear, this should not be problematic. Routines performing atomics to user memory (futex and deprecated instruction emulation) still need to transiently clear PAN, and these are left as-is. These are never used on kernel memory. Subsequent patches will refactor the uaccess helpers to remove redundant code, and will also remove the redundant PAN/UAO manipulation. Signed-off-by: Mark Rutland <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: James Morse <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
2020-12-02arm64: uaccess: refactor __{get,put}_userMark Rutland1-17/+27
As a step towards implementing __{get,put}_kernel_nofault(), this patch splits most user-memory specific logic out of __{get,put}_user(), with the memory access and fault handling in new __{raw_get,put}_mem() helpers. For now the LDR/LDTR patching is left within the *get_mem() helpers, and will be removed in a subsequent patch. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: James Morse <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
2020-12-02arm64: uaccess: simplify __copy_user_flushcache()Mark Rutland1-3/+1
Currently __copy_user_flushcache() open-codes raw_copy_from_user(), and doesn't use uaccess_mask_ptr() on the user address. Let's have it call raw_copy_from_user(), which is both a simplification and ensures that user pointers are masked under speculation. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <[email protected]> Reviewed-by: Robin Murphy <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
2020-12-02arm64: uaccess: rename privileged uaccess routinesMark Rutland3-8/+8
We currently have many uaccess_*{enable,disable}*() variants, which subsequent patches will cut down as part of removing set_fs() and friends. Once this simplification is made, most uaccess routines will only need to ensure that the user page tables are mapped in TTBR0, as is currently dealt with by uaccess_ttbr0_{enable,disable}(). The existing uaccess_{enable,disable}() routines ensure that user page tables are mapped in TTBR0, and also disable PAN protections, which is necessary to be able to use atomics on user memory, but also permit unrelated privileged accesses to access user memory. As preparatory step, let's rename uaccess_{enable,disable}() to uaccess_{enable,disable}_privileged(), highlighting this caveat and discouraging wider misuse. Subsequent patches can reuse the uaccess_{enable,disable}() naming for the common case of ensuring the user page tables are mapped in TTBR0. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: James Morse <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
2020-12-02arm64: sdei: explicitly simulate PAN/UAO entryMark Rutland2-6/+39
In preparation for removing addr_limit and set_fs() we must decouple the SDEI PAN/UAO manipulation from the uaccess code, and explicitly reinitialize these as required. SDEI enters the kernel with a non-architectural exception, and prior to the most recent revision of the specification (ARM DEN 0054B), PSTATE bits (e.g. PAN, UAO) are not manipulated in the same way as for architectural exceptions. Notably, older versions of the spec can be read ambiguously as to whether PSTATE bits are inherited unchanged from the interrupted context or whether they are generated from scratch, with TF-A doing the latter. We have three cases to consider: 1) The existing TF-A implementation of SDEI will clear PAN and clear UAO (along with other bits in PSTATE) when delivering an SDEI exception. 2) In theory, implementations of SDEI prior to revision B could inherit PAN and UAO (along with other bits in PSTATE) unchanged from the interrupted context. However, in practice such implementations do not exist. 3) Going forward, new implementations of SDEI must clear UAO, and depending on SCTLR_ELx.SPAN must either inherit or set PAN. As we can ignore (2) we can assume that upon SDEI entry, UAO is always clear, though PAN may be clear, inherited, or set per SCTLR_ELx.SPAN. Therefore, we must explicitly initialize PAN, but do not need to do anything for UAO. Considering what we need to do: * When set_fs() is removed, force_uaccess_begin() will have no HW side-effects. As this only clears UAO, which we can assume has already been cleared upon entry, this is not a problem. We do not need to add code to manipulate UAO explicitly. * PAN may be cleared upon entry (in case 1 above), so where a kernel is built to use PAN and this is supported by all CPUs, the kernel must set PAN upon entry to ensure expected behaviour. * PAN may be inherited from the interrupted context (in case 3 above), and so where a kernel is not built to use PAN or where PAN support is not uniform across CPUs, the kernel must clear PAN to ensure expected behaviour. This patch reworks the SDEI code accordingly, explicitly setting PAN to the expected state in all cases. To cater for the cases where the kernel does not use PAN or this is not uniformly supported by hardware we add a new cpu_has_pan() helper which can be used regardless of whether the kernel is built to use PAN. The existing system_uses_ttbr0_pan() is redefined in terms of system_uses_hw_pan() both for clarity and as a minor optimization when HW PAN is not selected. Signed-off-by: Mark Rutland <[email protected]> Reviewed-by: James Morse <[email protected]> Cc: James Morse <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
2020-12-02arm64: sdei: move uaccess logic to arch/arm64/Mark Rutland2-20/+12
The SDEI support code is split across arch/arm64/ and drivers/firmware/, largley this is split so that the arch-specific portions are under arch/arm64, and the management logic is under drivers/firmware/. However, exception entry fixups are currently under drivers/firmware. Let's move the exception entry fixups under arch/arm64/. This de-clutters the management logic, and puts all the arch-specific portions in one place. Doing this also allows the fixups to be applied earlier, so things like PAN and UAO will be in a known good state before we run other logic. This will also make subsequent refactoring easier. Signed-off-by: Mark Rutland <[email protected]> Reviewed-by: James Morse <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
2020-12-02arm64: head.S: always initialize PSTATEMark Rutland2-11/+26
As with SCTLR_ELx and other control registers, some PSTATE bits are UNKNOWN out-of-reset, and we may not be able to rely on hardware or firmware to initialize them to our liking prior to entry to the kernel, e.g. in the primary/secondary boot paths and return from idle/suspend. It would be more robust (and easier to reason about) if we consistently initialized PSTATE to a default value, as we do with control registers. This will ensure that the kernel is not adversely affected by bits it is not aware of, e.g. when support for a feature such as PAN/UAO is disabled. This patch ensures that PSTATE is consistently initialized at boot time via an ERET. This is not intended to relax the existing requirements (e.g. DAIF bits must still be set prior to entering the kernel). For features detected dynamically (which may require system-wide support), it is still necessary to subsequently modify PSTATE. As ERET is not always a Context Synchronization Event, an ISB is placed before each exception return to ensure updates to control registers have taken effect. This handles the kernel being entered with SCTLR_ELx.EOS clear (or any future control bits being in an UNKNOWN state). Signed-off-by: Mark Rutland <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: James Morse <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
2020-12-02arm64: head.S: cleanup SCTLR_ELx initializationMark Rutland3-10/+16
Let's make SCTLR_ELx initialization a bit clearer by using meaningful names for the initialization values, following the same scheme for SCTLR_EL1 and SCTLR_EL2. These definitions will be used more widely in subsequent patches. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: James Morse <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
2020-12-02arm64: head.S: rename el2_setup -> init_kernel_elMark Rutland2-8/+9
For a while now el2_setup has performed some basic initialization of EL1 even when the kernel is booted at EL1, so the name is a little misleading. Further, some comments are stale as with VHE it doesn't drop the CPU to EL1. To clarify things, rename el2_setup to init_kernel_el, and update comments to be clearer as to the function's purpose. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: James Morse <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>