aboutsummaryrefslogtreecommitdiff
path: root/arch/arm64/include
AgeCommit message (Collapse)AuthorFilesLines
2022-12-01arm64/sysreg: Convert ID_MMFR4_EL1 to automatic generationJames Morse1-10/+0
Convert ID_MMFR4_EL1 to be automatically generated as per DDI0487I.a, no functional changes. Reviewed-by: Mark Brown <[email protected]> Signed-off-by: James Morse <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-12-01arm64/sysreg: Convert ID_MMFR3_EL1 to automatic generationJames Morse1-1/+0
Convert ID_MMFR3_EL1 to be automatically generated as per DDI0487I.a, no functional changes. Reviewed-by: Mark Brown <[email protected]> Signed-off-by: James Morse <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-12-01arm64/sysreg: Convert ID_MMFR2_EL1 to automatic generationJames Morse1-1/+0
Convert ID_MMFR2_EL1 to be automatically generated as per DDI0487I.a, no functional changes. Reviewed-by: Mark Brown <[email protected]> Signed-off-by: James Morse <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-12-01arm64/sysreg: Convert ID_MMFR1_EL1 to automatic generationJames Morse1-1/+0
Convert ID_MMFR1_EL1 to be automatically generated as per DDI0487I.a, no functional changes. Reviewed-by: Mark Brown <[email protected]> Signed-off-by: James Morse <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-12-01arm64/sysreg: Convert ID_MMFR0_EL1 to automatic generationJames Morse1-10/+0
Convert ID_MMFR0_EL1 to be automatically generated as per DDI0487I.a, no functional changes. Reviewed-by: Mark Brown <[email protected]> Signed-off-by: James Morse <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-12-01arm64/sysreg: Standardise naming for MVFR2_EL1James Morse1-2/+2
To convert the 32bit id registers to use the sysreg generation, they must first have a regular pattern, to match the symbols the script generates. Ensure symbols for the MVFR2_EL1 register use lower-case for feature names where the arm-arm does the same. No functional change. Signed-off-by: James Morse <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-12-01arm64/sysreg: Standardise naming for MVFR1_EL1James Morse1-8/+8
To convert the 32bit id registers to use the sysreg generation, they must first have a regular pattern, to match the symbols the script generates. Ensure symbols for the MVFR1_EL1 register use lower-case for feature names where the arm-arm does the same. No functional change. Signed-off-by: James Morse <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-12-01arm64/sysreg: Standardise naming for MVFR0_EL1James Morse1-8/+8
To convert the 32bit id registers to use the sysreg generation, they must first have a regular pattern, to match the symbols the script generates. Ensure symbols for the MVFR0_EL1 register use lower-case for feature names where the arm-arm does the same. No functional change. Signed-off-by: James Morse <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-12-01arm64/sysreg: Standardise naming for ID_DFR1_EL1James Morse1-1/+1
To convert the 32bit id registers to use the sysreg generation, they must first have a regular pattern, to match the symbols the script generates. Ensure symbols for the ID_DFR1_EL1 register have an _EL1 suffix. No functional change. Signed-off-by: James Morse <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-12-01arm64/sysreg: Standardise naming for ID_DFR0_EL1James Morse1-13/+11
To convert the 32bit id registers to use the sysreg generation, they must first have a regular pattern, to match the symbols the script generates. Ensure symbols for the ID_DFR0_EL1 register have an _EL1 suffix, and use lower-case for feature names where the arm-arm does the same. The arm-arm has feature names for some of the ID_DFR0_EL1.PerMon encodings. Use these feature names in preference to the '8_4' indication of the architecture version they were introduced in. No functional change. Signed-off-by: James Morse <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-12-01arm64/sysreg: Standardise naming for ID_PFR2_EL1James Morse1-2/+2
To convert the 32bit id registers to use the sysreg generation, they must first have a regular pattern, to match the symbols the script generates. Ensure symbols for the ID_PFR2_EL1 register have an _EL1 suffix. No functional change. Signed-off-by: James Morse <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-12-01arm64/sysreg: Standardise naming for ID_PFR1_EL1James Morse1-8/+8
To convert the 32bit id registers to use the sysreg generation, they must first have a regular pattern, to match the symbols the script generates. Ensure symbols for the ID_PFR1_EL1 register have an _EL1 suffix, and use lower case in feature names where the arm-arm does the same. No functional change. Signed-off-by: James Morse <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-12-01arm64/sysreg: Standardise naming for ID_PFR0_EL1James Morse1-6/+6
To convert the 32bit id registers to use the sysreg generation, they must first have a regular pattern, to match the symbols the script generates. Ensure symbols for the ID_PFR0_EL1 register have an _EL1 suffix, and use lower case in feature names where the arm-arm does the same. No functional change. Signed-off-by: James Morse <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-12-01arm64/sysreg: Standardise naming for ID_ISAR6_EL1James Morse1-7/+7
To convert the 32bit id registers to use the sysreg generation, they must first have a regular pattern, to match the symbols the script generates. Ensure symbols for the ID_ISAR6_EL1 register have an _EL1 suffix. No functional change. Signed-off-by: James Morse <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-12-01arm64/sysreg: Standardise naming for ID_ISAR5_EL1James Morse1-6/+6
To convert the 32bit id registers to use the sysreg generation, they must first have a regular pattern, to match the symbols the script generates. Ensure symbols for the ID_ISAR5_EL1 register have an _EL1 suffix. No functional change. Signed-off-by: James Morse <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-12-01arm64/sysreg: Standardise naming for ID_ISAR4_EL1James Morse1-8/+8
To convert the 32bit id registers to use the sysreg generation, they must first have a regular pattern, to match the symbols the script generates. Ensure symbols for the ID_ISAR4_EL1 register have an _EL1 suffix, and use lower-case for feature names where the arm-arm does the same. No functional change. Signed-off-by: James Morse <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-12-01arm64/sysreg: Standardise naming for ID_ISAR0_EL1James Morse1-7/+7
To convert the 32bit id registers to use the sysreg generation, they must first have a regular pattern, to match the symbols the script generates. Ensure symbols for the ID_ISAR0_EL1 register have an _EL1 suffix, and use lower-case for feature names where the arm-arm does the same. To functional change. Signed-off-by: James Morse <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-12-01arm64/sysreg: Standardise naming for ID_MMFR5_EL1James Morse1-1/+1
To convert the 32bit id registers to use the sysreg generation, they must first have a regular pattern, to match the symbols the script generates. Ensure symbols for the ID_MMFR5_EL1 register have an _EL1 suffix. No functional change. Signed-off-by: James Morse <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-12-01arm64/sysreg: Standardise naming for ID_MMFR4_EL1James Morse1-8/+8
To convert the 32bit id registers to use the sysreg generation, they must first have a regular pattern, to match the symbols the script generates. Ensure symbols for the ID_MMFR4_EL1 register have an _EL1 suffix, and use lower case in feature names where the arm-arm does the same. No functional change. Signed-off-by: James Morse <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-12-01arm64/sysreg: Standardise naming for ID_MMFR0_EL1James Morse1-8/+8
To convert the 32bit id registers to use the sysreg generation, they must first have a regular pattern, to match the symbols the script generates. The scripts would like to follow exactly what is in the arm-arm, which uses lower case for some of these feature names. Ensure symbols for the ID_MMFR0_EL1 register have an _EL1 suffix, and use lower case in feature names where the arm-arm does the same. No functional change. Signed-off-by: James Morse <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-12-01arm64: efi: Revert "Recover from synchronous exceptions ..."Ard Biesheuvel1-8/+0
This reverts commit 23715a26c8d81291, which introduced some code in assembler that manipulates both the ordinary and the shadow call stack pointer in a way that could potentially be taken advantage of. So let's revert it, and do a better job the next time around. Signed-off-by: Ard Biesheuvel <[email protected]>
2022-11-29arm64/fp: Use a struct to pass data to fpsimd_bind_state_to_cpu()Mark Brown1-5/+12
For reasons that are unclear to this reader fpsimd_bind_state_to_cpu() populates the struct fpsimd_last_state_struct that it uses to store the active floating point state for KVM guests by passing an argument for each member of the structure. As the richness of the architecture increases this is resulting in a function with a rather large number of arguments which isn't ideal. Simplify the interface by using the struct directly as the single argument for the function, renaming it as we lift the definition into the header. This could be built on further to reduce the work we do adding storage for new FP state in various places but for now it just simplifies this one interface. Signed-off-by: Mark Brown <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Reviewed-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-11-29arm64/fpsimd: Have KVM explicitly say which FP registers to saveMark Brown2-1/+3
In order to avoid needlessly saving and restoring the guest registers KVM relies on the host FPSMID code to save the guest registers when we context switch away from the guest. This is done by binding the KVM guest state to the CPU on top of the task state that was originally there, then carefully managing the TIF_SVE flag for the task to cause the host to save the full SVE state when needed regardless of the needs of the host task. This works well enough but isn't terribly direct about what is going on and makes it much more complicated to try to optimise what we're doing with the SVE register state. Let's instead have KVM pass in the register state it wants saving when it binds to the CPU. We introduce a new FP_STATE_CURRENT for use during normal task binding to indicate that we should base our decisions on the current task. This should not be used when actually saving. Ideally we might want to use a separate enum for the type to save but this enum and the enum values would then need to be named which has problems with clarity and ambiguity. In order to ease any future debugging that might be required this patch does not actually update any of the decision making about what to save, it merely starts tracking the new information and warns if the requested state is not what we would otherwise have decided to save. Signed-off-by: Mark Brown <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Reviewed-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-11-29arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVEMark Brown3-2/+18
When we save the state for the floating point registers this can be done in the form visible through either the FPSIMD V registers or the SVE Z and P registers. At present we track which format is currently used based on TIF_SVE and the SME streaming mode state but particularly in the SVE case this limits our options for optimising things, especially around syscalls. Introduce a new enum which we place together with saved floating point state in both thread_struct and the KVM guest state which explicitly states which format is active and keep it up to date when we change it. At present we do not use this state except to verify that it has the expected value when loading the state, future patches will introduce functional changes. Signed-off-by: Mark Brown <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Reviewed-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-11-29KVM: arm64: Discard any SVE state when entering KVM guestsMark Brown1-0/+1
Since 8383741ab2e773a99 (KVM: arm64: Get rid of host SVE tracking/saving) KVM has not tracked the host SVE state, relying on the fact that we currently disable SVE whenever we perform a syscall. This may not be true in future since performance optimisation may result in us keeping SVE enabled in order to avoid needing to take access traps to reenable it. Handle this by clearing TIF_SVE and converting the stored task state to FPSIMD format when preparing to run the guest. This is done with a new call fpsimd_kvm_prepare() to keep the direct state manipulation functions internal to fpsimd.c. Signed-off-by: Mark Brown <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Reviewed-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-11-29arm64: mte: Lock a page for MTE tag initialisationCatalin Marinas2-3/+36
Initialising the tags and setting PG_mte_tagged flag for a page can race between multiple set_pte_at() on shared pages or setting the stage 2 pte via user_mem_abort(). Introduce a new PG_mte_lock flag as PG_arch_3 and set it before attempting page initialisation. Given that PG_mte_tagged is never cleared for a page, consider setting this flag to mean page unlocked and wait on this bit with acquire semantics if the page is locked: - try_page_mte_tagging() - lock the page for tagging, return true if it can be tagged, false if already tagged. No acquire semantics if it returns true (PG_mte_tagged not set) as there is no serialisation with a previous set_page_mte_tagged(). - set_page_mte_tagged() - set PG_mte_tagged with release semantics. The two-bit locking is based on Peter Collingbourne's idea. Signed-off-by: Catalin Marinas <[email protected]> Signed-off-by: Peter Collingbourne <[email protected]> Reviewed-by: Steven Price <[email protected]> Cc: Will Deacon <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Peter Collingbourne <[email protected]> Reviewed-by: Cornelia Huck <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-29arm64: mte: Fix/clarify the PG_mte_tagged semanticsCatalin Marinas2-1/+31
Currently the PG_mte_tagged page flag mostly means the page contains valid tags and it should be set after the tags have been cleared or restored. However, in mte_sync_tags() it is set before setting the tags to avoid, in theory, a race with concurrent mprotect(PROT_MTE) for shared pages. However, a concurrent mprotect(PROT_MTE) with a copy on write in another thread can cause the new page to have stale tags. Similarly, tag reading via ptrace() can read stale tags if the PG_mte_tagged flag is set before actually clearing/restoring the tags. Fix the PG_mte_tagged semantics so that it is only set after the tags have been cleared or restored. This is safe for swap restoring into a MAP_SHARED or CoW page since the core code takes the page lock. Add two functions to test and set the PG_mte_tagged flag with acquire and release semantics. The downside is that concurrent mprotect(PROT_MTE) on a MAP_SHARED page may cause tag loss. This is already the case for KVM guests if a VMM changes the page protection while the guest triggers a user_mem_abort(). Signed-off-by: Catalin Marinas <[email protected]> [[email protected]: fix build with CONFIG_ARM64_MTE disabled] Signed-off-by: Peter Collingbourne <[email protected]> Reviewed-by: Cornelia Huck <[email protected]> Reviewed-by: Steven Price <[email protected]> Cc: Will Deacon <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Peter Collingbourne <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-25arm64/asm: Remove unused assembler DAIF save/restore macrosMark Brown1-9/+0
There are no longer any users of the assembler macros for saving and restoring DAIF so remove them to prevent further users being added, there are C equivalents available. Signed-off-by: Mark Brown <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-11-25arm64/kpti: Move DAIF masking to C codeMark Brown1-0/+10
We really don't want to take an exception while replacing TTBR1 so we mask DAIF during the actual update. Currently this is done in the assembly function idmap_cpu_replace_ttbr1() but it could equally be done in the only caller of that function, cpu_replace_ttbr1(). This simplifies the assembly code slightly and means that when working with the code around masking DAIF flags there is one less piece of assembly code which needs to be considered. While we're at it add a comment which makes explicit why we are masking DAIF in this code. There should be no functional effect. Signed-off-by: Mark Brown <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-11-22KVM: arm64: Reject shared table walks in the hyp codeOliver Upton1-2/+15
Exclusive table walks are the only supported table walk in the hyp, as there is no construct like RCU available in the hypervisor code. Reject any attempt to do a shared table walk by returning an error and allowing the caller to clean up the mess. Suggested-by: Will Deacon <[email protected]> Signed-off-by: Oliver Upton <[email protected]> Acked-by: Will Deacon <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-22KVM: arm64: Don't acquire RCU read lock for exclusive table walksOliver Upton1-6/+8
Marek reported a BUG resulting from the recent parallel faults changes, as the hyp stage-1 map walker attempted to allocate table memory while holding the RCU read lock: BUG: sleeping function called from invalid context at include/linux/sched/mm.h:274 in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0 preempt_count: 0, expected: 0 RCU nest depth: 1, expected: 0 2 locks held by swapper/0/1: #0: ffff80000a8a44d0 (kvm_hyp_pgd_mutex){+.+.}-{3:3}, at: __create_hyp_mappings+0x80/0xc4 #1: ffff80000a927720 (rcu_read_lock){....}-{1:2}, at: kvm_pgtable_walk+0x0/0x1f4 CPU: 2 PID: 1 Comm: swapper/0 Not tainted 6.1.0-rc3+ #5918 Hardware name: Raspberry Pi 3 Model B (DT) Call trace: dump_backtrace.part.0+0xe4/0xf0 show_stack+0x18/0x40 dump_stack_lvl+0x8c/0xb8 dump_stack+0x18/0x34 __might_resched+0x178/0x220 __might_sleep+0x48/0xa0 prepare_alloc_pages+0x178/0x1a0 __alloc_pages+0x9c/0x109c alloc_page_interleave+0x1c/0xc4 alloc_pages+0xec/0x160 get_zeroed_page+0x1c/0x44 kvm_hyp_zalloc_page+0x14/0x20 hyp_map_walker+0xd4/0x134 kvm_pgtable_visitor_cb.isra.0+0x38/0x5c __kvm_pgtable_walk+0x1a4/0x220 kvm_pgtable_walk+0x104/0x1f4 kvm_pgtable_hyp_map+0x80/0xc4 __create_hyp_mappings+0x9c/0xc4 kvm_mmu_init+0x144/0x1cc kvm_arch_init+0xe4/0xef4 kvm_init+0x3c/0x3d0 arm_init+0x20/0x30 do_one_initcall+0x74/0x400 kernel_init_freeable+0x2e0/0x350 kernel_init+0x24/0x130 ret_from_fork+0x10/0x20 Since the hyp stage-1 table walkers are serialized by kvm_hyp_pgd_mutex, RCU protection really doesn't add anything. Don't acquire the RCU read lock for an exclusive walk. Reported-by: Marek Szyprowski <[email protected]> Signed-off-by: Oliver Upton <[email protected]> Acked-by: Will Deacon <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-22KVM: arm64: Take a pointer to walker data in kvm_dereference_pteref()Oliver Upton1-71/+73
Rather than passing through the state of the KVM_PGTABLE_WALK_SHARED flag, just take a pointer to the whole walker structure instead. Move around struct kvm_pgtable and the RCU indirection such that the associated ifdeffery remains in one place while ensuring the walker + flags definitions precede their use. No functional change intended. Signed-off-by: Oliver Upton <[email protected]> Acked-by: Will Deacon <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-19KVM: arm64: PMU: Move the ID_AA64DFR0_EL1.PMUver limit to VM creationMarc Zyngier1-0/+4
As further patches will enable the selection of a PMU revision from userspace, sample the supported PMU revision at VM creation time, rather than building each time the ID_AA64DFR0_EL1 register is accessed. This shouldn't result in any change in behaviour. Reviewed-by: Reiji Watanabe <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-18Merge tag 'arm64-fixes' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Fix a build error with CONFIG_CFI_CLANG + CONFIG_FTRACE when CONFIG_FUNCTION_GRAPH_TRACER is not enabled. - Fix a BUG_ON triggered by the page table checker due to incorrect file_map_count for non-leaf pmd/pud (the arm64 pmd_user_accessible_page() not checking whether it's a leaf entry). * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64/mm: fix incorrect file_map_count for non-leaf pmd/pud arm64: ftrace: Define ftrace_stub_graph only with FUNCTION_GRAPH_TRACER
2022-11-18arm64/mm: fix incorrect file_map_count for non-leaf pmd/pudLiu Shixin1-2/+2
The page table check trigger BUG_ON() unexpectedly when collapse hugepage: ------------[ cut here ]------------ kernel BUG at mm/page_table_check.c:82! Internal error: Oops - BUG: 00000000f2000800 [#1] SMP Dumping ftrace buffer: (ftrace buffer empty) Modules linked in: CPU: 6 PID: 68 Comm: khugepaged Not tainted 6.1.0-rc3+ #750 Hardware name: linux,dummy-virt (DT) pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : page_table_check_clear.isra.0+0x258/0x3f0 lr : page_table_check_clear.isra.0+0x240/0x3f0 [...] Call trace: page_table_check_clear.isra.0+0x258/0x3f0 __page_table_check_pmd_clear+0xbc/0x108 pmdp_collapse_flush+0xb0/0x160 collapse_huge_page+0xa08/0x1080 hpage_collapse_scan_pmd+0xf30/0x1590 khugepaged_scan_mm_slot.constprop.0+0x52c/0xac8 khugepaged+0x338/0x518 kthread+0x278/0x2f8 ret_from_fork+0x10/0x20 [...] Since pmd_user_accessible_page() doesn't check if a pmd is leaf, it decrease file_map_count for a non-leaf pmd comes from collapse_huge_page(). and so trigger BUG_ON() unexpectedly. Fix this problem by using pmd_leaf() insteal of pmd_present() in pmd_user_accessible_page(). Moreover, use pud_leaf() for pud_user_accessible_page() too. Fixes: 42b2547137f5 ("arm64/mm: enable ARCH_SUPPORTS_PAGE_TABLE_CHECK") Reported-by: Denys Vlasenko <[email protected]> Signed-off-by: Liu Shixin <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Acked-by: Pasha Tatashin <[email protected]> Reviewed-by: Kefeng Wang <[email protected]> Acked-by: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
2022-11-18arm64: errata: Workaround possible Cortex-A715 [ESR|FAR]_ELx corruptionAnshuman Khandual2-0/+18
If a Cortex-A715 cpu sees a page mapping permissions change from executable to non-executable, it may corrupt the ESR_ELx and FAR_ELx registers, on the next instruction abort caused by permission fault. Only user-space does executable to non-executable permission transition via mprotect() system call which calls ptep_modify_prot_start() and ptep_modify _prot_commit() helpers, while changing the page mapping. The platform code can override these helpers via __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION. Work around the problem via doing a break-before-make TLB invalidation, for all executable user space mappings, that go through mprotect() system call. This overrides ptep_modify_prot_start() and ptep_modify_prot_commit(), via defining HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION on the platform thus giving an opportunity to intercept user space exec mappings, and do the necessary TLB invalidation. Similar interceptions are also implemented for HugeTLB. Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Mark Rutland <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Anshuman Khandual <[email protected]> Reviewed-by: Catalin Marinas <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-11-18arm64: Add Cortex-715 CPU part definitionAnshuman Khandual1-0/+2
Add the CPU Partnumbers for the new Arm designs. Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Cc: Suzuki K Poulose <[email protected]> Cc: James Morse <[email protected]> Cc: [email protected] Cc: [email protected] Acked-by: Catalin Marinas <[email protected]> Signed-off-by: Anshuman Khandual <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-11-18arm64/mm: Drop unused restore_ttbr1Anshuman Khandual1-11/+0
restore_ttbr1 procedure is not used anywhere, hence just drop it. Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Andrew Morton <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Anshuman Khandual <[email protected]> Acked-by: Mark Rutland <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-11-18arm64: move on_thread_stack() to <asm/stacktrace.h>Mark Rutland2-2/+2
Currently on_thread_stack() is defined in <asm/processor.h>, depending upon definitiong from <asm/stacktrace.h> despite this header not being included. This ends up being fragile, and any user of on_thread_stack() must include both <asm/processor.h> and <asm/stacktrace.h>. We organised things this way due to header dependencies back in commit: 0b3e336601b82c6a ("arm64: Add support for STACKLEAK gcc plugin") ... but now that we no longer use current_top_of_stack(), and given that stackleak includes <asm/stacktrace.h> via <linux/stackleak.h>, we no longer need the definition to live in <asm/processor.h>. Move on_thread_stack() to <asm/stacktrace.h>, where all its dependencies are guaranteed to be defined. This requires having arm64's irq.c explicitly include <asm/stacktrace.h>, and I've taken the opportunity to sort the includes, which were slightly out of order. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Kees Cook <[email protected]> Cc: Will Deacon <[email protected]> Reviewed-by: Kees Cook <[email protected]> Reviewed-by: Mark Brown <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-11-18arm64: remove current_top_of_stack()Mark Rutland1-11/+0
We no longer use current_top_of_stack() on arm64, so it can be removed. We introduced current_top_of_stack() for STACKLEAK in commit: 0b3e336601b82c6a ("arm64: Add support for STACKLEAK gcc plugin") ... then we figured out the intended semantics were unclear, and reworked it in commit: e85094c31ddb794a ("arm64: stackleak: fix current_top_of_stack()") ... then we removed the only user in commit: 0cfa2ccd285d98ad ("stackleak: rework stack high bound handling") Given that it's no longer used, and it's very easy to misuse, this patch removes current_top_of_stack(). For the moment, on_thread_stack() is left where it is as moving it will change some header dependencies. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Kees Cook <[email protected]> Cc: Will Deacon <[email protected]> Reviewed-by: Kees Cook <[email protected]> Reviewed-by: Mark Brown <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-11-18arm64/mm: Drop idmap_pg_end[] declarationAnshuman Khandual1-1/+0
idmap_pg_end[] is not used anywhere, hence just drop its declaration. Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Andrew Morton <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Anshuman Khandual <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-11-18ftrace: arm64: move from REGS to ARGSMark Rutland1-6/+66
This commit replaces arm64's support for FTRACE_WITH_REGS with support for FTRACE_WITH_ARGS. This removes some overhead and complexity, and removes some latent issues with inconsistent presentation of struct pt_regs (which can only be reliably saved/restored at exception boundaries). FTRACE_WITH_REGS has been supported on arm64 since commit: 3b23e4991fb66f6d ("arm64: implement ftrace with regs") As noted in the commit message, the major reasons for implementing FTRACE_WITH_REGS were: (1) To make it possible to use the ftrace graph tracer with pointer authentication, where it's necessary to snapshot/manipulate the LR before it is signed by the instrumented function. (2) To make it possible to implement LIVEPATCH in future, where we need to hook function entry before an instrumented function manipulates the stack or argument registers. Practically speaking, we need to preserve the argument/return registers, PC, LR, and SP. Neither of these need a struct pt_regs, and only require the set of registers which are live at function call/return boundaries. Our calling convention is defined by "Procedure Call Standard for the Arm® 64-bit Architecture (AArch64)" (AKA "AAPCS64"), which can currently be found at: https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst Per AAPCS64, all function call argument and return values are held in the following GPRs: * X0 - X7 : parameter / result registers * X8 : indirect result location register * SP : stack pointer (AKA SP) Additionally, ad function call boundaries, the following GPRs hold context/return information: * X29 : frame pointer (AKA FP) * X30 : link register (AKA LR) ... and for ftrace we need to capture the instrumented address: * PC : program counter No other GPRs are relevant, as none of the other arguments hold parameters or return values: * X9 - X17 : temporaries, may be clobbered * X18 : shadow call stack pointer (or temorary) * X19 - X28 : callee saved This patch implements FTRACE_WITH_ARGS for arm64, only saving/restoring the minimal set of registers necessary. This is always sufficient to manipulate control flow (e.g. for live-patching) or to manipulate function arguments and return values. This reduces the necessary stack usage from 336 bytes for pt_regs down to 112 bytes for ftrace_regs + 32 bytes for two frame records, freeing up 188 bytes. This could be reduced further with changes to the unwinder. As there is no longer a need to save different sets of registers for different features, we no longer need distinct `ftrace_caller` and `ftrace_regs_caller` trampolines. This allows the trampoline assembly to be simpler, and simplifies code which previously had to handle the two trampolines. I've tested this with the ftrace selftests, where there are no unexpected failures. Co-developed-by: Florent Revest <[email protected]> Signed-off-by: Mark Rutland <[email protected]> Signed-off-by: Florent Revest <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Will Deacon <[email protected]> Reviewed-by: Masami Hiramatsu (Google) <[email protected]> Reviewed-by: Steven Rostedt (Google) <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-11-18Merge tag 'efi-zboot-direct-for-v6.2' into efi/nextArd Biesheuvel1-3/+12
2022-11-18random: remove early archrandom abstractionJason A. Donenfeld1-38/+10
The arch_get_random*_early() abstraction is not completely useful and adds complexity, because it's not a given that there will be no calls to arch_get_random*() between random_init_early(), which uses arch_get_random*_early(), and init_cpu_features(). During that gap, crng_reseed() might be called, which uses arch_get_random*(), since it's mostly not init code. Instead we can test whether we're in the early phase in arch_get_random*() itself, and in doing so avoid all ambiguity about where we are. Fortunately, the only architecture that currently implements arch_get_random*_early() also has an alternatives-based cpu feature system, one flag of which determines whether the other flags have been initialized. This makes it possible to do the early check with zero cost once the system is initialized. Reviewed-by: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Jean-Philippe Brucker <[email protected]> Signed-off-by: Jason A. Donenfeld <[email protected]>
2022-11-18stackprotector: actually use get_random_canary()Jason A. Donenfeld1-8/+1
The RNG always mixes in the Linux version extremely early in boot. It also always includes a cycle counter, not only during early boot, but each and every time it is invoked prior to being fully initialized. Together, this means that the use of additional xors inside of the various stackprotector.h files is superfluous and over-complicated. Instead, we can get exactly the same thing, but better, by just calling `get_random_canary()`. Acked-by: Guo Ren <[email protected]> # for csky Acked-by: Catalin Marinas <[email protected]> # for arm64 Acked-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Jason A. Donenfeld <[email protected]>
2022-11-17arm64: Add ID_DFR0_EL1.PerfMon values for PMUv3p7 and IMP_DEFMarc Zyngier1-0/+2
Align the ID_DFR0_EL1.PerfMon values with ID_AA64DFR0_EL1.PMUver. Reviewed-by: Oliver Upton <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-11-15arm64: armv8_deprecated: rework deprected instruction handlingMark Rutland1-10/+9
Support for deprecated instructions can be enabled or disabled at runtime. To handle this, the code in armv8_deprecated.c registers and unregisters undef_hooks, and makes cross CPU calls to configure HW support. This is rather complicated, and the synchronization required to make this safe ends up serializing the handling of instructions which have been trapped. This patch simplifies the deprecated instruction handling by removing the dynamic registration and unregistration, and changing the trap handling code to determine whether a handler should be invoked. This removes the need for dynamic list management, and simplifies the locking requirements, making it possible to handle trapped instructions entirely in parallel. Where changing the emulation state requires a cross-call, this is serialized by locally disabling interrupts, ensuring that the CPU is not left in an inconsistent state. To simplify sysctl management, each insn_emulation is given a separate sysctl table, permitting these to be registered separately. The core sysctl code will iterate over all of these when walking sysfs. I've tested this with userspace programs which use each of the deprecated instructions, and I've concurrently modified the support level for each of the features back-and-forth between HW and emulated to check that there are no spurious SIGILLs sent to userspace when the support level is changed. Signed-off-by: Mark Rutland <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: James Morse <[email protected]> Cc: Joey Gouly <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-11-15arm64: rework EL0 MRS emulationMark Rutland1-1/+2
On CPUs without FEAT_IDST, ID register emulation is slower than it needs to be, as all threads contend for the same lock to perform the emulation. This patch reworks the emulation to avoid this unnecessary contention. On CPUs with FEAT_IDST (which is mandatory from ARMv8.4 onwards), EL0 accesses to ID registers result in a SYS trap, and emulation of these is handled with a sys64_hook. These hooks are statically allocated, and no locking is required to iterate through the hooks and perform the emulation, allowing emulation to occur in parallel with no contention. On CPUs without FEAT_IDST, EL0 accesses to ID registers result in an UNDEFINED exception, and emulation of these accesses is handled with an undef_hook. When an EL0 MRS instruction is trapped to EL1, the kernel finds the relevant handler by iterating through all of the undef_hooks, requiring undef_lock to be held during this lookup. This locking is only required to safely traverse the list of undef_hooks (as it can be concurrently modified), and the actual emulation of the MRS does not require any mutual exclusion. This locking is an unfortunate bottleneck, especially given that MRS emulation is enabled unconditionally and is never disabled. This patch reworks the non-FEAT_IDST MRS emulation logic so that it can be invoked directly from do_el0_undef(). This removes the bottleneck, allowing MRS traps to be handled entirely in parallel, and is a stepping stone to making all of the undef_hooks lock-free. I've tested this in a 64-vCPU VM on a 64-CPU ThunderX2 host, with a benchmark which spawns a number of threads which each try to read ID_AA64ISAR0_EL1 1000000 times. This is vastly more contention than will ever be seen in realistic usage, but clearly demonstrates the removal of the bottleneck: | Threads || Time (seconds) | | || Before || After | | || Real | System || Real | System | |---------++--------+---------++--------+---------| | 1 || 0.29 | 0.20 || 0.24 | 0.12 | | 2 || 0.35 | 0.51 || 0.23 | 0.27 | | 4 || 1.08 | 3.87 || 0.24 | 0.56 | | 8 || 4.31 | 33.60 || 0.24 | 1.11 | | 16 || 9.47 | 149.39 || 0.23 | 2.15 | | 32 || 19.07 | 605.27 || 0.24 | 4.38 | | 64 || 65.40 | 3609.09 || 0.33 | 11.27 | Aside from the speedup, there should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: James Morse <[email protected]> Cc: Joey Gouly <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-11-15arm64: factor out EL1 SSBS emulation hookMark Rutland1-0/+2
Currently call_undef_hook() is used to handle UNDEFINED exceptions from EL0 and EL1. As support for deprecated instructions may be enabled independently, the handlers for individual instructions are organised as a linked list of struct undef_hook which can be manipulated dynamically. As this can be manipulated dynamically, the list is protected with a raw_spinlock which must be acquired when handling UNDEFINED exceptions or when manipulating the list of handlers. This locking is unfortunate as it serialises handling of UNDEFINED exceptions, and requires RCU to be enabled for lockdep, requiring the use of RCU_NONIDLE() in resume path of cpu_suspend() since commit: a2c42bbabbe260b7 ("arm64: spectre: Prevent lockdep splat on v4 mitigation enable path") The list of UNDEFINED handlers largely consist of handlers for exceptions taken from EL0, and the only handler for exceptions taken from EL1 handles `MSR SSBS, #imm` on CPUs which feature PSTATE.SSBS but lack the corresponding MSR (Immediate) instruction. Other than this we never expect to take an UNDEFINED exception from EL1 in normal operation. This patch reworks do_el0_undef() to invoke the EL1 SSBS handler directly, relegating call_undef_hook() to only handle EL0 UNDEFs. This removes redundant work to iterate the list for EL1 UNDEFs, and removes the need for locking, permitting EL1 UNDEFs to be handled in parallel without contention. The RCU_NONIDLE() call in cpu_suspend() will be removed in a subsequent patch, as there are other potential issues with the use of instrumentable code and RCU in the CPU suspend code. I've tested this by forcing the detection of SSBS on a CPU that doesn't have it, and verifying that the try_emulate_el1_ssbs() callback is invoked. Signed-off-by: Mark Rutland <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: James Morse <[email protected]> Cc: Joey Gouly <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>
2022-11-15arm64: split EL0/EL1 UNDEF handlersMark Rutland1-1/+2
In general, exceptions taken from EL1 need to be handled separately from exceptions taken from EL0, as the logic to handle the two cases can be significantly divergent, and exceptions taken from EL1 typically have more stringent requirements on locking and instrumentation. Subsequent patches will rework the way EL1 UNDEFs are handled in order to address longstanding soundness issues with instrumentation and RCU. In preparation for that rework, this patch splits the existing do_undefinstr() handler into separate do_el0_undef() and do_el1_undef() handlers. Prior to this patch, do_undefinstr() was marked with NOKPROBE_SYMBOL(), preventing instrumentation via kprobes. However, do_undefinstr() invokes other code which can be instrumented, and: * For UNDEFINED exceptions taken from EL0, there is no risk of recursion within kprobes. Therefore it is safe for do_el0_undef to be instrumented with kprobes, and it does not need to be marked with NOKPROBE_SYMBOL(). * For UNDEFINED exceptions taken from EL1, either: (a) The exception is has been taken when manipulating SSBS; these cases are limited and do not occur within code that can be invoked recursively via kprobes. Hence, in these cases instrumentation with kprobes is benign. (b) The exception has been taken for an unknown reason, as other than manipulating SSBS we do not expect to take UNDEFINED exceptions from EL1. Any handling of these exception is best-effort. ... and in either case, marking do_el1_undef() with NOKPROBE_SYMBOL() isn't sufficient to prevent recursion via kprobes as functions it calls (including die()) are instrumentable via kprobes. Hence, it's not worthwhile to mark do_el1_undef() with NOKPROBE_SYMBOL(). The same applies to do_el1_bti() and do_el1_fpac(), so their NOKPROBE_SYMBOL() annotations are also removed. Aside from the new instrumentability, there should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: James Morse <[email protected]> Cc: Joey Gouly <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Will Deacon <[email protected]>