aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2019-12-06arm64: Validate tagged addresses in access_ok() called from kernel threadsCatalin Marinas1-1/+6
__range_ok(), invoked from access_ok(), clears the tag of the user address only if CONFIG_ARM64_TAGGED_ADDR_ABI is enabled and the thread opted in to the relaxed ABI. The latter sets the TIF_TAGGED_ADDR thread flag. In the case of asynchronous I/O (e.g. io_submit()), the access_ok() may be called from a kernel thread. Since kernel threads don't have TIF_TAGGED_ADDR set, access_ok() will fail for valid tagged user addresses. Example from the ffs_user_copy_worker() thread: use_mm(io_data->mm); ret = ffs_copy_to_iter(io_data->buf, ret, &io_data->data); unuse_mm(io_data->mm); Relax the __range_ok() check to always untag the user address if called in the context of a kernel thread. The user pointers would have already been checked via aio_setup_rw() -> import_{single_range,iovec}() at the time of the asynchronous I/O request. Fixes: 63f0c6037965 ("arm64: Introduce prctl() options to control the tagged user addresses ABI") Cc: <[email protected]> # 5.4.x- Cc: Will Deacon <[email protected]> Reported-by: Evgenii Stepanov <[email protected]> Tested-by: Evgenii Stepanov <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2019-12-04arm64: mm: Fix column alignment for UXN in kernel_page_tablesMark Brown1-0/+1
UXN is the only individual PTE bit other than the PTE_ATTRINDX_MASK ones which doesn't have both a set and a clear value provided, meaning that the columns in the table won't all be aligned. The PTE_ATTRINDX_MASK values are all both mutually exclusive and longer so are listed last to make a single final column for those values. Ensure everything is aligned by providing a clear value for UXN. Acked-by: Mark Rutland <[email protected]> Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2019-12-04arm64: insn: consistently handle exit textMark Rutland3-4/+22
A kernel built with KASAN && FTRACE_WITH_REGS && !MODULES, produces a boot-time splat in the bowels of ftrace: | [ 0.000000] ftrace: allocating 32281 entries in 127 pages | [ 0.000000] ------------[ cut here ]------------ | [ 0.000000] WARNING: CPU: 0 PID: 0 at kernel/trace/ftrace.c:2019 ftrace_bug+0x27c/0x328 | [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.4.0-rc3-00008-g7f08ae53a7e3 #13 | [ 0.000000] Hardware name: linux,dummy-virt (DT) | [ 0.000000] pstate: 60000085 (nZCv daIf -PAN -UAO) | [ 0.000000] pc : ftrace_bug+0x27c/0x328 | [ 0.000000] lr : ftrace_init+0x640/0x6cc | [ 0.000000] sp : ffffa000120e7e00 | [ 0.000000] x29: ffffa000120e7e00 x28: ffff00006ac01b10 | [ 0.000000] x27: ffff00006ac898c0 x26: dfffa00000000000 | [ 0.000000] x25: ffffa000120ef290 x24: ffffa0001216df40 | [ 0.000000] x23: 000000000000018d x22: ffffa0001244c700 | [ 0.000000] x21: ffffa00011bf393c x20: ffff00006ac898c0 | [ 0.000000] x19: 00000000ffffffff x18: 0000000000001584 | [ 0.000000] x17: 0000000000001540 x16: 0000000000000007 | [ 0.000000] x15: 0000000000000000 x14: ffffa00010432770 | [ 0.000000] x13: ffff940002483519 x12: 1ffff40002483518 | [ 0.000000] x11: 1ffff40002483518 x10: ffff940002483518 | [ 0.000000] x9 : dfffa00000000000 x8 : 0000000000000001 | [ 0.000000] x7 : ffff940002483519 x6 : ffffa0001241a8c0 | [ 0.000000] x5 : ffff940002483519 x4 : ffff940002483519 | [ 0.000000] x3 : ffffa00011780870 x2 : 0000000000000001 | [ 0.000000] x1 : 1fffe0000d591318 x0 : 0000000000000000 | [ 0.000000] Call trace: | [ 0.000000] ftrace_bug+0x27c/0x328 | [ 0.000000] ftrace_init+0x640/0x6cc | [ 0.000000] start_kernel+0x27c/0x654 | [ 0.000000] random: get_random_bytes called from print_oops_end_marker+0x30/0x60 with crng_init=0 | [ 0.000000] ---[ end trace 0000000000000000 ]--- | [ 0.000000] ftrace faulted on writing | [ 0.000000] [<ffffa00011bf393c>] _GLOBAL__sub_D_65535_0___tracepoint_initcall_level+0x4/0x28 | [ 0.000000] Initializing ftrace call sites | [ 0.000000] ftrace record flags: 0 | [ 0.000000] (0) | [ 0.000000] expected tramp: ffffa000100b3344 This is due to an unfortunate combination of several factors. Building with KASAN results in the compiler generating anonymous functions to register/unregister global variables against the shadow memory. These functions are placed in .text.startup/.text.exit, and given mangled names like _GLOBAL__sub_{I,D}_65535_0_$OTHER_SYMBOL. The kernel linker script places these in .init.text and .exit.text respectively, which are both discarded at runtime as part of initmem. Building with FTRACE_WITH_REGS uses -fpatchable-function-entry=2, which also instruments KASAN's anonymous functions. When these are discarded with the rest of initmem, ftrace removes dangling references to these call sites. Building without MODULES implicitly disables STRICT_MODULE_RWX, and causes arm64's patch_map() function to treat any !core_kernel_text() symbol as something that can be modified in-place. As core_kernel_text() is only true for .text and .init.text, with the latter depending on system_state < SYSTEM_RUNNING, we'll treat .exit.text as something that can be patched in-place. However, .exit.text is mapped read-only. Hence in this configuration the ftrace init code blows up while trying to patch one of the functions generated by KASAN. We could try to filter out the call sites in .exit.text rather than initializing them, but this would be inconsistent with how we handle .init.text, and requires hooking into core bits of ftrace. The behaviour of patch_map() is also inconsistent today, so instead let's clean that up and have it consistently handle .exit.text. This patch teaches patch_map() to handle .exit.text at init time, preventing the boot-time splat above. The flow of patch_map() is reworked to make the logic clearer and minimize redundant conditionality. Fixes: 3b23e4991fb66f6d ("arm64: implement ftrace with regs") Signed-off-by: Mark Rutland <[email protected]> Cc: Amit Daniel Kachhap <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Torsten Duwe <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2019-12-04arm64: mm: Fix initialisation of DMA zones on non-NUMA systemsWill Deacon1-14/+11
John reports that the recently merged commit 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32") breaks the boot on his DB845C board: | Booting Linux on physical CPU 0x0000000000 [0x517f803c] | Linux version 5.4.0-mainline-10675-g957a03b9e38f | Machine model: Thundercomm Dragonboard 845c | [...] | Built 1 zonelists, mobility grouping on. Total pages: -188245 | Kernel command line: earlycon | firmware_class.path=/vendor/firmware/ androidboot.hardware=db845c | init=/init androidboot.boot_devices=soc/1d84000.ufshc | printk.devkmsg=on buildvariant=userdebug root=/dev/sda2 | androidboot.bootdevice=1d84000.ufshc androidboot.serialno=c4e1189c | androidboot.baseband=sda | msm_drm.dsi_display0=dsi_lt9611_1080_video_display: | androidboot.slot_suffix=_a skip_initramfs rootwait ro init=/init | | <hangs indefinitely here> This is because, when CONFIG_NUMA=n, zone_sizes_init() fails to handle memblocks that fall entirely within the ZONE_DMA region and erroneously ends up trying to add a negatively-sized region into the following ZONE_DMA32, which is later interpreted as a large unsigned region by the core MM code. Rework the non-NUMA implementation of zone_sizes_init() so that the start address of the memblock being processed is adjusted according to the end of the previous zone, which is then range-checked before updating the hole information of subsequent zones. Cc: Nicolas Saenz Julienne <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Bjorn Andersson <[email protected]> Link: https://lore.kernel.org/lkml/CALAqxLVVcsmFrDKLRGRq7GewcW405yTOxG=KR3csVzQ6bXutkA@mail.gmail.com Fixes: 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32") Reported-by: John Stultz <[email protected]> Tested-by: John Stultz <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2019-11-14arm64: Kconfig: add a choice for endiannessAnders Roxell1-1/+17
When building allmodconfig KCONFIG_ALLCONFIG=$(pwd)/arch/arm64/configs/defconfig CONFIG_CPU_BIG_ENDIAN gets enabled. Which tends not to be what most people want. Another concern that has come up is that ACPI isn't built for an allmodconfig kernel today since that also depends on !CPU_BIG_ENDIAN. Rework so that we introduce a 'choice' and default the choice to CPU_LITTLE_ENDIAN. That means that when we build an allmodconfig kernel it will default to CPU_LITTLE_ENDIAN that most people tends to want. Reviewed-by: John Garry <[email protected]> Acked-by: Will Deacon <[email protected]> Signed-off-by: Anders Roxell <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2019-11-11kselftest: arm64: fix spelling mistake "contiguos" -> "contiguous"Colin Ian King1-1/+1
There is a spelling mistake in an error message literal string. Fix it. Fixes: f96bf4340316 ("kselftest: arm64: mangle_pstate_invalid_compat_toggle and common utils") Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2019-11-11arm64: Kconfig: make CMDLINE_FORCE depend on CMDLINEAnders Roxell1-0/+1
When building allmodconfig KCONFIG_ALLCONFIG=$(pwd)/arch/arm64/configs/defconfig CONFIG_CMDLINE_FORCE gets enabled. Which forces the user to pass the full cmdline to CONFIG_CMDLINE="...". Rework so that CONFIG_CMDLINE_FORCE gets set only if CONFIG_CMDLINE is set to something except an empty string. Suggested-by: John Garry <[email protected]> Acked-by: Will Deacon <[email protected]> Signed-off-by: Anders Roxell <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2019-11-11MAINTAINERS: Add arm64 selftests to the ARM64 PORT entryCatalin Marinas1-0/+1
Since these are tests specific to the arm64 architecture, it makes sense for the arm64 maintainers to gatekeep the corresponding changes. Cc: Shuah Khan <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2019-11-08Merge branches 'for-next/elf-hwcap-docs', 'for-next/smccc-conduit-cleanup', ↵Catalin Marinas75-240/+2083
'for-next/zone-dma', 'for-next/relax-icc_pmr_el1-sync', 'for-next/double-page-fault', 'for-next/misc', 'for-next/kselftest-arm64-signal' and 'for-next/kaslr-diagnostics' into for-next/core * for-next/elf-hwcap-docs: : Update the arm64 ELF HWCAP documentation docs/arm64: cpu-feature-registers: Rewrite bitfields that don't follow [e, s] docs/arm64: cpu-feature-registers: Documents missing visible fields docs/arm64: elf_hwcaps: Document HWCAP_SB docs/arm64: elf_hwcaps: sort the HWCAP{, 2} documentation by ascending value * for-next/smccc-conduit-cleanup: : SMC calling convention conduit clean-up firmware: arm_sdei: use common SMCCC_CONDUIT_* firmware/psci: use common SMCCC_CONDUIT_* arm: spectre-v2: use arm_smccc_1_1_get_conduit() arm64: errata: use arm_smccc_1_1_get_conduit() arm/arm64: smccc/psci: add arm_smccc_1_1_get_conduit() * for-next/zone-dma: : Reintroduction of ZONE_DMA for Raspberry Pi 4 support arm64: mm: reserve CMA and crashkernel in ZONE_DMA32 dma/direct: turn ARCH_ZONE_DMA_BITS into a variable arm64: Make arm64_dma32_phys_limit static arm64: mm: Fix unused variable warning in zone_sizes_init mm: refresh ZONE_DMA and ZONE_DMA32 comments in 'enum zone_type' arm64: use both ZONE_DMA and ZONE_DMA32 arm64: rename variables used to calculate ZONE_DMA32's size arm64: mm: use arm64_dma_phys_limit instead of calling max_zone_dma_phys() * for-next/relax-icc_pmr_el1-sync: : Relax ICC_PMR_EL1 (GICv3) accesses when ICC_CTLR_EL1.PMHE is clear arm64: Document ICC_CTLR_EL3.PMHE setting requirements arm64: Relax ICC_PMR_EL1 accesses when ICC_CTLR_EL1.PMHE is clear * for-next/double-page-fault: : Avoid a double page fault in __copy_from_user_inatomic() if hw does not support auto Access Flag mm: fix double page fault on arm64 if PTE_AF is cleared x86/mm: implement arch_faults_on_old_pte() stub on x86 arm64: mm: implement arch_faults_on_old_pte() on arm64 arm64: cpufeature: introduce helper cpu_has_hw_af() * for-next/misc: : Various fixes and clean-ups arm64: kpti: Add NVIDIA's Carmel core to the KPTI whitelist arm64: mm: Remove MAX_USER_VA_BITS definition arm64: mm: simplify the page end calculation in __create_pgd_mapping() arm64: print additional fault message when executing non-exec memory arm64: psci: Reduce the waiting time for cpu_psci_cpu_kill() arm64: pgtable: Correct typo in comment arm64: docs: cpu-feature-registers: Document ID_AA64PFR1_EL1 arm64: cpufeature: Fix typos in comment arm64/mm: Poison initmem while freeing with free_reserved_area() arm64: use generic free_initrd_mem() arm64: simplify syscall wrapper ifdeffery * for-next/kselftest-arm64-signal: : arm64-specific kselftest support with signal-related test-cases kselftest: arm64: fake_sigreturn_misaligned_sp kselftest: arm64: fake_sigreturn_bad_size kselftest: arm64: fake_sigreturn_duplicated_fpsimd kselftest: arm64: fake_sigreturn_missing_fpsimd kselftest: arm64: fake_sigreturn_bad_size_for_magic0 kselftest: arm64: fake_sigreturn_bad_magic kselftest: arm64: add helper get_current_context kselftest: arm64: extend test_init functionalities kselftest: arm64: mangle_pstate_invalid_mode_el[123][ht] kselftest: arm64: mangle_pstate_invalid_daif_bits kselftest: arm64: mangle_pstate_invalid_compat_toggle and common utils kselftest: arm64: extend toplevel skeleton Makefile * for-next/kaslr-diagnostics: : Provide diagnostics on boot for KASLR arm64: kaslr: Check command line before looking for a seed arm64: kaslr: Announce KASLR status on boot
2019-11-08arm64: kaslr: Check command line before looking for a seedMark Brown1-5/+6
Now that we print diagnostics at boot the reason why we do not initialise KASLR matters. Currently we check for a seed before we check if the user has explicitly disabled KASLR on the command line which will result in misleading diagnostics so reverse the order of those checks. We still parse the seed from the DT early so that if the user has both provided a seed and disabled KASLR on the command line we still mask the seed on the command line. Signed-off-by: Mark Brown <[email protected]> Acked-by: Mark Rutland <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2019-11-08arm64: kaslr: Announce KASLR status on bootMark Brown1-3/+38
Currently the KASLR code is silent at boot unless it forces on KPTI in which case a message will be printed for that. This can lead to users incorrectly believing their system has the feature enabled when it in fact does not, and if they notice the problem the lack of any diagnostics makes it harder to understand the problem. Add an initcall which prints a message showing the status of KASLR during boot to make the status clear. This is particularly useful in cases where we don't have a seed. It seems to be a relatively common error for system integrators and administrators to enable KASLR in their configuration but not provide the seed at runtime, often due to seed provisioning breaking at some later point after it is initially enabled and verified. Signed-off-by: Mark Brown <[email protected]> Acked-by: Mark Rutland <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2019-11-08kselftest: arm64: fake_sigreturn_misaligned_spCristian Marussi1-0/+37
Add a simple fake_sigreturn testcase which places a valid sigframe on a non-16 bytes aligned SP. Expects a SIGSEGV on test PASS. Reviewed-by: Dave Martin <[email protected]> Signed-off-by: Cristian Marussi <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2019-11-08kselftest: arm64: fake_sigreturn_bad_sizeCristian Marussi1-0/+77
Add a simple fake_sigreturn testcase which builds a ucontext_t with a badly sized header that causes a overrun in the __reserved area and place it onto the stack. Expects a SIGSEGV on test PASS. Reviewed-by: Dave Martin <[email protected]> Signed-off-by: Cristian Marussi <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2019-11-08kselftest: arm64: fake_sigreturn_duplicated_fpsimdCristian Marussi1-0/+50
Add a simple fake_sigreturn testcase which builds a ucontext_t with an anomalous additional fpsimd_context and place it onto the stack. Expects a SIGSEGV on test PASS. Reviewed-by: Dave Martin <[email protected]> Signed-off-by: Cristian Marussi <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2019-11-08kselftest: arm64: fake_sigreturn_missing_fpsimdCristian Marussi1-0/+50
Add a simple fake_sigreturn testcase which builds a ucontext_t without the required fpsimd_context and place it onto the stack. Expects a SIGSEGV on test PASS. Reviewed-by: Dave Martin <[email protected]> Signed-off-by: Cristian Marussi <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2019-11-08kselftest: arm64: fake_sigreturn_bad_size_for_magic0Cristian Marussi1-0/+46
Add a simple fake_sigreturn testcase which builds a ucontext_t with a badly sized terminator record and place it onto the stack. Expects a SIGSEGV on test PASS. Reviewed-by: Dave Martin <[email protected]> Signed-off-by: Cristian Marussi <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2019-11-08kselftest: arm64: fake_sigreturn_bad_magicCristian Marussi6-1/+169
Add a simple fake_sigreturn testcase which builds a ucontext_t with a bad magic header and place it onto the stack. Expects a SIGSEGV on test PASS. Introduce a common utility assembly trampoline function to invoke a sigreturn while placing the provided sigframe at wanted alignment and also an helper to make space when needed inside the sigframe reserved area. Reviewed-by: Dave Martin <[email protected]> Signed-off-by: Cristian Marussi <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2019-11-08kselftest: arm64: add helper get_current_contextCristian Marussi3-1/+134
Introduce a new common utility function get_current_context() which can be used to grab a ucontext without the help of libc, and also to detect if such ucontext has been successfully used by placing it on the stack as a fake sigframe. Reviewed-by: Dave Martin <[email protected]> Signed-off-by: Cristian Marussi <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2019-11-08kselftest: arm64: extend test_init functionalitiesCristian Marussi4-13/+31
Extend signal testing framework to allow the definition of a custom per test initialization function to be run at the end of the common test_init after test setup phase has completed and before test-run routine. This custom per-test initialization function also enables the test writer to decide on its own when forcibly skip the test itself using standard KSFT mechanism. Reviewed-by: Dave Martin <[email protected]> Signed-off-by: Cristian Marussi <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2019-11-08kselftest: arm64: mangle_pstate_invalid_mode_el[123][ht]Cristian Marussi7-0/+118
Add 6 simple mangle testcases that mess with the ucontext_t from within the signal handler, trying to toggle PSTATE mode bits to trick the system into switching to EL1/EL2/EL3 using both SP_EL0(t) and SP_ELx(h). Expects SIGSEGV on test PASS. Reviewed-by: Dave Martin <[email protected]> Signed-off-by: Cristian Marussi <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2019-11-08kselftest: arm64: mangle_pstate_invalid_daif_bitsCristian Marussi1-0/+35
Add a simple mangle testcase which messes with the ucontext_t from within the signal handler, trying to set PSTATE DAIF bits to an invalid value (masking everything). Expects SIGSEGV on test PASS. Reviewed-by: Dave Martin <[email protected]> Signed-off-by: Cristian Marussi <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2019-11-08kselftest: arm64: mangle_pstate_invalid_compat_toggle and common utilsCristian Marussi11-1/+800
Add some arm64/signal specific boilerplate and utility code to help further testcases' development. Introduce also one simple testcase mangle_pstate_invalid_compat_toggle and some related helpers: it is a simple mangle testcase which messes with the ucontext_t from within the signal handler, trying to toggle PSTATE state bits to switch the system between 32bit/64bit execution state. Expects SIGSEGV on test PASS. Reviewed-by: Dave Martin <[email protected]> Signed-off-by: Cristian Marussi <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2019-11-08kselftest: arm64: extend toplevel skeleton MakefileCristian Marussi7-5/+92
Modify KSFT arm64 toplevel Makefile to maintain arm64 kselftests organized by subsystem, keeping them into distinct subdirectories under arm64 custom KSFT directory: tools/testing/selftests/arm64/ Add to such toplevel Makefile a mechanism to guess the effective location of Kernel headers as installed by KSFT framework. Fit existing arm64 tags kselftest into this new schema moving them into their own subdirectory (arm64/tags). Reviewed-by: Dave Martin <[email protected]> Signed-off-by: Cristian Marussi <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2019-11-08Merge branch 'for-next/perf' into for-next/coreCatalin Marinas15-227/+458
- Support for additional PMU topologies on HiSilicon platforms - Support for CCN-512 interconnect PMU - Support for AXI ID filtering in the IMX8 DDR PMU - Support for the CCPI2 uncore PMU in ThunderX2 - Driver cleanup to use devm_platform_ioremap_resource() * for-next/perf: drivers/perf: hisi: update the sccl_id/ccl_id for certain HiSilicon platform perf/imx_ddr: Dump AXI ID filter info to userspace docs/perf: Add AXI ID filter capabilities information perf/imx_ddr: Add driver for DDR PMU in i.MX8MPlus perf/imx_ddr: Add enhanced AXI ID filter support bindings: perf: imx-ddr: Add new compatible string docs/perf: Add explanation for DDR_CAP_AXI_ID_FILTER_ENHANCED quirk arm64: perf: Simplify the ARMv8 PMUv3 event attributes drivers/perf: Add CCPI2 PMU support in ThunderX2 UNCORE driver. Documentation: perf: Update documentation for ThunderX2 PMU uncore driver Documentation: Add documentation for CCN-512 DTS binding perf: arm-ccn: Enable stats for CCN-512 interconnect perf/smmuv3: use devm_platform_ioremap_resource() to simplify code perf/arm-cci: use devm_platform_ioremap_resource() to simplify code perf/arm-ccn: use devm_platform_ioremap_resource() to simplify code perf: xgene: use devm_platform_ioremap_resource() to simplify code perf: hisi: use devm_platform_ioremap_resource() to simplify code
2019-11-07drivers/perf: hisi: update the sccl_id/ccl_id for certain HiSilicon platformShaokun Zhang1-8/+18
For some HiSilicon platform, the originally designed SCCL_ID and CCL_ID are not satisfied with much rich topology when the MT is set, so we extend the SCCL_ID to MPIDR[aff3] and CCL_ID to MPIDR[aff2]. Let's update this for HiSilicon uncore PMU driver. Cc: John Garry <[email protected]> Cc: Hanjun Guo <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Shaokun Zhang <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2019-11-07Merge branch 'arm64/ftrace-with-regs' of ↵Catalin Marinas18-90/+362
git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux into for-next/core FTRACE_WITH_REGS support for arm64. * 'arm64/ftrace-with-regs' of git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux: arm64: ftrace: minimize ifdeffery arm64: implement ftrace with regs arm64: asm-offsets: add S_FP arm64: insn: add encoder for MOV (register) arm64: module/ftrace: intialize PLT at load time arm64: module: rework special section handling module/ftrace: handle patchable-function-entry ftrace: add ftrace_init_nop()
2019-11-07arm64: mm: reserve CMA and crashkernel in ZONE_DMA32Nicolas Saenz Julienne1-2/+2
With the introduction of ZONE_DMA in arm64 we moved the default CMA and crashkernel reservation into that area. This caused a regression on big machines that need big CMA and crashkernel reservations. Note that ZONE_DMA is only 1GB big. Restore the previous behavior as the wide majority of devices are OK with reserving these in ZONE_DMA32. The ones that need them in ZONE_DMA will configure it explicitly. Fixes: 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32") Reported-by: Qian Cai <[email protected]> Signed-off-by: Nicolas Saenz Julienne <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2019-11-06arm64: ftrace: minimize ifdefferyMark Rutland1-10/+8
Now that we no longer refer to mod->arch.ftrace_trampolines in the body of ftrace_make_call(), we can use IS_ENABLED() rather than ifdeffery, and make the code easier to follow. Likewise in ftrace_make_nop(). Let's do so. Signed-off-by: Mark Rutland <[email protected]> Reviewed-by: Ard Biesheuvel <[email protected]> Reviewed-by: Torsten Duwe <[email protected]> Tested-by: Amit Daniel Kachhap <[email protected]> Tested-by: Torsten Duwe <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]>
2019-11-06arm64: implement ftrace with regsTorsten Duwe8-25/+252
This patch implements FTRACE_WITH_REGS for arm64, which allows a traced function's arguments (and some other registers) to be captured into a struct pt_regs, allowing these to be inspected and/or modified. This is a building block for live-patching, where a function's arguments may be forwarded to another function. This is also necessary to enable ftrace and in-kernel pointer authentication at the same time, as it allows the LR value to be captured and adjusted prior to signing. Using GCC's -fpatchable-function-entry=N option, we can have the compiler insert a configurable number of NOPs between the function entry point and the usual prologue. This also ensures functions are AAPCS compliant (e.g. disabling inter-procedural register allocation). For example, with -fpatchable-function-entry=2, GCC 8.1.0 compiles the following: | unsigned long bar(void); | | unsigned long foo(void) | { | return bar() + 1; | } ... to: | <foo>: | nop | nop | stp x29, x30, [sp, #-16]! | mov x29, sp | bl 0 <bar> | add x0, x0, #0x1 | ldp x29, x30, [sp], #16 | ret This patch builds the kernel with -fpatchable-function-entry=2, prefixing each function with two NOPs. To trace a function, we replace these NOPs with a sequence that saves the LR into a GPR, then calls an ftrace entry assembly function which saves this and other relevant registers: | mov x9, x30 | bl <ftrace-entry> Since patchable functions are AAPCS compliant (and the kernel does not use x18 as a platform register), x9-x18 can be safely clobbered in the patched sequence and the ftrace entry code. There are now two ftrace entry functions, ftrace_regs_entry (which saves all GPRs), and ftrace_entry (which saves the bare minimum). A PLT is allocated for each within modules. Signed-off-by: Torsten Duwe <[email protected]> [Mark: rework asm, comments, PLTs, initialization, commit message] Signed-off-by: Mark Rutland <[email protected]> Reviewed-by: Amit Daniel Kachhap <[email protected]> Reviewed-by: Ard Biesheuvel <[email protected]> Reviewed-by: Torsten Duwe <[email protected]> Tested-by: Amit Daniel Kachhap <[email protected]> Tested-by: Torsten Duwe <[email protected]> Cc: AKASHI Takahiro <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Julien Thierry <[email protected]> Cc: Will Deacon <[email protected]>
2019-11-06arm64: asm-offsets: add S_FPMark Rutland1-0/+1
So that assembly code can more easily manipulate the FP (x29) within a pt_regs, add an S_FP asm-offsets definition. Signed-off-by: Mark Rutland <[email protected]> Reviewed-by: Ard Biesheuvel <[email protected]> Reviewed-by: Torsten Duwe <[email protected]> Tested-by: Amit Daniel Kachhap <[email protected]> Tested-by: Torsten Duwe <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]>
2019-11-06arm64: insn: add encoder for MOV (register)Mark Rutland2-0/+16
For FTRACE_WITH_REGS, we're going to want to generate a MOV (register) instruction as part of the callsite intialization. As MOV (register) is an alias for ORR (shifted register), we can generate this with aarch64_insn_gen_logical_shifted_reg(), but it's somewhat verbose and difficult to read in-context. Add a aarch64_insn_gen_move_reg() wrapper for this case so that we can write callers in a more straightforward way. Signed-off-by: Mark Rutland <[email protected]> Reviewed-by: Ard Biesheuvel <[email protected]> Reviewed-by: Torsten Duwe <[email protected]> Tested-by: Amit Daniel Kachhap <[email protected]> Tested-by: Torsten Duwe <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]>
2019-11-06arm64: module/ftrace: intialize PLT at load timeMark Rutland2-52/+35
Currently we lazily-initialize a module's ftrace PLT at runtime when we install the first ftrace call. To do so we have to apply a number of sanity checks, transiently mark the module text as RW, and perform an IPI as part of handling Neoverse-N1 erratum #1542419. We only expect the ftrace trampoline to point at ftrace_caller() (AKA FTRACE_ADDR), so let's simplify all of this by intializing the PLT at module load time, before the module loader marks the module RO and performs the intial I-cache maintenance for the module. Thus we can rely on the module having been correctly intialized, and can simplify the runtime work necessary to install an ftrace call in a module. This will also allow for the removal of module_disable_ro(). Tested by forcing ftrace_make_call() to use the module PLT, and then loading up a module after setting up ftrace with: | echo ":mod:<module-name>" > set_ftrace_filter; | echo function > current_tracer; | modprobe <module-name> Since FTRACE_ADDR is only defined when CONFIG_DYNAMIC_FTRACE is selected, we wrap its use along with most of module_init_ftrace_plt() with ifdeffery rather than using IS_ENABLED(). Signed-off-by: Mark Rutland <[email protected]> Reviewed-by: Amit Daniel Kachhap <[email protected]> Reviewed-by: Ard Biesheuvel <[email protected]> Reviewed-by: Torsten Duwe <[email protected]> Tested-by: Amit Daniel Kachhap <[email protected]> Tested-by: Torsten Duwe <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: James Morse <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]>
2019-11-06arm64: module: rework special section handlingMark Rutland1-9/+26
When we load a module, we have to perform some special work for a couple of named sections. To do this, we iterate over all of the module's sections, and perform work for each section we recognize. To make it easier to handle the unexpected absence of a section, and to make the section-specific logic easer to read, let's factor the section search into a helper. Similar is already done in the core module loader, and other architectures (and ideally we'd unify these in future). If we expect a module to have an ftrace trampoline section, but it doesn't have one, we'll now reject loading the module. When ARM64_MODULE_PLTS is selected, any correctly built module should have one (and this is assumed by arm64's ftrace PLT code) and the absence of such a section implies something has gone wrong at build time. Subsequent patches will make use of the new helper. Signed-off-by: Mark Rutland <[email protected]> Reviewed-by: Ard Biesheuvel <[email protected]> Reviewed-by: Torsten Duwe <[email protected]> Tested-by: Amit Daniel Kachhap <[email protected]> Tested-by: Torsten Duwe <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: James Morse <[email protected]> Cc: Will Deacon <[email protected]>
2019-11-06module/ftrace: handle patchable-function-entryMark Rutland6-19/+20
When using patchable-function-entry, the compiler will record the callsites into a section named "__patchable_function_entries" rather than "__mcount_loc". Let's abstract this difference behind a new FTRACE_CALLSITE_SECTION, so that architectures don't have to handle this explicitly (e.g. with custom module linker scripts). As parisc currently handles this explicitly, it is fixed up accordingly, with its custom linker script removed. Since FTRACE_CALLSITE_SECTION is only defined when DYNAMIC_FTRACE is selected, the parisc module loading code is updated to only use the definition in that case. When DYNAMIC_FTRACE is not selected, modules shouldn't have this section, so this removes some redundant work in that case. To make sure that this is keep up-to-date for modules and the main kernel, a comment is added to vmlinux.lds.h, with the existing ifdeffery simplified for legibility. I built parisc generic-{32,64}bit_defconfig with DYNAMIC_FTRACE enabled, and verified that the section made it into the .ko files for modules. Signed-off-by: Mark Rutland <[email protected]> Acked-by: Helge Deller <[email protected]> Acked-by: Steven Rostedt (VMware) <[email protected]> Reviewed-by: Ard Biesheuvel <[email protected]> Reviewed-by: Torsten Duwe <[email protected]> Tested-by: Amit Daniel Kachhap <[email protected]> Tested-by: Sven Schnelle <[email protected]> Tested-by: Torsten Duwe <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James E.J. Bottomley <[email protected]> Cc: Jessica Yu <[email protected]> Cc: [email protected]
2019-11-06ftrace: add ftrace_init_nop()Mark Rutland2-6/+35
Architectures may need to perform special initialization of ftrace callsites, and today they do so by special-casing ftrace_make_nop() when the expected branch address is MCOUNT_ADDR. In some cases (e.g. for patchable-function-entry), we don't have an mcount-like symbol and don't want a synthetic MCOUNT_ADDR, but we may need to perform some initialization of callsites. To make it possible to separate initialization from runtime modification, and to handle cases without an mcount-like symbol, this patch adds an optional ftrace_init_nop() function that architectures can implement, which does not pass a branch address. Where an architecture does not provide ftrace_init_nop(), we will fall back to the existing behaviour of calling ftrace_make_nop() with MCOUNT_ADDR. At the same time, ftrace_code_disable() is renamed to ftrace_nop_initialize() to make it clearer that it is intended to intialize a callsite into a disabled state, and is not for disabling a callsite that has been runtime enabled. The kerneldoc description of rec arguments is updated to cover non-mcount callsites. Signed-off-by: Mark Rutland <[email protected]> Reviewed-by: Amit Daniel Kachhap <[email protected]> Reviewed-by: Ard Biesheuvel <[email protected]> Reviewed-by: Miroslav Benes <[email protected]> Reviewed-by: Steven Rostedt (VMware) <[email protected]> Reviewed-by: Torsten Duwe <[email protected]> Tested-by: Amit Daniel Kachhap <[email protected]> Tested-by: Sven Schnelle <[email protected]> Tested-by: Torsten Duwe <[email protected]> Cc: Ingo Molnar <[email protected]>
2019-11-06arm64: kpti: Add NVIDIA's Carmel core to the KPTI whitelistRich Wiley1-0/+1
NVIDIA Carmel CPUs don't implement ID_AA64PFR0_EL1.CSV3 but aren't susceptible to Meltdown, so add Carmel to kpti_safe_list[]. Signed-off-by: Rich Wiley <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2019-11-06arm64: mm: Remove MAX_USER_VA_BITS definitionBhupesh Sharma3-8/+2
commit 9b31cf493ffa ("arm64: mm: Introduce MAX_USER_VA_BITS definition") introduced the MAX_USER_VA_BITS definition, which was used to support the arm64 mm use-cases where the user-space could use 52-bit virtual addresses whereas the kernel-space would still could a maximum of 48-bit virtual addressing. But, now with commit b6d00d47e81a ("arm64: mm: Introduce 52-bit Kernel VAs"), we removed the 52-bit user/48-bit kernel kconfig option and hence there is no longer any scenario where user VA != kernel VA size (even with CONFIG_ARM64_FORCE_52BIT enabled, the same is true). Hence we can do away with the MAX_USER_VA_BITS macro as it is equal to VA_BITS (maximum VA space size) in all possible use-cases. Note that even though the 'vabits_actual' value would be 48 for arm64 hardware which don't support LVA-8.2 extension (even when CONFIG_ARM64_VA_BITS_52 is enabled), VA_BITS would still be set to a value 52. Hence this change would be safe in all possible VA address space combinations. Cc: James Morse <[email protected]> Cc: Will Deacon <[email protected]> Cc: Steve Capper <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: [email protected] Cc: [email protected] Reviewed-by: Mark Rutland <[email protected]> Signed-off-by: Bhupesh Sharma <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2019-11-06arm64: mm: simplify the page end calculation in __create_pgd_mapping()Masahiro Yamada1-3/+2
Calculate the page-aligned end address more simply. The local variable, "length" is unneeded. Reviewed-by: Mark Rutland <[email protected]> Signed-off-by: Masahiro Yamada <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2019-11-04perf/imx_ddr: Dump AXI ID filter info to userspaceJoakim Zhang1-0/+56
caps/filter indicates whether HW supports AXI ID filter or not. caps/enhanced_filter indicates whether HW supports enhanced AXI ID filter or not. Users can check filter features from userspace with these attributions. Suggested-by: Will Deacon <[email protected]> Signed-off-by: Joakim Zhang <[email protected]> [will: reworked cap switch to be less error-prone] Signed-off-by: Will Deacon <[email protected]>
2019-11-04docs/perf: Add AXI ID filter capabilities informationJoakim Zhang1-4/+8
Add capabilities information for AXI ID filter. Signed-off-by: Joakim Zhang <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2019-11-04perf/imx_ddr: Add driver for DDR PMU in i.MX8MPlusJoakim Zhang1-0/+5
Add driver for DDR PMU in i.MX8MPlus. Signed-off-by: Joakim Zhang <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2019-11-04perf/imx_ddr: Add enhanced AXI ID filter supportJoakim Zhang1-21/+42
With DDR_CAP_AXI_ID_FILTER quirk, indicating HW supports AXI ID filter which only can get bursts from DDR transaction, i.e. DDR read/write requests. This patch add DDR_CAP_AXI_ID_ENHANCED_FILTER quirk, indicating HW supports AXI ID filter which can get bursts and bytes from DDR transaction at the same time. We hope PMU always return bytes in the driver due to it is more meaningful for users. Signed-off-by: Joakim Zhang <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2019-11-04bindings: perf: imx-ddr: Add new compatible stringJoakim Zhang1-0/+1
Add new compatible string for i.MX8MPlus DDR PMU core. Signed-off-by: Joakim Zhang <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2019-11-04docs/perf: Add explanation for DDR_CAP_AXI_ID_FILTER_ENHANCED quirkJoakim Zhang1-0/+5
Add explanation for DDR_CAP_AXI_ID_FILTER_ENHANCED quirk. Signed-off-by: Joakim Zhang <[email protected]> [will: Simplified wording] Signed-off-by: Will Deacon <[email protected]>
2019-11-01docs/arm64: cpu-feature-registers: Rewrite bitfields that don't follow [e, s]Julien Grall1-2/+2
Commit "docs/arm64: cpu-feature-registers: Documents missing visible fields" added bitfields following the convention [s, e]. However, the documentation is following [s, e] and so does the Arm ARM. Rewrite the bitfields to match the format [s, e]. Fixes: a8613e7070e7 ("docs/arm64: cpu-feature-registers: Documents missing visible fields") Signed-off-by: Julien Grall <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2019-11-01arm64: perf: Simplify the ARMv8 PMUv3 event attributesShaokun Zhang1-125/+66
For each PMU event, there is a ARMV8_EVENT_ATTR(xx, XX) and &armv8_event_attr_xx.attr.attr. Let's redefine the ARMV8_EVENT_ATTR to simplify the armv8_pmuv3_event_attrs. Cc: Will Deacon <[email protected]> Cc: Mark Rutland <[email protected]> Signed-off-by: Shaokun Zhang <[email protected]> [will: Dropped unnecessary array syntax] Signed-off-by: Will Deacon <[email protected]>
2019-11-01dma/direct: turn ARCH_ZONE_DMA_BITS into a variableNicolas Saenz Julienne8-27/+31
Some architectures, notably ARM, are interested in tweaking this depending on their runtime DMA addressing limitations. Acked-by: Christoph Hellwig <[email protected]> Signed-off-by: Nicolas Saenz Julienne <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2019-10-29arm64: print additional fault message when executing non-exec memoryXiang Zheng1-0/+2
When attempting to executing non-executable memory, the fault message shows: Unable to handle kernel read from unreadable memory at virtual address ffff802dac469000 This may confuse someone, so add a new fault message for instruction abort. Acked-by: Will Deacon <[email protected]> Signed-off-by: Xiang Zheng <[email protected]> Signed-off-by: Catalin Marinas <[email protected]>
2019-10-29drivers/perf: Add CCPI2 PMU support in ThunderX2 UNCORE driver.Ganapatrao Prabhakerrao Kulkarni1-33/+234
CCPI2 is a low-latency high-bandwidth serial interface for inter socket connectivity of ThunderX2 processors. CCPI2 PMU supports up to 8 counters per socket. Counters are independently programmable to different events and can be started and stopped individually. The CCPI2 counters are 64-bit and do not overflow in normal operation. Signed-off-by: Ganapatrao Prabhakerrao Kulkarni <[email protected]> Signed-off-by: Will Deacon <[email protected]>
2019-10-29Documentation: perf: Update documentation for ThunderX2 PMU uncore driverGanapatrao Prabhakerrao Kulkarni1-9/+11
Add documentation for Cavium Coherent Processor Interconnect (CCPI2) PMU. Signed-off-by: Ganapatrao Prabhakerrao Kulkarni <[email protected]> Signed-off-by: Will Deacon <[email protected]>