blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2024-07-11	Merge branches 'for-next/cpufeature', 'for-next/misc', 'for-next/kselftest', ↵	Catalin Marinas	48	-455/+805
	'for-next/mte', 'for-next/errata', 'for-next/acpi', 'for-next/gic-v3-pmr' and 'for-next/doc', remote-tracking branch 'arm64/for-next/perf' into for-next/core * arm64/for-next/perf: perf: add missing MODULE_DESCRIPTION() macros perf: arm_pmuv3: Include asm/arm_pmuv3.h from linux/perf/arm_pmuv3.h perf: arm_v6/7_pmu: Drop non-DT probe support perf/arm: Move 32-bit PMU drivers to drivers/perf/ perf: arm_pmuv3: Drop unnecessary IS_ENABLED(CONFIG_ARM64) check perf: arm_pmuv3: Avoid assigning fixed cycle counter with threshold perf: imx_perf: add support for i.MX95 platform perf: imx_perf: fix counter start and config sequence perf: imx_perf: refactor driver for imx93 perf: imx_perf: let the driver manage the counter usage rather the user perf: imx_perf: add macro definitions for parsing config attr dt-bindings: perf: fsl-imx-ddr: Add i.MX95 compatible perf: pmuv3: Add new Cortex and Neoverse PMUs dt-bindings: arm: pmu: Add new Cortex and Neoverse cores perf/arm-cmn: Enable support for tertiary match group perf/arm-cmn: Decouple wp_config registers from filter group number * for-next/cpufeature: : Various cpufeature infrastructure patches arm64/cpufeature: Replace custom macros with fields from ID_AA64PFR0_EL1 KVM: arm64: Replace custom macros with fields from ID_AA64PFR0_EL1 arm64/cpufeatures/kvm: Add ARMv8.9 FEAT_ECBHB bits in ID_AA64MMFR1 register * for-next/misc: : Miscellaneous patches arm64: smp: Fix missing IPI statistics arm64: Cleanup __cpu_set_tcr_t0sz() arm64/mm: Stop using ESR_ELx_FSC_TYPE during fault arm64: Kconfig: fix typo in __builtin_return_adddress ARM64: reloc_test: add missing MODULE_DESCRIPTION() macro arm64: implement raw_smp_processor_id() using thread_info arm64/arch_timer: include <linux/percpu.h> * for-next/kselftest: : arm64 kselftest updates selftests: arm64: tags: remove the result script selftests: arm64: tags_test: conform test to TAP output kselftest/arm64: Fix a couple of spelling mistakes kselftest/arm64: Fix redundancy of a testcase kselftest/arm64: Include kernel mode NEON in fp-stress * for-next/mte: : MTE updates arm64: mte: Make mte_check_tfsr_() conditional on KASAN instead of MTE for-next/errata: : Arm CPU errata workarounds arm64: errata: Expand speculative SSBS workaround arm64: errata: Unify speculative SSBS errata logic arm64: cputype: Add Cortex-X925 definitions arm64: cputype: Add Cortex-A720 definitions arm64: cputype: Add Cortex-X3 definitions * for-next/acpi: : arm64 ACPI patches ACPI: Add acpi=nospcr to disable ACPI SPCR as default console on ARM64 ACPI / amba: Drop unnecessary check for registered amba_dummy_clk arm64: FFH: Move ACPI specific code into drivers/acpi/arm64/ arm64: cpuidle: Move ACPI specific code into drivers/acpi/arm64/ ACPI: arm64: Sort entries alphabetically * for-next/gic-v3-pmr: : arm64: irqchip/gic-v3: Use compiletime constant PMR values arm64: irqchip/gic-v3: Select priorities at boot time irqchip/gic-v3: Detect GICD_CTRL.DS and SCR_EL3.FIQ earlier irqchip/gic-v3: Make distributor priorities variables irqchip/gic-common: Remove sync_access callback wordpart.h: Add REPEAT_BYTE_U32() * for-next/doc: : arm64 documentation updates Documentation: arm64: Update memory.rst for TBI
2024-07-11	selftests: arm64: tags: remove the result script	Muhammad Usama Anjum	2	-13/+0
	The run_tags_test.sh script is used to run tags_test and print out if the test succeeded or failed. As tags_test has been TAP conformed, this script is unneeded and hence can be removed. Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Link: https://lore.kernel.org/r/20240602132502.4186771-2-usama.anjum@collabora.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-07-11	selftests: arm64: tags_test: conform test to TAP output	Muhammad Usama Anjum	1	-4/+6
	Conform the layout, informational and status messages to TAP. No functional change is intended other than the layout of output messages. Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Link: https://lore.kernel.org/r/20240602132502.4186771-1-usama.anjum@collabora.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-07-10	perf: add missing MODULE_DESCRIPTION() macros	Jeff Johnson	8	-0/+8
	With ARCH=x86, make allmodconfig && make W=1 C=1 reports: WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/arm-ccn.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/fsl_imx8_ddr_perf.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/marvell_cn10k_ddr_pmu.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/arm_cspmu/arm_cspmu_module.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/arm_cspmu/nvidia_cspmu.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/arm_cspmu/ampere_cspmu.o WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/cxl_pmu.o Add the missing invocation of the MODULE_DESCRIPTION() macro to all files which have a MODULE_LICENSE(). This includes drivers/perf/hisilicon/hisi_uncore_pmu.c which, although it did not produce a warning with the x86 allmodconfig configuration, may cause this warning with arm64 configurations. Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Link: https://lore.kernel.org/r/20240709-md-drivers-perf-v3-1-513275b75ed0@quicinc.com Signed-off-by: Will Deacon <will@kernel.org>
2024-07-08	arm64: smp: Fix missing IPI statistics	Jinjie Ruan	1	-2/+4
	commit 83cfac95c018 ("genirq: Allow interrupts to be excluded from /proc/interrupts") is to avoid IPIs appear twice in /proc/interrupts. But the commit 331a1b3a836c ("arm64: smp: Add arch support for backtrace using pseudo-NMI") and commit 2f5cd0c7ffde("arm64: kgdb: Implement kgdb_roundup_cpus() to enable pseudo-NMI roundup") set CPU_BACKTRACE and KGDB_ROUNDUP IPIs "IRQ_HIDDEN" flag but not show them in arch_show_interrupts(), which cause the interrupt kstat_irqs accounting is missing in display. Before this patch, CPU_BACKTRACE and KGDB_ROUNDUP IPIs are missing: / # cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 11: 466 600 309 332 GICv3 27 Level arch_timer 13: 24 0 0 0 GICv3 33 Level uart-pl011 15: 64 0 0 0 GICv3 78 Edge virtio0 16: 0 0 0 0 GICv3 79 Edge virtio1 17: 0 0 0 0 GICv3 34 Level rtc-pl031 18: 3 3 3 3 GICv3 23 Level arm-pmu 19: 0 0 0 0 9030000.pl061 3 Edge GPIO Key Poweroff IPI0: 7 14 9 26 Rescheduling interrupts IPI1: 354 93 233 255 Function call interrupts IPI2: 0 0 0 0 CPU stop interrupts IPI3: 0 0 0 0 CPU stop (for crash dump) interrupts IPI4: 0 0 0 0 Timer broadcast interrupts IPI5: 1 0 0 0 IRQ work interrupts Err: 0 After this pacth, CPU_BACKTRACE and KGDB_ROUNDUP IPIs are displayed: / # cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 11: 393 281 532 449 GICv3 27 Level arch_timer 13: 15 0 0 0 GICv3 33 Level uart-pl011 15: 64 0 0 0 GICv3 78 Edge virtio0 16: 0 0 0 0 GICv3 79 Edge virtio1 17: 0 0 0 0 GICv3 34 Level rtc-pl031 18: 2 2 2 2 GICv3 23 Level arm-pmu 19: 0 0 0 0 9030000.pl061 3 Edge GPIO Key Poweroff IPI0: 11 19 4 23 Rescheduling interrupts IPI1: 279 347 222 72 Function call interrupts IPI2: 0 0 0 0 CPU stop interrupts IPI3: 0 0 0 0 CPU stop (for crash dump) interrupts IPI4: 0 0 0 0 Timer broadcast interrupts IPI5: 1 0 0 1 IRQ work interrupts IPI6: 0 0 0 0 CPU backtrace interrupts IPI7: 0 0 0 0 KGDB roundup interrupts Err: 0 Fixes: 331a1b3a836c ("arm64: smp: Add arch support for backtrace using pseudo-NMI") Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Suggested-by: Doug Anderson <dianders@chromium.org> Acked-by: Will Deacon <will@kernel.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20240620063600.573559-1-ruanjinjie@huawei.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-07-04	ACPI: Add acpi=nospcr to disable ACPI SPCR as default console on ARM64	Liu Wei	2	-4/+24
	For varying privacy and security reasons, sometimes we would like to completely silence the _serial_ console, and only enable it when needed. But there are many existing systems that depend on this _serial_ console, so add acpi=nospcr to disable console in ACPI SPCR table as default _serial_ console. Signed-off-by: Liu Wei <liuwei09@cestc.cn> Suggested-by: Prarit Bhargava <prarit@redhat.com> Suggested-by: Will Deacon <will@kernel.org> Suggested-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Hanjun Guo <guohanjun@huawei.com> Reviewed-by: Prarit Bhargava <prarit@redhat.com> Link: https://lore.kernel.org/r/20240625030504.58025-1-liuwei09@cestc.cn Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-07-04	Documentation: arm64: Update memory.rst for TBI	Kevin Brodsky	1	-22/+20
	Most of memory.rst was written very early, at a time where TBI (Top Byte Ignore) was not enabled. Nowadays TBI0 is always enabled, and TBI1 may be enabled, depending on the kernel configuration. This means that VA bits 63:56 cannot generally be assumed to have any particular value. Regardless of TBI, TTBRx selection is done based on bit 55; update memory.rst accordingly. Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20240702091349.356008-1-kevin.brodsky@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-07-04	arm64/cpufeature: Replace custom macros with fields from ID_AA64PFR0_EL1	Anshuman Khandual	3	-8/+4
	This replaces custom macros usage (i.e ID_AA64PFR0_EL1_ELx_64BIT_ONLY and ID_AA64PFR0_EL1_ELx_32BIT_64BIT) and instead directly uses register fields from ID_AA64PFR0_EL1 sysreg definition. Finally let's drop off both these custom macros as they are now redundant. Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20240613102710.3295108-3-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-07-04	KVM: arm64: Replace custom macros with fields from ID_AA64PFR0_EL1	Anshuman Khandual	3	-8/+8
	This replaces custom macros usage (i.e ID_AA64PFR0_EL1_ELx_64BIT_ONLY and ID_AA64PFR0_EL1_ELx_32BIT_64BIT) and instead directly uses register fields from ID_AA64PFR0_EL1 sysreg definition. Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: kvmarm@lists.linux.dev Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20240613102710.3295108-2-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-07-03	perf: arm_pmuv3: Include asm/arm_pmuv3.h from linux/perf/arm_pmuv3.h	Rob Herring (Arm)	4	-4/+3
	The arm64 asm/arm_pmuv3.h depends on defines from linux/perf/arm_pmuv3.h. Rather than depend on include order, follow the usual pattern of "linux" headers including "asm" headers of the same name. With this change, the include of linux/kvm_host.h is problematic due to circular includes: In file included from ../arch/arm64/include/asm/arm_pmuv3.h:9, from ../include/linux/perf/arm_pmuv3.h:312, from ../include/kvm/arm_pmu.h:11, from ../arch/arm64/include/asm/kvm_host.h:38, from ../arch/arm64/mm/init.c:41: ../include/linux/kvm_host.h:383:30: error: field 'arch' has incomplete type Switching to asm/kvm_host.h solves the issue. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Link: https://lore.kernel.org/r/20240626-arm-pmu-3-9-icntr-v2-5-c9784b4f4065@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2024-07-03	perf: arm_v6/7_pmu: Drop non-DT probe support	Rob Herring (Arm)	2	-25/+2
	There are no non-DT based PMU users for v6 or v7, so drop the custom non-DT probe table. Unfortunately XScale still needs non-DT probing. Note that this drops support for arm1156 PMU, but there are no arm1156 based systems supported in the kernel. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Link: https://lore.kernel.org/r/20240626-arm-pmu-3-9-icntr-v2-4-c9784b4f4065@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2024-07-03	perf/arm: Move 32-bit PMU drivers to drivers/perf/	Rob Herring (Arm)	6	-11/+15
	It is preferred to put drivers under drivers/ rather than under arch/. The PMU drivers also depend on arm_pmu.c, so it's better to place them all together. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Link: https://lore.kernel.org/r/20240626-arm-pmu-3-9-icntr-v2-3-c9784b4f4065@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2024-07-03	perf: arm_pmuv3: Drop unnecessary IS_ENABLED(CONFIG_ARM64) check	Rob Herring (Arm)	1	-1/+1
	The IS_ENABLED(CONFIG_ARM64) check for threshold support is unnecessary. The purpose is to not enable thresholds on arm32, but if threshold is non-zero, the check against threshold_max() just above here will have errored out because threshold_max() is always 0 on arm32. Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Acked-by: Mark rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20240626-arm-pmu-3-9-icntr-v2-2-c9784b4f4065@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2024-07-03	perf: arm_pmuv3: Avoid assigning fixed cycle counter with threshold	Rob Herring (Arm)	1	-2/+8
	If the user has requested a counting threshold for the CPU cycles event, then the fixed cycle counter can't be assigned as it lacks threshold support. Currently, the thresholds will work or not randomly depending on which counter the event is assigned. While using thresholds for CPU cycles doesn't make much sense, it can be useful for testing purposes. Fixes: 816c26754447 ("arm64: perf: Add support for event counting threshold") Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20240626-arm-pmu-3-9-icntr-v2-1-c9784b4f4065@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2024-07-01	perf: imx_perf: add support for i.MX95 platform	Xu Yang	1	-3/+86
	i.MX95 has a DDR PMU which is almostly same as i.MX93, it now supports read beat and write beat filter capabilities. This will add support for i.MX95 and enhance the driver to support specific filter handling for it. Usage: For read beat: ~# perf stat -a -I 1000 -e imx9_ddr0/eddrtq_pm_rd_beat_filt2,axi_mask=ID_MASK,axi_id=ID/ ~# perf stat -a -I 1000 -e imx9_ddr0/eddrtq_pm_rd_beat_filt1,axi_mask=ID_MASK,axi_id=ID/ ~# perf stat -a -I 1000 -e imx9_ddr0/eddrtq_pm_rd_beat_filt0,axi_mask=ID_MASK,axi_id=ID/ eg: For edma2: perf stat -a -I 1000 -e imx9_ddr0/eddrtq_pm_rd_beat_filt0,axi_mask=0x00f,axi_id=0x00c/ For write beat: ~# perf stat -a -I 1000 -e imx9_ddr0/eddrtq_pm_wr_beat_filt,axi_mask=ID_MASK,axi_id=ID/ eg: For edma2: perf stat -a -I 1000 -e imx9_ddr0/eddrtq_pm_wr_beat_filt,axi_mask=0x00f,axi_id=0x00c/ Reviewed-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Link: https://lore.kernel.org/r/20240529080358.703784-6-xu.yang_2@nxp.com Signed-off-by: Will Deacon <will@kernel.org>
2024-07-01	perf: imx_perf: fix counter start and config sequence	Xu Yang	1	-3/+3
	In current driver, the counter will start firstly and then be configured. This sequence is not correct for AXI filter events since the correct AXI_MASK and AXI_ID are not set yet. Then the results may be inaccurate. Reviewed-by: Frank Li <Frank.Li@nxp.com> Fixes: 55691f99d417 ("drivers/perf: imx_ddr: Add support for NXP i.MX9 SoC DDRC PMU driver") cc: stable@vger.kernel.org Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Link: https://lore.kernel.org/r/20240529080358.703784-5-xu.yang_2@nxp.com Signed-off-by: Will Deacon <will@kernel.org>
2024-07-01	perf: imx_perf: refactor driver for imx93	Xu Yang	1	-38/+68
	This driver is initinally used to support imx93 Soc and now it's time to add support for imx95 Soc. However, some macro definitions and events are different on these two Socs. For preparing imx95 supports, this will refactor driver for imx93. Reviewed-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Link: https://lore.kernel.org/r/20240529080358.703784-4-xu.yang_2@nxp.com Signed-off-by: Will Deacon <will@kernel.org>
2024-07-01	perf: imx_perf: let the driver manage the counter usage rather the user	Xu Yang	1	-68/+100
	In current design, the user of perf app needs to input counter ID to count events. However, this is not user-friendly since the user needs to lookup the map table to find the counter. Instead of letting the user to input the counter, let this driver to manage the counters in this patch. This will be implemented by: 1. allocate counter 0 for cycle event. 2. find unused counter from 1-10 for reference events. 3. allocate specific counter for counter-specific events. In this patch, counter attr will be kept for back-compatible but all the value passed down by counter=<n> will be ignored. To mark counter-specific events, counter ID will be encoded into perf_pmu_events_attr.id. Reviewed-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Link: https://lore.kernel.org/r/20240529080358.703784-3-xu.yang_2@nxp.com Signed-off-by: Will Deacon <will@kernel.org>
2024-07-01	perf: imx_perf: add macro definitions for parsing config attr	Xu Yang	1	-4/+9
	The user can set event and counter in cmdline and the driver need to parse it using 'config' attr value. This will add macro definitions to avoid hard-code in driver. Reviewed-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Link: https://lore.kernel.org/r/20240529080358.703784-2-xu.yang_2@nxp.com Signed-off-by: Will Deacon <will@kernel.org>
2024-07-01	dt-bindings: perf: fsl-imx-ddr: Add i.MX95 compatible	Xu Yang	1	-0/+3
	i.MX95 has a DDR pmu. This will add a compatible for it. Acked-by: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Xu Yang <xu.yang_2@nxp.com> Link: https://lore.kernel.org/r/20240529080358.703784-1-xu.yang_2@nxp.com Signed-off-by: Will Deacon <will@kernel.org>
2024-07-01	perf: pmuv3: Add new Cortex and Neoverse PMUs	Andre Przywara	1	-0/+12
	Add support for the Arm Cortex-A725, Cortex-X925, Neoverse N3, Neoverse V2, Neoverse V3 and Neoverse V3AE. This just adds the names and connects them with their DT compatible strings. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Link: https://lore.kernel.org/r/20240628145612.1291329-3-andre.przywara@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-07-01	dt-bindings: arm: pmu: Add new Cortex and Neoverse cores	Andre Przywara	1	-0/+6
	Add compatible strings for the PMUs in the Arm Cortex-A725, Cortex-X925, Neoverse N3, Neoverse V2, Neoverse V3 and Neoverse V3AE cores. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Link: https://lore.kernel.org/r/20240628145612.1291329-2-andre.przywara@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-07-01	perf/arm-cmn: Enable support for tertiary match group	Ilkka Koskinen	1	-6/+13
	Add support for tertiary match group. Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/20240618005056.3092866-3-ilkka@os.amperecomputing.com Signed-off-by: Will Deacon <will@kernel.org>
2024-07-01	perf/arm-cmn: Decouple wp_config registers from filter group number	Ilkka Koskinen	1	-17/+80
	Previously, wp_config0/2 registers were used for primary match group and wp_config1/3 registers for secondary match group. In order to support tertiary match group, this patch decouples the registers and the groups. Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/20240618005056.3092866-2-ilkka@os.amperecomputing.com Signed-off-by: Will Deacon <will@kernel.org>
2024-06-24	arm64: Cleanup __cpu_set_tcr_t0sz()	Seongsu Park	1	-2/+2
	The T0SZ field of TCR_EL1 occupies bits 0-5 of the register and encode the virtual address space translated by TTBR0_EL1. When updating the field, for example because we are switching to/from the idmap page-table, __cpu_set_tcr_t0sz() erroneously treats its 't0sz' argument as unshifted, resulting in harmless but confusing double shifts by 0 in the code. Co-developed-by: Leem ChaeHoon <infinite.run@gmail.com> Signed-off-by: Leem ChaeHoon <infinite.run@gmail.com> Signed-off-by: Seongsu Park <sgsu.park@samsung.com> Link: https://lore.kernel.org/r/20240523122146.144483-1-sgsu.park@samsung.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-24	ACPI / amba: Drop unnecessary check for registered amba_dummy_clk	Youwan Wang	1	-5/+1
	amba_register_dummy_clk() is called only once from acpi_amba_init() and acpi_amba_init() itself is called once during the initialisation. amba_dummy_clk can't be initialised before this in any other code path and hence the check for already registered amba_dummy_clk is not necessary. Drop the same. Signed-off-by: Youwan Wang <youwan@nfschina.com> Acked-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Hanjun Guo <guohanjun@huawei.com> Link: https://lore.kernel.org/r/20240624023101.369633-1-youwan@nfschina.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-24	arm64: irqchip/gic-v3: Select priorities at boot time	Mark Rutland	5	-106/+97
	The distributor and PMR/RPR can present different views of the interrupt priority space dependent upon the values of GICD_CTLR.DS and SCR_EL3.FIQ. Currently we treat the distributor's view of the priority space as canonical, and when the two differ we change the way we handle values in the PMR/RPR, using the `gic_nonsecure_priorities` static key to decide what to do. This approach works, but it's sub-optimal. When using pseudo-NMI we manipulate the distributor rarely, and we manipulate the PMR/RPR registers very frequently in code spread out throughout the kernel (e.g. local_irq_{save,restore}()). It would be nicer if we could use fixed values for the PMR/RPR, and dynamically choose the values programmed into the distributor. This patch changes the GICv3 driver and arm64 code accordingly. PMR values are chosen at compile time, and the GICv3 driver determines the appropriate values to program into the distributor at boot time. This removes the need for the `gic_nonsecure_priorities` static key and results in smaller and better generated code for saving/restoring the irqflags. Before this patch, local_irq_disable() compiles to: \| 0000000000000000 <outlined_local_irq_disable>: \| 0: d503201f nop \| 4: d50343df msr daifset, #0x3 \| 8: d65f03c0 ret \| c: d503201f nop \| 10: d2800c00 mov x0, #0x60 // #96 \| 14: d5184600 msr icc_pmr_el1, x0 \| 18: d65f03c0 ret \| 1c: d2801400 mov x0, #0xa0 // #160 \| 20: 17fffffd b 14 <outlined_local_irq_disable+0x14> After this patch, local_irq_disable() compiles to: \| 0000000000000000 <outlined_local_irq_disable>: \| 0: d503201f nop \| 4: d50343df msr daifset, #0x3 \| 8: d65f03c0 ret \| c: d2801800 mov x0, #0xc0 // #192 \| 10: d5184600 msr icc_pmr_el1, x0 \| 14: d65f03c0 ret ... with 3 fewer instructions per call. For defconfig + CONFIG_PSEUDO_NMI=y, this results in a minor saving of ~4K of text, and will make it easier to make further improvements to the way we manipulate irqflags and DAIF bits. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Tested-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240617111841.2529370-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de>
2024-06-24	irqchip/gic-v3: Detect GICD_CTRL.DS and SCR_EL3.FIQ earlier	Mark Rutland	1	-54/+63
	In subsequent patches the GICv3 driver will choose the regular interrupt priority at boot time, dependent on the configuration of GICD_CTRL.DS and SCR_EL3.FIQ. This will need to be chosen before we configure the distributor with default prioirities for all the interrupts, which happens before we currently detect these in gic_cpu_sys_reg_init(). Add a new gic_prio_init() function to detect these earlier and log them to the console so that any problems can be debugged more easily. This also allows the uniformity checks in gic_cpu_sys_reg_init() to be simplified, as we can compare directly with the boot CPU values which were recorded earlier. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Tested-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240617111841.2529370-5-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de>
2024-06-24	irqchip/gic-v3: Make distributor priorities variables	Mark Rutland	8	-30/+32
	In subsequent patches the GICv3 driver will choose the regular interrupt priority at boot time. In preparation for using dynamic priorities, place the priorities in variables and update the code to pass these as parameters. Users of GICD_INT_DEF_PRI_X4 are modified to replicate the priority byte using REPEAT_BYTE_U32(). There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Tested-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240617111841.2529370-4-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de>
2024-06-24	irqchip/gic-common: Remove sync_access callback	Mark Rutland	5	-26/+16
	The gic_configure_irq(), gic_dist_config(), and gic_cpu_config() functions each take an optional "sync_access" callback, but in almost all cases this is not used. The only user is the GICv3 driver's gic_cpu_init() function, which uses gic_redist_wait_for_rwp() as the "sync_access" callback for gic_cpu_config(). It would be simpler and clearer to remove the callback and have the GICv3 driver call gic_redist_wait_for_rwp() explicitly after gic_cpu_config(). Remove the "sync_access" callback, and call gic_redist_wait_for_rwp() explicitly in the GICv3 driver. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Tested-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240617111841.2529370-3-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de>
2024-06-24	wordpart.h: Add REPEAT_BYTE_U32()	Mark Rutland	1	-0/+8
	In some cases it's necessary to replicate a byte across a u32 value, for which REPEAT_BYTE() would be helpful. Currently this requires explicit masking of the result to avoid sparse warnings, as e.g. (u32)REPEAT_BYTE(0xa0)) ... will result in a warning: cast truncates bits from constant value (a0a0a0a0a0a0a0a0 becomes a0a0a0a0) Add a new REPEAT_BYTE_U32() which does the necessary masking internally, so that we don't need to duplicate this for every usage. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240617111841.2529370-2-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Thomas Gleixner <tglx@linutronix.de>
2024-06-24	arm64/mm: Stop using ESR_ELx_FSC_TYPE during fault	Anshuman Khandual	1	-6/+27
	Fault status codes at page table level 0, 1, 2 and 3 for access, permission and translation faults are architecturally organized in a way, that masking out ESR_ELx_FSC_TYPE, fetches Level 0 status code for the respective fault. Helpers like esr_fsc_is_[translation\|permission\|access_flag]_fault() mask out ESR_ELx_FSC_TYPE before comparing against corresponding Level 0 status code as the kernel does not yet care about the page table level, where in the fault really occurred previously. This scheme is starting to crumble after FEAT_LPA2 when level -1 got added. Fault status code for translation fault at level -1 is 0x2B which does not follow ESR_ELx_FSC_TYPE, requiring esr_fsc_is_translation_fault() changes. This changes above helpers to compare against individual fault status code values for each page table level and stop using ESR_ELx_FSC_TYPE, which is losing its value as a common mask. Cc: Will Deacon <will@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20240618034703.3622510-1-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-21	arm64: Kconfig: fix typo in __builtin_return_adddress	Mike Rapoport (IBM)	1	-1/+1
	Comment about BUILTIN_RETURN_ADDRESS_STRIPS_PAC spells __builtin_return_adddress with a triple 'd', fix it. Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org> Link: https://lore.kernel.org/r/20240620174038.3721466-1-rppt@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-13	ARM64: reloc_test: add missing MODULE_DESCRIPTION() macro	Jeff Johnson	1	-0/+1
	With ARCH=arm64, make allmodconfig && make W=1 C=1 reports: WARNING: modpost: missing MODULE_DESCRIPTION() in arch/arm64/kernel/arm64-reloc-test.o Add the missing invocation of the MODULE_DESCRIPTION() macro. Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Link: https://lore.kernel.org/r/20240612-md-arch-arm64-kernel-v1-1-1fafe8d11df3@quicinc.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-13	kselftest/arm64: Fix a couple of spelling mistakes	Colin Ian King	1	-2/+2
	There are two spelling mistakes in some error messages. Fix them. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240613073429.1797451-1-colin.i.king@gmail.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-13	arm64: FFH: Move ACPI specific code into drivers/acpi/arm64/	Sudeep Holla	3	-105/+108
	The ACPI FFH Opregion code can be moved out of arm64 arch code as it just uses SMCCC. Move all the ACPI FFH Opregion code into drivers/acpi/arm64/ffh.c Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Hanjun Guo <guohanjun@huawei.com> Link: https://lore.kernel.org/r/20240605131458.3341095-4-sudeep.holla@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-13	arm64: cpuidle: Move ACPI specific code into drivers/acpi/arm64/	Sudeep Holla	3	-5/+1
	The ACPI cpuidle LPI FFH code can be moved out of arm64 arch code as it just uses SMCCC. Move all the ACPI cpuidle LPI FFH code into drivers/acpi/arm64/cpuidle.c Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Hanjun Guo <guohanjun@huawei.com> Link: https://lore.kernel.org/r/20240605131458.3341095-3-sudeep.holla@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-13	ACPI: arm64: Sort entries alphabetically	Sudeep Holla	1	-2/+2
	Sort the entries in the Makefile alphabetically. Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Acked-by: Hanjun Guo <guohanjun@huawei.com> Link: https://lore.kernel.org/r/20240605131458.3341095-2-sudeep.holla@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-12	arm64: errata: Expand speculative SSBS workaround	Mark Rutland	3	-2/+30
	A number of Arm Ltd CPUs suffer from errata whereby an MSR to the SSBS special-purpose register does not affect subsequent speculative instructions, permitting speculative store bypassing for a window of time. We worked around this for Cortex-X4 and Neoverse-V3, in commit: 7187bb7d0b5c7dfa ("arm64: errata: Add workaround for Arm errata 3194386 and 3312417") ... as per their Software Developer Errata Notice (SDEN) documents: * Cortex-X4 SDEN v8.0, erratum 3194386: https://developer.arm.com/documentation/SDEN-2432808/0800/ * Neoverse-V3 SDEN v6.0, erratum 3312417: https://developer.arm.com/documentation/SDEN-2891958/0600/ Since then, similar errata have been published for a number of other Arm Ltd CPUs, for which the mitigation is the same. This is described in their respective SDEN documents: * Cortex-A710 SDEN v19.0, errataum 3324338 https://developer.arm.com/documentation/SDEN-1775101/1900/?lang=en * Cortex-A720 SDEN v11.0, erratum 3456091 https://developer.arm.com/documentation/SDEN-2439421/1100/?lang=en * Cortex-X2 SDEN v19.0, erratum 3324338 https://developer.arm.com/documentation/SDEN-1775100/1900/?lang=en * Cortex-X3 SDEN v14.0, erratum 3324335 https://developer.arm.com/documentation/SDEN-2055130/1400/?lang=en * Cortex-X925 SDEN v8.0, erratum 3324334 https://developer.arm.com/documentation/109108/800/?lang=en * Neoverse-N2 SDEN v17.0, erratum 3324339 https://developer.arm.com/documentation/SDEN-1982442/1700/?lang=en * Neoverse-V2 SDEN v9.0, erratum 3324336 https://developer.arm.com/documentation/SDEN-2332927/900/?lang=en Note that due to shared design lineage, some CPUs share the same erratum number. Add these to the existing mitigation under CONFIG_ARM64_ERRATUM_3194386. As listing all of the erratum IDs in the runtime description would be unwieldy, this is reduced to: "SSBS not fully self-synchronizing" ... matching the description of the errata in all of the SDENs. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240603111812.1514101-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-12	arm64: errata: Unify speculative SSBS errata logic	Mark Rutland	5	-34/+9
	Cortex-X4 erratum 3194386 and Neoverse-V3 erratum 3312417 are identical, with duplicate Kconfig text and some unsightly ifdeffery. While we try to share code behind CONFIG_ARM64_WORKAROUND_SPECULATIVE_SSBS, having separate options results in a fair amount of boilerplate code, and this will only get worse as we expand the set of affected CPUs. To reduce this boilerplate, unify the two behind a common Kconfig option. This removes the duplicate text and Kconfig logic, and removes the need for the intermediate ARM64_WORKAROUND_SPECULATIVE_SSBS option. The set of affected CPUs is described as a list so that this can easily be extended. I've used ARM64_ERRATUM_3194386 (matching the Neoverse-V3 erratum ID) as the common option, matching the way we use ARM64_ERRATUM_1319367 to cover Cortex-A57 erratum 1319537 and Cortex-A72 erratum 1319367. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <wilL@kernel.org> Link: https://lore.kernel.org/r/20240603111812.1514101-5-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-12	arm64: cputype: Add Cortex-X925 definitions	Mark Rutland	1	-0/+2
	Add cputype definitions for Cortex-X925. These will be used for errata detection in subsequent patches. These values can be found in Table A-285 ("MIDR_EL1 bit descriptions") in issue 0001-05 of the Cortex-X925 TRM, which can be found at: https://developer.arm.com/documentation/102807/0001/?lang=en Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240603111812.1514101-4-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-12	arm64: cputype: Add Cortex-A720 definitions	Mark Rutland	1	-0/+2
	Add cputype definitions for Cortex-A720. These will be used for errata detection in subsequent patches. These values can be found in Table A-186 ("MIDR_EL1 bit descriptions") in issue 0002-05 of the Cortex-A720 TRM, which can be found at: https://developer.arm.com/documentation/102530/0002/?lang=en Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240603111812.1514101-3-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-12	arm64: cputype: Add Cortex-X3 definitions	Mark Rutland	1	-0/+2
	Add cputype definitions for Cortex-X3. These will be used for errata detection in subsequent patches. These values can be found in Table A-263 ("MIDR_EL1 bit descriptions") in issue 07 of the Cortex-X3 TRM, which can be found at: https://developer.arm.com/documentation/101593/0102/?lang=en Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20240603111812.1514101-2-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-12	arm64: mte: Make mte_check_tfsr_*() conditional on KASAN instead of MTE	Peter Collingbourne	1	-2/+2
	The check in mte_check_tfsr_el1() is only necessary if HW tag based KASAN is enabled. However, we were also executing the check if MTE is enabled and KASAN is enabled at build time but disabled at runtime. This turned out to cause a measurable increase in power consumption on a specific microarchitecture after enabling MTE. Moreover, on the same system, an increase in invalid syscall latency (as measured by [1]) of around 20-30% (depending on the cluster) was observed after enabling MTE; this almost entirely goes away after removing this check. Therefore, make the check conditional on whether KASAN is enabled rather than on whether MTE is enabled. [1] https://lore.kernel.org/all/CAMn1gO4MwRV8bmFJ_SeY5tsYNPn2ZP56LjAhafygjFaKuu5ouw@mail.gmail.com/ Signed-off-by: Peter Collingbourne <pcc@google.com> Link: https://linux-review.googlesource.com/id/I22d98d1483dd400a95595946552b769a5a1ad7bd Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Link: https://lore.kernel.org/r/20240528225131.3577704-1-pcc@google.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-12	kselftest/arm64: Fix redundancy of a testcase	Dev Jain	1	-1/+1
	Currently, we are writing the same value as we read into the TLS register, hence we cannot confirm update of the register, making the testcase "verify_tpidr_one" redundant. Fix this. Signed-off-by: Dev Jain <dev.jain@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240605115448.640717-1-dev.jain@arm.com [catalin.marinas@arm.com: remove the increment style change] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-12	kselftest/arm64: Include kernel mode NEON in fp-stress	Mark Brown	4	-9/+343
	Currently fp-stress only covers userspace use of floating point, it does not cover any kernel mode uses. Since currently kernel mode floating point usage can't be preempted and there are explicit preemption points in the existing implementations this isn't so important for fp-stress but when we readd preemption it will be good to try to exercise it. When the arm64 accelerated crypto operations are implemented we can relatively straightforwardly trigger kernel mode floating point usage by using the crypto userspace API to hash data, using the splice() support in an effort to minimise copying. We use /proc/crypto to check which accelerated implementations are available, picking the first symmetric hash we find. We run the kernel mode test unconditionally, replacing the second copy of the FPSIMD testcase for systems with FPSIMD only. If we don't think there are any suitable kernel mode implementations we fall back to running another copy of fpsimd-stress. There are a number issues with this approach, we don't actually verify that we are using an accelerated (or even CPU) implementation of the algorithm being tested and even with attempting to use splice() to minimise copying there are sizing limits on how much data gets spliced at once. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240521-arm64-fp-stress-kernel-v1-1-e38f107baad4@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-12	arm64: implement raw_smp_processor_id() using thread_info	Puranjay Mohan	2	-19/+3
	Historically, arm64 implemented raw_smp_processor_id() as a read of current_thread_info()->cpu. This changed when arm64 moved thread_info into task struct, as at the time CONFIG_THREAD_INFO_IN_TASK made core code use thread_struct::cpu for the cpu number, and due to header dependencies prevented using this in raw_smp_processor_id(). As a workaround, we moved to using a percpu variable in commit: 57c82954e77fa12c ("arm64: make cpu number a percpu variable") Since then, thread_info::cpu was reintroduced, and core code was made to use this in commits: 001430c1910df65a ("arm64: add CPU field to struct thread_info") bcf9033e5449bdca ("sched: move CPU field back into thread_info if THREAD_INFO_IN_TASK=y") Consequently it is possible to use current_thread_info()->cpu again. This decreases the number of emitted instructions like in the following example: Dump of assembler code for function bpf_get_smp_processor_id: 0xffff8000802cd608 <+0>: nop 0xffff8000802cd60c <+4>: nop 0xffff8000802cd610 <+8>: adrp x0, 0xffff800082138000 0xffff8000802cd614 <+12>: mrs x1, tpidr_el1 0xffff8000802cd618 <+16>: add x0, x0, #0x8 0xffff8000802cd61c <+20>: ldrsw x0, [x0, x1] 0xffff8000802cd620 <+24>: ret After this patch: Dump of assembler code for function bpf_get_smp_processor_id: 0xffff8000802c9130 <+0>: nop 0xffff8000802c9134 <+4>: nop 0xffff8000802c9138 <+8>: mrs x0, sp_el0 0xffff8000802c913c <+12>: ldr w0, [x0, #24] 0xffff8000802c9140 <+16>: ret A microbenchmark[1] was built to measure the performance improvement provided by this change. It calls the following function given number of times and finds the runtime overhead: static noinline int get_cpu_id(void) { return smp_processor_id(); } Run the benchmark like: modprobe smp_processor_id nr_function_calls=1000000000 +--------------------------+------------------------+ \| \| Number of Calls \| Time taken \| +--------+-----------------+------------------------+ \| Before \| 1000000000 \| 1602888401ns \| +--------+-----------------+------------------------+ \| After \| 1000000000 \| 1206212658ns \| +--------+-----------------+------------------------+ \| Difference (decrease) \| 396675743ns (24.74%) \| +---------------------------------------------------+ Remove the percpu variable cpu_number as it is used only in set_smp_ipi_range() as a dummy variable to be passed to ipi_handler(). Use irq_stat in place of cpu_number here like arm32. [1] https://github.com/puranjaymohan/linux/commit/77d3fdd Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Link: https://lore.kernel.org/r/20240503171847.68267-2-puranjay@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-12	arm64/arch_timer: include <linux/percpu.h>	Puranjay Mohan	1	-1/+1
	arch_timer.h includes linux/smp.h since the commit: 6acc71ccac7187fc ("arm64: arch_timer: Allows a CPU-specific erratum to only affect a subset of CPUs") It was included to use DEFINE_PER_CPU(), etc. But It should have included <linux/percpu.h> rather than <linux/smp.h>. It worked because smp.h includes percpu.h. The next commit will remove percpu.h from smp.h and it will break this usage. Explicitly include percpu.h and remove smp.h Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20240503171847.68267-1-puranjay@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-12	arm64/cpufeatures/kvm: Add ARMv8.9 FEAT_ECBHB bits in ID_AA64MMFR1 register	Nianyao Tang	1	-0/+1
	Enable ECBHB bits in ID_AA64MMFR1 register as per ARM DDI 0487K.a specification. When guest OS read ID_AA64MMFR1_EL1, kvm emulate this reg using ftr_id_aa64mmfr1 and always return ID_AA64MMFR1_EL1.ECBHB=0 to guest. It results in guest syscall jump to tramp ventry, which is not needed in implementation with ID_AA64MMFR1_EL1.ECBHB=1. Let's make the guest syscall process the same as the host. Signed-off-by: Nianyao Tang <tangnianyao@huawei.com> Link: https://lore.kernel.org/r/20240611122049.2758600-1-tangnianyao@huawei.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-06-09	Linux 6.10-rc3	Linus Torvalds	1	-1/+1