Age | Commit message (Collapse) | Author | Files | Lines |
|
'for-next/mte', 'for-next/errata', 'for-next/acpi', 'for-next/gic-v3-pmr' and 'for-next/doc', remote-tracking branch 'arm64/for-next/perf' into for-next/core
* arm64/for-next/perf:
perf: add missing MODULE_DESCRIPTION() macros
perf: arm_pmuv3: Include asm/arm_pmuv3.h from linux/perf/arm_pmuv3.h
perf: arm_v6/7_pmu: Drop non-DT probe support
perf/arm: Move 32-bit PMU drivers to drivers/perf/
perf: arm_pmuv3: Drop unnecessary IS_ENABLED(CONFIG_ARM64) check
perf: arm_pmuv3: Avoid assigning fixed cycle counter with threshold
perf: imx_perf: add support for i.MX95 platform
perf: imx_perf: fix counter start and config sequence
perf: imx_perf: refactor driver for imx93
perf: imx_perf: let the driver manage the counter usage rather the user
perf: imx_perf: add macro definitions for parsing config attr
dt-bindings: perf: fsl-imx-ddr: Add i.MX95 compatible
perf: pmuv3: Add new Cortex and Neoverse PMUs
dt-bindings: arm: pmu: Add new Cortex and Neoverse cores
perf/arm-cmn: Enable support for tertiary match group
perf/arm-cmn: Decouple wp_config registers from filter group number
* for-next/cpufeature:
: Various cpufeature infrastructure patches
arm64/cpufeature: Replace custom macros with fields from ID_AA64PFR0_EL1
KVM: arm64: Replace custom macros with fields from ID_AA64PFR0_EL1
arm64/cpufeatures/kvm: Add ARMv8.9 FEAT_ECBHB bits in ID_AA64MMFR1 register
* for-next/misc:
: Miscellaneous patches
arm64: smp: Fix missing IPI statistics
arm64: Cleanup __cpu_set_tcr_t0sz()
arm64/mm: Stop using ESR_ELx_FSC_TYPE during fault
arm64: Kconfig: fix typo in __builtin_return_adddress
ARM64: reloc_test: add missing MODULE_DESCRIPTION() macro
arm64: implement raw_smp_processor_id() using thread_info
arm64/arch_timer: include <linux/percpu.h>
* for-next/kselftest:
: arm64 kselftest updates
selftests: arm64: tags: remove the result script
selftests: arm64: tags_test: conform test to TAP output
kselftest/arm64: Fix a couple of spelling mistakes
kselftest/arm64: Fix redundancy of a testcase
kselftest/arm64: Include kernel mode NEON in fp-stress
* for-next/mte:
: MTE updates
arm64: mte: Make mte_check_tfsr_*() conditional on KASAN instead of MTE
* for-next/errata:
: Arm CPU errata workarounds
arm64: errata: Expand speculative SSBS workaround
arm64: errata: Unify speculative SSBS errata logic
arm64: cputype: Add Cortex-X925 definitions
arm64: cputype: Add Cortex-A720 definitions
arm64: cputype: Add Cortex-X3 definitions
* for-next/acpi:
: arm64 ACPI patches
ACPI: Add acpi=nospcr to disable ACPI SPCR as default console on ARM64
ACPI / amba: Drop unnecessary check for registered amba_dummy_clk
arm64: FFH: Move ACPI specific code into drivers/acpi/arm64/
arm64: cpuidle: Move ACPI specific code into drivers/acpi/arm64/
ACPI: arm64: Sort entries alphabetically
* for-next/gic-v3-pmr:
: arm64: irqchip/gic-v3: Use compiletime constant PMR values
arm64: irqchip/gic-v3: Select priorities at boot time
irqchip/gic-v3: Detect GICD_CTRL.DS and SCR_EL3.FIQ earlier
irqchip/gic-v3: Make distributor priorities variables
irqchip/gic-common: Remove sync_access callback
wordpart.h: Add REPEAT_BYTE_U32()
* for-next/doc:
: arm64 documentation updates
Documentation: arm64: Update memory.rst for TBI
|
|
The run_tags_test.sh script is used to run tags_test and print out if
the test succeeded or failed. As tags_test has been TAP conformed, this
script is unneeded and hence can be removed.
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Link: https://lore.kernel.org/r/20240602132502.4186771-2-usama.anjum@collabora.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Conform the layout, informational and status messages to TAP. No
functional change is intended other than the layout of output messages.
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Link: https://lore.kernel.org/r/20240602132502.4186771-1-usama.anjum@collabora.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
With ARCH=x86, make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/arm-ccn.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/fsl_imx8_ddr_perf.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/marvell_cn10k_ddr_pmu.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/arm_cspmu/arm_cspmu_module.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/arm_cspmu/nvidia_cspmu.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/arm_cspmu/ampere_cspmu.o
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/perf/cxl_pmu.o
Add the missing invocation of the MODULE_DESCRIPTION() macro to all
files which have a MODULE_LICENSE().
This includes drivers/perf/hisilicon/hisi_uncore_pmu.c which, although
it did not produce a warning with the x86 allmodconfig configuration,
may cause this warning with arm64 configurations.
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://lore.kernel.org/r/20240709-md-drivers-perf-v3-1-513275b75ed0@quicinc.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
commit 83cfac95c018 ("genirq: Allow interrupts to be excluded from
/proc/interrupts") is to avoid IPIs appear twice in /proc/interrupts.
But the commit 331a1b3a836c ("arm64: smp: Add arch support for backtrace
using pseudo-NMI") and commit 2f5cd0c7ffde("arm64: kgdb: Implement
kgdb_roundup_cpus() to enable pseudo-NMI roundup") set CPU_BACKTRACE and
KGDB_ROUNDUP IPIs "IRQ_HIDDEN" flag but not show them in
arch_show_interrupts(), which cause the interrupt kstat_irqs accounting
is missing in display.
Before this patch, CPU_BACKTRACE and KGDB_ROUNDUP IPIs are missing:
/ # cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3
11: 466 600 309 332 GICv3 27 Level arch_timer
13: 24 0 0 0 GICv3 33 Level uart-pl011
15: 64 0 0 0 GICv3 78 Edge virtio0
16: 0 0 0 0 GICv3 79 Edge virtio1
17: 0 0 0 0 GICv3 34 Level rtc-pl031
18: 3 3 3 3 GICv3 23 Level arm-pmu
19: 0 0 0 0 9030000.pl061 3 Edge GPIO Key Poweroff
IPI0: 7 14 9 26 Rescheduling interrupts
IPI1: 354 93 233 255 Function call interrupts
IPI2: 0 0 0 0 CPU stop interrupts
IPI3: 0 0 0 0 CPU stop (for crash dump) interrupts
IPI4: 0 0 0 0 Timer broadcast interrupts
IPI5: 1 0 0 0 IRQ work interrupts
Err: 0
After this pacth, CPU_BACKTRACE and KGDB_ROUNDUP IPIs are displayed:
/ # cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3
11: 393 281 532 449 GICv3 27 Level arch_timer
13: 15 0 0 0 GICv3 33 Level uart-pl011
15: 64 0 0 0 GICv3 78 Edge virtio0
16: 0 0 0 0 GICv3 79 Edge virtio1
17: 0 0 0 0 GICv3 34 Level rtc-pl031
18: 2 2 2 2 GICv3 23 Level arm-pmu
19: 0 0 0 0 9030000.pl061 3 Edge GPIO Key Poweroff
IPI0: 11 19 4 23 Rescheduling interrupts
IPI1: 279 347 222 72 Function call interrupts
IPI2: 0 0 0 0 CPU stop interrupts
IPI3: 0 0 0 0 CPU stop (for crash dump) interrupts
IPI4: 0 0 0 0 Timer broadcast interrupts
IPI5: 1 0 0 1 IRQ work interrupts
IPI6: 0 0 0 0 CPU backtrace interrupts
IPI7: 0 0 0 0 KGDB roundup interrupts
Err: 0
Fixes: 331a1b3a836c ("arm64: smp: Add arch support for backtrace using pseudo-NMI")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Suggested-by: Doug Anderson <dianders@chromium.org>
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Link: https://lore.kernel.org/r/20240620063600.573559-1-ruanjinjie@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
For varying privacy and security reasons, sometimes we would like to
completely silence the _serial_ console, and only enable it when needed.
But there are many existing systems that depend on this _serial_ console,
so add acpi=nospcr to disable console in ACPI SPCR table as default
_serial_ console.
Signed-off-by: Liu Wei <liuwei09@cestc.cn>
Suggested-by: Prarit Bhargava <prarit@redhat.com>
Suggested-by: Will Deacon <will@kernel.org>
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Reviewed-by: Prarit Bhargava <prarit@redhat.com>
Link: https://lore.kernel.org/r/20240625030504.58025-1-liuwei09@cestc.cn
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Most of memory.rst was written very early, at a time where TBI (Top
Byte Ignore) was not enabled. Nowadays TBI0 is always enabled, and
TBI1 may be enabled, depending on the kernel configuration. This
means that VA bits 63:56 cannot generally be assumed to have any
particular value.
Regardless of TBI, TTBRx selection is done based on bit 55; update
memory.rst accordingly.
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20240702091349.356008-1-kevin.brodsky@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
This replaces custom macros usage (i.e ID_AA64PFR0_EL1_ELx_64BIT_ONLY and
ID_AA64PFR0_EL1_ELx_32BIT_64BIT) and instead directly uses register fields
from ID_AA64PFR0_EL1 sysreg definition. Finally let's drop off both these
custom macros as they are now redundant.
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20240613102710.3295108-3-anshuman.khandual@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
This replaces custom macros usage (i.e ID_AA64PFR0_EL1_ELx_64BIT_ONLY and
ID_AA64PFR0_EL1_ELx_32BIT_64BIT) and instead directly uses register fields
from ID_AA64PFR0_EL1 sysreg definition.
Cc: Marc Zyngier <maz@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: kvmarm@lists.linux.dev
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20240613102710.3295108-2-anshuman.khandual@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
The arm64 asm/arm_pmuv3.h depends on defines from
linux/perf/arm_pmuv3.h. Rather than depend on include order, follow the
usual pattern of "linux" headers including "asm" headers of the same
name.
With this change, the include of linux/kvm_host.h is problematic due to
circular includes:
In file included from ../arch/arm64/include/asm/arm_pmuv3.h:9,
from ../include/linux/perf/arm_pmuv3.h:312,
from ../include/kvm/arm_pmu.h:11,
from ../arch/arm64/include/asm/kvm_host.h:38,
from ../arch/arm64/mm/init.c:41:
../include/linux/kvm_host.h:383:30: error: field 'arch' has incomplete type
Switching to asm/kvm_host.h solves the issue.
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://lore.kernel.org/r/20240626-arm-pmu-3-9-icntr-v2-5-c9784b4f4065@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
There are no non-DT based PMU users for v6 or v7, so drop the custom
non-DT probe table. Unfortunately XScale still needs non-DT probing.
Note that this drops support for arm1156 PMU, but there are no arm1156
based systems supported in the kernel.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://lore.kernel.org/r/20240626-arm-pmu-3-9-icntr-v2-4-c9784b4f4065@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
It is preferred to put drivers under drivers/ rather than under arch/.
The PMU drivers also depend on arm_pmu.c, so it's better to place them
all together.
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://lore.kernel.org/r/20240626-arm-pmu-3-9-icntr-v2-3-c9784b4f4065@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The IS_ENABLED(CONFIG_ARM64) check for threshold support is unnecessary.
The purpose is to not enable thresholds on arm32, but if threshold is
non-zero, the check against threshold_max() just above here will have
errored out because threshold_max() is always 0 on arm32.
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Acked-by: Mark rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20240626-arm-pmu-3-9-icntr-v2-2-c9784b4f4065@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
If the user has requested a counting threshold for the CPU cycles event,
then the fixed cycle counter can't be assigned as it lacks threshold
support. Currently, the thresholds will work or not randomly depending
on which counter the event is assigned.
While using thresholds for CPU cycles doesn't make much sense, it can be
useful for testing purposes.
Fixes: 816c26754447 ("arm64: perf: Add support for event counting threshold")
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20240626-arm-pmu-3-9-icntr-v2-1-c9784b4f4065@kernel.org
Signed-off-by: Will Deacon <will@kernel.org>
|
|
i.MX95 has a DDR PMU which is almostly same as i.MX93, it now supports
read beat and write beat filter capabilities. This will add support for
i.MX95 and enhance the driver to support specific filter handling for it.
Usage:
For read beat:
~# perf stat -a -I 1000 -e imx9_ddr0/eddrtq_pm_rd_beat_filt2,axi_mask=ID_MASK,axi_id=ID/
~# perf stat -a -I 1000 -e imx9_ddr0/eddrtq_pm_rd_beat_filt1,axi_mask=ID_MASK,axi_id=ID/
~# perf stat -a -I 1000 -e imx9_ddr0/eddrtq_pm_rd_beat_filt0,axi_mask=ID_MASK,axi_id=ID/
eg: For edma2: perf stat -a -I 1000 -e imx9_ddr0/eddrtq_pm_rd_beat_filt0,axi_mask=0x00f,axi_id=0x00c/
For write beat:
~# perf stat -a -I 1000 -e imx9_ddr0/eddrtq_pm_wr_beat_filt,axi_mask=ID_MASK,axi_id=ID/
eg: For edma2: perf stat -a -I 1000 -e imx9_ddr0/eddrtq_pm_wr_beat_filt,axi_mask=0x00f,axi_id=0x00c/
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20240529080358.703784-6-xu.yang_2@nxp.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
In current driver, the counter will start firstly and then be configured.
This sequence is not correct for AXI filter events since the correct
AXI_MASK and AXI_ID are not set yet. Then the results may be inaccurate.
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Fixes: 55691f99d417 ("drivers/perf: imx_ddr: Add support for NXP i.MX9 SoC DDRC PMU driver")
cc: stable@vger.kernel.org
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20240529080358.703784-5-xu.yang_2@nxp.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
This driver is initinally used to support imx93 Soc and now it's time to
add support for imx95 Soc. However, some macro definitions and events are
different on these two Socs. For preparing imx95 supports, this will
refactor driver for imx93.
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20240529080358.703784-4-xu.yang_2@nxp.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
In current design, the user of perf app needs to input counter ID to count
events. However, this is not user-friendly since the user needs to lookup
the map table to find the counter. Instead of letting the user to input
the counter, let this driver to manage the counters in this patch.
This will be implemented by:
1. allocate counter 0 for cycle event.
2. find unused counter from 1-10 for reference events.
3. allocate specific counter for counter-specific events.
In this patch, counter attr will be kept for back-compatible but all the
value passed down by counter=<n> will be ignored. To mark counter-specific
events, counter ID will be encoded into perf_pmu_events_attr.id.
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20240529080358.703784-3-xu.yang_2@nxp.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The user can set event and counter in cmdline and the driver need to parse
it using 'config' attr value. This will add macro definitions to avoid
hard-code in driver.
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20240529080358.703784-2-xu.yang_2@nxp.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
i.MX95 has a DDR pmu. This will add a compatible for it.
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20240529080358.703784-1-xu.yang_2@nxp.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Add support for the Arm Cortex-A725, Cortex-X925, Neoverse N3,
Neoverse V2, Neoverse V3 and Neoverse V3AE.
This just adds the names and connects them with their DT compatible
strings.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Link: https://lore.kernel.org/r/20240628145612.1291329-3-andre.przywara@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Add compatible strings for the PMUs in the Arm Cortex-A725, Cortex-X925,
Neoverse N3, Neoverse V2, Neoverse V3 and Neoverse V3AE cores.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Link: https://lore.kernel.org/r/20240628145612.1291329-2-andre.przywara@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Add support for tertiary match group.
Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/20240618005056.3092866-3-ilkka@os.amperecomputing.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Previously, wp_config0/2 registers were used for primary match group and
wp_config1/3 registers for secondary match group. In order to support
tertiary match group, this patch decouples the registers and the groups.
Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Link: https://lore.kernel.org/r/20240618005056.3092866-2-ilkka@os.amperecomputing.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
The T0SZ field of TCR_EL1 occupies bits 0-5 of the register and encode
the virtual address space translated by TTBR0_EL1. When updating the
field, for example because we are switching to/from the idmap page-table,
__cpu_set_tcr_t0sz() erroneously treats its 't0sz' argument as unshifted,
resulting in harmless but confusing double shifts by 0 in the code.
Co-developed-by: Leem ChaeHoon <infinite.run@gmail.com>
Signed-off-by: Leem ChaeHoon <infinite.run@gmail.com>
Signed-off-by: Seongsu Park <sgsu.park@samsung.com>
Link: https://lore.kernel.org/r/20240523122146.144483-1-sgsu.park@samsung.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
amba_register_dummy_clk() is called only once from acpi_amba_init()
and acpi_amba_init() itself is called once during the initialisation.
amba_dummy_clk can't be initialised before this in any other code
path and hence the check for already registered amba_dummy_clk is
not necessary. Drop the same.
Signed-off-by: Youwan Wang <youwan@nfschina.com>
Acked-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Hanjun Guo <guohanjun@huawei.com>
Link: https://lore.kernel.org/r/20240624023101.369633-1-youwan@nfschina.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
The distributor and PMR/RPR can present different views of the interrupt
priority space dependent upon the values of GICD_CTLR.DS and
SCR_EL3.FIQ. Currently we treat the distributor's view of the priority
space as canonical, and when the two differ we change the way we handle
values in the PMR/RPR, using the `gic_nonsecure_priorities` static key
to decide what to do.
This approach works, but it's sub-optimal. When using pseudo-NMI we
manipulate the distributor rarely, and we manipulate the PMR/RPR
registers very frequently in code spread out throughout the kernel (e.g.
local_irq_{save,restore}()). It would be nicer if we could use fixed
values for the PMR/RPR, and dynamically choose the values programmed
into the distributor.
This patch changes the GICv3 driver and arm64 code accordingly. PMR
values are chosen at compile time, and the GICv3 driver determines the
appropriate values to program into the distributor at boot time. This
removes the need for the `gic_nonsecure_priorities` static key and
results in smaller and better generated code for saving/restoring the
irqflags.
Before this patch, local_irq_disable() compiles to:
| 0000000000000000 <outlined_local_irq_disable>:
| 0: d503201f nop
| 4: d50343df msr daifset, #0x3
| 8: d65f03c0 ret
| c: d503201f nop
| 10: d2800c00 mov x0, #0x60 // #96
| 14: d5184600 msr icc_pmr_el1, x0
| 18: d65f03c0 ret
| 1c: d2801400 mov x0, #0xa0 // #160
| 20: 17fffffd b 14 <outlined_local_irq_disable+0x14>
After this patch, local_irq_disable() compiles to:
| 0000000000000000 <outlined_local_irq_disable>:
| 0: d503201f nop
| 4: d50343df msr daifset, #0x3
| 8: d65f03c0 ret
| c: d2801800 mov x0, #0xc0 // #192
| 10: d5184600 msr icc_pmr_el1, x0
| 14: d65f03c0 ret
... with 3 fewer instructions per call.
For defconfig + CONFIG_PSEUDO_NMI=y, this results in a minor saving of
~4K of text, and will make it easier to make further improvements to the
way we manipulate irqflags and DAIF bits.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Tested-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240617111841.2529370-6-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
|
|
In subsequent patches the GICv3 driver will choose the regular interrupt
priority at boot time, dependent on the configuration of GICD_CTRL.DS
and SCR_EL3.FIQ. This will need to be chosen before we configure the
distributor with default prioirities for all the interrupts, which
happens before we currently detect these in gic_cpu_sys_reg_init().
Add a new gic_prio_init() function to detect these earlier and log them
to the console so that any problems can be debugged more easily. This
also allows the uniformity checks in gic_cpu_sys_reg_init() to be
simplified, as we can compare directly with the boot CPU values which
were recorded earlier.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Tested-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240617111841.2529370-5-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
|
|
In subsequent patches the GICv3 driver will choose the regular interrupt
priority at boot time.
In preparation for using dynamic priorities, place the priorities in
variables and update the code to pass these as parameters. Users of
GICD_INT_DEF_PRI_X4 are modified to replicate the priority byte using
REPEAT_BYTE_U32().
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Tested-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240617111841.2529370-4-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
|
|
The gic_configure_irq(), gic_dist_config(), and gic_cpu_config()
functions each take an optional "sync_access" callback, but in almost
all cases this is not used. The only user is the GICv3 driver's
gic_cpu_init() function, which uses gic_redist_wait_for_rwp() as the
"sync_access" callback for gic_cpu_config().
It would be simpler and clearer to remove the callback and have the
GICv3 driver call gic_redist_wait_for_rwp() explicitly after
gic_cpu_config().
Remove the "sync_access" callback, and call gic_redist_wait_for_rwp()
explicitly in the GICv3 driver.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Tested-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20240617111841.2529370-3-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
|
|
In some cases it's necessary to replicate a byte across a u32 value, for
which REPEAT_BYTE() would be helpful. Currently this requires explicit
masking of the result to avoid sparse warnings, as e.g.
(u32)REPEAT_BYTE(0xa0))
... will result in a warning:
cast truncates bits from constant value (a0a0a0a0a0a0a0a0 becomes a0a0a0a0)
Add a new REPEAT_BYTE_U32() which does the necessary masking internally,
so that we don't need to duplicate this for every usage.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240617111841.2529370-2-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Fault status codes at page table level 0, 1, 2 and 3 for access, permission
and translation faults are architecturally organized in a way, that masking
out ESR_ELx_FSC_TYPE, fetches Level 0 status code for the respective fault.
Helpers like esr_fsc_is_[translation|permission|access_flag]_fault() mask
out ESR_ELx_FSC_TYPE before comparing against corresponding Level 0 status
code as the kernel does not yet care about the page table level, where in
the fault really occurred previously.
This scheme is starting to crumble after FEAT_LPA2 when level -1 got added.
Fault status code for translation fault at level -1 is 0x2B which does not
follow ESR_ELx_FSC_TYPE, requiring esr_fsc_is_translation_fault() changes.
This changes above helpers to compare against individual fault status code
values for each page table level and stop using ESR_ELx_FSC_TYPE, which is
losing its value as a common mask.
Cc: Will Deacon <will@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20240618034703.3622510-1-anshuman.khandual@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Comment about BUILTIN_RETURN_ADDRESS_STRIPS_PAC spells
__builtin_return_adddress with a triple 'd', fix it.
Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
Link: https://lore.kernel.org/r/20240620174038.3721466-1-rppt@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
With ARCH=arm64, make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in arch/arm64/kernel/arm64-reloc-test.o
Add the missing invocation of the MODULE_DESCRIPTION() macro.
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://lore.kernel.org/r/20240612-md-arch-arm64-kernel-v1-1-1fafe8d11df3@quicinc.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
There are two spelling mistakes in some error messages. Fix them.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20240613073429.1797451-1-colin.i.king@gmail.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
The ACPI FFH Opregion code can be moved out of arm64 arch code as
it just uses SMCCC. Move all the ACPI FFH Opregion code into
drivers/acpi/arm64/ffh.c
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Hanjun Guo <guohanjun@huawei.com>
Link: https://lore.kernel.org/r/20240605131458.3341095-4-sudeep.holla@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
The ACPI cpuidle LPI FFH code can be moved out of arm64 arch code as
it just uses SMCCC. Move all the ACPI cpuidle LPI FFH code into
drivers/acpi/arm64/cpuidle.c
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Hanjun Guo <guohanjun@huawei.com>
Link: https://lore.kernel.org/r/20240605131458.3341095-3-sudeep.holla@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Sort the entries in the Makefile alphabetically.
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: Hanjun Guo <guohanjun@huawei.com>
Link: https://lore.kernel.org/r/20240605131458.3341095-2-sudeep.holla@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
A number of Arm Ltd CPUs suffer from errata whereby an MSR to the SSBS
special-purpose register does not affect subsequent speculative
instructions, permitting speculative store bypassing for a window of
time.
We worked around this for Cortex-X4 and Neoverse-V3, in commit:
7187bb7d0b5c7dfa ("arm64: errata: Add workaround for Arm errata 3194386 and 3312417")
... as per their Software Developer Errata Notice (SDEN) documents:
* Cortex-X4 SDEN v8.0, erratum 3194386:
https://developer.arm.com/documentation/SDEN-2432808/0800/
* Neoverse-V3 SDEN v6.0, erratum 3312417:
https://developer.arm.com/documentation/SDEN-2891958/0600/
Since then, similar errata have been published for a number of other Arm Ltd
CPUs, for which the mitigation is the same. This is described in their
respective SDEN documents:
* Cortex-A710 SDEN v19.0, errataum 3324338
https://developer.arm.com/documentation/SDEN-1775101/1900/?lang=en
* Cortex-A720 SDEN v11.0, erratum 3456091
https://developer.arm.com/documentation/SDEN-2439421/1100/?lang=en
* Cortex-X2 SDEN v19.0, erratum 3324338
https://developer.arm.com/documentation/SDEN-1775100/1900/?lang=en
* Cortex-X3 SDEN v14.0, erratum 3324335
https://developer.arm.com/documentation/SDEN-2055130/1400/?lang=en
* Cortex-X925 SDEN v8.0, erratum 3324334
https://developer.arm.com/documentation/109108/800/?lang=en
* Neoverse-N2 SDEN v17.0, erratum 3324339
https://developer.arm.com/documentation/SDEN-1982442/1700/?lang=en
* Neoverse-V2 SDEN v9.0, erratum 3324336
https://developer.arm.com/documentation/SDEN-2332927/900/?lang=en
Note that due to shared design lineage, some CPUs share the same erratum
number.
Add these to the existing mitigation under CONFIG_ARM64_ERRATUM_3194386.
As listing all of the erratum IDs in the runtime description would be
unwieldy, this is reduced to:
"SSBS not fully self-synchronizing"
... matching the description of the errata in all of the SDENs.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240603111812.1514101-6-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Cortex-X4 erratum 3194386 and Neoverse-V3 erratum 3312417 are identical,
with duplicate Kconfig text and some unsightly ifdeffery. While we try
to share code behind CONFIG_ARM64_WORKAROUND_SPECULATIVE_SSBS, having
separate options results in a fair amount of boilerplate code, and this
will only get worse as we expand the set of affected CPUs.
To reduce this boilerplate, unify the two behind a common Kconfig
option. This removes the duplicate text and Kconfig logic, and removes
the need for the intermediate ARM64_WORKAROUND_SPECULATIVE_SSBS option.
The set of affected CPUs is described as a list so that this can easily
be extended.
I've used ARM64_ERRATUM_3194386 (matching the Neoverse-V3 erratum ID) as
the common option, matching the way we use ARM64_ERRATUM_1319367 to
cover Cortex-A57 erratum 1319537 and Cortex-A72 erratum 1319367.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <wilL@kernel.org>
Link: https://lore.kernel.org/r/20240603111812.1514101-5-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Add cputype definitions for Cortex-X925. These will be used for errata
detection in subsequent patches.
These values can be found in Table A-285 ("MIDR_EL1 bit descriptions")
in issue 0001-05 of the Cortex-X925 TRM, which can be found at:
https://developer.arm.com/documentation/102807/0001/?lang=en
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240603111812.1514101-4-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Add cputype definitions for Cortex-A720. These will be used for errata
detection in subsequent patches.
These values can be found in Table A-186 ("MIDR_EL1 bit descriptions")
in issue 0002-05 of the Cortex-A720 TRM, which can be found at:
https://developer.arm.com/documentation/102530/0002/?lang=en
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240603111812.1514101-3-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Add cputype definitions for Cortex-X3. These will be used for errata
detection in subsequent patches.
These values can be found in Table A-263 ("MIDR_EL1 bit descriptions")
in issue 07 of the Cortex-X3 TRM, which can be found at:
https://developer.arm.com/documentation/101593/0102/?lang=en
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240603111812.1514101-2-mark.rutland@arm.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
The check in mte_check_tfsr_el1() is only necessary if HW tag
based KASAN is enabled. However, we were also executing the check
if MTE is enabled and KASAN is enabled at build time but disabled
at runtime. This turned out to cause a measurable increase in
power consumption on a specific microarchitecture after enabling
MTE. Moreover, on the same system, an increase in invalid syscall
latency (as measured by [1]) of around 20-30% (depending on the
cluster) was observed after enabling MTE; this almost entirely goes
away after removing this check. Therefore, make the check conditional
on whether KASAN is enabled rather than on whether MTE is enabled.
[1] https://lore.kernel.org/all/CAMn1gO4MwRV8bmFJ_SeY5tsYNPn2ZP56LjAhafygjFaKuu5ouw@mail.gmail.com/
Signed-off-by: Peter Collingbourne <pcc@google.com>
Link: https://linux-review.googlesource.com/id/I22d98d1483dd400a95595946552b769a5a1ad7bd
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Link: https://lore.kernel.org/r/20240528225131.3577704-1-pcc@google.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Currently, we are writing the same value as we read into the TLS
register, hence we cannot confirm update of the register, making the
testcase "verify_tpidr_one" redundant. Fix this.
Signed-off-by: Dev Jain <dev.jain@arm.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20240605115448.640717-1-dev.jain@arm.com
[catalin.marinas@arm.com: remove the increment style change]
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Currently fp-stress only covers userspace use of floating point, it does
not cover any kernel mode uses. Since currently kernel mode floating
point usage can't be preempted and there are explicit preemption points in
the existing implementations this isn't so important for fp-stress but
when we readd preemption it will be good to try to exercise it.
When the arm64 accelerated crypto operations are implemented we can
relatively straightforwardly trigger kernel mode floating point usage by
using the crypto userspace API to hash data, using the splice() support
in an effort to minimise copying. We use /proc/crypto to check which
accelerated implementations are available, picking the first symmetric
hash we find. We run the kernel mode test unconditionally, replacing the
second copy of the FPSIMD testcase for systems with FPSIMD only. If we
don't think there are any suitable kernel mode implementations we fall back
to running another copy of fpsimd-stress.
There are a number issues with this approach, we don't actually verify
that we are using an accelerated (or even CPU) implementation of the
algorithm being tested and even with attempting to use splice() to
minimise copying there are sizing limits on how much data gets spliced
at once.
Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20240521-arm64-fp-stress-kernel-v1-1-e38f107baad4@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Historically, arm64 implemented raw_smp_processor_id() as a read of
current_thread_info()->cpu. This changed when arm64 moved thread_info into
task struct, as at the time CONFIG_THREAD_INFO_IN_TASK made core code use
thread_struct::cpu for the cpu number, and due to header dependencies
prevented using this in raw_smp_processor_id(). As a workaround, we moved to
using a percpu variable in commit:
57c82954e77fa12c ("arm64: make cpu number a percpu variable")
Since then, thread_info::cpu was reintroduced, and core code was made to use
this in commits:
001430c1910df65a ("arm64: add CPU field to struct thread_info")
bcf9033e5449bdca ("sched: move CPU field back into thread_info if THREAD_INFO_IN_TASK=y")
Consequently it is possible to use current_thread_info()->cpu again.
This decreases the number of emitted instructions like in the following
example:
Dump of assembler code for function bpf_get_smp_processor_id:
0xffff8000802cd608 <+0>: nop
0xffff8000802cd60c <+4>: nop
0xffff8000802cd610 <+8>: adrp x0, 0xffff800082138000
0xffff8000802cd614 <+12>: mrs x1, tpidr_el1
0xffff8000802cd618 <+16>: add x0, x0, #0x8
0xffff8000802cd61c <+20>: ldrsw x0, [x0, x1]
0xffff8000802cd620 <+24>: ret
After this patch:
Dump of assembler code for function bpf_get_smp_processor_id:
0xffff8000802c9130 <+0>: nop
0xffff8000802c9134 <+4>: nop
0xffff8000802c9138 <+8>: mrs x0, sp_el0
0xffff8000802c913c <+12>: ldr w0, [x0, #24]
0xffff8000802c9140 <+16>: ret
A microbenchmark[1] was built to measure the performance improvement
provided by this change. It calls the following function given number of
times and finds the runtime overhead:
static noinline int get_cpu_id(void)
{
return smp_processor_id();
}
Run the benchmark like:
modprobe smp_processor_id nr_function_calls=1000000000
+--------------------------+------------------------+
| | Number of Calls | Time taken |
+--------+-----------------+------------------------+
| Before | 1000000000 | 1602888401ns |
+--------+-----------------+------------------------+
| After | 1000000000 | 1206212658ns |
+--------+-----------------+------------------------+
| Difference (decrease) | 396675743ns (24.74%) |
+---------------------------------------------------+
Remove the percpu variable cpu_number as it is used only in
set_smp_ipi_range() as a dummy variable to be passed to ipi_handler().
Use irq_stat in place of cpu_number here like arm32.
[1] https://github.com/puranjaymohan/linux/commit/77d3fdd
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Link: https://lore.kernel.org/r/20240503171847.68267-2-puranjay@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
arch_timer.h includes linux/smp.h since the commit:
6acc71ccac7187fc ("arm64: arch_timer: Allows a CPU-specific erratum to only affect a subset of CPUs")
It was included to use DEFINE_PER_CPU(), etc. But It should have
included <linux/percpu.h> rather than <linux/smp.h>. It worked because
smp.h includes percpu.h.
The next commit will remove percpu.h from smp.h and it will break this
usage.
Explicitly include percpu.h and remove smp.h
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Link: https://lore.kernel.org/r/20240503171847.68267-1-puranjay@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Enable ECBHB bits in ID_AA64MMFR1 register as per ARM DDI 0487K.a
specification.
When guest OS read ID_AA64MMFR1_EL1, kvm emulate this reg using
ftr_id_aa64mmfr1 and always return ID_AA64MMFR1_EL1.ECBHB=0 to guest.
It results in guest syscall jump to tramp ventry, which is not needed
in implementation with ID_AA64MMFR1_EL1.ECBHB=1.
Let's make the guest syscall process the same as the host.
Signed-off-by: Nianyao Tang <tangnianyao@huawei.com>
Link: https://lore.kernel.org/r/20240611122049.2758600-1-tangnianyao@huawei.com
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
|