aboutsummaryrefslogtreecommitdiff
path: root/scripts/gcc-plugins/cyc_complexity_plugin.c
diff options
context:
space:
mode:
authorPingfan Liu <[email protected]>2020-07-10 22:04:12 +0800
committerCatalin Marinas <[email protected]>2020-07-30 12:58:40 +0100
commitc4885bbb3afee80f41d39a33e49881a18e500f47 (patch)
tree76f4d4083619972c14462a9f186d88069b64ac5b /scripts/gcc-plugins/cyc_complexity_plugin.c
parentea0eada45632f4807b2f49de951072283e2d781c (diff)
arm64/mm: save memory access in check_and_switch_context() fast switch path
On arm64, smp_processor_id() reads a per-cpu `cpu_number` variable, using the per-cpu offset stored in the tpidr_el1 system register. In some cases we generate a per-cpu address with a sequence like: cpu_ptr = &per_cpu(ptr, smp_processor_id()); Which potentially incurs a cache miss for both `cpu_number` and the in-memory `__per_cpu_offset` array. This can be written more optimally as: cpu_ptr = this_cpu_ptr(ptr); Which only needs the offset from tpidr_el1, and does not need to load from memory. The following two test cases show a small performance improvement measured on a 46-cpus qualcomm machine with 5.8.0-rc4 kernel. Test 1: (about 0.3% improvement) #cat b.sh make clean && make all -j138 #perf stat --repeat 10 --null --sync sh b.sh - before this patch Performance counter stats for 'sh b.sh' (10 runs): 298.62 +- 1.86 seconds time elapsed ( +- 0.62% ) - after this patch Performance counter stats for 'sh b.sh' (10 runs): 297.734 +- 0.954 seconds time elapsed ( +- 0.32% ) Test 2: (about 1.69% improvement) 'perf stat -r 10 perf bench sched messaging' Then sum the total time of 'sched/messaging' by manual. - before this patch total 0.707 sec for 10 times - after this patch totol 0.695 sec for 10 times Signed-off-by: Pingfan Liu <[email protected]> Acked-by: Mark Rutland <[email protected]> Cc: Will Deacon <[email protected]> Cc: Steve Capper <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Vladimir Murzin <[email protected]> Cc: Jean-Philippe Brucker <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Catalin Marinas <[email protected]>
Diffstat (limited to 'scripts/gcc-plugins/cyc_complexity_plugin.c')
0 files changed, 0 insertions, 0 deletions