aboutsummaryrefslogtreecommitdiff
path: root/arch/x86/include/asm/fpu/sched.h
diff options
context:
space:
mode:
authorUros Bizjak <[email protected]>2023-10-15 22:24:40 +0200
committerIngo Molnar <[email protected]>2023-10-16 12:52:02 +0200
commit1d10f3aec2bb734b4b594afe8c1bd0aa656a7e4d (patch)
tree92617f3981085cc3dd3a93c8b9f66904cc2ae80f /arch/x86/include/asm/fpu/sched.h
parenta048d3abae7c33f0a3f4575fab15ac5504d443f7 (diff)
x86/percpu: Use C for arch_raw_cpu_ptr(), to improve code generation
Implement arch_raw_cpu_ptr() in C to allow the compiler to perform better optimizations, such as setting an appropriate base to compute the address. The compiler is free to choose either MOV or ADD from this_cpu_off address to construct the optimal final address. There are some other issues when memory access to the percpu area is implemented with an asm. Compilers can not eliminate asm common subexpressions over basic block boundaries, but are extremely good at optimizing memory access. By implementing arch_raw_cpu_ptr() in C, the compiler can eliminate additional redundant loads from this_cpu_off, further reducing the number of percpu offset reads from 1646 to 1631 on a test build, a -0.9% reduction. Co-developed-by: Nadav Amit <[email protected]> Signed-off-by: Nadav Amit <[email protected]> Signed-off-by: Uros Bizjak <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Uros Bizjak <[email protected]> Cc: Sean Christopherson <[email protected]> Link: https://lore.kernel.org/r/[email protected]
Diffstat (limited to 'arch/x86/include/asm/fpu/sched.h')
0 files changed, 0 insertions, 0 deletions