aboutsummaryrefslogtreecommitdiff
path: root/tools/perf/scripts/python/parallel-perf.py
diff options
context:
space:
mode:
authorPuranjay Mohan <[email protected]>2024-06-19 13:13:34 +0000
committerAndrii Nakryiko <[email protected]>2024-06-21 14:28:33 -0700
commit2bb138cb20a6a347cfed84381430cd25e05f118e (patch)
treea3e9841f3e93378a7db305fdfe5a3777b3f9142b /tools/perf/scripts/python/parallel-perf.py
parent2807db78ab302eab2c86c5924e4079adb63fd7c8 (diff)
bpf, arm64: Inline bpf_get_current_task/_btf() helpers
On ARM64, the pointer to task_struct is always available in the sp_el0 register and therefore the calls to bpf_get_current_task() and bpf_get_current_task_btf() can be inlined into a single MRS instruction. Here is the difference before and after this change: Before: ; struct task_struct *task = bpf_get_current_task_btf(); 54: mov x10, #0xffffffffffff7978 // #-34440 58: movk x10, #0x802b, lsl #16 5c: movk x10, #0x8000, lsl #32 60: blr x10 --------------> 0xffff8000802b7978 <+0>: mrs x0, sp_el0 64: add x7, x0, #0x0 <-------------- 0xffff8000802b797c <+4>: ret After: ; struct task_struct *task = bpf_get_current_task_btf(); 54: mrs x7, sp_el0 This shows around 1% performance improvement in artificial microbenchmark. Signed-off-by: Puranjay Mohan <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: Xu Kuohai <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
Diffstat (limited to 'tools/perf/scripts/python/parallel-perf.py')
0 files changed, 0 insertions, 0 deletions