Age | Commit message (Collapse) | Author | Files | Lines |
|
Update JSON events and metrics for alderlake to perf.
Based on ADL JSON event list v1.16:
https://github.com/intel/perfmon/tree/main/ADL/events
Generate the event list and metrics with the converter scripts:
https://github.com/intel/perfmon/pull/32
Reviewed-by: Kan Liang <[email protected]>
Signed-off-by: Xing Zhengjun <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add JSON metrics for Alderlake-N to perf.
It only included E-core metrics.
E-core metrics based on E-core TMA v2.2 (E-core_TMA_Metrics.csv)
It is downloaded from:
https://github.com/intel/perfmon/
Reviewed-by: Kan Liang <[email protected]>
Signed-off-by: Xing Zhengjun <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add JSON uncore events for Alderlake-N
Based on JSON list v1.16:
https://github.com/intel/perfmon/tree/main/ADL/events/
Reviewed-by: Kan Liang <[email protected]>
Signed-off-by: Xing Zhengjun <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Alderlake-N only has E-core, it has been moved to non-hybrid code path on
the kernel side, so add the cpuid for Alderlake-N separately.
Add core event list for Alderlake-N, it is based on the
ADL gracemont v1.16 JSON file.
https://github.com/intel/perfmon/tree/main/ADL/events/
Reviewed-by: Kan Liang <[email protected]>
Signed-off-by: Xing Zhengjun <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
It printed empty strings for each metric. I guess it's needed for CSV
output to match the column number. We could just ignore the empty
metrics in JSON but it ended up with a broken JSON object with a
trailing comma.
So I added a dummy '"metric-value" : "none"' part. To do that, it
needs to pass struct outstate to print_metric_end() to check if any
metric value is printed or not.
Before:
# perf stat -aj --metric-only --per-socket --for-each-cgroup system.slice true
{"socket" : "S0", "cpu-count" : 8, "cgroup" : "system.slice", "" : "", "" : "", "" : "", "" : "", "" : "", "" : "", "" : "", "" : ""}
After:
# perf stat -aj --metric-only --per-socket --for-each-cgroup system.slice true
{"socket" : "S0", "cpu-count" : 8, "cgroup" : "system.slice", "metric-value" : "none"}
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
As the JSON output has been broken for a little while, I guess there are
not many users. Let's rename the field to more intuitive one. :)
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
It generated a broken JSON output when aggregation mode or cgroup is
used with --metric-only option. Also get rid of the header line and
make the output single line for each entry.
It needs to know whether the current metric is the first one or not.
So add 'first' field in the outstate and mark it false after printing.
Before:
# perf stat -a -j --metric-only true
{"unit" : "GHz"}{"unit" : "insn per cycle"}{"unit" : "branch-misses of all branches"}
{{"metric-value" : "0.797"}{"metric-value" : "1.65"}{"metric-value" : "0.89"}
^
# perf stat -a -j --metric-only --per-socket true
{"unit" : "GHz"}{"unit" : "insn per cycle"}{"unit" : "branch-misses of all branches"}
{"socket" : "S0", "aggregate-number" : 8, {"metric-value" : "0.295"}{"metric-value" : "1.88"}{"metric-value" : "0.64"}
^
After:
# perf stat -a -j --metric-only true
{"GHz" : "0.990", "insn per cycle" : "2.06", "branch-misses of all branches" : "0.59"}
# perf stat -a -j --metric-only --per-socket true
{"socket" : "S0", "aggregate-number" : 8, "GHz" : "0.439", "insn per cycle" : "2.14", "branch-misses of all branches" : "0.51"}
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Now most of the print functions take a pointer to the struct outstate.
We have one in the evlist__print_counters() and pass it through the
child functions.
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
It always passes a pointer to rt_stat as it's the only one. Let's not
pass it and directly refer it in the printout().
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The printout() takes a lot of arguments and sets an outstate with the
value. Instead, we can fill the outstate first and then pass it to
reduce the number of arguments.
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
It passes prefix and cgroup pointers but the outstate already has them.
Let's pass the outstate pointer instead.
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This is a preparation for the later cleanup. No functional changes
intended.
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This is a minor cleanup and preparation for the later change.
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
It already passes the stat_config argument, then it can find the value in the
config. No need to pass it separately.
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
It always passes a whitespace to the function, thus we can just add it to the
function body. Furthermore, it's only used in the normal output mode.
Well, actually CSV used it but it doesn't need to since we don't care about the
indentation or alignment in the CSV output.
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
It should not use sprintf() anymore. Let's pass the buffer size and use the
safer scnprintf() instead.
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
We don't care about the alignment in the CSV output as it's intended for machine
processing. Let's get rid of it to make the output more compact.
Before:
# perf stat -a --summary -I 1 -x, true
0.001149309,219.20,msec,cpu-clock,219322251,100.00,219.200,CPUs utilized
0.001149309,144,,context-switches,219241902,100.00,656.935,/sec
0.001149309,38,,cpu-migrations,219173705,100.00,173.358,/sec
0.001149309,61,,page-faults,219093635,100.00,278.285,/sec
0.001149309,10679310,,cycles,218746228,100.00,0.049,GHz
0.001149309,6288296,,instructions,218589869,100.00,0.59,insn per cycle
0.001149309,1386904,,branches,218428851,100.00,6.327,M/sec
0.001149309,56863,,branch-misses,218219951,100.00,4.10,of all branches
summary,219.20,msec,cpu-clock,219322251,100.00,20.025,CPUs utilized
summary,144,,context-switches,219241902,100.00,656.935,/sec
summary,38,,cpu-migrations,219173705,100.00,173.358,/sec
summary,61,,page-faults,219093635,100.00,278.285,/sec
summary,10679310,,cycles,218746228,100.00,0.049,GHz
summary,6288296,,instructions,218589869,100.00,0.59,insn per cycle
summary,1386904,,branches,218428851,100.00,6.327,M/sec
summary,56863,,branch-misses,218219951,100.00,4.10,of all branches
After:
0.001148449,224.75,msec,cpu-clock,224870589,100.00,224.747,CPUs utilized
0.001148449,176,,context-switches,224775564,100.00,783.103,/sec
0.001148449,38,,cpu-migrations,224707428,100.00,169.079,/sec
0.001148449,61,,page-faults,224629326,100.00,271.416,/sec
0.001148449,12172071,,cycles,224266368,100.00,0.054,GHz
0.001148449,6901907,,instructions,224108764,100.00,0.57,insn per cycle
0.001148449,1515655,,branches,223946693,100.00,6.744,M/sec
0.001148449,70027,,branch-misses,223735385,100.00,4.62,of all branches
summary,224.75,msec,cpu-clock,224870589,100.00,21.066,CPUs utilized
summary,176,,context-switches,224775564,100.00,783.103,/sec
summary,38,,cpu-migrations,224707428,100.00,169.079,/sec
summary,61,,page-faults,224629326,100.00,271.416,/sec
summary,12172071,,cycles,224266368,100.00,0.054,GHz
summary,6901907,,instructions,224108764,100.00,0.57,insn per cycle
summary,1515655,,branches,223946693,100.00,6.744,M/sec
summary,70027,,branch-misses,223735385,100.00,4.62,of all branches
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
It matches to the prefix (interval timestamp), so better to have them together.
No functional change intended.
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
It missed the 'else' keyword after checking json output mode.
Fixes: 41cb875242e71bf1 ("perf stat: Split print_cgroup() function")
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Remove the repeated word "not" in comments.
Signed-off-by: Shaomin Deng <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
There is a spelling mistake in a perror message. Fix it.
Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
Verify when a bond is configured with {up,down}delay and the link state
of slave members flaps if there are no remaining members up the bond
should immediately select a member to bring up. (from bonding.txt
section 13.1 paragraph 4)
Suggested-by: Liang Li <[email protected]>
Signed-off-by: Jonathan Toppins <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Add some selftest testcases that validate the expected behavior of the
bpf_task_from_pid() kfunc that was added in the prior patch.
Signed-off-by: David Vernet <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
* append missing optnames to the end
* simplify bpf_getsockopt()'s doc
Signed-off-by: Ji Rongfeng <[email protected]>
Link: https://lore.kernel.org/r/DU0P192MB15479B86200B1216EC90E162D6099@DU0P192MB1547.EURP192.PROD.OUTLOOK.COM
Signed-off-by: Martin KaFai Lau <[email protected]>
|
|
It should trigger a WARN_ON_ONCE in btf_type_id_size:
RIP: 0010:btf_type_id_size+0x8bd/0x940 kernel/bpf/btf.c:1952
btf_func_proto_check kernel/bpf/btf.c:4506 [inline]
btf_check_all_types kernel/bpf/btf.c:4734 [inline]
btf_parse_type_sec+0x1175/0x1980 kernel/bpf/btf.c:4763
btf_parse kernel/bpf/btf.c:5042 [inline]
btf_new_fd+0x65a/0xb00 kernel/bpf/btf.c:6709
bpf_btf_load+0x6f/0x90 kernel/bpf/syscall.c:4342
__sys_bpf+0x50a/0x6c0 kernel/bpf/syscall.c:5034
__do_sys_bpf kernel/bpf/syscall.c:5093 [inline]
__se_sys_bpf kernel/bpf/syscall.c:5091 [inline]
__x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:5091
do_syscall_64+0x54/0x70 arch/x86/entry/common.c:48
Signed-off-by: Stanislav Fomichev <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Yonghong Song <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Jiri reports broken test_progs after recent commit 68f8e3d4b916
("selftests/bpf: Make sure zero-len skbs aren't redirectable").
Apparently we don't remount debugfs when we switch back networking namespace.
Let's explicitly mount /sys/kernel/debug.
0: https://lore.kernel.org/bpf/[email protected]/T/#ma66ca9c92e99eee0a25e40f422489b26ee0171c1
Fixes: a30338840fa5 ("selftests/bpf: Move open_netns() and close_netns() into network_helpers.c")
Reported-by: Jiri Olsa <[email protected]>
Signed-off-by: Stanislav Fomichev <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
It caused some troubles when a lock inside kmalloc is contended
because task local storage would allocate memory using kmalloc.
It'd create a recusion and even crash in my system.
There could be a couple of workarounds but I think the simplest
one is to use a pre-allocated hash map. We could fix the task
local storage to use the safe BPF allocator, but it takes time
so let's change this until it happens actually.
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Martin KaFai Lau <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Blake Jones <[email protected]>
Cc: Chris Li <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Song Liu <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Using precise flag with br_inst_retired.near_call causes the test fail
on KVM guests, even when the guests have PMU forwarding enabled and the
event itself is supported.
Remove the precise flag in order to make the test work on KVM guests.
Signed-off-by: Michael Petlan <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
With perf inject -b, it synthesizes build-id event for DSOs. But it
missed to set the size and resulted in having trailing zeros.
As perf record sets the size in write_build_id(), let's set the size
here as well.
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
On IBM Power9, perf watchpoint tests fail since no hardware breakpoints
are available. Detect this by checking the error returned by
perf_event_open() and skip the tests in that case.
Reported-by: Disha Goel <[email protected]>
Signed-off-by: Naveen N. Rao <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Reviewed-by: Kajol Jain<[email protected]>
Tested-by: Kajol Jain<[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
|
|
augmented_raw_syscalls.c defines the bpf map 'syscalls' which is
initialized by perf tool in user space to indicate which system calls
are enabled for tracing, on the other flip eBPF program relies on the
map to filter out the trace events which are not enabled.
The map also includes a field 'string_args_len[6]' which presents the
string length if the corresponding argument is a string type.
Now the map 'syscalls' is not used, bpf program doesn't use it as filter
anymore, this is replaced by using the function bpf_tail_call() and
PROG_ARRAY syscalls map. And we don't need to explicitly set the string
length anymore, bpf_probe_read_str() is smart to copy the string and
return string length.
Therefore, it's safe to remove the bpf map 'syscalls'.
To consolidate the code, this patch removes the definition of map
'syscalls' from augmented_raw_syscalls.c and drops code for using
the map in the perf trace.
Note, since function trace__set_ev_qualifier_bpf_filter() is removed,
calling trace__init_syscall_bpf_progs() from it is also removed. We
don't need to worry it because trace__init_syscall_bpf_progs() is
still invoked from trace__init_syscalls_bpf_prog_array_maps() for
initialization the system call's bpf program callback.
After:
# perf trace -e examples/bpf/augmented_raw_syscalls.c,open* --max-events 10 perf stat --quiet sleep 0.001
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libm.so.6", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libelf.so.1", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libdw.so.1", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libunwind.so.8", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libunwind-aarch64.so.8", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libcrypto.so.3", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libslang.so.2", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libperl.so.5.34", O_RDONLY|O_CLOEXEC) = 3
openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
# perf trace -e examples/bpf/augmented_raw_syscalls.c --max-events 10 perf stat --quiet sleep 0.001
... [continued]: execve()) = 0
brk(NULL) = 0xaaaab1d28000
faccessat(-100, "/etc/ld.so.preload", 4) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
close(3</usr/lib/aarch64-linux-gnu/libcrypto.so.3>) = 0
openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libm.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3</usr/lib/aarch64-linux-gnu/libcrypto.so.3>, 0xfffff33f70d0, 832) = 832
munmap(0xffffb5519000, 28672) = 0
munmap(0xffffb55b7000, 32880) = 0
mprotect(0xffffb55a6000, 61440, PROT_NONE) = 0
Signed-off-by: Leo Yan <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The local variable 'syscall' is not used anymore, remove it.
Signed-off-by: Leo Yan <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
On Arm64 a case is perf tools fails to find the corresponding trace
point folder for system calls listed in the table 'syscalltbl_arm64',
e.g. the generated system call table contains "lookup_dcookie" but we
cannot find out the matched trace point folder for it.
We need to figure out if there have any issue for the generated system
call table, on the other hand, we need to handle the case when trace
point folder is missed under sysfs, this patch sets the flag
syscall::nonexistent as true and returns the error from
trace__read_syscall_info().
Another problem is for trace__syscall_info(), it returns two different
values if a system call doesn't exist: at the first time calling
trace__syscall_info() it returns NULL when the system call doesn't exist,
later if call trace__syscall_info() again for the same missed system
call, it returns pointer of syscall. trace__syscall_info() checks the
condition 'syscalls.table[id].name == NULL', but the name will be
assigned in the first invoking even the system call is not found.
So checking system call's name in trace__syscall_info() is not the right
thing to do, this patch simply checks flag syscall::nonexistent to make
decision if a system call exists or not, finally trace__syscall_info()
returns the consistent result (NULL) if a system call doesn't existed.
Fixes: b8b1033fcaa091d8 ("perf trace: Mark syscall ids that are not allocated to avoid unnecessary error messages")
Signed-off-by: Leo Yan <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: [email protected]
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When a system call is not detected, the reason is either because the
system call ID is out of scope or failure to find the corresponding path
in the sysfs, trace__read_syscall_info() returns zero. Finally, without
returning an error value it introduces confusion for the caller.
This patch lets the function trace__read_syscall_info() to return
-EEXIST when a system call doesn't exist.
Fixes: b8b1033fcaa091d8 ("perf trace: Mark syscall ids that are not allocated to avoid unnecessary error messages")
Signed-off-by: Leo Yan <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: [email protected]
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This patch defines a macro RAW_SYSCALL_ARGS_NUM to replace the open
coded number '6'.
Signed-off-by: Leo Yan <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Output events and metrics in a JSON format by overriding the print
callbacks. Currently other command line options aren't supported and
metrics are repeated once per metric group.
Committer testing:
$ perf list cache
List of pre-defined events (to be used in -e or -M):
L1-dcache-load-misses [Hardware cache event]
L1-dcache-loads [Hardware cache event]
L1-dcache-prefetches [Hardware cache event]
L1-icache-load-misses [Hardware cache event]
L1-icache-loads [Hardware cache event]
branch-load-misses [Hardware cache event]
branch-loads [Hardware cache event]
dTLB-load-misses [Hardware cache event]
dTLB-loads [Hardware cache event]
iTLB-load-misses [Hardware cache event]
iTLB-loads [Hardware cache event]
$ perf list --json cache
[
{
"Unit": "cache",
"EventName": "L1-dcache-load-misses",
"EventType": "Hardware cache event"
},
{
"Unit": "cache",
"EventName": "L1-dcache-loads",
"EventType": "Hardware cache event"
},
{
"Unit": "cache",
"EventName": "L1-dcache-prefetches",
"EventType": "Hardware cache event"
},
{
"Unit": "cache",
"EventName": "L1-icache-load-misses",
"EventType": "Hardware cache event"
},
{
"Unit": "cache",
"EventName": "L1-icache-loads",
"EventType": "Hardware cache event"
},
{
"Unit": "cache",
"EventName": "branch-load-misses",
"EventType": "Hardware cache event"
},
{
"Unit": "cache",
"EventName": "branch-loads",
"EventType": "Hardware cache event"
},
{
"Unit": "cache",
"EventName": "dTLB-load-misses",
"EventType": "Hardware cache event"
},
{
"Unit": "cache",
"EventName": "dTLB-loads",
"EventType": "Hardware cache event"
},
{
"Unit": "cache",
"EventName": "iTLB-load-misses",
"EventType": "Hardware cache event"
},
{
"Unit": "cache",
"EventName": "iTLB-loads",
"EventType": "Hardware cache event"
}
]
$
Signed-off-by: Ian Rogers <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Caleb Biggers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Perry Taylor <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Sandipan Das <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Weilin Wang <[email protected]>
Cc: Xin Gao <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Rather than controlling the list output with passed flags, add
callbacks that are called when an event or metric are
encountered. State is passed to the callback so that command line
options can be respected, alternatively the callbacks can be changed.
Fix a few bugs:
- wordwrap to columns metric descriptions and expressions;
- remove unnecessary whitespace after PMU event names;
- the metric filter is a glob but matched using strstr which will
always fail, switch to using a proper globmatch,
- the detail flag gives details for extra kernel PMU events like
branch-instructions.
In metricgroup.c switch from struct mep being a rbtree of metricgroups
containing a list of metrics, to the tree directly containing all the
metrics. In general the alias for a name is passed to the print
routine rather than being contained in the name with OR.
Committer notes:
Check the asprint() return to address this on fedora 36:
util/print-events.c: In function ‘print_sdt_events’:
util/print-events.c:183:33: error: ignoring return value of ‘asprintf’ declared with attribute ‘warn_unused_result’ [-Werror=unused-result]
183 | asprintf(&evt_name, "%s@%s(%.12s)", sdt_name->s, path, bid);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
$ gcc --version | head -1
gcc (GCC) 12.2.1 20220819 (Red Hat 12.2.1-2)
$
Fix ps.pmu_glob setting when dealing with *:* events, it was being left
with a freed pointer that then at the end of cmd_list() would be double
freed.
Check if pmu_name is NULL in default_print_event() before calling
strglobmatch(pmu_name, ...) to avoid a segfault.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Caleb Biggers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Perry Taylor <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Sandipan Das <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Weilin Wang <[email protected]>
Cc: Xin Gao <[email protected]>
Cc: Xing Zhengjun <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The tools/lib includes fixes break LIBTRACEVENT_DYNAMIC as the makefile
erroneously had dependencies on building libtraceevent even when not
linking with it. This change fixes the issues with LIBTRACEEVENT_DYNAMIC
by making the built files optional.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masahiro Yamada <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Nicolas Schier <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
So that it can get rid of requirement of a compiler.
$ sudo ./perf test -v 109
109: Test data symbol :
--- start ---
test child forked, pid 844526
Recording workload...
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.354 MB /tmp/__perf_test.perf.data.GFeZO (4847 samples) ]
Cleaning up files...
test child finished with 0
---- end ----
Test data symbol: Ok
Signed-off-by: Namhyung Kim <[email protected]>
Tested-by: James Clark <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: German Gomez <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Zhengjun Xing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The datasym workload is to check if perf mem command gets the data
addresses precisely. This is needed for data symbol test.
$ perf test -w datasym
I had to keep the buf1 in the data section, otherwise it could end
up in the BSS and was mmaped as a separate //anon region, then it
was not symbolized at all. It needs to be fixed separately.
Committer notes:
Add a -U _FORTIFY_SOURCE to the datasym CFLAGS, as the main perf flags
set it and it requires building with optimization, and this new test has
a -O0.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: German Gomez <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Zhengjun Xing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
So that it can get rid of requirement of a compiler. Also rename the
symbols to match with the perf test workload.
Signed-off-by: Namhyung Kim <[email protected]>
Tested-by: James Clark <[email protected]>
Acked-by: German Gomez <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Zhengjun Xing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The brstack is to run different kinds of branches repeatedly. This is
necessary for brstack test case to verify if it has correct branch info.
$ perf test -w brstack
I renamed the internal functions to have brstack_ prefix as it's too
generic name.
Add a -U_FORTIFY_SOURCE to the brstack CFLAGS, as the main perf flags
set it and it requires building with optimization, and this new test has
a -O0.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: German Gomez <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Zhengjun Xing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
So that it can get rid of requirement of a compiler. I've also removed
killall as it'll kill perf process now and run the test workload for 10
sec instead.
Signed-off-by: Namhyung Kim <[email protected]>
Tested-by: James Clark <[email protected]>
Tested-by: Leo Yan <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: German Gomez <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Zhengjun Xing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The sqrtloop creates a child process to run an infinite loop calling
sqrt() with rand(). This is needed for ARM SPE fork test.
$ perf test -w sqrtloop
It can take an optional argument to specify how long it will run in
seconds (default: 1).
Committer notes:
Explicitely ignored the sqrt() return to fix the build on systems where
the compiler complains it isn't being used.
And added a sqrtloop specific CFLAGS to disable optimizations to make
this a bit more robust wrt dead code elimination.
Doing that a -U_FORTIFY_SOURCE needs to be added, as -O0 is incompatible
with it.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: German Gomez <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Zhengjun Xing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
So that it can get rid of requirement of a compiler.
Reviewed-by: Leo Yan <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Tested-by: James Clark <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: German Gomez <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Zhengjun Xing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The leafloop workload is to run an infinite loop in the test_leaf
function. This is needed for the ARM fp callgraph test to verify if it
gets the correct callchains.
$ perf test -w leafloop
Committer notes:
Add a:
-U_FORTIFY_SOURCE
to the leafloop CFLAGS as the main perf flags set it and it requires
building with optimization, and this new test has a -O0.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: German Gomez <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Zhengjun Xing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Provide an implementation for arch_pc_relative_reloc(). It is needed to
pass the build once 61c6065ef7ec ("objtool: Allow !PC relative
relocations") is merged.
Signed-off-by: Michael Ellerman <[email protected]>
|
|
bpf_cgroup_ancestor() allows BPF programs to access the ancestor of a
struct cgroup *. This patch adds selftests that validate its expected
behavior.
Signed-off-by: David Vernet <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
This patch adds a selftest suite to validate the cgroup kfuncs that were
added in the prior patch.
Signed-off-by: David Vernet <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
Currently LLVM fails to recognize .data.* as data section and defaults to .text
section. Later BPF backend tries to emit 4-byte NOP instruction which doesn't
exist in BPF ISA and aborts.
The fix for LLVM is pending:
https://reviews.llvm.org/D138477
While waiting for the fix lets workaround the linked_list test case
by using .bss.* prefix which is properly recognized by LLVM as BSS section.
Fix libbpf to support .bss. prefix and adjust tests.
Signed-off-by: Alexei Starovoitov <[email protected]>
|