aboutsummaryrefslogtreecommitdiff
path: root/tools/perf
AgeCommit message (Collapse)AuthorFilesLines
2023-12-07perf annotate: Get rid of local annotation optionsNamhyung Kim2-3/+0
It doesn't need the option in the struct annotation which is allocated for each symbol. It can directly use the global options and save 8 bytes per symbol. Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-12-07perf annotate: Remove remaining usages of local annotation optionsNamhyung Kim3-12/+10
So that it can get rid of the unused data. Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-12-07perf annotate: Ensure init/exit for global optionsNamhyung Kim5-24/+27
Now it only cares about the global options so it can just handle it without the argument. Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-12-07perf ui/browser/annotate: Use global annotation_optionsNamhyung Kim10-95/+59
Now it can use the global options and no need save local browser options separately. Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-12-07perf annotate: Use global annotation_optionsNamhyung Kim8-89/+71
Now it can directly use the global options and no need to pass it as an argument. Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Fixup build with GTK2=1 ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-12-07perf top: Convert to the global annotation_optionsNamhyung Kim2-23/+22
Use the global option and drop the local copy. Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-12-07perf report: Convert to the global annotation_optionsNamhyung Kim1-17/+16
Use the global option and drop the local copy. Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-12-07perf annotate: Introduce global annotation_optionsNamhyung Kim3-22/+26
The annotation options are to control the behavior of objdump and the output. It's basically used by 'perf annotate' but 'perf report' and 'perf top' can call it on TUI dynamically. But it doesn't need to have a copy of annotation options in many places. As most of the work is done in the util/annotate.c file, add a global variable and set/use it instead of having their own copies. Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-12-06perf stat: Exit perf stat if parse groups failsIan Rogers1-7/+11
Metrics were added by a callback but commit a4b8cfcabb1d90ec ("perf stat: Delay metric parsing") postponed this to allow optimizations based on the CPU configuration. In doing so it stopped errors in metric parsing from causing 'perf stat' termination. This change adds the termination for bad metric names back in. Fixes: a4b8cfcabb1d90ec ("perf stat: Delay metric parsing") Reported-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Closes: https://lore.kernel.org/lkml/[email protected]/ Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-12-06perf thread: Add missing RC_CHK_EQUALIan Rogers1-1/+1
Comparing pointers without RC_CHK_ACCESS means the indirect object will be compared rather than the underlying maps when REFCNT_CHECKING is enabled. Fix by adding missing RC_CHK_EQUAL. Signed-off-by: Ian Rogers <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Jajeev <[email protected]> Cc: Changbin Du <[email protected]> Cc: Colin Ian King <[email protected]> Cc: Dmitrii Dolgov <[email protected]> Cc: German Gomez <[email protected]> Cc: Guilherme Amadio <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Li Dong <[email protected]> Cc: Liam Howlett <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Masami Hiramatsu (Google) <[email protected]> Cc: Miguel Ojeda <[email protected]> Cc: Ming Wang <[email protected]> Cc: Nick Terrell <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Steinar H. Gunderson <[email protected]> Cc: Vincent Whitchurch <[email protected]> Cc: Wenyu Liu <[email protected]> Cc: Yang Jihong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-12-06perf maps: Move symbol maps functions to maps.cIan Rogers4-249/+250
Move the find and certain other symbol maps__* functions to maps.c for better abstraction. Signed-off-by: Ian Rogers <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Jajeev <[email protected]> Cc: Changbin Du <[email protected]> Cc: Colin Ian King <[email protected]> Cc: Dmitrii Dolgov <[email protected]> Cc: German Gomez <[email protected]> Cc: Guilherme Amadio <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Li Dong <[email protected]> Cc: Liam Howlett <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Masami Hiramatsu (Google) <[email protected]> Cc: Miguel Ojeda <[email protected]> Cc: Ming Wang <[email protected]> Cc: Nick Terrell <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Steinar H. Gunderson <[email protected]> Cc: Vincent Whitchurch <[email protected]> Cc: Wenyu Liu <[email protected]> Cc: Yang Jihong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-12-06perf map: Simplify map_ip/unmap_ip and make 'struct map' smallerIan Rogers5-68/+50
When mapping an IP it is either an identity mapping or a DSO relative mapping, so a single bit is required in the struct to identify this. The current code uses function pointers, adding 2 pointers per map and also pushing the size of a map beyond 1 cache line. Switch to using a byte to identify the mapping type (as well as priv and erange_warned), to avoid any masking. Change struct maps's layout to avoid holes. Before: ``` struct map { u64 start; /* 0 8 */ u64 end; /* 8 8 */ _Bool erange_warned:1; /* 16: 0 1 */ _Bool priv:1; /* 16: 1 1 */ /* XXX 6 bits hole, try to pack */ /* XXX 3 bytes hole, try to pack */ u32 prot; /* 20 4 */ u64 pgoff; /* 24 8 */ u64 reloc; /* 32 8 */ u64 (*map_ip)(const struct map *, u64); /* 40 8 */ u64 (*unmap_ip)(const struct map *, u64); /* 48 8 */ struct dso * dso; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ refcount_t refcnt; /* 64 4 */ u32 flags; /* 68 4 */ /* size: 72, cachelines: 2, members: 12 */ /* sum members: 68, holes: 1, sum holes: 3 */ /* sum bitfield members: 2 bits, bit holes: 1, sum bit holes: 6 bits */ /* last cacheline: 8 bytes */ }; ``` After: ``` struct map { u64 start; /* 0 8 */ u64 end; /* 8 8 */ u64 pgoff; /* 16 8 */ u64 reloc; /* 24 8 */ struct dso * dso; /* 32 8 */ refcount_t refcnt; /* 40 4 */ u32 prot; /* 44 4 */ u32 flags; /* 48 4 */ enum mapping_type mapping_type:8; /* 52: 0 4 */ /* Bitfield combined with next fields */ _Bool erange_warned; /* 53 1 */ _Bool priv; /* 54 1 */ /* size: 56, cachelines: 1, members: 11 */ /* padding: 1 */ /* last cacheline: 56 bytes */ }; ``` Signed-off-by: Ian Rogers <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Jajeev <[email protected]> Cc: Changbin Du <[email protected]> Cc: Colin Ian King <[email protected]> Cc: Dmitrii Dolgov <[email protected]> Cc: German Gomez <[email protected]> Cc: Guilherme Amadio <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Li Dong <[email protected]> Cc: Liam Howlett <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Masami Hiramatsu (Google) <[email protected]> Cc: Miguel Ojeda <[email protected]> Cc: Ming Wang <[email protected]> Cc: Nick Terrell <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Steinar H. Gunderson <[email protected]> Cc: Vincent Whitchurch <[email protected]> Cc: Wenyu Liu <[email protected]> Cc: Yang Jihong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-12-06perf test shell diff: Skip test if test_loop symbol is missing in the perf ↵Ian Rogers1-0/+7
binary The diff test depends on finding the symbol test_loop in perf and will fail if perf has been stripped and no debug object is available. In that case, skip the test instead. Suggested-by: Adrian Hunter <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Tested-by: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-12-06perf symbols: Parse NOTE segments until the build id is foundChengen Du1-4/+6
In the ELF file, multiple NOTE segments may exist. To locate the build id, the process shall persist in parsing NOTE segments until the build id is found. Signed-off-by: Chengen Du <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-12-06perf record: Be lazier in allocating lost samples bufferIan Rogers1-10/+19
Wait until a lost sample occurs to allocate the lost samples buffer, often the buffer isn't necessary. This saves a 64kb allocation and 5.3kb of peak memory consumption. Signed-off-by: Ian Rogers <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Jajeev <[email protected]> Cc: Changbin Du <[email protected]> Cc: Colin Ian King <[email protected]> Cc: Dmitrii Dolgov <[email protected]> Cc: German Gomez <[email protected]> Cc: Guilherme Amadio <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Li Dong <[email protected]> Cc: Liam Howlett <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Masami Hiramatsu (Google) <[email protected]> Cc: Miguel Ojeda <[email protected]> Cc: Ming Wang <[email protected]> Cc: Nick Terrell <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Steinar H. Gunderson <[email protected]> Cc: Vincent Whitchurch <[email protected]> Cc: Wenyu Liu <[email protected]> Cc: Yang Jihong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-12-06perf evsel: Fallback to "task-clock" when not system wideIan Rogers5-12/+15
When the "cycles" event isn't available evsel will fallback to the "cpu-clock" software event. "task-clock" is similar to "cpu-clock" but only runs when the process is running. Falling back to "cpu-clock" when not system wide leads to confusion, by falling back to "task-clock" it is hoped the confusion is less. Pass the target to determine if "task-clock" is more appropriate. Update a nearby comment and debug string for the change. Signed-off-by: Ian Rogers <[email protected]> Tested-by: Athira Rajeev <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ajay Kaher <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexey Makhalov <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Yang Jihong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-12-05perf list: Fix JSON segfault by setting the used skip_duplicate_pmus callbackIan Rogers1-0/+6
Json output didn't set the skip_duplicate_pmus callback yielding a segfault. Fixes: cd4e1efbbc40 ("perf pmus: Skip duplicate PMUs and don't print list suffix by default") Signed-off-by: Ian Rogers <[email protected]> Cc: James Clark <[email protected]> Cc: Kan Liang <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] [namhyung: updated subject line according to Arnaldo] Signed-off-by: Namhyung Kim <[email protected]>
2023-12-05perf bench sched-seccomp-notify: Fix spelling mistake "synchronious" -> ↵Colin Ian King1-1/+1
"synchronous" There is a spelling mistake in an option description. Fix it. Signed-off-by: Colin Ian King <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-12-05perf test: Add basic 'perf diff' testIan Rogers1-0/+101
There are some old bug reports on perf diff crashing: https://rhaas.blogspot.com/2012/06/perf-good-bad-ugly.html Happening across them I was prompted to add two very basic tests that will give some 'perf diff' coverage. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Athira Jajeev <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-12-05perf mem: Fix error on hybrid related to availability of mem event in a PMUKan Liang1-11/+14
The below error can be triggered on a hybrid machine. $ perf mem record -t load sleep 1 event syntax error: 'breakpoint/mem-loads,ldlat=30/P' \___ Bad event or PMU Unable to find PMU or event on a PMU of 'breakpoint' In the perf_mem_events__record_args(), the current perf never checks the availability of a mem event on a given PMU. All the PMUs will be added to the perf mem event list. Perf errors out for the unsupported PMU. Extend perf_mem_event__supported() and take a PMU into account. Check the mem event for each PMU before adding it to the perf mem event list. Optimize the perf_mem_events__init() a little bit. The function is to check whether the mem events are supported in the system. It doesn't need to scan all PMUs. Just return with the first supported PMU is good enough. Fixes: 5752c20f3787c9bc ("perf mem: Scan all PMUs instead of just core ones") Reported-by: Ammy Yi <[email protected]> Signed-off-by: Kan Liang <[email protected]> Tested-by: Ammy Yi <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-12-05perf vendor events powerpc: Update datasource event name to fix duplicate eventsAthira Rajeev1-4/+14
Running "perf list" on powerpc fails with segfault as below: $ ./perf list Segmentation fault (core dumped) $ This happens because of duplicate events in the JSON list. The powerpc JSON event list contains some event with same event name, but different event code. They are: - PM_INST_FROM_L3MISS (Present in datasource and frontend) - PM_MRK_DATA_FROM_L2MISS (Present in datasource and marked) - PM_MRK_INST_FROM_L3MISS (Present in datasource and marked) - PM_MRK_DATA_FROM_L3MISS (Present in datasource and marked) pmu_events_table__num_events() uses the value from table_pmu->num_entries which includes duplicate events as well. This causes issue during "perf list" and results in a segmentation fault. Since both event codes are valid, append _DSRC to the Data Source events (datasource.json), so that they would have a unique name. Also add PM_DATA_FROM_L2MISS_DSRC and PM_DATA_FROM_L3MISS_DSRC events. With the fix, 'perf list' works as expected. Fixes: fc143580753348c6 ("perf vendor events power10: Update JSON/events") Signed-off-by: Athira Jajeev <[email protected]> Tested-by: Disha Goel <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Disha Goel <[email protected]> Cc: Ian Rogers <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: [email protected] Cc: Madhavan Srinivasan <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-12-05perf test: Add basic 'perf list --json" testIan Rogers1-0/+19
Test that JSON output produces valid JSON. Reviewed-by: Athira Jajeev <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Athira Jajeev <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-12-05perf test: Use common python setup libraryIan Rogers4-33/+26
Avoid replicated logic by having a common library to set the PYTHON environment variable. Reviewed-by: Athira Jajeev <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-12-05perf build: Shellcheck support for OUTPUT directoryIan Rogers4-42/+27
Migrate Makefile.tests to Build so that variables like rule_mkdir are defined via Makefile.build (needed so the output directory can be created). This requires SHELLCHECK being exported and the clean rule tweaking to remove the files in find. Change find "-perm -o=x" as it was failing on my Debian based Linux kernel tree, switch to using "-executable". Adding a filename prefix of "." to the shellcheck log files is a pain and error prone in make, remove this prefix and just add the shellcheck log files to .gitignore. Fix the command echo so that running the test is displayed. Fixes: 1638b11ef8156c85 ("perf tools: Add perf binary dependent rule for shellcheck log in Makefile.perf") Reviewed-by: Athira Jajeev <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-12-05perf vendor events arm64 AmpereOneX: Add core PMU events and metricsIlkka Koskinen13-0/+1713
Add JSON files for AmpereOneX core PMU events and metrics. Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Ilkka Koskinen <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-12-05perf vendor events arm64 AmpereOne: Rename BPU_FLUSH_MEM_FAULT to ↵Ilkka Koskinen1-1/+1
GPC_FLUSH_MEM_FAULT The documentation wrongly called the event as BPU_FLUSH_MEM_FAULT and now has been fixed. Correct the name in the perf tool as well. Fixes: a9650b7f6fc09d16 ("perf vendor events arm64: Add AmpereOne core PMU events") Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Ilkka Koskinen <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ilkka Koskinen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Leo Yan <[email protected]> Cc: [email protected] Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-12-05perf vendor events arm64: AmpereOne: Add missing DefaultMetricgroupName fieldsIlkka Koskinen1-0/+2
AmpereOne metrics were missing DefaultMetricgroupName from metrics with "Default" in group name resulting perf to segfault. Add the missing field to address the issue. Fixes: 59faeaf80d02 ("perf vendor events arm64: Fix for AmpereOne metrics") Signed-off-by: Ilkka Koskinen <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Cc: James Clark <[email protected]> Cc: Will Deacon <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mike Leach <[email protected]> Cc: John Garry <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-12-05perf metrics: Avoid segv if default metricgroup isn't setIan Rogers1-1/+1
A metric is default by having "Default" within its groups. The default metricgroup name needn't be set and this can result in segv in default_metricgroup_cmp and perf_stat__print_shadow_stats_metricgroup that assume it has a value when there is a Default metric group. To avoid the segv initialize the value to "". Fixes: 1c0e47956a8e ("perf metrics: Sort the Default metricgroup") Signed-off-by: Ian Rogers <[email protected]> Reviewed-and-tested-by: Ilkka Koskinen <[email protected]> Cc: James Clark <[email protected]> Cc: Will Deacon <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mike Leach <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: John Garry <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-12-04perf test record user-regs: Fix mask for vg registerVeronika Molnarova2-2/+2
The 'vg' register for arm64 shows up in --user_regs as available when masking the variable AT_HWCAP with 1 << 22 returns '1' as done in perf_regs.c. However, in subtests for support of SVE, the check for the 'vg' register is done by masking the variable AT_HWCAP with the value 0x200000 which is equals to 1 << 21 instead of 1 << 22. This results in inconsistencies on certain systems where the test expects that the 'vg' register is not operational when it is, and vice-versa. During the testing on a machine that the test expected not to have the 'vg' register available, 'perf record' with the option --user-regs showed records for the 'vg' register together with all of the others, which means that the mask for the subtest of perf_event_attr is off by one. Change the value of the mask from 0x200000 to 0x400000 to correct it. Fixes: 9440ebdc333dd12e ("perf test arm64: Add attr tests for new VG register") Reviewed-by: Leo Yan <[email protected]> Signed-off-by: Veronika Molnarova <[email protected]> Cc: James Clark <[email protected]> Cc: Michael Petlan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-12-04perf env: Cache the arch specific strerrno function in perf_env__arch_strerrno()Arnaldo Carvalho de Melo4-7/+12
So that we don't have to go thru the series of strcmp(arch) calls for each id -> string translation. Reviewed-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: David Laight <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-12-04perf env: Introduce perf_env__arch_strerrno()Arnaldo Carvalho de Melo3-4/+15
That will cache the arch specific function translating error numbers to strings. Reviewed-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: David Laight <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-12-03perf beauty: Don't use 'find ... -printf' as it isn't available in busyboxArnaldo Carvalho de Melo1-1/+3
Namhyung reported: I'm seeing a build error on my Alpine linux image which uses busybox + musl libc: In file included from trace/beauty/arch_errno_names.c:1, from builtin-trace.c:899: /build/trace/beauty/generated/arch_errno_name_array.c: In function 'arch_syscalls__strerrno': /build/trace/beauty/generated/arch_errno_name_array.c:142:49: error: unused parameter 'arch' [-Werror=unused-parameter] 142 | const char *arch_syscalls__strerrno(const char *arch, int err) It looks like busybox find command doesn't have -printf option find: unrecognized: -printf , Yesterday 9:16 PM , BusyBox v1.36.1 (2023-07-27 17:12:24 UTC) multi-call binary. Usage: find [-HL] [PATH]... [OPTIONS] [ACTIONS] Search for files and perform actions on them. First failed action stops processing of current file. Defaults: PATH is current directory, action is '-print' So just remove it and pipe find's entry to a basename loop to produce the same result. Then use an alternative loop that relies on the shell to avoid needless forks and execs. The discussion about it generated the impetus to stop doing strcmps to find the right table at each errno to string translation but instead do this just once and then use a function pointer to the right arch specific table. Suggested-by: David Laight <[email protected]> Reported-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Hendrik Brueckner <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Thomas Richter <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-12-03perf docs: Fix man page formatting for 'perf lock'Nick Forrington1-1/+1
This makes "CONTENTION" a top level section (rather than a subsection of "INFO"). Fixes: 79079f21f50a501f ("perf lock: Add -k and -F options to 'contention' subcommand") Reviewed-by: James Clark <[email protected]> Signed-off-by: Nick Forrington <[email protected]> Tested-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-11-29perf test record+probe_libc_inet_pton: Fix call chain match on powerpcLikhitha Korrapati1-1/+4
The perf test "probe libc's inet_pton & backtrace it with ping" fails on powerpc as below: # perf test -v "probe libc's inet_pton & backtrace it with ping" 85: probe libc's inet_pton & backtrace it with ping : --- start --- test child forked, pid 96028 ping 96056 [002] 127271.101961: probe_libc:inet_pton: (7fffa1779a60) 7fffa1779a60 __GI___inet_pton+0x0 (/usr/lib64/glibc-hwcaps/power10/libc.so.6) 7fffa172a73c getaddrinfo+0x121c (/usr/lib64/glibc-hwcaps/power10/libc.so.6) FAIL: expected backtrace entry "gaih_inet.*\+0x[[:xdigit:]]+[[:space:]]\(/usr/lib64/glibc-hwcaps/power10/libc.so.6\)$" got "7fffa172a73c getaddrinfo+0x121c (/usr/lib64/glibc-hwcaps/power10/libc.so.6)" test child finished with -1 ---- end ---- probe libc's inet_pton & backtrace it with ping: FAILED! This test installs a probe on libc's inet_pton function, which will use uprobes and then uses perf trace on a ping to localhost. It gets 3 levels deep backtrace and checks whether it is what we expected or not. The test started failing from RHEL 9.4 where as it works in previous distro version (RHEL 9.2). Test expects gaih_inet function to be part of backtrace. But in the glibc version (2.34-86) which is part of distro where it fails, this function is missing and hence the test is failing. From nm and ping command output we can confirm that gaih_inet function is not present in the expected backtrace for glibc version glibc-2.34-86 [root@xxx perf]# nm /usr/lib64/glibc-hwcaps/power10/libc.so.6 | grep gaih_inet 00000000001273e0 t gaih_inet_serv 00000000001cd8d8 r gaih_inet_typeproto [root@xxx perf]# perf script -i /tmp/perf.data.6E8 ping 104048 [000] 128582.508976: probe_libc:inet_pton: (7fff83779a60) 7fff83779a60 __GI___inet_pton+0x0 (/usr/lib64/glibc-hwcaps/power10/libc.so.6) 7fff8372a73c getaddrinfo+0x121c (/usr/lib64/glibc-hwcaps/power10/libc.so.6) 11dc73534 [unknown] (/usr/bin/ping) 7fff8362a8c4 __libc_start_call_main+0x84 (/usr/lib64/glibc-hwcaps/power10/libc.so.6) FAIL: expected backtrace entry "gaih_inet.*\+0x[[:xdigit:]]+[[:space:]]\(/usr/lib64/glibc-hwcaps/power10/libc.so.6\)$" got "7fff9d52a73c getaddrinfo+0x121c (/usr/lib64/glibc-hwcaps/power10/libc.so.6)" With version glibc-2.34-60 gaih_inet function is present as part of the expected backtrace. So we cannot just remove the gaih_inet function from the backtrace. [root@xxx perf]# nm /usr/lib64/glibc-hwcaps/power10/libc.so.6 | grep gaih_inet 0000000000130490 t gaih_inet.constprop.0 000000000012e830 t gaih_inet_serv 00000000001d45e4 r gaih_inet_typeproto [root@xxx perf]# ./perf script -i /tmp/perf.data.b6S ping 67906 [000] 22699.591699: probe_libc:inet_pton_3: (7fffbdd80820) 7fffbdd80820 __GI___inet_pton+0x0 (/usr/lib64/glibc-hwcaps/power10/libc.so.6) 7fffbdd31160 gaih_inet.constprop.0+0xcd0 (/usr/lib64/glibc-hwcaps/power10/libc.so.6) 7fffbdd31c7c getaddrinfo+0x14c (/usr/lib64/glibc-hwcaps/power10/libc.so.6) 1140d3558 [unknown] (/usr/bin/ping) This patch solves this issue by doing a conditional skip. If there is a gaih_inet function present in the libc then it will be added to the expected backtrace else the function will be skipped from being added to the expected backtrace. Output with the patch [root@xxx perf]# ./perf test -v "probe libc's inet_pton & backtrace it with ping" 83: probe libc's inet_pton & backtrace it with ping : --- start --- test child forked, pid 102662 ping 102692 [000] 127935.549973: probe_libc:inet_pton: (7fff93379a60) 7fff93379a60 __GI___inet_pton+0x0 (/usr/lib64/glibc-hwcaps/power10/libc.so.6) 7fff9332a73c getaddrinfo+0x121c (/usr/lib64/glibc-hwcaps/power10/libc.so.6) 11ef03534 [unknown] (/usr/bin/ping) test child finished with 0 ---- end ---- probe libc's inet_pton & backtrace it with ping: Ok Reported-by: Disha Goel <[email protected]> Reviewed-by: Athira Jajeev <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Likhitha Korrapati <[email protected]> Tested-by: Disha Goel <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Disha Goel <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-11-29perf tests sigtrap: Skip if running on a kernel with sleepable spinlocksArnaldo Carvalho de Melo1-2/+44
There are issues as reported that need some more investigation on the RT kernel front, till that is addressed, skip this test. This test is already skipped for multiple hardware architectures where the tested kernel feature is not supported. Acked-by: Marco Elver <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Clark Williams <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Juri Lelli <[email protected]> Cc: Kate Carcia <[email protected]> Cc: Marco Elver <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/all/[email protected]/ Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-11-29perf test sigtrap: Generalize the BTF routine to reuse it in this testArnaldo Carvalho de Melo1-20/+40
Move the part that loads the BTF info to a "btf__available()" that will lazy load the BTF info so that if we need it for some other test, which we will in the following cset, we can reuse it. At some point this will move from this specific 'perf test' entry to be used in other parts of perf, do it when needed. Cc: Adrian Hunter <[email protected]> Cc: Clark Williams <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kate Carcia <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-11-28perf mmap: Lazily initialize zstd streams to save memory when not using itIan Rogers5-43/+59
Zstd streams create dictionaries that can require significant RAM, especially when there is one per-CPU. Tools like 'perf record' won't use the streams without the -z option, and so the creation of the streams is pure overhead. Switch to creating the streams on first use. Committer notes: ssize_t comes from sys/types.h, size_t from stddef.h. This worked on glibc as stdlib.h includes both, but not on musl libc. So do what 'man size_t' says and include sys/types.h and stddef.h instead of stdlib.h Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Jajeev <[email protected]> Cc: Changbin Du <[email protected]> Cc: Colin Ian King <[email protected]> Cc: Dmitrii Dolgov <[email protected]> Cc: German Gomez <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Li Dong <[email protected]> Cc: Liam Howlett <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Masami Hiramatsu (Google) <[email protected]> Cc: Miguel Ojeda <[email protected]> Cc: Ming Wang <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Nick Terrell <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Steinar H. Gunderson <[email protected]> Cc: Vincent Whitchurch <[email protected]> Cc: Wenyu Liu <[email protected]> Cc: Yang Jihong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-11-28perf dwarf-aux: Add die_find_variable_by_addr()Namhyung Kim2-0/+93
The die_find_variable_by_addr() is to find a variables in the given DIE using given (PC-relative) address. Global variables will have a location expression with DW_OP_addr which has an address so can simply compare it with the address. <1><143a7>: Abbrev Number: 2 (DW_TAG_variable) <143a8> DW_AT_name : loops_per_jiffy <143ac> DW_AT_type : <0x1cca> <143b0> DW_AT_external : 1 <143b0> DW_AT_decl_file : 193 <143b1> DW_AT_decl_line : 213 <143b2> DW_AT_location : 9 byte block: 3 b0 46 41 82 ff ff ff ff (DW_OP_addr: ffffffff824146b0) Note that the type-offset should be calculated from the base address of the global variable. Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Masami Hiramatsu (Google) <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-11-28perf tools: Add --debug-file option to redirect debug outputYang Jihong2-0/+33
Currently, debug messages is output to stderr, add --debug-file option to support redirection to a specified file. Some test scenarios: # perf --list-opts --help --version --exec-path --html-path --paginate --no-pager --debugfs-dir --buildid-dir --list-cmds --list-opts --debug --debug-file # perf --debug-file No path given for --debug-file. Usage: perf [--version] [--help] [OPTIONS] COMMAND [ARGS] # perf --debug-file /sys/perf.log record -v true Open debug file '/sys/perf.log' failed: Permission denied Usage: perf [--version] [--help] [OPTIONS] COMMAND [ARGS] # perf --debug-file /tmp/perf.log record -v true [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.013 MB perf.data (26 samples) ] # cat /tmp/perf.log DEBUGINFOD_URLS= Using CPUID GenuineIntel-6-3E-4 nr_cblocks: 0 affinity: SYS mmap flush: 1 comp level: 0 mmap size 528384B Control descriptor is not initialized mmap size 528384B Looking at the vmlinux_path (8 entries long) Using /proc/kcore for kernel data Using /proc/kallsyms for symbols symbol:unmap_start file:(null) line:0 offset:0 return:0 lazy:(null) symbol:unmap_complete file:(null) line:0 offset:0 return:0 lazy:(null) symbol:map_start file:(null) line:0 offset:0 return:0 lazy:(null) symbol:map_complete file:(null) line:0 offset:0 return:0 lazy:(null) symbol:reloc_start file:(null) line:0 offset:0 return:0 lazy:(null) symbol:reloc_complete file:(null) line:0 offset:0 return:0 lazy:(null) symbol:init_start file:(null) line:0 offset:0 return:0 lazy:(null) symbol:init_complete file:(null) line:0 offset:0 return:0 lazy:(null) symbol:lll_lock_wait_private file:(null) line:0 offset:0 return:0 lazy:(null) symbol:lll_lock_wait file:(null) line:0 offset:0 return:0 lazy:(null) symbol:setjmp file:(null) line:0 offset:0 return:0 lazy:(null) symbol:longjmp file:(null) line:0 offset:0 return:0 lazy:(null) symbol:longjmp_target file:(null) line:0 offset:0 return:0 lazy:(null) failed to write feature HYBRID_TOPOLOGY Signed-off-by: Yang Jihong <[email protected]> Acked-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-11-27perf annotate: Check if operand has multiple regsNamhyung Kim2-0/+38
It needs to check all possible information in an instruction. Let's add a field indicating if the operand has multiple registers. I'll be used to search type information like in an array access on x86 like: mov 0x10(%rax,%rbx,8), %rcx ------------- here Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-11-27perf test: Use existing config value for objdump pathJames Clark2-7/+3
There is already an existing config value for changing the objdump path, so instead of having two values that do the same thing, make 'perf test' use annotate.objdump as well. Signed-off-by: James Clark <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Fangrui Song <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Yang Jihong <[email protected]> Cc: Yonghong Song <[email protected]> Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-11-27perf vendor events riscv: add T-HEAD C9xx JSON fileInochi Amaoto5-0/+288
Add JSON file of T-HEAD C9xx series events. The event idx (raw value) is summary as following: event id range | support cpu 0x01 - 0x2a | c906,c910,c920 The event ids are based on the public document of T-HEAD and cover the c900 series. These events are the max that c900 series support. Since T-HEAD let manufacturers decide whether events are usable, the final support of the perf events is determined by the pmu node of the soc dtb. Signed-off-by: Inochi Amaoto <[email protected]> Tested-by: Guo Ren <[email protected]> Acked-by: Palmer Dabbelt <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Chen Wang <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Jisheng Zhang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wei Fu <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/IA1PR20MB495325FCF603BAA841E29281BBBAA@IA1PR20MB4953.namprd20.prod.outlook.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-11-27perf vendor events: Add skx, clx, icx and spr upi bandwidth metricIan Rogers4-0/+24
Add upi_data_receive_bw metric for skylakex, cascadelakex, icelakex and sapphirerapids. The metric was added to perfmon metrics in: https://github.com/intel/perfmon/pull/119 Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-11-27perf tests: Skip data symbol test if buf1 symbol is missingAdrian Hunter1-0/+5
perf data symbol test depends on finding symbol buf1 in perf, and fails if perf has been stripped and no debug object is available. In that case, skip the test instead. Example: Before: $ strip tools/perf/perf $ tools/perf/perf buildid-cache -p `realpath tools/perf/perf` $ tools/perf/perf test -v 'data symbol' 113: Test data symbol : --- start --- test child forked, pid 125646 Recording workload... [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 0.577 MB /tmp/__perf_test.perf.data.Jhbdp (7794 samples) ] Cleaning up files... test child finished with -1 ---- end ---- Test data symbol: FAILED! After: $ tools/perf/perf test -v 'data symbol' 113: Test data symbol : --- start --- test child forked, pid 125747 perf does not have symbol 'buf1' perf is missing symbols - skipping test test child finished with -2 ---- end ---- Test data symbol: Skip Signed-off-by: Adrian Hunter <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: German Gomez <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Leo Yan <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-11-27perf tests: Make data symbol test wait for perf to startAdrian Hunter1-2/+9
The perf data symbol test waits 1 second for perf to run and collect data, which may be too little if perf takes a long time to start up, which has been noticed on systems with many CPUs. Use existing wait_for_perf_to_start helper to wait for perf to start. Signed-off-by: Adrian Hunter <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: German Gomez <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Leo Yan <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-11-27perf tests: Skip branch stack sampling test if brstack_bench symbol is missingAdrian Hunter1-0/+6
The test "Check branch stack sampling" depends on finding symbol brstack_bench (and several others) in perf, and fails if perf has been stripped and no debug object is available. In that case, skip the test instead. Example: Before: $ strip tools/perf/perf $ tools/perf/perf buildid-cache -p `realpath tools/perf/perf` $ tools/perf/perf test -v 'branch stack sampling' 112: Check branch stack sampling : --- start --- test child forked, pid 123741 Testing user branch stack sampling + grep -E -m1 ^brstack_bench\+[^ ]*/brstack_foo\+[^ ]*/IND_CALL/.*$ /tmp/__perf_test.program.5Dz1U/perf.script + cleanup + rm -rf /tmp/__perf_test.program.5Dz1U test child finished with -1 ---- end ---- Check branch stack sampling: FAILED! After: $ tools/perf/perf test -v 'branch stack sampling' 112: Check branch stack sampling : --- start --- test child forked, pid 125157 perf does not have symbol 'brstack_bench' perf is missing symbols - skipping test test child finished with -2 ---- end ---- Check branch stack sampling: Skip Signed-off-by: Adrian Hunter <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: German Gomez <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Leo Yan <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-11-27perf tests: Skip Arm64 callgraphs test if leafloop symbol is missingAdrian Hunter1-0/+6
The test "Check Arm64 callgraphs are complete in fp mode" depends on finding symbol leafloop in perf, and fails if perf has been stripped and no debug object is available. In that case, skip the test instead. Signed-off-by: Adrian Hunter <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: German Gomez <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Leo Yan <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-11-27perf tests: Skip record test if test_loop symbol is missingAdrian Hunter1-1/+7
perf record test depends on finding symbol test_loop in perf, and fails if perf has been stripped and no debug object is available. In that case, skip the test instead. Example: Note, building with perl support adds option -Wl,-E which causes the linker to add all (global) symbols to the dynamic symbol table. So the test_loop symbol, being global, does not get stripped unless NO_LIBPERL=1 Before: $ make NO_LIBPERL=1 -C tools/perf >/dev/null 2>&1 $ strip tools/perf/perf $ tools/perf/perf buildid-cache -p `realpath tools/perf/perf` $ tools/perf/perf test -v 'record tests' 91: perf record tests : --- start --- test child forked, pid 118750 Basic --per-thread mode test Per-thread record [Failed missing output] Register capture test Register capture test [Success] Basic --system-wide mode test System-wide record [Skipped not supported] Basic target workload test Workload record [Failed missing output] test child finished with -1 ---- end ---- perf record tests: FAILED! After: $ tools/perf/perf test -v 'record tests' 91: perf record tests : --- start --- test child forked, pid 120025 perf does not have symbol 'test_loop' perf is missing symbols - skipping test test child finished with -2 ---- end ---- perf record tests: Skip Signed-off-by: Adrian Hunter <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: German Gomez <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Leo Yan <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-11-27perf tests: Skip pipe test if noploop symbol is missingAdrian Hunter1-1/+8
perf pipe recording and injection test depends on finding symbol noploop in perf, and fails if perf has been stripped and no debug object is available. In that case, skip the test instead. Example: Before: $ strip tools/perf/perf $ tools/perf/perf buildid-cache -p `realpath tools/perf/perf` $ tools/perf/perf test -v pipe 86: perf pipe recording and injection test : --- start --- test child forked, pid 47734 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] 47741 47741 -1 |perf [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] cannot find noploop function in pipe #1 test child finished with -1 ---- end ---- perf pipe recording and injection test: FAILED! After: $ tools/perf/perf test -v pipe 86: perf pipe recording and injection test : --- start --- test child forked, pid 48996 perf does not have symbol 'noploop' perf is missing symbols - skipping test test child finished with -2 ---- end ---- perf pipe recording and injection test: Skip Signed-off-by: Adrian Hunter <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: German Gomez <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Leo Yan <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-11-27perf tests lib: Add perf_has_symbol.shAdrian Hunter1-0/+21
Some shell tests depend on finding symbols for perf itself, and fail if perf has been stripped and no debug object is available. Add helper functions to check if perf has a needed symbol. This is preparation for amending the tests themselves to be skipped if a needed symbol is not found. The functions make use of the "Symbols" test which reads and checks symbols from a dso, perf itself by default. Note the "Symbols" test will find symbols using the same method as other perf tests, including, for example, looking in the buildid cache. An alternative would be to prevent the needed symbols from being stripped, which seems to work with gcc's externally_visible attribute, but that attribute is not supported by clang. Another alternative would be to use option -Wl,-E (which is already used when perf is built with perl support) which causes the linker to add all (global) symbols to the dynamic symbol table. Then the required symbols need only be made global in scope to avoid being strippable. However that goes beyond what is needed. Signed-off-by: Adrian Hunter <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: German Gomez <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Leo Yan <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>