aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-04-12perf dsos: Tidy reference counting and lockingIan Rogers5-99/+97
Move more functionality into dsos.c generally from machine.c, renaming functions to match their new usage. The find function is made to always "get" before returning a dso. Reduce the scope of locks in vdso to match the locking paradigm. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Anne Macedo <retpolanne@posteo.net> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Chengen Du <chengen.du@canonical.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Markus Elfring <Markus.Elfring@web.de> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Song Liu <song@kernel.org> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com> Link: https://lore.kernel.org/r/20240410064214.2755936-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-12perf dsos: Attempt to better abstract DSOs internalsIan Rogers11-86/+97
Move functions from machine and build-id to dsos. Pass 'struct dsos' rather than internal state. Rename some functions to better represent which data structure they operate on. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Anne Macedo <retpolanne@posteo.net> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Chengen Du <chengen.du@canonical.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Markus Elfring <Markus.Elfring@web.de> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Song Liu <song@kernel.org> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com> Link: https://lore.kernel.org/r/20240410064214.2755936-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-12perf record: Fix debug message placement for test consumptionAdrian Hunter1-2/+2
evlist__config() might mess up the debug output consumed by test "Test per-thread recording" in "Miscellaneous Intel PT testing". Move it out from between the debug prints: "perf record opening and mmapping events" and "perf record done opening and mmapping events" Fixes: da4062021e0e6da5 ("perf tools: Add debug messages and comments for testing") Closes: https://lore.kernel.org/linux-perf-users/ZhVfc5jYLarnGzKa@x1/ Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240411075447.17306-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-12perf annotate: Skip DSOs not foundNamhyung Kim1-0/+2
In some data file, I see the following messages repeated. It seems it doesn't have DSOs in the system and the dso->binary_type is set to DSO_BINARY_TYPE__NOT_FOUND. Let's skip them to avoid the followings. No output from objdump --start-address=0x0000000000000000 --stop-address=0x00000000000000d4 -d --no-show-raw-insn -C "$1" Error running objdump --start-address=0x0000000000000000 --stop-address=0x0000000000000631 -d --no-show-raw-insn -C "$1" ... Closes: https://lore.kernel.org/linux-perf-users/15e1a2847b8cebab4de57fc68e033086aa6980ce.camel@yandex.ru/ Reported-by: Konstantin Kharlamov <Hi-Angel@yandex.ru> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Konstantin Kharlamov <Hi-Angel@yandex.ru> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240410185117.1987239-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-12perf report: Do not collect sample histogram unnecessarilyNamhyung Kim2-1/+8
The data type profiling alone doesn't need the sample histogram for functions. It only needs the histogram for the types. Let's remove the condition in the report_callback to check if data type profiling is selected and make sure the annotation has the 'struct annotated_source' instantiated before calling symbol__disassemble(). Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240411033256.2099646-8-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-12perf report: Add a menu item to annotate data type in TUINamhyung Kim2-0/+36
When the hist entry has the type info, it should be able to display the annotation browser for the type like in `perf annotate --data-type`. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240411033256.2099646-7-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-12perf annotate-data: Support event group display in TUINamhyung Kim1-10/+40
Like in stdio, it should print all events in a group together. Committer notes: Collect it: root@number:~# perf record -a -e '{cpu_core/mem-loads,ldlat=30/P,cpu_core/mem-stores/P}' ^C[ perf record: Woken up 8 times to write data ] [ perf record: Captured and wrote 4.980 MB perf.data (55825 samples) ] root@number:~# Then do it in stdio: root@number:~# perf annotate --stdio --data-type Annotate type: 'union ' in /usr/lib64/libc.so.6 (1131 samples): event[0] = cpu_core/mem-loads,ldlat=30/P event[1] = cpu_core/mem-stores/P ============================================================================ Percent offset size field 100.00 100.00 0 40 union { 100.00 100.00 0 40 struct __pthread_mutex_s __data { 48.61 23.46 0 4 int __lock; 0.00 0.48 4 4 unsigned int __count; 6.38 41.32 8 4 int __owner; 8.74 34.02 12 4 unsigned int __nusers; 35.66 0.26 16 4 int __kind; 0.61 0.45 20 2 short int __spins; 0.00 0.00 22 2 short int __elision; 0.00 0.00 24 16 __pthread_list_t __list { 0.00 0.00 24 8 struct __pthread_internal_list* __prev; 0.00 0.00 32 8 struct __pthread_internal_list* __next; }; }; 0.00 0.00 0 0 char* __size; 48.61 23.94 0 8 long int __align; }; Now with TUI before this patch: root@number:~# perf annotate --tui --data-type Annotate type: 'union ' (790 samples) Percent Offset Size Field 100.00 0 40 union { 100.00 0 40 struct __pthread_mutex_s __data { 48.61 0 4 int __lock; 0.00 4 4 unsigned int __count; 6.38 8 4 int __owner; 8.74 12 4 unsigned int __nusers; 35.66 16 4 int __kind; 0.61 20 2 short int __spins; 0.00 22 2 short int __elision; 0.00 24 16 __pthread_list_t __list { 0.00 24 8 struct __pthread_internal_list* __prev; 0.00 32 8 struct __pthread_internal_list* __next; 0.00 0 0 char* __size; 48.61 0 8 long int __align; }; And now after this patch: Annotate type: 'union ' (790 samples) Percent Offset Size Field 100.00 100.00 0 40 union { 100.00 100.00 0 40 struct __pthread_mutex_s __data { 48.61 23.46 0 4 int __lock; 0.00 0.48 4 4 unsigned int __count; 6.38 41.32 8 4 int __owner; 8.74 34.02 12 4 unsigned int __nusers; 35.66 0.26 16 4 int __kind; 0.61 0.45 20 2 short int __spins; 0.00 0.00 22 2 short int __elision; 0.00 0.00 24 16 __pthread_list_t __list { 0.00 0.00 24 8 struct __pthread_internal_list* __prev; 0.00 0.00 32 8 struct __pthread_internal_list* __next; }; }; 0.00 0.00 0 0 char* __size; 48.61 23.94 0 8 long int __align; }; On a followup patch the --tui output should have this that is present in --stdio: And the --stdio has all the missing info in TUI: Annotate type: 'union ' in /usr/lib64/libc.so.6 (1131 samples): event[0] = cpu_core/mem-loads,ldlat=30/P event[1] = cpu_core/mem-stores/P Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240411033256.2099646-6-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-12perf annotate-data: Add hist_entry__annotate_data_tui()Namhyung Kim5-5/+324
Support data type profiling output on TUI. Testing from Arnaldo: First make sure that the debug information for your workload binaries in embedded in them by building it with '-g' or install the debuginfo packages, since our workload is 'find': root@number:~# type find find is hashed (/usr/bin/find) root@number:~# rpm -qf /usr/bin/find findutils-4.9.0-5.fc39.x86_64 root@number:~# dnf debuginfo-install findutils <SNIP> root@number:~# Then collect some data: root@number:~# echo 1 > /proc/sys/vm/drop_caches root@number:~# perf mem record find / > /dev/null [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.331 MB perf.data (3982 samples) ] root@number:~# Finally do data-type annotation with the following command, that will default, as 'perf report' to the --tui mode, with lines colored to highlight the hotspots, etc. root@number:~# perf annotate --data-type Annotate type: 'struct predicate' (58 samples) Percent Offset Size Field 100.00 0 312 struct predicate { 0.00 0 8 PRED_FUNC pred_func; 0.00 8 8 char* p_name; 0.00 16 4 enum predicate_type p_type; 0.00 20 4 enum predicate_precedence p_prec; 0.00 24 1 _Bool side_effects; 0.00 25 1 _Bool no_default_print; 0.00 26 1 _Bool need_stat; 0.00 27 1 _Bool need_type; 0.00 28 1 _Bool need_inum; 0.00 32 4 enum EvaluationCost p_cost; 0.00 36 4 float est_success_rate; 0.00 40 1 _Bool literal_control_chars; 0.00 41 1 _Bool artificial; 0.00 48 8 char* arg_text; <SNIP> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240411033256.2099646-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-12perf annotate-data: Add hist_entry__annotate_data_tty()Namhyung Kim3-105/+122
And move the related code into util/annotate-data.c file. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240411033256.2099646-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-12perf annotate: Show progress of sample processingNamhyung Kim1-2/+13
Like 'perf report', it can take a while to process samples. Show a progress window to inform users how that it is not stuck. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240411033256.2099646-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-12perf annotate-data: Skip sample histogram for stack canaryNamhyung Kim1-2/+3
It's a pseudo data type and has no field. Fixes: b3c95109c131fcc9 ("perf annotate-data: Add stack canary type") Closes: https://lore.kernel.org/lkml/Zhb6jJneP36Z-or0@x1 Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240411033256.2099646-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-12perf tests: Remove dependency on lscpuJames Clark1-1/+3
This check can be done with uname which is more portable. At the same time re-arrange it into a standard if statement so that it's more readable. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: James Clark <james.clark@arm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Spoorthy S <spoorts2@in.ibm.com> Link: https://lore.kernel.org/r/20240410103458.813656-5-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-12perf map: Remove kernel map before updating start and end addressesJames Clark1-1/+1
In a debug build there is validation that mmap lists are sorted when taking a lock. In machine__update_kernel_mmap() the start and end addresses are updated resulting in an unsorted list before the map is removed from the list. When the map is removed, the lock is taken which triggers the validation and the failure: $ perf test "object code reading" --- start --- perf: util/maps.c:88: check_invariants: Assertion `map__start(prev) <= map__start(map)' failed. Aborted Fix it by updating the addresses after removal, but before insertion. The bug depends on the ordering and type of debug info on the system and doesn't reproduce everywhere. Fixes: 659ad3492b913c90 ("perf maps: Switch from rbtree to lazily sorted array for addresses") Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: James Clark <james.clark@arm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Spoorthy S <spoorts2@in.ibm.com> Link: https://lore.kernel.org/r/20240410103458.813656-4-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-12perf tests: Apply attributes to all events in object code reading testJames Clark1-5/+5
PERF_PMU_CAP_EXTENDED_HW_TYPE results in multiple events being opened on heterogeneous systems. Currently this test only sets its required attributes on the first event. Not disabling enable_on_exec on the other events causes the test to fail because the forked objdump processes are sampled. No tracking event is opened so Perf only knows about its own mappings causing the objdump samples to give the following error: $ perf test -vvv "object code reading" Reading object code for memory address: 0xffff9aaa55ec thread__find_map failed ---- end(-1) ---- 24: Object code reading : FAILED! Fixes: 251aa040244a3b17 ("perf parse-events: Wildcard most "numeric" events") Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: James Clark <james.clark@arm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Spoorthy S <spoorts2@in.ibm.com> Link: https://lore.kernel.org/r/20240410103458.813656-3-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-12perf tests: Make "test data symbol" more robust on Neoverse N1James Clark1-0/+16
To prevent anyone from seeing a test failure appear as a regression and thinking that it was caused by their code change, insert some noise into the loop which makes it immune to sampling bias issues (errata 1694299). The "test data symbol" test can fail with any unrelated change that shifts the loop into an unfortunate position in the Perf binary which is almost impossible to debug as the root cause of the test failure. Ultimately it's caused by the referenced errata. Fixes: 60abedb8aa902b06 ("perf test: Introduce script for data symbol testing") Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: James Clark <james.clark@arm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Spoorthy S <spoorts2@in.ibm.com> Link: https://lore.kernel.org/r/20240410103458.813656-2-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-12perf metrics: Remove the "No_group" metric groupIan Rogers1-2/+2
Rather than place metrics without a metric group in "No_group" place them in a a metric group that is their name. Still allow such metrics to be selected if "No_group" is passed, this change just impacts perf list. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240403164636.3429091-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-12perf annotate: Get rid of symbol__ensure_annotate()Namhyung Kim1-12/+2
Now symbol__annotate() is reentrant and it doesn't need to remove non-instruction lines. Let's get rid of symbol__ensure_annotate() and call symbol__annotate() directly. Also we can use it to get the arch pointer instead of calling evsel__get_arch() directly. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240405211800.1412920-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-12perf annotate-data: Do not delete non-asm linesNamhyung Kim2-25/+74
For data type profiling, it removed non-instruction lines from the list of annotation lines. It was to simplify the implementation dealing with instructions like to calculate the PC-relative address and to search the shortest path to the target instruction or basic block. But it means that it removes all the comments and debug information in the annotate output like source file name and line numbers. To support both code annotation and data type annotation, it'd be better to keep the non-instruction lines as well. So this change is to skip those lines during the data type profiling and to display them in the normal perf annotate output. No function changes intended (other than having more lines). Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240405211800.1412920-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-12perf annotate-data: Fix global variable lookupNamhyung Kim1-1/+3
The recent change in the global variable handling added a bug to miss setting the return value even if it found a data type. Also add the type name in the debug message. Fixes: 1ebb5e17ef21b492 ("perf annotate-data: Add get_global_var_type()") Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240405211800.1412920-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-08tools subcmd: Add check_if_command_finished()Ian Rogers2-24/+49
Add non-blocking function to check if a 'struct child_process' has completed. If the process has completed the exit code is stored in the 'struct child_process' so that finish_command() returns it. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240405070931.1231245-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-08perf annotate: Move 'start' field struct to 'struct annotated_source'Namhyung Kim2-6/+6
It's only used in 'perf annotate' output which means functions with actual samples. No need to consume memory for every symbol ('struct annotation'). Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240404175716.1225482-10-namhyung@kernel.org Cc: Ian Rogers <irogers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: LKML <linux-kernel@vger.kernel.org> Cc: <linux-perf-users@vger.kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-08perf annotate: Move nr_events struct to 'struct annotated_source'Namhyung Kim2-5/+6
It's only used in 'perf annotate' output which means functions with actual samples. No need to consume memory for every symbol ('struct annotation'). Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240404175716.1225482-9-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-08perf annotate: Move 'max_jump_sources' struct to 'struct annotated_source'Namhyung Kim3-5/+7
It's only used in 'perf annotate' output which means functions with actual samples. No need to consume memory for every symbol ('struct annotation'). Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240404175716.1225482-8-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-08perf annotate: Move 'widths' struct to 'struct annotated_source'Namhyung Kim3-32/+35
It's only used in 'perf annotate' output which means functions with actual samples. No need to consume memory for every symbol ('struct annotation'). Also move the 'max_line_len' field into it as it's related. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240404175716.1225482-7-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-08perf annotate: Get rid of offsets arrayNamhyung Kim3-29/+7
The struct annotated_source.offsets[] is to save pointers to annotation_line at each offset. We can use annotated_source__get_line() helper instead so let's get rid of the array. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240404175716.1225482-6-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-08perf annotate: Check annotation lines more efficientlyNamhyung Kim1-21/+35
In some places, it checks annotated (disasm) lines for each byte. But as it already has a list of disasm lines, it'd be better to traverse the list entries instead of checking every offset with linear search (by annotated_source__get_line() helper). Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240404175716.1225482-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-08perf annotate: Introduce annotated_source__get_line()Namhyung Kim3-6/+25
It's a helper function to get annotation_line at the given offset without using the offsets array. The goal is to get rid of the offsets array altogether. It just does the linear search but I think it's better to save memory as it won't be called in a hot path. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240404175716.1225482-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-08perf annotate: Staticize some local functionsNamhyung Kim2-6/+5
I found annotation__mark_jump_targets(), annotation__set_offsets() and annotation__init_column_widths() are only used in the same file. Let's make them static. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240404175716.1225482-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-08perf annotate: Fix annotation_calc_lines() to pass correct address to ↵Namhyung Kim1-4/+6
get_srcline() It should pass a proper address (i.e. suitable for objdump or addr2line) to get_srcline() in order to work correctly. It used to pass an address with map__rip_2objdump() as the second argument but later it's changed to use notes->start. It's ok in normal cases but it can be changed when annotate_opts.full_addr is set. So let's convert the address directly instead of using the notes->start. Also the last argument is an IP to print symbol offset if requested. So it should pass symbol-relative address. Fixes: 7d18a824b5e57ddd ("perf annotate: Toggle full address <-> offset display") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240404175716.1225482-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-08perf script: Consolidate capstone print functionsAdrian Hunter3-86/+68
Consolidate capstone print functions, to reduce duplication. Amend call sites to use a file pointer for output, which is consistent with most perf tools print functions. Add print_opts with an option to print also the hex value of a resolved symbol+offset. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20240401210925.209671-4-ak@linux.intel.com Signed-off-by: Andi Kleen <ak@linux.intel.com> [ Added missing inttypes.h include to use PRIx64 in util/print_insn.c ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-05perf script: Add capstone support for '-F +brstackdisasm'Andi Kleen5-9/+86
Support capstone output for the '-F +brstackinsn' branch dump. The new output is enabled with the new field 'brstackdisasm'. This was possible before with --xed, but now also allow it for users that don't have xed using the builtin capstone support. Before: perf record -b emacs -Q --batch '()' perf script -F +brstackinsn ... emacs 55778 1814366.755945: 151564 cycles:P: 7f0ab2d17192 intel_check_word.constprop.0+0x162 (/usr/lib64/ld-linux-x86-64.s> intel_check_word.constprop.0+237: 00007f0ab2d1711d insn: 75 e6 # PRED 3 cycles [3] 00007f0ab2d17105 insn: 73 51 00007f0ab2d17107 insn: 48 89 c1 00007f0ab2d1710a insn: 48 39 ca 00007f0ab2d1710d insn: 73 96 00007f0ab2d1710f insn: 48 8d 04 11 00007f0ab2d17113 insn: 48 d1 e8 00007f0ab2d17116 insn: 49 8d 34 c1 00007f0ab2d1711a insn: 44 3a 06 00007f0ab2d1711d insn: 75 e6 # PRED 3 cycles [6] 3.00 IPC 00007f0ab2d17105 insn: 73 51 # PRED 1 cycles [7] 1.00 IPC 00007f0ab2d17158 insn: 48 8d 50 01 00007f0ab2d1715c insn: eb 92 # PRED 1 cycles [8] 2.00 IPC 00007f0ab2d170f0 insn: 48 39 ca 00007f0ab2d170f3 insn: 73 b0 # PRED 1 cycles [9] 2.00 IPC After (perf must be compiled with capstone): perf script -F +brstackdisasm ... emacs 55778 1814366.755945: 151564 cycles:P: 7f0ab2d17192 intel_check_word.constprop.0+0x162 (/usr/lib64/ld-linux-x86-64.s> intel_check_word.constprop.0+237: 00007f0ab2d1711d jne intel_check_word.constprop.0+0xd5 # PRED 3 cycles [3] 00007f0ab2d17105 jae intel_check_word.constprop.0+0x128 00007f0ab2d17107 movq %rax, %rcx 00007f0ab2d1710a cmpq %rcx, %rdx 00007f0ab2d1710d jae intel_check_word.constprop.0+0x75 00007f0ab2d1710f leaq (%rcx, %rdx), %rax 00007f0ab2d17113 shrq $1, %rax 00007f0ab2d17116 leaq (%r9, %rax, 8), %rsi 00007f0ab2d1711a cmpb (%rsi), %r8b 00007f0ab2d1711d jne intel_check_word.constprop.0+0xd5 # PRED 3 cycles [6] 3.00 IPC 00007f0ab2d17105 jae intel_check_word.constprop.0+0x128 # PRED 1 cycles [7] 1.00 IPC 00007f0ab2d17158 leaq 1(%rax), %rdx 00007f0ab2d1715c jmp intel_check_word.constprop.0+0xc0 # PRED 1 cycles [8] 2.00 IPC 00007f0ab2d170f0 cmpq %rcx, %rdx 00007f0ab2d170f3 jae intel_check_word.constprop.0+0x75 # PRED 1 cycles [9] 2.00 IPC Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Link: https://lore.kernel.org/r/20240401210925.209671-3-ak@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-05perf script: Support 32bit code under 64bit OS with capstoneAndi Kleen3-10/+28
Use the DSO to resolve whether an IP is 32bit or 64bit and use that to configure capstone to the correct mode. This allows to correctly disassemble 32bit code under a 64bit OS. % cat > loop.c volatile int var; int main(void) { int i; for (i = 0; i < 100000; i++) var++; } % gcc -m32 -o loop loop.c % perf record -e cycles:u ./loop % perf script -F +disasm loop 82665 1833176.618023: 1 cycles:u: f7eed500 _start+0x0 (/usr/lib/ld-linux.so.2) movl %esp, %eax loop 82665 1833176.618029: 1 cycles:u: f7eed500 _start+0x0 (/usr/lib/ld-linux.so.2) movl %esp, %eax loop 82665 1833176.618031: 7 cycles:u: f7eed500 _start+0x0 (/usr/lib/ld-linux.so.2) movl %esp, %eax loop 82665 1833176.618034: 91 cycles:u: f7eed500 _start+0x0 (/usr/lib/ld-linux.so.2) movl %esp, %eax loop 82665 1833176.618036: 1242 cycles:u: f7eed500 _start+0x0 (/usr/lib/ld-linux.so.2) movl %esp, %eax Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Link: https://lore.kernel.org/r/20240401210925.209671-2-ak@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-04perf stat: Do not fail on metrics on s390 z/VM systemsThomas Richter1-14/+14
On s390 z/VM virtual machines command 'perf list' also displays metrics: # perf list | grep -A 20 'Metric Groups:' Metric Groups: No_group: cpi [Cycles per Instruction] est_cpi [Estimated Instruction Complexity CPI infinite Level 1] finite_cpi [Cycles per Instructions from Finite cache/memory] l1mp [Level One Miss per 100 Instructions] l2p [Percentage sourced from Level 2 cache] l3p [Percentage sourced from Level 3 on same chip cache] l4lp [Percentage sourced from Level 4 Local cache on same book] l4rp [Percentage sourced from Level 4 Remote cache on different book] memp [Percentage sourced from memory] .... # The command # perf stat -M cpi -- true event syntax error: '{CPU_CYCLES/metric-id=CPU_CYCLES/.....' \___ Bad event or PMU Unable to find PMU or event on a PMU of 'CPU_CYCLES' event syntax error: '{CPU_CYCLES/metric-id=CPU_CYCLES/...' \___ Cannot find PMU `CPU_CYCLES'. Missing kernel support? # fails. 'perf stat' should not fail on metrics when the referenced CPU Counter Measurement PMU is not available. Output after: # perf stat -M est_cpi -- sleep 1 Performance counter stats for 'sleep 1': 1,000,887,494 ns duration_time # 0.00 est_cpi 1.000887494 seconds time elapsed 0.000143000 seconds user 0.000662000 seconds sys # Fixes: 7f76b31130680fb3 ("perf list: Add IBM z16 event description for s390") Suggested-by: Ian Rogers <irogers@google.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20240404064806.1362876-2-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-04perf report: Fix PAI counter names for s390 virtual machinesThomas Richter1-1/+1
s390 introduced the Processor Activity Instrumentation (PAI) counter facility on LPAR and virtual machines z/VM for models 3931 and 3932. These counters are stored as raw data in the perf.data file and are displayed with: # perf report -i /tmp//perfout-635468 -D | grep Counter Counter:007 <unknown> Value:0x00000000000186a0 Counter:032 <unknown> Value:0x0000000000000001 Counter:032 <unknown> Value:0x0000000000000001 Counter:032 <unknown> Value:0x0000000000000001 # However on z/VM virtual machines, the counter names are not retrieved from the PMU and are shown as '<unknown>'. This is caused by the CPU string saved in the mapfile.csv for this machine: ^IBM.393[12].*3\.7.[[:xdigit:]]+$,3,cf_z16,core This string contains the CPU Measurement facility first and second version number and authorization level (3\.7.[[:xdigit:]]+). These numbers do not apply to the PAI counter facility. In fact they can be omitted. Shorten the CPU identification string for this machine to manufacturer and model. This is sufficient for all PMU devices. Output after: # perf report -i /tmp//perfout-635468 -D | grep Counter Counter:007 km_aes_128 Value:0x00000000000186a0 Counter:032 kma_gcm_aes_256 Value:0x0000000000000001 Counter:032 kma_gcm_aes_256 Value:0x0000000000000001 Counter:032 kma_gcm_aes_256 Value:0x0000000000000001 # Fixes: b539deafbadb2fc6 ("perf report: Add s390 raw data interpretation for PAI counters") Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20240404064806.1362876-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-03perf annotate: Initialize 'arch' variable not to trip some ↵Arnaldo Carvalho de Melo1-1/+3
-Werror=maybe-uninitialized In some older distros the build is failing due to -Werror=maybe-uninitialized, in this case we know that this isn't the case because 'arch' gets initialized by evsel__get_arch(), so make sure it is initialized to NULL before returning from evsel__get_arch(), as suggested by Ian Rogers. E.g.: 32 17.12 opensuse:15.5 : FAIL gcc version 7.5.0 (SUSE Linux) util/annotate.c: In function 'hist_entry__get_data_type': util/annotate.c:2269:15: error: 'arch' may be used uninitialized in this function [-Werror=maybe-uninitialized] struct arch *arch; ^~~~ cc1: all warnings being treated as errors 43 7.30 ubuntu:18.04-x-powerpc64el : FAIL gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04) util/annotate.c: In function 'hist_entry__get_data_type': util/annotate.c:2351:36: error: 'arch' may be used uninitialized in this function [-Werror=maybe-uninitialized] if (map__dso(ms->map)->kernel && arch__is(arch, "x86") && ^~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors Suggested-by: Ian Rogers <irogers@google.com> Reviewed-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/CAP-5=fUqtjxAsmdGrnkjhUTLHs-JvV10TtxyocpYDJK_+LYTiQ@mail.gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-03perf build: Add LIBTRACEEVENT_DIR build optionYang Jihong1-4/+16
Currently, when libtraceevent is not linked, perf does not support tracepoint: # ./perf record -e sched:sched_switch -a sleep 10 event syntax error: 'sched:sched_switch' \___ unsupported tracepoint libtraceevent is necessary for tracepoint support Run 'perf list' for a list of valid events Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -e, --event <event> event selector. use 'perf list' to list available events For cross-compilation scenario, library may not be installed in the default system path. Based on the above requirements, add LIBTRACEEVENT_DIR build option to support specifying path of libtraceevent. Example: 1. Cross compile libtraceevent # cd /opt/libtraceevent # CROSS_COMPILE=aarch64-linux-gnu- make 2. Cross compile perf # cd tool/perf # make VF=1 ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- NO_LIBELF=1 LDFLAGS=--static LIBTRACEEVENT_DIR=/opt/libtraceevent <SNIP> Auto-detecting system features: <SNIP> ... LIBTRACEEVENT_DIR: /opt/libtraceevent Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Yang Jihong <yangjihong@bytedance.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240314063000.2139877-1-yangjihong@bytedance.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-03perf beauty: Fix AT_EACCESS undeclared build error for system with kernel ↵Yang Jihong1-0/+8
versions lower than v5.8 In the environment of ubuntu 20.04 (the version of kernel headers is 5.4), there is an error in building perf: CC trace/beauty/fs_at_flags.o trace/beauty/fs_at_flags.c: In function ‘faccessat2__scnprintf_flags’: trace/beauty/fs_at_flags.c:35:14: error: ‘AT_EACCESS’ undeclared (first use in this function); did you mean ‘DN_ACCESS’? 35 | if (flags & AT_EACCESS) { | ^~~~~~~~~~ | DN_ACCESS trace/beauty/fs_at_flags.c:35:14: note: each undeclared identifier is reported only once for each function it appears in commit 8a1ad4413519b10f ("tools headers: Remove now unused copies of uapi/{fcntl,openat2}.h and asm/fcntl.h") removes fcntl.h from tools headers directory, and fs_at_flags.c uses the 'AT_EACCESS' macro. This macro was introduced in the kernel version v5.8. For system with a kernel version older than this version, it will cause compilation to fail. Fixes: 8a1ad4413519b10f ("tools headers: Remove now unused copies of uapi/{fcntl,openat2}.h and asm/fcntl.h") Signed-off-by: Yang Jihong <yangjihong@bytedance.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240403122558.1438841-1-yangjihong@bytedance.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-03perf annotate: Add symbol name when using capstoneNamhyung Kim1-3/+71
This is to keep the existing behavior with objdump. It needs to show symbol information of global variables like below: Percent | Source code & Disassembly of elf for cycles:P (1 samples, percent: local period) ------------------------------------------------------------------------------------------------ : 0 0xffffffff81338f70 <vm_normal_page>: 0.00 : ffffffff81338f70: endbr64 0.00 : ffffffff81338f74: callq 0xffffffff81083a40 0.00 : ffffffff81338f79: movq %rdi, %r8 0.00 : ffffffff81338f7c: movq %rdx, %rdi 0.00 : ffffffff81338f7f: callq *0x17021c3(%rip) # ffffffff82a3b148 <pv_ops+0x1e8> 0.00 : ffffffff81338f85: movq 0xffbf3c(%rip), %rdx # ffffffff82334ec8 <physical_mask> 0.00 : ffffffff81338f8c: testq %rax, %rax ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 0.00 : ffffffff81338f8f: je 0xffffffff81338fd0 here 0.00 : ffffffff81338f91: movq %rax, %rcx 0.00 : ffffffff81338f94: andl $1, %ecx Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240329215812.537846-6-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-03perf annotate: Use libcapstone to disassembleNamhyung Kim1-0/+160
Now it can use the capstone library to disassemble the instructions. Let's use that (if available) for perf annotate to speed up. Currently it only supports x86 architecture. With this change I can see ~3x speed up in data type profiling. But note that capstone cannot give the source file and line number info. For now, users should use the external objdump for that by specifying the --objdump option explicitly. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240329215812.537846-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-03perf annotate: Split out util/disasm.cNamhyung Kim5-1657/+1709
The util/annotate.c code has both disassembly and sample annotation related codes. Factor out the disasm part so that it can be handled more easily. No functional changes intended. Committer notes: Add missing include env.h, util.h, bpf-event.h and bpf-util.h to disasm.c, to fix things like: util/disasm.c: In function ‘symbol__disassemble_bpf’: util/disasm.c:1203:9: error: implicit declaration of function ‘perf_exe’ [-Werror=implicit-function-declaration] 1203 | perf_exe(tpath, sizeof(tpath)); | ^~~~~~~~ Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240329215812.537846-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-03perf annotate: Add and use ins__is_nop()Namhyung Kim2-1/+7
Likewise, add ins__is_nop() to check if the current instruction is NOP. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240329215812.537846-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-03perf annotate: Use ins__is_xxx() if possibleNamhyung Kim1-4/+4
This is to prepare separation of disasm related code. Use the public ins API instead of checking the internal data structure. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240329215812.537846-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-03perf evsel: Use evsel__name_is() helperYang Jihong6-33/+22
Code cleanup, replace strcmp(evsel__name(evsel, {NAME})) with evsel__name_is() helper. No functional change. Committer notes: Fix this build error: trace.syscalls.events.bpf_output = evlist__last(trace.evlist); - assert(evsel__name_is(trace.syscalls.events.bpf_output), "__augmented_syscalls__"); + assert(evsel__name_is(trace.syscalls.events.bpf_output, "__augmented_syscalls__")); Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Yang Jihong <yangjihong@bytedance.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240401062724.1006010-3-yangjihong@bytedance.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-03perf sched timehist: Fix -g/--call-graph option failureYang Jihong1-2/+5
When 'perf sched' enables the call-graph recording, sample_type of dummy event does not have PERF_SAMPLE_CALLCHAIN, timehist_check_attr() checks that the evsel does not have a callchain, and set show_callchain to 0. Currently 'perf sched timehist' only saves callchain when processing the 'sched:sched_switch event', timehist_check_attr() only needs to determine whether the event has PERF_SAMPLE_CALLCHAIN. Before: # perf sched record -g true [ perf record: Woken up 0 times to write data ] [ perf record: Captured and wrote 4.153 MB perf.data (7536 samples) ] # perf sched timehist Samples do not have callchains. time cpu task name wait time sch delay run time [tid/pid] (msec) (msec) (msec) --------------- ------ ------------------------------ --------- --------- --------- 147851.826019 [0000] perf[285035] 0.000 0.000 0.000 147851.826029 [0000] migration/0[15] 0.000 0.003 0.009 147851.826063 [0001] perf[285035] 0.000 0.000 0.000 147851.826069 [0001] migration/1[21] 0.000 0.003 0.006 <SNIP> After: # perf sched record -g true [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 2.572 MB perf.data (822 samples) ] # perf sched timehist time cpu task name waittime sch delay runtime [tid/pid] (msec) (msec) (msec) ----------- --- --------------- -------- -------- ----- 4193.035164 [0] perf[277062] 0.000 0.000 0.000 __traceiter_sched_switch <- __traceiter_sched_switch <- __sched_text_start <- preempt_schedule_common <- __cond_resched <- __wait_for_common <- wait_for_completion 4193.035174 [0] migration/0[15] 0.000 0.003 0.009 __traceiter_sched_switch <- __traceiter_sched_switch <- __sched_text_start <- smpboot_thread_fn <- kthread <- ret_from_fork 4193.035207 [1] perf[277062] 0.000 0.000 0.000 __traceiter_sched_switch <- __traceiter_sched_switch <- __sched_text_start <- preempt_schedule_common <- __cond_resched <- __wait_for_common <- wait_for_completion 4193.035214 [1] migration/1[21] 0.000 0.003 0.007 __traceiter_sched_switch <- __traceiter_sched_switch <- __sched_text_start <- smpboot_thread_fn <- kthread <- ret_from_fork <SNIP> Fixes: 9c95e4ef06572349 ("perf evlist: Add evlist__findnew_tracking_event() helper") Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Yang Jihong <yangjihong@bytedance.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20240401062724.1006010-2-yangjihong@bytedance.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-03perf annotate: Honor output options with --data-typeNamhyung Kim1-6/+38
For data type profiling output, it should be in sync with normal output so make it display percentage for each field. Also use coloring scheme for users to identify fields with big overhead easily. Users can use --show-total-period or --show-nr-samples to change the output style like in the normal perf annotate output. Before: $ perf annotate --data-type Annotate type: 'struct task_struct' in [kernel.kallsyms] (34 samples): ============================================================================ samples offset size field 34 0 9792 struct task_struct { 2 0 24 struct thread_info thread_info { 0 0 8 long unsigned int flags; 1 8 8 long unsigned int syscall_work; 0 16 4 u32 status; 1 20 4 u32 cpu; }; After: $ perf annotate --data-type Annotate type: 'struct task_struct' in [kernel.kallsyms] (34 samples): ============================================================================ Percent offset size field 100.00 0 9792 struct task_struct { 3.55 0 24 struct thread_info thread_info { 0.00 0 8 long unsigned int flags; 1.63 8 8 long unsigned int syscall_work; 0.00 16 4 u32 status; 1.91 20 4 u32 cpu; }; Committer testing: First collect a suitable perf.data file for use with 'perf annotate --data-type': root@number:~# perf mem record -a sleep 1s [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 11.047 MB perf.data (3466 samples) ] root@number:~# Then, before: root@number:~# perf annotate --data-type Annotate type: 'union ' in /usr/lib64/libc.so.6 (6 samples): ============================================================================ samples offset size field 6 0 40 union { 6 0 40 struct __pthread_mutex_s __data { 2 0 4 int __lock; 0 4 4 unsigned int __count; 0 8 4 int __owner; 1 12 4 unsigned int __nusers; 2 16 4 int __kind; 1 20 2 short int __spins; 0 22 2 short int __elision; 0 24 16 __pthread_list_t __list { 0 24 8 struct __pthread_internal_list* __prev; 0 32 8 struct __pthread_internal_list* __next; }; }; 0 0 0 char* __size; 2 0 8 long int __align; }; <SNIP> And after: Annotate type: 'union ' in /usr/lib64/libc.so.6 (6 samples): ============================================================================ Percent offset size field 100.00 0 40 union { 100.00 0 40 struct __pthread_mutex_s __data { 31.27 0 4 int __lock; 0.00 4 4 unsigned int __count; 0.00 8 4 int __owner; 7.67 12 4 unsigned int __nusers; 53.10 16 4 int __kind; 7.96 20 2 short int __spins; 0.00 22 2 short int __elision; 0.00 24 16 __pthread_list_t __list { 0.00 24 8 struct __pthread_internal_list* __prev; 0.00 32 8 struct __pthread_internal_list* __next; }; }; 0.00 0 0 char* __size; 31.27 0 8 long int __align; }; <SNIP> The lines with percentages >= 7.67 have its percentages red colored. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240322224313.423181-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-03perf annotate: Get rid of duplicate --group option itemNamhyung Kim1-2/+0
The options array in cmd_annotate() has duplicate --group options. It only needs one and let's get rid of the other. $ perf annotate -h 2>&1 | grep group --group Show event group information together --group Show event group information together Fixes: 7ebaf4890f63eb90 ("perf annotate: Support '--group' option") Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240322224313.423181-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-03perf tools: Add Kan Liang to MAINTAINERS as a reviewerArnaldo Carvalho de Melo1-0/+1
Kan has been reviewing patches regularly, add him as a perf tools reviewer so that people CC him on new patches. Acked-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Acked-by: "Liang, Kan" <kan.liang@linux.intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21perf beauty: Move uapi/linux/vhost.h copy out of the directory used to build ↵Arnaldo Carvalho de Melo4-7/+6
perf It is only used to generate string tables, not to build perf, so move it to the tools/perf/trace/beauty/include/ hierarchy, that is used just for scraping. This is a something that should've have happened, as happened with the linux/socket.h scrapper, do it now as Ian suggested while doing an audit/refactor session in the headers used by perf. No other tools/ living code uses it, just <linux/vhost.h> coming from either 'make install_headers' or from the system /usr/include/ directory. Suggested-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/CAP-5=fWZVrpRufO4w-S4EcSi9STXcTAN2ERLwTSN7yrSSA-otQ@mail.gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21perf dso: Reorder members to save space in 'struct dso'Ian Rogers1-42/+42
Save 40 bytes and move from 8 to 7 cache lines. Make member dwfl dependent on being a powerpc build. Squeeze bits of int/enum types when appropriate. Remove holes/padding by reordering variables. Before: struct dso { struct mutex lock; /* 0 40 */ struct list_head node; /* 40 16 */ struct rb_node rb_node __attribute__((__aligned__(8))); /* 56 24 */ /* --- cacheline 1 boundary (64 bytes) was 16 bytes ago --- */ struct rb_root * root; /* 80 8 */ struct rb_root_cached symbols; /* 88 16 */ struct symbol * * symbol_names; /* 104 8 */ size_t symbol_names_len; /* 112 8 */ struct rb_root_cached inlined_nodes; /* 120 16 */ /* --- cacheline 2 boundary (128 bytes) was 8 bytes ago --- */ struct rb_root_cached srclines; /* 136 16 */ struct { u64 addr; /* 152 8 */ struct symbol * symbol; /* 160 8 */ } last_find_result; /* 152 16 */ void * a2l; /* 168 8 */ char * symsrc_filename; /* 176 8 */ unsigned int a2l_fails; /* 184 4 */ enum dso_space_type kernel; /* 188 4 */ /* --- cacheline 3 boundary (192 bytes) --- */ _Bool is_kmod; /* 192 1 */ /* XXX 3 bytes hole, try to pack */ enum dso_swap_type needs_swap; /* 196 4 */ enum dso_binary_type symtab_type; /* 200 4 */ enum dso_binary_type binary_type; /* 204 4 */ enum dso_load_errno load_errno; /* 208 4 */ u8 adjust_symbols:1; /* 212: 0 1 */ u8 has_build_id:1; /* 212: 1 1 */ u8 header_build_id:1; /* 212: 2 1 */ u8 has_srcline:1; /* 212: 3 1 */ u8 hit:1; /* 212: 4 1 */ u8 annotate_warned:1; /* 212: 5 1 */ u8 auxtrace_warned:1; /* 212: 6 1 */ u8 short_name_allocated:1; /* 212: 7 1 */ u8 long_name_allocated:1; /* 213: 0 1 */ u8 is_64_bit:1; /* 213: 1 1 */ /* XXX 6 bits hole, try to pack */ _Bool sorted_by_name; /* 214 1 */ _Bool loaded; /* 215 1 */ u8 rel; /* 216 1 */ /* XXX 7 bytes hole, try to pack */ struct build_id bid; /* 224 32 */ /* --- cacheline 4 boundary (256 bytes) --- */ u64 text_offset; /* 256 8 */ u64 text_end; /* 264 8 */ const char * short_name; /* 272 8 */ const char * long_name; /* 280 8 */ u16 long_name_len; /* 288 2 */ u16 short_name_len; /* 290 2 */ /* XXX 4 bytes hole, try to pack */ void * dwfl; /* 296 8 */ struct auxtrace_cache * auxtrace_cache; /* 304 8 */ int comp; /* 312 4 */ /* XXX 4 bytes hole, try to pack */ /* --- cacheline 5 boundary (320 bytes) --- */ struct { struct rb_root cache; /* 320 8 */ int fd; /* 328 4 */ int status; /* 332 4 */ u32 status_seen; /* 336 4 */ /* XXX 4 bytes hole, try to pack */ u64 file_size; /* 344 8 */ struct list_head open_entry; /* 352 16 */ u64 elf_base_addr; /* 368 8 */ u64 debug_frame_offset; /* 376 8 */ /* --- cacheline 6 boundary (384 bytes) --- */ u64 eh_frame_hdr_addr; /* 384 8 */ u64 eh_frame_hdr_offset; /* 392 8 */ } data; /* 320 80 */ struct { u32 id; /* 400 4 */ u32 sub_id; /* 404 4 */ struct perf_env * env; /* 408 8 */ } bpf_prog; /* 400 16 */ union { void * priv; /* 416 8 */ u64 db_id; /* 416 8 */ }; /* 416 8 */ struct nsinfo * nsinfo; /* 424 8 */ struct dso_id id; /* 432 24 */ /* --- cacheline 7 boundary (448 bytes) was 8 bytes ago --- */ refcount_t refcnt; /* 456 4 */ char name[]; /* 460 0 */ /* size: 464, cachelines: 8, members: 49 */ /* sum members: 440, holes: 4, sum holes: 18 */ /* sum bitfield members: 10 bits, bit holes: 1, sum bit holes: 6 bits */ /* padding: 4 */ /* forced alignments: 1 */ /* last cacheline: 16 bytes */ } __attribute__((__aligned__(8))); After: struct dso { struct mutex lock; /* 0 40 */ struct list_head node; /* 40 16 */ struct rb_node rb_node __attribute__((__aligned__(8))); /* 56 24 */ /* --- cacheline 1 boundary (64 bytes) was 16 bytes ago --- */ struct rb_root * root; /* 80 8 */ struct rb_root_cached symbols; /* 88 16 */ struct symbol * * symbol_names; /* 104 8 */ size_t symbol_names_len; /* 112 8 */ struct rb_root_cached inlined_nodes; /* 120 16 */ /* --- cacheline 2 boundary (128 bytes) was 8 bytes ago --- */ struct rb_root_cached srclines; /* 136 16 */ struct { u64 addr; /* 152 8 */ struct symbol * symbol; /* 160 8 */ } last_find_result; /* 152 16 */ struct build_id bid; /* 168 32 */ /* --- cacheline 3 boundary (192 bytes) was 8 bytes ago --- */ u64 text_offset; /* 200 8 */ u64 text_end; /* 208 8 */ const char * short_name; /* 216 8 */ const char * long_name; /* 224 8 */ void * a2l; /* 232 8 */ char * symsrc_filename; /* 240 8 */ struct nsinfo * nsinfo; /* 248 8 */ /* --- cacheline 4 boundary (256 bytes) --- */ struct auxtrace_cache * auxtrace_cache; /* 256 8 */ union { void * priv; /* 264 8 */ u64 db_id; /* 264 8 */ }; /* 264 8 */ struct { struct perf_env * env; /* 272 8 */ u32 id; /* 280 4 */ u32 sub_id; /* 284 4 */ } bpf_prog; /* 272 16 */ struct { struct rb_root cache; /* 288 8 */ struct list_head open_entry; /* 296 16 */ u64 file_size; /* 312 8 */ /* --- cacheline 5 boundary (320 bytes) --- */ u64 elf_base_addr; /* 320 8 */ u64 debug_frame_offset; /* 328 8 */ u64 eh_frame_hdr_addr; /* 336 8 */ u64 eh_frame_hdr_offset; /* 344 8 */ int fd; /* 352 4 */ int status; /* 356 4 */ u32 status_seen; /* 360 4 */ } data; /* 288 80 */ /* XXX last struct has 4 bytes of padding */ struct dso_id id; /* 368 24 */ /* --- cacheline 6 boundary (384 bytes) was 8 bytes ago --- */ unsigned int a2l_fails; /* 392 4 */ int comp; /* 396 4 */ refcount_t refcnt; /* 400 4 */ enum dso_load_errno load_errno; /* 404 4 */ u16 long_name_len; /* 408 2 */ u16 short_name_len; /* 410 2 */ enum dso_binary_type symtab_type:8; /* 412: 0 4 */ enum dso_binary_type binary_type:8; /* 412: 8 4 */ enum dso_space_type kernel:2; /* 412:16 4 */ enum dso_swap_type needs_swap:2; /* 412:18 4 */ /* Bitfield combined with next fields */ _Bool is_kmod:1; /* 414: 4 1 */ u8 adjust_symbols:1; /* 414: 5 1 */ u8 has_build_id:1; /* 414: 6 1 */ u8 header_build_id:1; /* 414: 7 1 */ u8 has_srcline:1; /* 415: 0 1 */ u8 hit:1; /* 415: 1 1 */ u8 annotate_warned:1; /* 415: 2 1 */ u8 auxtrace_warned:1; /* 415: 3 1 */ u8 short_name_allocated:1; /* 415: 4 1 */ u8 long_name_allocated:1; /* 415: 5 1 */ u8 is_64_bit:1; /* 415: 6 1 */ /* XXX 1 bit hole, try to pack */ _Bool sorted_by_name; /* 416 1 */ _Bool loaded; /* 417 1 */ u8 rel; /* 418 1 */ char name[]; /* 419 0 */ /* size: 424, cachelines: 7, members: 48 */ /* sum members: 415 */ /* sum bitfield members: 31 bits, bit holes: 1, sum bit holes: 1 bits */ /* padding: 5 */ /* paddings: 1, sum paddings: 4 */ /* forced alignments: 1 */ /* last cacheline: 40 bytes */ } __attribute__((__aligned__(8))); Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Chengen Du <chengen.du@canonical.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Li Dong <lidong@vivo.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Markus Elfring <Markus.Elfring@web.de> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Song Liu <song@kernel.org> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com> Link: https://lore.kernel.org/r/20240321160300.1635121-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21perf lock contention: Trim backtrace by skipping traceiter functionsAnne Macedo2-1/+25
The 'perf lock contention' program currently shows the caller of the locks as __traceiter_contention_begin+0x??. This caller can be ignored, as it is from the traceiter itself. Instead, it should show the real callers for the locks. When fiddling with the --stack-skip parameter, the actual callers for the locks start to show up. However, just ignore the __traceiter_contention_begin and the __traceiter_contention_end symbols so the actual callers will show up. Before this patch is applied: sudo perf lock con -a -b -- sleep 3 contended total wait max wait avg wait type caller 8 2.33 s 2.28 s 291.18 ms rwlock:W __traceiter_contention_begin+0x44 4 2.33 s 2.28 s 582.35 ms rwlock:W __traceiter_contention_begin+0x44 7 140.30 ms 46.77 ms 20.04 ms rwlock:W __traceiter_contention_begin+0x44 2 63.35 ms 33.76 ms 31.68 ms mutex trace_contention_begin+0x84 2 46.74 ms 46.73 ms 23.37 ms rwlock:W __traceiter_contention_begin+0x44 1 13.54 us 13.54 us 13.54 us mutex trace_contention_begin+0x84 1 3.67 us 3.67 us 3.67 us rwsem:R __traceiter_contention_begin+0x44 Before this patch is applied - using --stack-skip 5 sudo perf lock con --stack-skip 5 -a -b -- sleep 3 contended total wait max wait avg wait type caller 2 2.24 s 2.24 s 1.12 s rwlock:W do_epoll_wait+0x5a0 4 1.65 s 824.21 ms 412.08 ms rwlock:W do_exit+0x338 2 824.35 ms 824.29 ms 412.17 ms spinlock get_signal+0x108 2 824.14 ms 824.14 ms 412.07 ms rwlock:W release_task+0x68 1 25.22 ms 25.22 ms 25.22 ms mutex cgroup_kn_lock_live+0x58 1 24.71 us 24.71 us 24.71 us spinlock do_exit+0x44 1 22.04 us 22.04 us 22.04 us rwsem:R lock_mm_and_find_vma+0xb0 After this patch is applied: sudo ./perf lock con -a -b -- sleep 3 contended total wait max wait avg wait type caller 4 4.13 s 2.07 s 1.03 s rwlock:W release_task+0x68 2 2.07 s 2.07 s 1.03 s rwlock:R mm_update_next_owner+0x50 2 2.07 s 2.07 s 1.03 s rwlock:W do_exit+0x338 1 41.56 ms 41.56 ms 41.56 ms mutex cgroup_kn_lock_live+0x58 2 36.12 us 18.83 us 18.06 us rwlock:W do_exit+0x338 Signed-off-by: Anne Macedo <retpolanne@posteo.net> Acked-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240319143629.3422590-1-retpolanne@posteo.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>