aboutsummaryrefslogtreecommitdiff
path: root/tools/perf/util
AgeCommit message (Collapse)AuthorFilesLines
2023-09-29perf pmus: Make PMU alias name loading lazyIan Rogers3-20/+31
PMU alias names were computed when the first perf_pmu is created, scanning all PMUs in event sources for a file called alias that generally doesn't exist. Switch to trying to load the file when all PMU related files are loaded in lookup. This would cause a PMU name lookup of an alias name to fail if no PMUs were loaded, so in that case all PMUs are loaded and the find repeated. The overhead is similar but in the (very) general case not all PMUs are scanned for the alias file. As the overhead occurs once per invocation it doesn't show in perf bench internals pmu-scan. On a tigerlake machine, the number of openat system calls for an event of cpu/cycles/ with perf stat reduces from 94 to 69 (ie 25 fewer openat calls). Signed-off-by: Ian Rogers <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: James Clark <[email protected]> Cc: Leo Yan <[email protected]> Cc: Kan Liang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-27perf metric: "Compat" supports regular expression matching identifiersJing Zhang1-1/+1
The jevent "Compat" is used for uncore PMU alias or metric definitions. The same PMU driver has different PMU identifiers due to different hardware versions and types, but they may have some common PMU metric. Since a Compat value can only match one identifier, when adding the same metric to PMUs with different identifiers, each identifier needs to be defined once, which is not streamlined enough. So let "Compat" support using regular expression to match multiple identifiers for uncore PMU metric. Signed-off-by: Jing Zhang <[email protected]> Reviewed-by: John Garry <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Tested-by: Ian Rogers <[email protected]> Cc: James Clark <[email protected]> Cc: Will Deacon <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mike Leach <[email protected]> Cc: Shuai Xue <[email protected]> Cc: Zhuo Song <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-27perf pmu: "Compat" supports regular expression matching identifiersJing Zhang2-2/+26
The jevent "Compat" is used for uncore PMU alias or metric definitions. The same PMU driver has different PMU identifiers due to different hardware versions and types, but they may have some common PMU event. Since a Compat value can only match one identifier, when adding the same event alias to PMUs with different identifiers, each identifier needs to be defined once, which is not streamlined enough. So let "Compat" support using regular expression to match identifiers for uncore PMU alias. For example, if the "Compat" value is set to "43401|43c01", it would be able to match PMU identifiers such as "43401" or "43c01", which correspond to CMN600_r0p0 or CMN700_r0p0. Signed-off-by: Jing Zhang <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Tested-by: Ian Rogers <[email protected]> Cc: James Clark <[email protected]> Cc: Will Deacon <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mike Leach <[email protected]> Cc: Shuai Xue <[email protected]> Cc: Zhuo Song <[email protected]> Cc: John Garry <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-27perf record: Fix BTF type checks in the off-cpu profilingNamhyung Kim1-2/+3
The BTF func proto for a tracepoint has one more argument than the actual tracepoint function since it has a context argument at the begining. So it should compare to 5 when the tracepoint has 4 arguments. typedef void (*btf_trace_sched_switch)(void *, bool, struct task_struct *, struct task_struct *, unsigned int); Also, recent change in the perf tool would use a hand-written minimal vmlinux.h to generate BTF in the skeleton. So it won't have the info of the tracepoint. Anyway it should use the kernel's vmlinux BTF to check the type in the kernel. Fixes: b36888f71c85 ("perf record: Handle argument change in sched_switch") Reviewed-by: Ian Rogers <[email protected]> Acked-by: Song Liu <[email protected]> Cc: Hao Luo <[email protected]> CC: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-26perf evlist: Avoid frequency mode for the dummy eventIan Rogers1-2/+3
Dummy events are created with an attribute where the period and freq are zero. evsel__config will then see the uninitialized values and initialize them in evsel__default_freq_period. As fequency mode is used by default the dummy event would be set to use frequency mode. However, this has no effect on the dummy event but does cause unnecessary timers/interrupts. Avoid this overhead by setting the period to 1 for dummy events. evlist__add_aux_dummy calls evlist__add_dummy then sets freq=0 and period=1. This isn't necessary after this change and so the setting is removed. From Stephane: The dummy event is not counting anything. It is used to collect mmap records and avoid a race condition during the synthesize mmap phase of perf record. As such, it should not cause any overhead during active profiling. Yet, it did. Because of a bug the dummy event was programmed as a sampling event in frequency mode. Events in that mode incur more kernel overheads because on timer tick, the kernel has to look at the number of samples for each event and potentially adjust the sampling period to achieve the desired frequency. The dummy event was therefore adding a frequency event to task and ctx contexts we may otherwise not have any, e.g., perf record -a -e cpu/event=0x3c,period=10000000/. On each timer tick the perf_adjust_freq_unthr_context() is invoked and if ctx->nr_freq is non-zero, then the kernel will loop over ALL the events of the context looking for frequency mode ones. In doing, so it locks the context, and enable/disable the PMU of each hw event. If all the events of the context are in period mode, the kernel will have to traverse the list for nothing incurring overhead. The overhead is multiplied by a very large factor when this happens in a guest kernel. There is no need for the dummy event to be in frequency mode, it does not count anything and therefore should not cause extra overhead for no reason. Fixes: 5bae0250237f ("perf evlist: Introduce perf_evlist__new_dummy constructor") Reported-by: Stephane Eranian <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Acked-by: Adrian Hunter <[email protected]> Cc: Yang Jihong <[email protected]> Cc: Kan Liang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-26perf pmu: Fix perf stat output with correct scale and unitWyes Karny1-4/+4
The perf_pmu__parse_* functions for the sysfs files of pmu event’s scale, unit, per-pkg and snapshot were updated in commit 7b723dbb96e8 ("perf pmu: Be lazy about loading event info files from sysfs"). However, the paths for these sysfs files were incorrect. This resulted in perf stat reporting values with wrong scaling and missing units. This is fixed by correcting the paths for these sysfs files. Before this fix: $sudo perf stat -e power/energy-pkg/ -- sleep 2 Performance counter stats for 'system wide': 351,217,188,864 power/energy-pkg/ 2.004127961 seconds time elapsed After this fix: $sudo perf stat -e power/energy-pkg/ -- sleep 2 Performance counter stats for 'system wide': 80.58 Joules power/energy-pkg/ 2.004009749 seconds time elapsed Fixes: 7b723dbb96e8 ("perf pmu: Be lazy about loading event info files from sysfs") Signed-off-by: Wyes Karny <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-18perf parse-events: Fix tracepoint name memory leakIan Rogers1-0/+1
Fuzzing found that an invalid tracepoint name would create a memory leak with an address sanitizer build: ``` $ perf stat -e '*:o/' true event syntax error: '*:o/' \___ parser error Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events ================================================================= ==59380==ERROR: LeakSanitizer: detected memory leaks Direct leak of 4 byte(s) in 2 object(s) allocated from: #0 0x7f38ac07077b in __interceptor_strdup ../../../../src/libsanitizer/asan/asan_interceptors.cpp:439 #1 0x55f2f41be73b in str util/parse-events.l:49 #2 0x55f2f41d08e8 in parse_events_lex util/parse-events.l:338 #3 0x55f2f41dc3b1 in parse_events_parse util/parse-events-bison.c:1464 #4 0x55f2f410b8b3 in parse_events__scanner util/parse-events.c:1822 #5 0x55f2f410d1b9 in __parse_events util/parse-events.c:2094 #6 0x55f2f410e57f in parse_events_option util/parse-events.c:2279 #7 0x55f2f4427b56 in get_value tools/lib/subcmd/parse-options.c:251 #8 0x55f2f4428d98 in parse_short_opt tools/lib/subcmd/parse-options.c:351 #9 0x55f2f4429d80 in parse_options_step tools/lib/subcmd/parse-options.c:539 #10 0x55f2f442acb9 in parse_options_subcommand tools/lib/subcmd/parse-options.c:654 #11 0x55f2f3ec99fc in cmd_stat tools/perf/builtin-stat.c:2501 #12 0x55f2f4093289 in run_builtin tools/perf/perf.c:322 #13 0x55f2f40937f5 in handle_internal_command tools/perf/perf.c:375 #14 0x55f2f4093bbd in run_argv tools/perf/perf.c:419 #15 0x55f2f409412b in main tools/perf/perf.c:535 SUMMARY: AddressSanitizer: 4 byte(s) leaked in 2 allocation(s). ``` Fix by adding the missing destructor. Fixes: 865582c3f48e ("perf tools: Adds the tracepoint name parsing support") Signed-off-by: Ian Rogers <[email protected]> Cc: He Kuang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-18perf kwork top: Simplify bool conversionYang Li1-1/+1
./tools/perf/util/bpf_kwork_top.c:120:53-58: WARNING: conversion to bool not needed here Signed-off-by: Yang Li <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-17perf pmu: Ensure all alias variables are initializedIan Rogers1-1/+1
Fix an error detected by memory sanitizer: ``` ==4033==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x55fb0fbedfc7 in read_alias_info tools/perf/util/pmu.c:457:6 #1 0x55fb0fbea339 in check_info_data tools/perf/util/pmu.c:1434:2 #2 0x55fb0fbea339 in perf_pmu__check_alias tools/perf/util/pmu.c:1504:9 #3 0x55fb0fbdca85 in parse_events_add_pmu tools/perf/util/parse-events.c:1429:32 #4 0x55fb0f965230 in parse_events_parse tools/perf/util/parse-events.y:299:6 #5 0x55fb0fbdf6b2 in parse_events__scanner tools/perf/util/parse-events.c:1822:8 #6 0x55fb0fbdf8c1 in __parse_events tools/perf/util/parse-events.c:2094:8 #7 0x55fb0fa8ffa9 in parse_events tools/perf/util/parse-events.h:41:9 #8 0x55fb0fa8ffa9 in test_event tools/perf/tests/parse-events.c:2393:8 #9 0x55fb0fa8f458 in test__pmu_events tools/perf/tests/parse-events.c:2551:15 #10 0x55fb0fa6d93f in run_test tools/perf/tests/builtin-test.c:242:9 #11 0x55fb0fa6d93f in test_and_print tools/perf/tests/builtin-test.c:271:8 #12 0x55fb0fa6d082 in __cmd_test tools/perf/tests/builtin-test.c:442:5 #13 0x55fb0fa6d082 in cmd_test tools/perf/tests/builtin-test.c:564:9 #14 0x55fb0f942720 in run_builtin tools/perf/perf.c:322:11 #15 0x55fb0f942486 in handle_internal_command tools/perf/perf.c:375:8 #16 0x55fb0f941dab in run_argv tools/perf/perf.c:419:2 #17 0x55fb0f941dab in main tools/perf/perf.c:535:3 ``` Fixes: 7b723dbb96e8 ("perf pmu: Be lazy about loading event info files from sysfs") Signed-off-by: Ian Rogers <[email protected]> Cc: James Clark <[email protected]> Cc: Kan Liang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-17perf trace: Avoid compile error wrt redefining boolIan Rogers1-0/+2
Make part of an existing TODO conditional to avoid the following build error: ``` tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c:26:14: error: cannot combine with previous 'char' declaration specifier 26 | typedef char bool; | ^ include/stdbool.h:20:14: note: expanded from macro 'bool' 20 | #define bool _Bool | ^ tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c:26:1: error: typedef requires a name [-Werror,-Wmissing-declarations] 26 | typedef char bool; | ^~~~~~~~~~~~~~~~~ 2 errors generated. ``` Signed-off-by: Ian Rogers <[email protected]> Cc: Leo Yan <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-17perf bpf-prologue: Remove unused fileIan Rogers1-508/+0
Commit 3d6dfae88917 ("perf parse-events: Remove BPF event support") removed building bpf-prologue.c but failed to remove the actual file. Fixes: 3d6dfae88917 ("perf parse-events: Remove BPF event support") Signed-off-by: Ian Rogers <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-15perf pmu: Remove unused functionJames Clark2-6/+0
pmu_events_table__find() is no longer used so remove it and its Arm specific version. Signed-off-by: James Clark <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Reviewed-by: John Garry <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Will Deacon <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mike Leach <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Haixin Yu <[email protected]> Cc: Kan Liang <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-15perf pmus: Simplify perf_pmus__find_core_pmu()James Clark1-13/+1
Currently the while loop always either exits on the first iteration with a core PMU, or exits with NULL on heterogeneous systems or when not all CPUs are online. Both of the latter behaviors are undesirable for platforms other than Arm so simplify it to always return the first core PMU, or NULL if none exist. This behavior was depended on by the Arm version of pmu_metrics_table__find(), so the logic has been moved there instead. Signed-off-by: James Clark <[email protected]> Suggested-by: Ian Rogers <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Reviewed-by: John Garry <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Will Deacon <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mike Leach <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Haixin Yu <[email protected]> Cc: Kan Liang <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-15perf pmu: Move pmu__find_core_pmu() to pmus.cJames Clark4-20/+21
pmu__find_core_pmu() more logically belongs in pmus.c because it iterates over all PMUs, so move it to pmus.c At the same time rename it to perf_pmus__find_core_pmu() to match the naming convention in this file. list_prepare_entry() can't be used in perf_pmus__scan_core() anymore now that it's called from the same compilation unit. This is with -O2 (specifically -O1 -ftree-vrp -finline-functions -finline-small-functions) which allow the bounds of the array access to be determined at compile time. list_prepare_entry() subtracts the offset of the 'list' member in struct perf_pmu from &core_pmus, which isn't a struct perf_pmu. The compiler sees that pmu results in &core_pmus - 8 and refuses to compile. At runtime this works because list_for_each_entry_continue() always adds the offset back again before dereferencing ->next, but it's technically undefined behavior. With -fsanitize=undefined an additional warning is generated. Using list_first_entry_or_null() to get the first entry here avoids doing &core_pmus - 8 but has the same result and fixes both the compile warning and the undefined behavior warning. There are other uses of list_prepare_entry() in pmus.c, but the compiler doesn't seem to be able to see that they can also be called with &core_pmus, so I won't change any at this time. Signed-off-by: James Clark <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Reviewed-by: John Garry <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Will Deacon <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mike Leach <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Haixin Yu <[email protected]> Cc: Kan Liang <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-15perf symbol: Avoid an undefined behavior warningIan Rogers1-2/+1
The node (nd) may be NULL and pointer arithmetic on NULL is undefined behavior. Move the computation of next below the NULL check on the node. Signed-off-by: Ian Rogers <[email protected]> Cc: James Clark <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-12perf bpf-filter: Add YYDEBUGIan Rogers1-0/+4
YYDEBUG enables line numbers and other error helpers in the generated bpf-filter-bison.c. Conditionally enabled only for debug builds. Signed-off-by: Ian Rogers <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf pmu: Add YYDEBUGIan Rogers1-0/+4
YYDEBUG enables line numbers and other error helpers in the generated pmu-bison.c. Conditionally enabled only for debug builds. Signed-off-by: Ian Rogers <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf expr: Make YYDEBUG dependent on doing a debug buildIan Rogers1-0/+2
YYDEBUG enables line numbers and other error helpers in the generated expr-bison.c. These shouldn't be generated when debugging isn't enabled. Signed-off-by: Ian Rogers <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf parse-events: Make YYDEBUG dependent on doing a debug buildIan Rogers1-0/+2
YYDEBUG enables line numbers and other error helpers in the generated parse-events-bison.c. These shouldn't be generated when debugging isn't enabled. Signed-off-by: Ian Rogers <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf parse-events: Remove unused header filesIan Rogers1-3/+0
The fnmatch header is now used in the PMU matching logic in pmu.c. Signed-off-by: Ian Rogers <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf util: Add a function for replacing characters in a stringJames Clark2-0/+49
It finds all occurrences of a single character and replaces them with a multi character string. This will be used in a test in a following commit. Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: James Clark <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Chen Zhongjin <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Haixin Yu <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Liam Howlett <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Jihong <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf list pfm: Retry supported test with exclude_kernelIan Rogers1-1/+14
With paranoia set at 2 evsel__open will fail with EACCES for non-root users. To avoid this stopping libpfm4 events from being printed, retry with exclude_kernel enabled - copying the regular is_event_supported test. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kang Minchul <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf list: Avoid a hardcoded cpu PMU nameIan Rogers1-11/+17
Use the first core PMU instead. On a Raspberry Pi, before: $ perf list ... cpu/t1=v1[,t2=v2,t3 ...]/modifier [Raw hardware event descriptor] [(see 'man perf-list' on how to encode it)] ... After: $ perf list ... armv8_cortex_a72/t1=v1[,t2=v2,t3 ...]/modifier [Raw hardware event descriptor] [(see 'man perf-list' on how to encode it)] ... ``` Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kang Minchul <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf lock contention: Add -G/--cgroup-filter optionNamhyung Kim5-2/+35
The -G/--cgroup-filter is to limit lock contention collection on the tasks in the specific cgroups only. $ sudo ./perf lock con -abt -G /user.slice/.../vte-spawn-52221fb8-b33f-4a52-b5c3-e35d1e6fc0e0.scope \ ./perf bench sched messaging # Running 'sched/messaging' benchmark: # 20 sender and receiver processes per group # 10 groups == 400 processes run Total time: 0.174 [sec] contended total wait max wait avg wait pid comm 4 114.45 us 60.06 us 28.61 us 214847 sched-messaging 2 111.40 us 60.84 us 55.70 us 214848 sched-messaging 2 106.09 us 59.42 us 53.04 us 214837 sched-messaging 1 81.70 us 81.70 us 81.70 us 214709 sched-messaging 68 78.44 us 6.83 us 1.15 us 214633 sched-messaging 69 73.71 us 2.69 us 1.07 us 214632 sched-messaging 4 72.62 us 60.83 us 18.15 us 214850 sched-messaging 2 71.75 us 67.60 us 35.88 us 214840 sched-messaging 2 69.29 us 67.53 us 34.65 us 214804 sched-messaging 2 69.00 us 68.23 us 34.50 us 214826 sched-messaging ... Export cgroup__new() function as it's needed from outside. Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Hao Luo <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf lock contention: Add --lock-cgroup optionNamhyung Kim4-9/+42
The --lock-cgroup option shows lock contention stats break down by cgroups. Add LOCK_AGGR_CGROUP mode and use it instead of use_cgroup field. $ sudo ./perf lock con -ab --lock-cgroup sleep 1 contended total wait max wait avg wait cgroup 8 15.70 us 6.34 us 1.96 us / 2 1.48 us 747 ns 738 ns /user.slice/.../app.slice/app-gnome-google\x2dchrome-6442.scope 1 848 ns 848 ns 848 ns /user.slice/.../session.slice/[email protected] 1 220 ns 220 ns 220 ns /user.slice/.../session.slice/pipewire-pulse.service For now, the cgroup mode only works with BPF (-b). Committer notes: Remove -g as it is used in the other tools with a clear meaning of collect/show callchains. As agreed with Namhyung off list. Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Hao Luo <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf lock contention: Prepare to handle cgroupsNamhyung Kim2-3/+32
Save cgroup info and display cgroup names if requested. This is a preparation for the next patch. Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Hao Luo <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf tools: Add read_all_cgroups() and __cgroup_find()Namhyung Kim2-8/+57
The read_all_cgroups() is to build a tree of cgroups in the system and users can look up a cgroup using __cgroup_find(). Committer notes: Had to do this to cover that #else block: -static inline u64 __read_cgroup_id(const char *path) { return -1ULL; } +static inline u64 __read_cgroup_id(const char *path __maybe_unused) { return -1ULL; } Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Hao Luo <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf kwork top: Add BPF-based statistics on softirq event supportYang Jihong2-0/+83
Use BPF to collect statistics on softirq events based on perf BPF skeletons. Example usage: # perf kwork top -b Starting trace, Hit <Ctrl+C> to stop and report ^C Total : 135445.704 ms, 8 cpus %Cpu(s): 28.35% id, 0.00% hi, 0.25% si %Cpu0 [|||||||||||||||||||| 69.85%] %Cpu1 [|||||||||||||||||||||| 74.10%] %Cpu2 [||||||||||||||||||||| 71.18%] %Cpu3 [|||||||||||||||||||| 69.61%] %Cpu4 [|||||||||||||||||||||| 74.05%] %Cpu5 [|||||||||||||||||||| 69.33%] %Cpu6 [|||||||||||||||||||| 69.71%] %Cpu7 [|||||||||||||||||||||| 73.77%] PID SPID %CPU RUNTIME COMMMAND ------------------------------------------------------------- 0 0 30.43 5271.005 ms [swapper/5] 0 0 30.17 5226.644 ms [swapper/3] 0 0 30.08 5210.257 ms [swapper/6] 0 0 29.89 5177.177 ms [swapper/0] 0 0 28.51 4938.672 ms [swapper/2] 0 0 25.93 4223.464 ms [swapper/7] 0 0 25.69 4181.411 ms [swapper/4] 0 0 25.63 4173.804 ms [swapper/1] 16665 16265 2.16 360.600 ms sched-messaging 16537 16265 2.05 356.275 ms sched-messaging 16503 16265 2.01 343.063 ms sched-messaging 16424 16265 1.97 336.876 ms sched-messaging 16580 16265 1.94 323.658 ms sched-messaging 16515 16265 1.92 321.616 ms sched-messaging 16659 16265 1.91 325.538 ms sched-messaging 16634 16265 1.88 327.766 ms sched-messaging 16454 16265 1.87 326.843 ms sched-messaging 16382 16265 1.87 322.591 ms sched-messaging 16642 16265 1.86 320.506 ms sched-messaging 16582 16265 1.86 320.164 ms sched-messaging 16315 16265 1.86 326.872 ms sched-messaging 16637 16265 1.85 323.766 ms sched-messaging 16506 16265 1.82 311.688 ms sched-messaging 16512 16265 1.81 304.643 ms sched-messaging 16560 16265 1.80 314.751 ms sched-messaging 16320 16265 1.80 313.405 ms sched-messaging 16442 16265 1.80 314.403 ms sched-messaging 16626 16265 1.78 295.380 ms sched-messaging 16600 16265 1.77 309.444 ms sched-messaging 16550 16265 1.76 301.161 ms sched-messaging 16525 16265 1.75 296.560 ms sched-messaging 16314 16265 1.75 298.338 ms sched-messaging 16595 16265 1.74 304.390 ms sched-messaging 16555 16265 1.74 287.564 ms sched-messaging 16520 16265 1.74 295.734 ms sched-messaging 16507 16265 1.73 293.956 ms sched-messaging 16593 16265 1.72 296.443 ms sched-messaging 16531 16265 1.72 299.950 ms sched-messaging 16281 16265 1.72 301.339 ms sched-messaging <SNIP> Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Yang Jihong <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sandipan Das <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf kwork top: Add BPF-based statistics on hardirq event supportYang Jihong2-0/+90
Use BPF to collect statistics on hardirq events based on perf BPF skeletons. Example usage: # perf kwork top -k sched,irq -b Starting trace, Hit <Ctrl+C> to stop and report ^C Total : 136717.945 ms, 8 cpus %Cpu(s): 17.10% id, 0.01% hi, 0.00% si %Cpu0 [||||||||||||||||||||||||| 84.26%] %Cpu1 [||||||||||||||||||||||||| 84.77%] %Cpu2 [|||||||||||||||||||||||| 83.22%] %Cpu3 [|||||||||||||||||||||||| 80.37%] %Cpu4 [|||||||||||||||||||||||| 81.49%] %Cpu5 [||||||||||||||||||||||||| 84.68%] %Cpu6 [||||||||||||||||||||||||| 84.48%] %Cpu7 [|||||||||||||||||||||||| 80.21%] PID SPID %CPU RUNTIME COMMMAND ------------------------------------------------------------- 0 0 19.78 3482.833 ms [swapper/7] 0 0 19.62 3454.219 ms [swapper/3] 0 0 18.50 3258.339 ms [swapper/4] 0 0 16.76 2842.749 ms [swapper/2] 0 0 15.71 2627.905 ms [swapper/0] 0 0 15.51 2598.206 ms [swapper/6] 0 0 15.31 2561.820 ms [swapper/5] 0 0 15.22 2548.708 ms [swapper/1] 13253 13018 2.95 513.108 ms sched-messaging 13092 13018 2.67 454.167 ms sched-messaging 13401 13018 2.66 454.790 ms sched-messaging 13240 13018 2.64 454.587 ms sched-messaging 13251 13018 2.61 442.273 ms sched-messaging 13075 13018 2.61 438.932 ms sched-messaging 13220 13018 2.60 443.245 ms sched-messaging 13235 13018 2.59 443.268 ms sched-messaging 13222 13018 2.50 426.344 ms sched-messaging 13410 13018 2.49 426.191 ms sched-messaging 13228 13018 2.46 425.121 ms sched-messaging 13379 13018 2.38 409.950 ms sched-messaging 13236 13018 2.37 413.159 ms sched-messaging 13095 13018 2.36 396.572 ms sched-messaging 13325 13018 2.35 408.089 ms sched-messaging 13242 13018 2.32 394.750 ms sched-messaging 13386 13018 2.31 396.997 ms sched-messaging 13046 13018 2.29 383.833 ms sched-messaging 13109 13018 2.28 388.482 ms sched-messaging 13388 13018 2.28 393.576 ms sched-messaging 13238 13018 2.26 388.487 ms sched-messaging <SNIP> Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Yang Jihong <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sandipan Das <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf kwork top: Implements BPF-based cpu usage statisticsYang Jihong4-0/+500
Use BPF to collect statistics on the CPU usage based on perf BPF skeletons. Example usage: # perf kwork top -h Usage: perf kwork top [<options>] -b, --use-bpf Use BPF to measure task cpu usage -C, --cpu <cpu> list of cpus to profile -i, --input <file> input file name -n, --name <name> event name to profile -s, --sort <key[,key2...]> sort by key(s): rate, runtime, tid --time <str> Time span for analysis (start,stop) # # perf kwork -k sched top -b Starting trace, Hit <Ctrl+C> to stop and report ^C Total : 160702.425 ms, 8 cpus %Cpu(s): 36.00% id, 0.00% hi, 0.00% si %Cpu0 [|||||||||||||||||| 61.66%] %Cpu1 [|||||||||||||||||| 61.27%] %Cpu2 [||||||||||||||||||| 66.40%] %Cpu3 [|||||||||||||||||| 61.28%] %Cpu4 [|||||||||||||||||| 61.82%] %Cpu5 [||||||||||||||||||||||| 77.41%] %Cpu6 [|||||||||||||||||| 61.73%] %Cpu7 [|||||||||||||||||| 63.25%] PID SPID %CPU RUNTIME COMMMAND ------------------------------------------------------------- 0 0 38.72 8089.463 ms [swapper/1] 0 0 38.71 8084.547 ms [swapper/3] 0 0 38.33 8007.532 ms [swapper/0] 0 0 38.26 7992.985 ms [swapper/6] 0 0 38.17 7971.865 ms [swapper/4] 0 0 36.74 7447.765 ms [swapper/7] 0 0 33.59 6486.942 ms [swapper/2] 0 0 22.58 3771.268 ms [swapper/5] 9545 9351 2.48 447.136 ms sched-messaging 9574 9351 2.09 418.583 ms sched-messaging 9724 9351 2.05 372.407 ms sched-messaging 9531 9351 2.01 368.804 ms sched-messaging 9512 9351 2.00 362.250 ms sched-messaging 9514 9351 1.95 357.767 ms sched-messaging 9538 9351 1.86 384.476 ms sched-messaging 9712 9351 1.84 386.490 ms sched-messaging 9723 9351 1.83 380.021 ms sched-messaging 9722 9351 1.82 382.738 ms sched-messaging 9517 9351 1.81 354.794 ms sched-messaging 9559 9351 1.79 344.305 ms sched-messaging 9725 9351 1.77 365.315 ms sched-messaging <SNIP> # perf kwork -k sched top -b -n perf Starting trace, Hit <Ctrl+C> to stop and report ^C Total : 151563.332 ms, 8 cpus %Cpu(s): 26.49% id, 0.00% hi, 0.00% si %Cpu0 [ 0.01%] %Cpu1 [ 0.00%] %Cpu2 [ 0.00%] %Cpu3 [ 0.00%] %Cpu4 [ 0.00%] %Cpu5 [ 0.00%] %Cpu6 [ 0.00%] %Cpu7 [ 0.00%] PID SPID %CPU RUNTIME COMMMAND ------------------------------------------------------------- 9754 9754 0.01 2.303 ms perf # # perf kwork -k sched top -b -C 2,3,4 Starting trace, Hit <Ctrl+C> to stop and report ^C Total : 48016.721 ms, 3 cpus %Cpu(s): 27.82% id, 0.00% hi, 0.00% si %Cpu2 [|||||||||||||||||||||| 74.68%] %Cpu3 [||||||||||||||||||||| 71.06%] %Cpu4 [||||||||||||||||||||| 70.91%] PID SPID %CPU RUNTIME COMMMAND ------------------------------------------------------------- 0 0 29.08 4734.998 ms [swapper/4] 0 0 28.93 4710.029 ms [swapper/3] 0 0 25.31 3912.363 ms [swapper/2] 10248 10158 1.62 264.931 ms sched-messaging 10253 10158 1.62 265.136 ms sched-messaging 10158 10158 1.60 263.013 ms bash 10360 10158 1.49 243.639 ms sched-messaging 10413 10158 1.48 238.604 ms sched-messaging 10531 10158 1.47 234.067 ms sched-messaging 10400 10158 1.47 240.631 ms sched-messaging 10355 10158 1.47 230.586 ms sched-messaging 10377 10158 1.43 234.835 ms sched-messaging 10526 10158 1.42 232.045 ms sched-messaging 10298 10158 1.41 222.396 ms sched-messaging 10410 10158 1.38 221.853 ms sched-messaging 10364 10158 1.38 226.042 ms sched-messaging 10480 10158 1.36 213.633 ms sched-messaging 10370 10158 1.36 223.620 ms sched-messaging 10553 10158 1.34 217.169 ms sched-messaging 10291 10158 1.34 211.516 ms sched-messaging 10251 10158 1.34 218.813 ms sched-messaging 10522 10158 1.33 218.498 ms sched-messaging 10288 10158 1.33 216.787 ms sched-messaging <SNIP> Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Yang Jihong <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sandipan Das <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf kwork top: Add statistics on softirq event supportYang Jihong1-0/+1
Calculate the runtime of the softirq events and subtract it from the corresponding task runtime to improve the precision. Example usage: # perf kwork -k sched,irq,softirq record -- perf record -e cpu-clock -o perf_record.data -a sleep 10 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.467 MB perf_record.data (7154 samples) ] [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 2.152 MB perf.data (22846 samples) ] # perf kwork top Total : 136601.588 ms, 8 cpus %Cpu(s): 95.66% id, 0.04% hi, 0.05% si %Cpu0 [ 0.02%] %Cpu1 [ 0.01%] %Cpu2 [| 4.61%] %Cpu3 [ 0.04%] %Cpu4 [ 0.01%] %Cpu5 [||||| 17.31%] %Cpu6 [ 0.51%] %Cpu7 [||| 11.42%] PID %CPU RUNTIME COMMMAND ---------------------------------------------------- 0 99.98 17073.515 ms swapper/4 0 99.98 17072.173 ms swapper/1 0 99.93 17064.229 ms swapper/3 0 99.62 17011.013 ms swapper/0 0 99.47 16985.180 ms swapper/6 0 95.17 16250.874 ms swapper/2 0 88.51 15111.684 ms swapper/7 0 82.62 14108.577 ms swapper/5 4342 33.00 5644.045 ms perf 4344 0.43 74.351 ms perf 16 0.13 22.296 ms rcu_preempt 4345 0.05 10.093 ms perf 4343 0.05 8.769 ms perf 4341 0.02 4.882 ms perf 4095 0.02 4.605 ms kworker/7:1 75 0.02 4.261 ms kworker/2:1 120 0.01 1.909 ms systemd-journal 98 0.01 2.540 ms jbd2/sda-8 61 0.01 3.404 ms kcompactd0 667 0.01 2.542 ms kworker/u16:2 4340 0.00 1.052 ms kworker/7:2 97 0.00 0.489 ms kworker/7:1H 51 0.00 0.209 ms ksoftirqd/7 50 0.00 0.646 ms migration/7 76 0.00 0.753 ms kworker/6:1 45 0.00 0.572 ms migration/6 87 0.00 0.145 ms kworker/5:1H 73 0.00 0.596 ms kworker/5:1 41 0.00 0.041 ms ksoftirqd/5 40 0.00 0.718 ms migration/5 64 0.00 0.115 ms kworker/4:1 35 0.00 0.556 ms migration/4 353 0.00 2.600 ms sshd 74 0.00 0.205 ms kworker/3:1 33 0.00 1.576 ms kworker/3:0H 30 0.00 0.996 ms migration/3 26 0.00 1.665 ms ksoftirqd/2 25 0.00 0.662 ms migration/2 397 0.00 0.057 ms kworker/1:1 20 0.00 1.005 ms migration/1 2909 0.00 1.053 ms kworker/0:2 17 0.00 0.720 ms migration/0 15 0.00 0.039 ms ksoftirqd/0 Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Yang Jihong <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sandipan Das <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf kwork top: Add statistics on hardirq event supportYang Jihong1-0/+1
Calculate the runtime of the hardirq events and subtract it from the corresponding task runtime to improve the precision. Example usage: # perf kwork -k sched,irq record -- perf record -o perf_record.data -a sleep 10 [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 1.054 MB perf_record.data (18019 samples) ] [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.798 MB perf.data (16334 samples) ] # # perf kwork top Total : 139240.869 ms, 8 cpus %Cpu(s): 94.91% id, 0.05% hi %Cpu0 [ 0.05%] %Cpu1 [| 5.00%] %Cpu2 [ 0.43%] %Cpu3 [ 0.57%] %Cpu4 [ 1.19%] %Cpu5 [|||||| 20.46%] %Cpu6 [ 0.48%] %Cpu7 [||| 12.10%] PID %CPU RUNTIME COMMMAND ---------------------------------------------------- 0 99.54 17325.622 ms swapper/2 0 99.54 17327.527 ms swapper/0 0 99.51 17319.909 ms swapper/6 0 99.42 17304.934 ms swapper/3 0 98.80 17197.385 ms swapper/4 0 94.99 16534.991 ms swapper/1 0 87.89 15295.264 ms swapper/7 0 79.53 13843.182 ms swapper/5 4252 36.50 6361.768 ms perf 4256 1.17 205.215 ms bash 151 0.53 93.298 ms systemd-resolve 4254 0.39 69.468 ms perf 423 0.34 59.368 ms bash 412 0.29 51.204 ms sshd 249 0.20 35.288 ms sd-resolve 16 0.17 30.287 ms rcu_preempt 153 0.09 17.266 ms systemd-timesyn 1 0.09 17.078 ms systemd 4253 0.07 12.457 ms perf 4255 0.06 11.559 ms perf 4234 0.03 6.105 ms kworker/u16:1 69 0.03 6.259 ms kworker/1:1H 4251 0.02 4.615 ms perf 4095 0.02 4.890 ms kworker/7:1 61 0.02 4.005 ms kcompactd0 75 0.02 3.546 ms kworker/2:1 97 0.01 3.106 ms kworker/7:1H 98 0.01 1.995 ms jbd2/sda-8 4088 0.01 1.779 ms kworker/u16:3 2909 0.01 1.795 ms kworker/0:2 4246 0.00 1.117 ms kworker/7:2 51 0.00 0.327 ms ksoftirqd/7 50 0.00 0.369 ms migration/7 102 0.00 0.160 ms kworker/6:1H 76 0.00 0.609 ms kworker/6:1 45 0.00 0.779 ms migration/6 87 0.00 0.504 ms kworker/5:1H 73 0.00 1.130 ms kworker/5:1 41 0.00 0.152 ms ksoftirqd/5 40 0.00 0.702 ms migration/5 64 0.00 0.316 ms kworker/4:1 35 0.00 0.791 ms migration/4 353 0.00 2.211 ms sshd 74 0.00 0.272 ms kworker/3:1 30 0.00 0.819 ms migration/3 25 0.00 0.784 ms migration/2 397 0.00 0.539 ms kworker/1:1 21 0.00 1.600 ms ksoftirqd/1 20 0.00 0.773 ms migration/1 17 0.00 1.682 ms migration/0 15 0.00 0.076 ms ksoftirqd/0 Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Yang Jihong <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sandipan Das <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf evsel: Add evsel__intval_common() helperYang Jihong2-0/+15
Add evsel__intval_common() helper to search for common_field in tracepoint format. Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Yang Jihong <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sandipan Das <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf kwork top: Introduce new top utilityYang Jihong1-0/+22
Some common tools for collecting statistics on CPU usage, such as top, obtain statistics from timer interrupt sampling, and then periodically read statistics from /proc/stat. This method has some deviations: 1. In the tick interrupt, the time between the last tick and the current tick is counted in the current task. However, the task may be running only part of the time. 2. For each task, the top tool periodically reads the /proc/{PID}/status information. For tasks with a short life cycle, it may be missed. In conclusion, the top tool cannot accurately collect statistics on the CPU usage and running time of tasks. The statistical method based on sched_switch tracepoint can accurately calculate the CPU usage of all tasks. This method is applicable to scenarios where performance comparison data is of high precision. Example usage: # perf kwork Usage: perf kwork [<options>] {record|report|latency|timehist|top} -D, --dump-raw-trace dump raw trace in ASCII -f, --force don't complain, do it -k, --kwork <kwork> list of kwork to profile (irq, softirq, workqueue, sched, etc) -v, --verbose be more verbose (show symbol address, etc) # perf kwork -k sched record -- perf bench sched messaging -g 1 -l 10000 # Running 'sched/messaging' benchmark: # 20 sender and receiver processes per group # 1 groups == 40 processes run Total time: 14.074 [sec] [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 15.886 MB perf.data (129472 samples) ] # perf kwork top Total : 115708.178 ms, 8 cpus %Cpu(s): 9.78% id %Cpu0 [||||||||||||||||||||||||||| 90.55%] %Cpu1 [||||||||||||||||||||||||||| 90.51%] %Cpu2 [|||||||||||||||||||||||||| 88.57%] %Cpu3 [||||||||||||||||||||||||||| 91.18%] %Cpu4 [||||||||||||||||||||||||||| 91.09%] %Cpu5 [||||||||||||||||||||||||||| 90.88%] %Cpu6 [|||||||||||||||||||||||||| 88.64%] %Cpu7 [||||||||||||||||||||||||||| 90.28%] PID %CPU RUNTIME COMMMAND ---------------------------------------------------- 4113 22.23 3221.547 ms sched-messaging 4105 21.61 3131.495 ms sched-messaging 4119 21.53 3120.937 ms sched-messaging 4103 21.39 3101.614 ms sched-messaging 4106 21.37 3095.209 ms sched-messaging 4104 21.25 3077.269 ms sched-messaging 4115 21.21 3073.188 ms sched-messaging 4109 21.18 3069.022 ms sched-messaging 4111 20.78 3010.033 ms sched-messaging 4114 20.74 3007.073 ms sched-messaging 4108 20.73 3002.137 ms sched-messaging 4107 20.47 2967.292 ms sched-messaging 4117 20.39 2955.335 ms sched-messaging 4112 20.34 2947.080 ms sched-messaging 4118 20.32 2942.519 ms sched-messaging 4121 20.23 2929.865 ms sched-messaging 4110 20.22 2930.078 ms sched-messaging 4122 20.15 2919.542 ms sched-messaging 4120 19.77 2866.032 ms sched-messaging 4116 19.72 2857.660 ms sched-messaging 4127 16.19 2346.334 ms sched-messaging 4142 15.86 2297.600 ms sched-messaging 4141 15.62 2262.646 ms sched-messaging 4136 15.41 2231.408 ms sched-messaging 4130 15.38 2227.008 ms sched-messaging 4129 15.31 2217.692 ms sched-messaging 4126 15.21 2201.711 ms sched-messaging 4139 15.19 2200.722 ms sched-messaging 4137 15.10 2188.633 ms sched-messaging 4134 15.06 2182.082 ms sched-messaging 4132 15.02 2177.530 ms sched-messaging 4131 14.73 2131.973 ms sched-messaging 4125 14.68 2125.439 ms sched-messaging 4128 14.66 2122.255 ms sched-messaging 4123 14.65 2122.113 ms sched-messaging 4135 14.56 2107.144 ms sched-messaging 4133 14.51 2103.549 ms sched-messaging 4124 14.27 2066.671 ms sched-messaging 4140 14.17 2052.251 ms sched-messaging 4138 13.81 2000.361 ms sched-messaging 0 11.42 1652.009 ms swapper/2 0 11.35 1641.694 ms swapper/6 0 9.71 1405.108 ms swapper/7 0 9.48 1372.338 ms swapper/1 0 9.44 1366.013 ms swapper/0 0 9.11 1318.382 ms swapper/5 0 8.90 1287.582 ms swapper/4 0 8.81 1274.356 ms swapper/3 4100 2.61 379.328 ms perf 4101 1.16 169.487 ms perf-exec 151 0.65 94.741 ms systemd-resolve 249 0.36 53.030 ms sd-resolve 153 0.14 21.405 ms systemd-timesyn 1 0.10 16.200 ms systemd 16 0.09 15.785 ms rcu_preempt 4102 0.06 9.727 ms perf 4095 0.03 5.464 ms kworker/7:1 98 0.02 3.231 ms jbd2/sda-8 353 0.02 4.115 ms sshd 75 0.02 3.889 ms kworker/2:1 73 0.01 1.552 ms kworker/5:1 64 0.01 1.591 ms kworker/4:1 74 0.01 1.952 ms kworker/3:1 61 0.01 2.608 ms kcompactd0 397 0.01 1.602 ms kworker/1:1 69 0.01 1.817 ms kworker/1:1H 10 0.01 2.553 ms kworker/u16:0 2909 0.01 2.684 ms kworker/0:2 1211 0.00 0.426 ms kworker/7:0 97 0.00 0.153 ms kworker/7:1H 51 0.00 0.100 ms ksoftirqd/7 120 0.00 0.856 ms systemd-journal 76 0.00 1.414 ms kworker/6:1 46 0.00 0.246 ms ksoftirqd/6 45 0.00 0.164 ms migration/6 41 0.00 0.098 ms ksoftirqd/5 40 0.00 0.207 ms migration/5 86 0.00 1.339 ms kworker/4:1H 36 0.00 0.252 ms ksoftirqd/4 35 0.00 0.090 ms migration/4 31 0.00 0.156 ms ksoftirqd/3 30 0.00 0.073 ms migration/3 26 0.00 0.180 ms ksoftirqd/2 25 0.00 0.085 ms migration/2 21 0.00 0.106 ms ksoftirqd/1 20 0.00 0.118 ms migration/1 302 0.00 1.440 ms systemd-logind 17 0.00 0.132 ms migration/0 15 0.00 0.255 ms ksoftirqd/0 Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Yang Jihong <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sandipan Das <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf kwork: Add sched record supportYang Jihong1-0/+5
The kwork_class type of sched is added to support recording and parsing of sched_switch events. As follows: # perf kwork -h Usage: perf kwork [<options>] {record|report|latency|timehist} -D, --dump-raw-trace dump raw trace in ASCII -f, --force don't complain, do it -k, --kwork <kwork> list of kwork to profile (irq, softirq, workqueue, sched, etc) -v, --verbose be more verbose (show symbol address, etc) # perf kwork -k sched record true [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.083 MB perf.data (47 samples) ] # perf evlist sched:sched_switch dummy:HG # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Yang Jihong <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sandipan Das <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf kwork: Add `kwork` and `src_type` to work_init() for 'struct kwork_class'Yang Jihong1-2/+4
To support different types of reports, two parameters `struct perf_kwork * kwork` and `enum kwork_trace_type src_type` are added to work_init() of struct kwork_class for initialization in different scenarios. No functional change intended. Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Yang Jihong <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sandipan Das <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf evlist: Add evlist__findnew_tracking_event() helperYang Jihong2-0/+19
Currently, intel-bts, intel-pt, and arm-spe may add tracking event to the evlist. We may need to search for the tracking event for some settings. Therefore, add evlist__findnew_tracking_event() helper. If system_wide is true, evlist__findnew_tracking_event() set the cpu map of the evsel to all online CPUs. Signed-off-by: Yang Jihong <[email protected]> Acked-by: Adrian Hunter <[email protected]> Tested-by: Ravi Bangoria <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Richter <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-11perf tools: Update copy of libbpf's hashmap.cArnaldo Carvalho de Melo1-10/+0
To pick the changes in: a3e7e6b17946f48b ("libbpf: Remove HASHMAP_INIT static initialization helper") That don't entail any changes in tools/perf. This addresses this perf build warning: Warning: Kernel ABI header differences: diff -u tools/perf/util/hashmap.h tools/lib/bpf/hashmap.h Not a kernel ABI, its just that this uses the mechanism in place for checking kernel ABI files drift. Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-11perf parse-events: Introduce 'struct parse_events_terms'Ian Rogers5-120/+134
parse_events_terms() existed in function names but was passed a 'struct list_head'. As many parse_events functions take an evsel_config list as well as a parse_event_term list, and the naming head_terms and head_config is inconsistent, there's a potential to switch the lists and get errors. Introduce a 'struct parse_events_terms', that just wraps a list_head, to avoid this. Add the regular init/exit functions and transition the code to use them. Reviewed-by: James Clark <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-11perf parse-events: Copy fewer term listsIan Rogers3-67/+65
When trying to add events to multiple PMUs the term list is copied first as adding the event will rewrite the event's name term into the sysfs and/or json encoding terms (see perf_pmu__check_alias). Change the parse events add API so the passed in term list is const, then copy the list when modification is necessary. Reviewed-by: James Clark <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-11perf parse-events: Avoid enum castsIan Rogers2-15/+12
Add term_type to union of values returned by the lexer to avoid casts to and from an integer. Reviewed-by: James Clark <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-11perf parse-events: Tidy up str parameterIan Rogers2-7/+8
Add a const and rename str to event_name. Reviewed-by: James Clark <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-11perf parse-events: Remove unnecessary __maybe_unusedIan Rogers1-4/+2
The parameter head_terms is always used in get_config_terms. Reviewed-by: James Clark <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-11perf machine: Use true and false for bool variableJiapeng Chong1-3/+1
Fix the following coccicheck warnings: ./tools/perf/util/machine.c:2000:9-10: WARNING: return of 0/1 in function 'symbol__match_regex' with return type bool. Committer notes: Found this in the pile, it was already returning bool, but this patch simplifies it further, from 3 lines to just 1. Reported-by: Abaci Robot <[email protected]> Signed-off-by: Jiapeng Chong <[email protected]> Suggested-by: David Laight <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Fastabend <[email protected]> Cc: KP Singh <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: Yonghong Song <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/1614247483-102665-1-git-send-email-jiapeng.chong@linux.alibaba.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-09Merge tag 'perf-tools-for-v6.6-1-2023-09-05' of ↵Linus Torvalds77-5647/+3126
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools updates from Arnaldo Carvalho de Melo: "perf tools maintainership: - Add git information for perf-tools and perf-tools-next trees and branches to the MAINTAINERS file. That is where development now takes place and myself and Namhyung Kim have write access, more people to come as we emulate other maintainer groups. perf record: - Record kernel data maps when 'perf record --data' is used, so that global variables can be resolved and used in tools that do data profiling. perf trace: - Remove the old, experimental support for BPF events in which a .c file was passed as an event: "perf trace -e hello.c" to then get compiled and loaded. The only known usage for that, that shipped with the kernel as an example for such events, augmented the raw_syscalls tracepoints and was converted to a libbpf skeleton, reusing all the user space components and the BPF code connected to the syscalls. In the end just the way to glue the BPF part and the user space type beautifiers changed, now being performed by libbpf skeletons. The next step is to use BTF to do pretty printing of all syscall types, as discussed with Alan Maguire and others. Now, on a perf built with BUILD_BPF_SKEL=1 we get most if not all path/filenames/strings, some of the networking data structures, perf_event_attr, etc, i.e. systemwide tracing of nanosleep calls and perf_event_open syscalls while 'perf stat' runs 'sleep' for 5 seconds: # perf trace -a -e *nanosleep,perf* perf stat -e cycles,instructions sleep 5 0.000 ( 9.034 ms): perf/327641 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0 (PERF_COUNT_HW_CPU_CYCLES), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 327642 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 3 9.039 ( 0.006 ms): perf/327641 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0x1 (PERF_COUNT_HW_INSTRUCTIONS), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 327642 (perf-exec), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 4 ? ( ): gpm/991 ... [continued]: clock_nanosleep()) = 0 10.133 ( ): sleep/327642 clock_nanosleep(rqtp: { .tv_sec: 5, .tv_nsec: 0 }, rmtp: 0x7ffd36f83ed0) ... ? ( ): pool-gsd-smart/3051 ... [continued]: clock_nanosleep()) = 0 30.276 ( ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ... 223.215 (1000.430 ms): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) = 0 30.276 (2000.394 ms): gpm/991 ... [continued]: clock_nanosleep()) = 0 1230.814 ( ): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) ... 1230.814 (1000.404 ms): pool-gsd-smart/3051 ... [continued]: clock_nanosleep()) = 0 2030.886 ( ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ... 2237.709 (1000.153 ms): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) = 0 ? ( ): crond/1172 ... [continued]: clock_nanosleep()) = 0 3242.699 ( ): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) ... 2030.886 (2000.385 ms): gpm/991 ... [continued]: clock_nanosleep()) = 0 3728.078 ( ): crond/1172 clock_nanosleep(rqtp: { .tv_sec: 60, .tv_nsec: 0 }, rmtp: 0x7ffe0971dcf0) ... 3242.699 (1000.158 ms): pool-gsd-smart/3051 ... [continued]: clock_nanosleep()) = 0 4031.409 ( ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ... 10.133 (5000.375 ms): sleep/327642 ... [continued]: clock_nanosleep()) = 0 Performance counter stats for 'sleep 5': 2,617,347 cycles 1,855,997 instructions # 0.71 insn per cycle 5.002282128 seconds time elapsed 0.000855000 seconds user 0.000852000 seconds sys perf annotate: - Building with binutils' libopcode now is opt-in (BUILD_NONDISTRO=1) for licensing reasons, and we missed a build test on tools/perf/tests makefile. Since we now default to NDEBUG=1, we ended up segfaulting when building with BUILD_NONDISTRO=1 because a needed initialization routine was being "error checked" via an assert. Fix it by explicitly checking the result and aborting instead if it fails. We better back propagate the error, but at least 'perf annotate' on samples collected for a BPF program is back working when perf is built with BUILD_NONDISTRO=1. perf report/top: - Add back TUI hierarchy mode header, that is seen when using 'perf report/top --hierarchy'. - Fix the number of entries for 'e' key in the TUI that was preventing navigation of lines when expanding an entry. perf report/script: - Support cross platform register handling, allowing a perf.data file collected on one architecture to have registers sampled correctly displayed when analysis tools such as 'perf report' and 'perf script' are used on a different architecture. - Fix handling of event attributes in pipe mode, i.e. when one uses: perf record -o - | perf report -i - When no perf.data files are used. - Handle files generated via pipe mode with a version of perf and then read also via pipe mode with a different version of perf, where the event attr record may have changed, use the record size field to properly support this version mismatch. perf probe: - Accessing global variables from uprobes isn't supported, make the error message state that instead of stating that some minimal kernel version is needed to have that feature. This seems just a tool limitation, the kernel probably has all that is needed. perf tests: - Fix a reference count related leak in the dlfilter v0 API where the result of a thread__find_symbol_fb() is not matched with an addr_location__exit() to drop the reference counts of the resolved components (machine, thread, map, symbol, etc). Add a dlfilter test to make sure that doesn't regresses. - Lots of fixes for the 'perf test' written in shell script related to problems found with the shellcheck utility. - Fixes for 'perf test' shell scripts testing features enabled when perf is built with BUILD_BPF_SKEL=1, such as 'perf stat' bpf counters. - Add perf record sample filtering test, things like the following example, that gets implemented as a BPF filter attached to the event: # perf record -e task-clock -c 10000 --filter 'ip < 0xffffffff00000000' - Improve the way the task_analyzer test checks if libtraceevent is linked, using 'perf version --build-options' instead of the more expensinve 'perf record -e "sched:sched_switch"'. - Add support for riscv in the mmap-basic test. (This went as well via the RiscV tree, same contents). libperf: - Implement riscv mmap support (This went as well via the RiscV tree, same contents). perf script: - New tool that converts perf.data files to the firefox profiler format so that one can use the visualizer at https://profiler.firefox.com/. Done by Anup Sharma as part of this year's Google Summer of Code. One can generate the output and upload it to the web interface but Anup also automated everything: perf script gecko -F 99 -a sleep 60 - Support syscall name parsing on arm64. - Print "cgroup" field on the same line as "comm". perf bench: - Add new 'uprobe' benchmark to measure the overhead of uprobes with/without BPF programs attached to it. - breakpoints are not available on power9, skip that test. perf stat: - Add #num_cpus_online literal to be used in 'perf stat' metrics, and add this extra 'perf test' check that exemplifies its purpose: TEST_ASSERT_VAL("#num_cpus_online", expr__parse(&num_cpus_online, ctx, "#num_cpus_online") == 0); TEST_ASSERT_VAL("#num_cpus", expr__parse(&num_cpus, ctx, "#num_cpus") == 0); TEST_ASSERT_VAL("#num_cpus >= #num_cpus_online", num_cpus >= num_cpus_online); Miscellaneous: - Improve tool startup time by lazily reading PMU, JSON, sysfs data. - Improve error reporting in the parsing of events, passing YYLTYPE to error routines, so that the output can show were the parsing error was found. - Add 'perf test' entries to check the parsing of events improvements. - Fix various leak for things detected by -fsanitize=address, mostly things that would be freed at tool exit, including: - Free evsel->filter on the destructor. - Allow tools to register a thread->priv destructor and use it in 'perf trace'. - Free evsel->priv in 'perf trace'. - Free string returned by synthesize_perf_probe_point() when the caller fails to do all it needs. - Adjust various compiler options to not consider errors some warnings when building with broken headers found in things like python, flex, bison, as we otherwise build with -Werror. Some for gcc, some for clang, some for some specific version of those, some for some specific version of flex or bison, or some specific combination of these components, bah. - Allow customization of clang options for BPF target, this helps building on gentoo where there are other oddities where BPF targets gets passed some compiler options intended for the native build, so building with WERROR=0 helps while these oddities are fixed. - Dont pass ERR_PTR() values to perf_session__delete() in 'perf top' and 'perf lock', fixing some segfaults when handling some odd failures. - Add LTO build option. - Fix format of unordered lists in the perf docs (tools/perf/Documentation) - Overhaul the bison files, using constructs such as YYNOMEM. - Remove unused tokens from the bison .y files. - Add more comments to various structs. - A few LoongArch enablement patches. Vendor events (JSON): - Add JSON metrics for Yitian 710 DDR (aarch64). Things like: EventName, BriefDescription visible_window_limit_reached_rd, "At least one entry in read queue reaches the visible window limit.", visible_window_limit_reached_wr, "At least one entry in write queue reaches the visible window limit.", op_is_dqsosc_mpc , "A DQS Oscillator MPC command to DRAM.", op_is_dqsosc_mrr , "A DQS Oscillator MRR command to DRAM.", op_is_tcr_mrr , "A Temperature Compensated Refresh(TCR) MRR command to DRAM.", - Add AmpereOne metrics (aarch64). - Update N2 and V2 metrics (aarch64) and events using Arm telemetry repo. - Update scale units and descriptions of common topdown metrics on aarch64. Things like: - "MetricExpr": "stall_slot_frontend / (#slots * cpu_cycles)", - "BriefDescription": "Frontend bound L1 topdown metric", + "MetricExpr": "100 * (stall_slot_frontend / (#slots * cpu_cycles))", + "BriefDescription": "This metric is the percentage of total slots that were stalled due to resource constraints in the frontend of the processor.", - Update events for intel: meteorlake to 1.04, sapphirerapids to 1.15, Icelake+ metric constraints. - Update files for the power10 platform" * tag 'perf-tools-for-v6.6-1-2023-09-05' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (217 commits) perf parse-events: Fix driver config term perf parse-events: Fixes relating to no_value terms perf parse-events: Fix propagation of term's no_value when cloning perf parse-events: Name the two term enums perf list: Don't print Unit for "default_core" perf vendor events intel: Fix modifier in tma_info_system_mem_parallel_reads for skylake perf dlfilter: Avoid leak in v0 API test use of resolve_address() perf metric: Add #num_cpus_online literal perf pmu: Remove str from perf_pmu_alias perf parse-events: Make common term list to strbuf helper perf parse-events: Minor help message improvements perf pmu: Avoid uninitialized use of alias->str perf jevents: Use "default_core" for events with no Unit perf test stat_bpf_counters_cgrp: Enhance perf stat cgroup BPF counter test perf test shell stat_bpf_counters: Fix test on Intel perf test shell record_bpf_filter: Skip 6.2 kernel libperf: Get rid of attr.id field perf tools: Convert to perf_record_header_attr_id() libperf: Add perf_record_header_attr_id() perf tools: Handle old data in PERF_RECORD_ATTR ...
2023-09-05perf parse-events: Fix driver config termIan Rogers1-0/+17
Inadvertently deleted in commit 30f4ade33d649aa0 ("perf tools: Revert enable indices setting syntax for BPF map"). Fixes: 30f4ade33d649aa0 ("perf tools: Revert enable indices setting syntax for BPF map") Reported-by: James Clark <[email protected]> Reviewed-by: James Clark <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-02perf parse-events: Fixes relating to no_value termsIan Rogers2-3/+3
A term may have no value in which case it is assumed to have a value of 1. It doesn't just apply to alias/event terms so change the parse_events_term__to_strbuf assert. Commit 99e7138eb7897aa0 ("perf tools: Fail on using multiple bits long terms without value") made it so that no_value terms could only be for a single bit. Prior to commit 64199ae4b8a3 ("perf parse-events: Fix propagation of term's no_value when cloning") this missed a test case where config1 had no_value. Fixes: 64199ae4b8a36038 ("perf parse-events: Fix propagation of term's no_value when cloning") Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-31perf parse-events: Fix propagation of term's no_value when cloningIan Rogers3-21/+19
The no_value field in 'struct parse_events_term' indicates that the val variable isn't used, the case for an event name. Cloning wasn't propagating this, making cloned event name terms appearing to have a constant assinged to them. Working around the bug would check for a value of 1 assigned to value, but then this meant a user value of 1 couldn't be differentiated causing the value to be lost in debug printing and perf list. The change fixes the cloning and updates the "val.num ==/!= 1" tests to use no_value instead. To better check the no_value is set appropriately parameter comments are added for constant values. This found that no_value wasn't set correctly in parse_events_multi_pmu_add, which matters now that no_value is used to indicate an event name. Fixes: 7a6e91644708d514 ("perf parse-events: Make common term list to strbuf helper") Fixes: 99e7138eb7897aa0 ("perf tools: Fail on using multiple bits long terms without value") Signed-off-by: Ian Rogers <[email protected]> Tested-by: Kan Liang <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-31perf parse-events: Name the two term enumsIan Rogers4-67/+187
Name the enums used by 'struct parse_events_term' to parse_events__term_val_type and parse_events__term_type. This allows greater compile time error checking. Fix -Wswitch related issues by explicitly listing all enum values prior to default. Add config_term_name to safely look up a parse_events__term_type name, bounds checking the array access first. Add documentation to 'struct parse_events_terms' and reorder to save space. Signed-off-by: Ian Rogers <[email protected]> Tested-by: Kan Liang <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-30perf dlfilter: Avoid leak in v0 API test use of resolve_address()Adrian Hunter1-0/+2
The introduction of reference counting causes the v0 API perf_dlfilter_fns.resolve_address() to leak. v2 API introduced perf_dlfilter_fns.al_cleanup() to prevent that. For the v0 API, avoid the leak by exiting the addr_location immediately, since the documentation makes it clear that pointers obtained via perf_dlfilter_fns are not necessarily valid (dereferenceable) after 'filter_event' and 'filter_event_early' return. Reported-by: kernel test robot <[email protected]> Signed-off-by: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Closes: https://lore.kernel.org/oe-lkp/[email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>