aboutsummaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2023-09-26perf kwork: Fix spelling mistake "Captuer" -> "Capture"Colin Ian King1-1/+1
There is a spelling mistake in a pr_debug message. Fix it. (I didn't see this one in the first spell check scan I ran). Signed-off-by: Colin Ian King <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-26perf evlist: Avoid frequency mode for the dummy eventIan Rogers1-2/+3
Dummy events are created with an attribute where the period and freq are zero. evsel__config will then see the uninitialized values and initialize them in evsel__default_freq_period. As fequency mode is used by default the dummy event would be set to use frequency mode. However, this has no effect on the dummy event but does cause unnecessary timers/interrupts. Avoid this overhead by setting the period to 1 for dummy events. evlist__add_aux_dummy calls evlist__add_dummy then sets freq=0 and period=1. This isn't necessary after this change and so the setting is removed. From Stephane: The dummy event is not counting anything. It is used to collect mmap records and avoid a race condition during the synthesize mmap phase of perf record. As such, it should not cause any overhead during active profiling. Yet, it did. Because of a bug the dummy event was programmed as a sampling event in frequency mode. Events in that mode incur more kernel overheads because on timer tick, the kernel has to look at the number of samples for each event and potentially adjust the sampling period to achieve the desired frequency. The dummy event was therefore adding a frequency event to task and ctx contexts we may otherwise not have any, e.g., perf record -a -e cpu/event=0x3c,period=10000000/. On each timer tick the perf_adjust_freq_unthr_context() is invoked and if ctx->nr_freq is non-zero, then the kernel will loop over ALL the events of the context looking for frequency mode ones. In doing, so it locks the context, and enable/disable the PMU of each hw event. If all the events of the context are in period mode, the kernel will have to traverse the list for nothing incurring overhead. The overhead is multiplied by a very large factor when this happens in a guest kernel. There is no need for the dummy event to be in frequency mode, it does not count anything and therefore should not cause extra overhead for no reason. Fixes: 5bae0250237f ("perf evlist: Introduce perf_evlist__new_dummy constructor") Reported-by: Stephane Eranian <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Acked-by: Adrian Hunter <[email protected]> Cc: Yang Jihong <[email protected]> Cc: Kan Liang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-26perf vendors events: Remove repeated word in commentsCharles Han3-3/+3
Remove the repeated word "of" in comments. Signed-off-by: Charles Han <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-26perf vendor events arm64: Fix for AmpereOne metricsIlkka Koskinen1-198/+220
This patch addresses review comments that were given for 705ed549148f ("perf vendor events arm64: Add AmpereOne metrics") but didn't make it to the original patch [1][2] Changes include: A fix for backend_memory formula, use of standard metrics when possible, using #slots, renaming metrics to avoid spaces in the names, and cleanup. [1] https://lore.kernel.org/linux-perf-users/[email protected]/ [2] https://lore.kernel.org/linux-perf-users/[email protected]/ Fixes: 705ed549148f ("perf vendor events arm64: Add AmpereOne metrics") Signed-off-by: Ilkka Koskinen <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Cc: James Clark <[email protected]> Cc: Will Deacon <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mike Leach <[email protected]> Cc: Dave Kleikamp <[email protected]> Cc: John Garry <[email protected]> Cc: D Scott Phillips <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-21perf test lock_contention.sh: Skip test if not enough CPUsVeronika Molnarova1-0/+6
Machines with less then 4 CPUs weren't consistently triggering lock events required for the test. Skip the test on those machines. The limit of 4 CPUs is set as it generates around 100 lock events for a test. Signed-off-by: Veronika Molnarova <[email protected]> Acked-by: Michael Petlan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-21perf test stat+shadow_stat.sh: Add threshold for rounding errorsVeronika Molnarova1-6/+24
The test was failing in specific scenarios due to imperfection of FP arithmetics. The `bc` command wasn't correctly rounding the result of division causing the failure. Replace the `bc` with `awk` which should work with more decimal places and add a threshold to catch any possible rounding errors. The acceptable rounding error is set to 0.01 when the test passes with a warning message. Signed-off-by: Veronika Molnarova <[email protected]> Acked-by: Michael Petlan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-19perf jevents: fix no member named 'entries' issueXu Yang1-2/+2
The struct "pmu_events_table" has been changed after commit 2e255b4f9f41 (perf jevents: Group events by PMU, 2023-08-23). So there doesn't exist 'entries' in pmu_events_table anymore. This will align the members with that commit. Othewise, below errors will be printed when run jevent.py: pmu-events/pmu-events.c:5485:26: error: ‘struct pmu_metrics_table’ has no member named ‘entries’ 5485 | .entries = pmu_metrics__freescale_imx8dxl_sys, Signed-off-by: Xu Yang <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Tested-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-18perf parse-events: Fix tracepoint name memory leakIan Rogers1-0/+1
Fuzzing found that an invalid tracepoint name would create a memory leak with an address sanitizer build: ``` $ perf stat -e '*:o/' true event syntax error: '*:o/' \___ parser error Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events ================================================================= ==59380==ERROR: LeakSanitizer: detected memory leaks Direct leak of 4 byte(s) in 2 object(s) allocated from: #0 0x7f38ac07077b in __interceptor_strdup ../../../../src/libsanitizer/asan/asan_interceptors.cpp:439 #1 0x55f2f41be73b in str util/parse-events.l:49 #2 0x55f2f41d08e8 in parse_events_lex util/parse-events.l:338 #3 0x55f2f41dc3b1 in parse_events_parse util/parse-events-bison.c:1464 #4 0x55f2f410b8b3 in parse_events__scanner util/parse-events.c:1822 #5 0x55f2f410d1b9 in __parse_events util/parse-events.c:2094 #6 0x55f2f410e57f in parse_events_option util/parse-events.c:2279 #7 0x55f2f4427b56 in get_value tools/lib/subcmd/parse-options.c:251 #8 0x55f2f4428d98 in parse_short_opt tools/lib/subcmd/parse-options.c:351 #9 0x55f2f4429d80 in parse_options_step tools/lib/subcmd/parse-options.c:539 #10 0x55f2f442acb9 in parse_options_subcommand tools/lib/subcmd/parse-options.c:654 #11 0x55f2f3ec99fc in cmd_stat tools/perf/builtin-stat.c:2501 #12 0x55f2f4093289 in run_builtin tools/perf/perf.c:322 #13 0x55f2f40937f5 in handle_internal_command tools/perf/perf.c:375 #14 0x55f2f4093bbd in run_argv tools/perf/perf.c:419 #15 0x55f2f409412b in main tools/perf/perf.c:535 SUMMARY: AddressSanitizer: 4 byte(s) leaked in 2 allocation(s). ``` Fix by adding the missing destructor. Fixes: 865582c3f48e ("perf tools: Adds the tracepoint name parsing support") Signed-off-by: Ian Rogers <[email protected]> Cc: He Kuang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-18perf test: Detect off-cpu support from build optionsIan Rogers1-1/+1
Use perf version to detect whether BPF skeletons were enabled in a build rather than a failing perf record. Signed-off-by: Ian Rogers <[email protected]> Tested-by: Namhyung Kim <[email protected]> Cc: James Clark <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Nick Terrell <[email protected]> Cc: Patrice Duroux <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Tiezhu Yang <[email protected]> Cc: Tom Rix <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-18perf test: Ensure EXTRA_TESTS is covered in build testIan Rogers1-0/+1
Add to run variable. Signed-off-by: Ian Rogers <[email protected]> Tested-by: Namhyung Kim <[email protected]> Cc: James Clark <[email protected]> Cc: Nick Terrell <[email protected]> Cc: Patrice Duroux <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Tiezhu Yang <[email protected]> Cc: Tom Rix <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-18perf test: Update build test for changed BPF skeleton defaultsIan Rogers1-3/+3
Fix a target name and set BUILD_BPF_SKEL to 0 rather than 1. Signed-off-by: Ian Rogers <[email protected]> Tested-by: Namhyung Kim <[email protected]> Cc: James Clark <[email protected]> Cc: Nick Terrell <[email protected]> Cc: Patrice Duroux <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Tiezhu Yang <[email protected]> Cc: Tom Rix <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-18perf build: Default BUILD_BPF_SKEL, warn/disable for missing depsIan Rogers2-33/+53
LIBBPF is dependent on zlib so move the NO_ZLIB and feature check early to avoid statically building when zlib is disabled. This avoids a linkage failure with perf and static libbpf when zlib isn't specified. Move BUILD_BPF_SKEL logic to one place and if not defined set BUILD_BPF_SKEL to 1. Detect dependencies of building with BPF skeletons and warn/disable if the dependencies aren't present. Change Makefile.perf to contain BPF skeleton logic dependent on the Makefile.config result and refresh the comment about BUILD_BPF_SKEL. Signed-off-by: Ian Rogers <[email protected]> Tested-by: Namhyung Kim <[email protected]> Cc: James Clark <[email protected]> Cc: Nick Terrell <[email protected]> Cc: Patrice Duroux <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Tiezhu Yang <[email protected]> Cc: Tom Rix <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-18perf version: Add status of bpf skeletonsIan Rogers1-0/+1
Add status for BPF skeletons, to see if a build has them enabled: ``` $ perf version --build-options perf version 6.6.rc1.g0381ae36d1a6 dwarf: [ OFF ] # HAVE_DWARF_SUPPORT dwarf_getlocations: [ OFF ] # HAVE_DWARF_GETLOCATIONS_SUPPORT syscall_table: [ on ] # HAVE_SYSCALL_TABLE_SUPPORT libbfd: [ OFF ] # HAVE_LIBBFD_SUPPORT debuginfod: [ OFF ] # HAVE_DEBUGINFOD_SUPPORT libelf: [ OFF ] # HAVE_LIBELF_SUPPORT libnuma: [ OFF ] # HAVE_LIBNUMA_SUPPORT numa_num_possible_cpus: [ OFF ] # HAVE_LIBNUMA_SUPPORT libperl: [ on ] # HAVE_LIBPERL_SUPPORT libpython: [ on ] # HAVE_LIBPYTHON_SUPPORT libslang: [ on ] # HAVE_SLANG_SUPPORT libcrypto: [ on ] # HAVE_LIBCRYPTO_SUPPORT libunwind: [ OFF ] # HAVE_LIBUNWIND_SUPPORT libdw-dwarf-unwind: [ OFF ] # HAVE_DWARF_SUPPORT zlib: [ on ] # HAVE_ZLIB_SUPPORT lzma: [ on ] # HAVE_LZMA_SUPPORT get_cpuid: [ on ] # HAVE_AUXTRACE_SUPPORT bpf: [ OFF ] # HAVE_LIBBPF_SUPPORT aio: [ on ] # HAVE_AIO_SUPPORT zstd: [ on ] # HAVE_ZSTD_SUPPORT libpfm4: [ on ] # HAVE_LIBPFM libtraceevent: [ on ] # HAVE_LIBTRACEEVENT bpf_skeletons: [ OFF ] # HAVE_BPF_SKEL ``` Signed-off-by: Ian Rogers <[email protected]> Tested-by: Namhyung Kim <[email protected]> Cc: James Clark <[email protected]> Cc: Nick Terrell <[email protected]> Cc: Patrice Duroux <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Tiezhu Yang <[email protected]> Cc: Tom Rix <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-18perf kwork top: Simplify bool conversionYang Li1-1/+1
./tools/perf/util/bpf_kwork_top.c:120:53-58: WARNING: conversion to bool not needed here Signed-off-by: Yang Li <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-17perf jevent: fix core dump on software events on s390Thomas Richter1-1/+1
Running commands such as # ./perf stat -e cs -- true Segmentation fault (core dumped) # ./perf stat -e cpu-clock-- true Segmentation fault (core dumped) # dump core. This should not happen as these events are defined even when no hardware PMU is available. Debugging this reveals this call chain: perf_pmus__find_by_type(type=1) +--> pmu_read_sysfs(core_only=false) +--> perf_pmu__find2(dirfd=3, name=0x152a113 "software") +--> perf_pmu__lookup(pmus=0x14f0568 <other_pmus>, dirfd=3, lookup_name=0x152a113 "software") +--> perf_pmu__find_events_table (pmu=0x1532130) Now the pmu is "software" and it tries to find a proper table generated by the pmu-event generation process for s390: # cd pmu-events/ # ./jevents.py s390 all /root/linux/tools/perf/pmu-events/arch |\ grep -E '^const struct pmu_table_entry' const struct pmu_table_entry pmu_events__cf_z10[] = { const struct pmu_table_entry pmu_events__cf_z13[] = { const struct pmu_table_entry pmu_metrics__cf_z13[] = { const struct pmu_table_entry pmu_events__cf_z14[] = { const struct pmu_table_entry pmu_metrics__cf_z14[] = { const struct pmu_table_entry pmu_events__cf_z15[] = { const struct pmu_table_entry pmu_metrics__cf_z15[] = { const struct pmu_table_entry pmu_events__cf_z16[] = { const struct pmu_table_entry pmu_metrics__cf_z16[] = { const struct pmu_table_entry pmu_events__cf_z196[] = { const struct pmu_table_entry pmu_events__cf_zec12[] = { const struct pmu_table_entry pmu_metrics__cf_zec12[] = { const struct pmu_table_entry pmu_events__test_soc_cpu[] = { const struct pmu_table_entry pmu_metrics__test_soc_cpu[] = { const struct pmu_table_entry pmu_events__test_soc_sys[] = { # However event "software" is not listed, as can be seen in the generated const struct pmu_events_map pmu_events_map[]. So in function perf_pmu__find_events_table(), the variable table is initialized to NULL, but never set to a proper value. The function scans all generated &pmu_events_map[] tables, but no table matches, because the tables are s390 CPU Measurement unit specific: i = 0; for (;;) { const struct pmu_events_map *map = &pmu_events_map[i++]; if (!map->arch) break; --> the maps are there because the build generated them if (!strcmp_cpuid_str(map->cpuid, cpuid)) { table = &map->event_table; break; } --> Since no matching CPU string the table var remains 0x0 } free(cpuid); if (!pmu) return table; --> The pmu is "software" so it exists and no return --> and here perf dies because table is 0x0 for (i = 0; i < table->num_pmus; i++) { ... } return NULL; Fix this and do not access the table variable. Instead return 0x0 which is the same return code when the for-loop was not successful. Output after: # ./perf stat -e cs -- true Performance counter stats for 'true': 0 cs 0.000853105 seconds time elapsed 0.000061000 seconds user 0.000827000 seconds sys # ./perf stat -e cpu-clock -- true Performance counter stats for 'true': 0.25 msec cpu-clock # 0.341 CPUs utilized 0.000728383 seconds time elapsed 0.000055000 seconds user 0.000706000 seconds sys # ./perf stat -e cycles -- true Performance counter stats for 'true': <not supported> cycles 0.000767298 seconds time elapsed 0.000055000 seconds user 0.000739000 seconds sys # Fixes: 7c52f10c0d4d8 ("perf pmu: Cache JSON events table") Signed-off-by: Thomas Richter <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-17perf pmu: Ensure all alias variables are initializedIan Rogers1-1/+1
Fix an error detected by memory sanitizer: ``` ==4033==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x55fb0fbedfc7 in read_alias_info tools/perf/util/pmu.c:457:6 #1 0x55fb0fbea339 in check_info_data tools/perf/util/pmu.c:1434:2 #2 0x55fb0fbea339 in perf_pmu__check_alias tools/perf/util/pmu.c:1504:9 #3 0x55fb0fbdca85 in parse_events_add_pmu tools/perf/util/parse-events.c:1429:32 #4 0x55fb0f965230 in parse_events_parse tools/perf/util/parse-events.y:299:6 #5 0x55fb0fbdf6b2 in parse_events__scanner tools/perf/util/parse-events.c:1822:8 #6 0x55fb0fbdf8c1 in __parse_events tools/perf/util/parse-events.c:2094:8 #7 0x55fb0fa8ffa9 in parse_events tools/perf/util/parse-events.h:41:9 #8 0x55fb0fa8ffa9 in test_event tools/perf/tests/parse-events.c:2393:8 #9 0x55fb0fa8f458 in test__pmu_events tools/perf/tests/parse-events.c:2551:15 #10 0x55fb0fa6d93f in run_test tools/perf/tests/builtin-test.c:242:9 #11 0x55fb0fa6d93f in test_and_print tools/perf/tests/builtin-test.c:271:8 #12 0x55fb0fa6d082 in __cmd_test tools/perf/tests/builtin-test.c:442:5 #13 0x55fb0fa6d082 in cmd_test tools/perf/tests/builtin-test.c:564:9 #14 0x55fb0f942720 in run_builtin tools/perf/perf.c:322:11 #15 0x55fb0f942486 in handle_internal_command tools/perf/perf.c:375:8 #16 0x55fb0f941dab in run_argv tools/perf/perf.c:419:2 #17 0x55fb0f941dab in main tools/perf/perf.c:535:3 ``` Fixes: 7b723dbb96e8 ("perf pmu: Be lazy about loading event info files from sysfs") Signed-off-by: Ian Rogers <[email protected]> Cc: James Clark <[email protected]> Cc: Kan Liang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-17perf jevents metric: Fix type of strcmp_cpuid_strIan Rogers1-2/+2
The parser wraps all strings as Events, so the input is an Event. Using a string would be bad as functions like Simplify are called on the arguments, which wouldn't be present on a string. Fixes: 9d5da30e4ae9 ("perf jevents: Add a new expression builtin strcmp_cpuid_str()") Signed-off-by: Ian Rogers <[email protected]> Cc: James Clark <[email protected]> Cc: Kajol Jain <[email protected]> Cc: John Garry <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-17perf trace: Avoid compile error wrt redefining boolIan Rogers1-0/+2
Make part of an existing TODO conditional to avoid the following build error: ``` tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c:26:14: error: cannot combine with previous 'char' declaration specifier 26 | typedef char bool; | ^ include/stdbool.h:20:14: note: expanded from macro 'bool' 20 | #define bool _Bool | ^ tools/perf/util/bpf_skel/augmented_raw_syscalls.bpf.c:26:1: error: typedef requires a name [-Werror,-Wmissing-declarations] 26 | typedef char bool; | ^~~~~~~~~~~~~~~~~ 2 errors generated. ``` Signed-off-by: Ian Rogers <[email protected]> Cc: Leo Yan <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-17perf bpf-prologue: Remove unused fileIan Rogers1-508/+0
Commit 3d6dfae88917 ("perf parse-events: Remove BPF event support") removed building bpf-prologue.c but failed to remove the actual file. Fixes: 3d6dfae88917 ("perf parse-events: Remove BPF event support") Signed-off-by: Ian Rogers <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-16perf test: Fix test-record-dummy-C0 failure for supported PERF_FORMAT_LOST ↵Yang Jihong1-1/+1
feature kernel For kernel that supports PERF_FORMAT_LOST, attr->read_format has PERF_FORMAT_LOST bit. Update expected value of attr->read_format of test-record-dummy-C0 for this scenario. Before: # ./perf test 17 -vv 17: Setup struct perf_event_attr : --- start --- test child forked, pid 1609441 <SNIP> running './tests/attr/test-record-dummy-C0' 'PERF_TEST_ATTR=/tmp/tmpm3s60aji ./perf record -o /tmp/tmpm3s60aji/perf.data --no-bpf-event -e dummy -C 0 kill >/dev/null 2>&1' ret '1', expected '1' expected read_format=4, got 20 FAILED './tests/attr/test-record-dummy-C0' - match failure test child finished with -1 ---- end ---- Setup struct perf_event_attr: FAILED! After: # ./perf test 17 -vv 17: Setup struct perf_event_attr : --- start --- test child forked, pid 1609441 <SNIP> running './tests/attr/test-record-dummy-C0' 'PERF_TEST_ATTR=/tmp/tmppa9vxcb7 ./perf record -o /tmp/tmppa9vxcb7/perf.data --no-bpf-event -e dummy -C 0 kill >/dev/null 2>&1' ret '1', expected '1' <SNIP> test child finished with 0 ---- end ---- Setup struct perf_event_attr: Ok Reported-and-Tested-by: Namhyung Kim <[email protected]> Signed-off-by: Yang Jihong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-15perf kwork: Fix spelling mistake "COMMMAND" -> "COMMAND"Colin Ian King1-1/+1
There is a spelling mistake in a literal string. Fix it. Signed-off-by: Colin Ian King <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-15perf annotate: Add more x86 mov instruction casesNamhyung Kim1-3/+6
Instructions with sign- and zero- extention like movsbl and movzwq were not handled properly. As it can check different size suffix (-b, -w, -l or -q) we can omit that and add the common parts even though some combinations are not possible. Reviewed-by: Ian Rogers <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-15perf pmu: Remove unused functionJames Clark3-16/+0
pmu_events_table__find() is no longer used so remove it and its Arm specific version. Signed-off-by: James Clark <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Reviewed-by: John Garry <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Will Deacon <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mike Leach <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Haixin Yu <[email protected]> Cc: Kan Liang <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-15perf pmus: Simplify perf_pmus__find_core_pmu()James Clark2-14/+8
Currently the while loop always either exits on the first iteration with a core PMU, or exits with NULL on heterogeneous systems or when not all CPUs are online. Both of the latter behaviors are undesirable for platforms other than Arm so simplify it to always return the first core PMU, or NULL if none exist. This behavior was depended on by the Arm version of pmu_metrics_table__find(), so the logic has been moved there instead. Signed-off-by: James Clark <[email protected]> Suggested-by: Ian Rogers <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Reviewed-by: John Garry <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Will Deacon <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mike Leach <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Haixin Yu <[email protected]> Cc: Kan Liang <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-15perf pmu: Move pmu__find_core_pmu() to pmus.cJames Clark6-24/+25
pmu__find_core_pmu() more logically belongs in pmus.c because it iterates over all PMUs, so move it to pmus.c At the same time rename it to perf_pmus__find_core_pmu() to match the naming convention in this file. list_prepare_entry() can't be used in perf_pmus__scan_core() anymore now that it's called from the same compilation unit. This is with -O2 (specifically -O1 -ftree-vrp -finline-functions -finline-small-functions) which allow the bounds of the array access to be determined at compile time. list_prepare_entry() subtracts the offset of the 'list' member in struct perf_pmu from &core_pmus, which isn't a struct perf_pmu. The compiler sees that pmu results in &core_pmus - 8 and refuses to compile. At runtime this works because list_for_each_entry_continue() always adds the offset back again before dereferencing ->next, but it's technically undefined behavior. With -fsanitize=undefined an additional warning is generated. Using list_first_entry_or_null() to get the first entry here avoids doing &core_pmus - 8 but has the same result and fixes both the compile warning and the undefined behavior warning. There are other uses of list_prepare_entry() in pmus.c, but the compiler doesn't seem to be able to see that they can also be called with &core_pmus, so I won't change any at this time. Signed-off-by: James Clark <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Reviewed-by: John Garry <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Will Deacon <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mike Leach <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Haixin Yu <[email protected]> Cc: Kan Liang <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-15perf symbol: Avoid an undefined behavior warningIan Rogers1-2/+1
The node (nd) may be NULL and pointer arithmetic on NULL is undefined behavior. Move the computation of next below the NULL check on the node. Signed-off-by: Ian Rogers <[email protected]> Cc: James Clark <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-09-13tools headers UAPI: Update tools's copy of drm.h headersArnaldo Carvalho de Melo1-15/+69
Picking the changes from: ad9ee11fdf113f96 ("drm/doc: document that PRIME import/export is always supported") 2ff4f6d410afa762 ("drm/doc: document drm_event and its types") 9a2eabf48ade4fba ("drm/doc: use proper cross-references for sections") c7a4722971691562 ("drm/syncobj: add IOCTL to register an eventfd") Addressing these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/include/uapi/drm/drm.h include/uapi/drm/drm.h Now 'perf trace' and other code that might use the tools/perf/trace/beauty autogenerated tables will be able to translate this new ioctl code into a string: $ tools/perf/trace/beauty/drm_ioctl.sh > before $ cp include/uapi/drm/drm.h tools/include/uapi/drm/drm.h $ tools/perf/trace/beauty/drm_ioctl.sh > after $ diff -u before after --- before 2023-09-13 08:54:45.170134002 -0300 +++ after 2023-09-13 08:55:06.612712776 -0300 @@ -108,6 +108,7 @@ [0xCC] = "SYNCOBJ_TRANSFER", [0xCD] = "SYNCOBJ_TIMELINE_SIGNAL", [0xCE] = "MODE_GETFB2", + [0xCF] = "SYNCOBJ_EVENTFD", [DRM_COMMAND_BASE + 0x00] = "I915_INIT", [DRM_COMMAND_BASE + 0x01] = "I915_FLUSH", [DRM_COMMAND_BASE + 0x02] = "I915_FLIP", $ Cc: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Simon Ser <[email protected]> Link: https://lore.kernel.org/lkml/ZQGkh9qlhpKA%[email protected]/ Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-13tools arch x86: Sync the msr-index.h copy with the kernel sourcesArnaldo Carvalho de Melo1-0/+12
To pick up the changes from these csets: 1b5277c0ea0b2473 ("x86/srso: Add SRSO_NO support") 8974eb588283b7d4 ("x86/speculation: Add Gather Data Sampling mitigation") That cause no changes to tooling: $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after $ diff -u before after $ Just silences this perf build warning: Warning: Kernel ABI header differences: diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h Cc: Adrian Hunter <[email protected]> Cc: Borislav Petkov (AMD) <[email protected]> Cc: Daniel Sneddon <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-13perf bench sched-seccomp-notify: Use the tools copy of seccomp.h UAPIArnaldo Carvalho de Melo1-1/+1
To keep perf building in systems where types and defines used in this new benchmark are not available, such as: 12 13.46 centos:stream : FAIL gcc version 8.5.0 20210514 (Red Hat 8.5.0-20) (GCC) bench/sched-seccomp-notify.c: In function 'user_notif_syscall': bench/sched-seccomp-notify.c:55:27: error: 'SECCOMP_RET_USER_NOTIF' undeclared (first use in this function); did you mean 'SECCOMP_RET_ERRNO'? BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_USER_NOTIF), ^~~~~~~~~~~~~~~~~~~~~~ /git/perf-6.6.0-rc1/tools/include/uapi/linux/filter.h:49:59: note: in definition of macro 'BPF_STMT' #define BPF_STMT(code, k) { (unsigned short)(code), 0, 0, k } ^ bench/sched-seccomp-notify.c:55:27: note: each undeclared identifier is reported only once for each function it appears in BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_USER_NOTIF), ^~~~~~~~~~~~~~~~~~~~~~ /git/perf-6.6.0-rc1/tools/include/uapi/linux/filter.h:49:59: note: in definition of macro 'BPF_STMT' #define BPF_STMT(code, k) { (unsigned short)(code), 0, 0, k } ^ bench/sched-seccomp-notify.c:55:3: error: missing initializer for field 'k' of 'struct sock_filter' [-Werror=missing-field-initializers] BPF_STMT(BPF_RET|BPF_K, SECCOMP_RET_USER_NOTIF), ^~~~~~~~ In file included from bench/sched-seccomp-notify.c:5: /git/perf-6.6.0-rc1/tools/include/uapi/linux/filter.h:28:8: note: 'k' declared here __u32 k; /* Generic multiuse field */ ^ bench/sched-seccomp-notify.c: In function 'user_notification_sync_loop': bench/sched-seccomp-notify.c:70:28: error: storage size of 'resp' isn't known struct seccomp_notif_resp resp; ^~~~ bench/sched-seccomp-notify.c:71:23: error: storage size of 'req' isn't known struct seccomp_notif req; ^~~ bench/sched-seccomp-notify.c:76:23: error: 'SECCOMP_IOCTL_NOTIF_RECV' undeclared (first use in this function); did you mean 'SECCOMP_MODE_STRICT'? if (ioctl(listener, SECCOMP_IOCTL_NOTIF_RECV, &req)) ^~~~~~~~~~~~~~~~~~~~~~~~ SECCOMP_MODE_STRICT bench/sched-seccomp-notify.c:86:23: error: 'SECCOMP_IOCTL_NOTIF_SEND' undeclared (first use in this function); did you mean 'SECCOMP_RET_ACTION'? if (ioctl(listener, SECCOMP_IOCTL_NOTIF_SEND, &resp)) ^~~~~~~~~~~~~~~~~~~~~~~~ SECCOMP_RET_ACTION bench/sched-seccomp-notify.c:71:23: error: unused variable 'req' [-Werror=unused-variable] struct seccomp_notif req; ^~~ bench/sched-seccomp-notify.c:70:28: error: unused variable 'resp' [-Werror=unused-variable] struct seccomp_notif_resp resp; ^~~~ 14 11.31 debian:10 : FAIL gcc version 8.3.0 (Debian 8.3.0-6) Cc: Adrian Hunter <[email protected]> Cc: Andrei Vagin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kees Kook <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-13tools headers UAPI: Copy seccomp.h to be able to build 'perf bench' in older ↵Arnaldo Carvalho de Melo2-0/+158
systems The new 'perf bench' for sched-seccomp-notify uses defines and types not available in older systems where we want to have perf available, so grab a copy of this UAPI from the kernel sources to allow that. This will be checked in the future for drift from the original when we build the perf tool, that will warn when that happens like: make: Entering directory '/var/home/acme/git/perf-tools/tools/perf' BUILD: Doing 'make -j32' parallel build Warning: Kernel ABI header differences: Cc: Adrian Hunter <[email protected]> Cc: Andrei Vagin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kees Kook <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-13tools headers UAPI: Sync files changed by new fchmodat2 and map_shadow_stack ↵Arnaldo Carvalho de Melo5-1/+9
syscalls with the kernel sources To pick the changes in these csets: c35559f94ebc3e3b ("x86/shstk: Introduce map_shadow_stack syscall") 78252deb023cf087 ("arch: Register fchmodat2, usually as syscall 452") That add support for this new syscall in tools such as 'perf trace'. For instance, this is now possible: # perf trace -v -e fchmodat*,map_shadow_stack --max-events=4 Using CPUID AuthenticAMD-25-21-0 Reusing "openat" BPF sys_enter augmenter for "fchmodat" event qualifier tracepoint filter: (common_pid != 3499340 && common_pid != 11259) && (id == 268 || id == 452 || id == 453) ^C# And it'll work as with other syscalls, for instance openat: # perf trace -e openat* --max-events=4 0.000 ( 0.015 ms): systemd-oomd/1150 openat(dfd: CWD, filename: "/proc/meminfo", flags: RDONLY|CLOEXEC) = 11 0.068 ( 0.019 ms): systemd-oomd/1150 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1001.slice/[email protected]/memory.pressure", flags: RDONLY|CLOEXEC) = 11 0.119 ( 0.008 ms): systemd-oomd/1150 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1001.slice/[email protected]/memory.current", flags: RDONLY|CLOEXEC) = 11 0.138 ( 0.006 ms): systemd-oomd/1150 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1001.slice/[email protected]/memory.min", flags: RDONLY|CLOEXEC) = 11 # That is the filter expression attached to the raw_syscalls:sys_{enter,exit} tracepoints. $ find tools/perf/arch/ -name "syscall*tbl" | xargs grep -E fchmodat\|sys_map_shadow_stack tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl:258 n64 fchmodat sys_fchmodat tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl:452 n64 fchmodat2 sys_fchmodat2 tools/perf/arch/powerpc/entry/syscalls/syscall.tbl:297 common fchmodat sys_fchmodat tools/perf/arch/powerpc/entry/syscalls/syscall.tbl:452 common fchmodat2 sys_fchmodat2 tools/perf/arch/s390/entry/syscalls/syscall.tbl:299 common fchmodat sys_fchmodat sys_fchmodat tools/perf/arch/s390/entry/syscalls/syscall.tbl:452 common fchmodat2 sys_fchmodat2 sys_fchmodat2 tools/perf/arch/x86/entry/syscalls/syscall_64.tbl:268 common fchmodat sys_fchmodat tools/perf/arch/x86/entry/syscalls/syscall_64.tbl:452 common fchmodat2 sys_fchmodat2 tools/perf/arch/x86/entry/syscalls/syscall_64.tbl:453 64 map_shadow_stack sys_map_shadow_stack $ $ grep -Ew map_shadow_stack\|fchmodat2 /tmp/build/perf-tools/arch/x86/include/generated/asm/syscalls_64.c [452] = "fchmodat2", [453] = "map_shadow_stack", $ This addresses these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl Cc: Adrian Hunter <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Rick Edgecombe <[email protected]> Link: https://lore.kernel.org/lkml/ZP8bE7aXDBu%[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf bpf-filter: Add YYDEBUGIan Rogers1-0/+4
YYDEBUG enables line numbers and other error helpers in the generated bpf-filter-bison.c. Conditionally enabled only for debug builds. Signed-off-by: Ian Rogers <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf pmu: Add YYDEBUGIan Rogers1-0/+4
YYDEBUG enables line numbers and other error helpers in the generated pmu-bison.c. Conditionally enabled only for debug builds. Signed-off-by: Ian Rogers <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf expr: Make YYDEBUG dependent on doing a debug buildIan Rogers1-0/+2
YYDEBUG enables line numbers and other error helpers in the generated expr-bison.c. These shouldn't be generated when debugging isn't enabled. Signed-off-by: Ian Rogers <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf parse-events: Make YYDEBUG dependent on doing a debug buildIan Rogers1-0/+2
YYDEBUG enables line numbers and other error helpers in the generated parse-events-bison.c. These shouldn't be generated when debugging isn't enabled. Signed-off-by: Ian Rogers <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf parse-events: Remove unused header filesIan Rogers1-3/+0
The fnmatch header is now used in the PMU matching logic in pmu.c. Signed-off-by: Ian Rogers <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf tools: Add includes for detected configs in Makefile.perfAthira Rajeev1-0/+3
Makefile.perf uses "CONFIG_*" checks in the code. Example the config for libtraceevent is used to set PYTHON_EXT_SRCS ifeq ($(CONFIG_LIBTRACEEVENT),y) PYTHON_EXT_SRCS := $(shell grep -v ^\# util/python-ext-sources) else PYTHON_EXT_SRCS := $(shell grep -v '^\#\|util/trace-event.c' util/python-ext-sources) endif But this is not picking the value for CONFIG_LIBTRACEEVENT that is set using the settings in Makefile.config. Include the file ".config-detected" so that make will use the system detected configuration in the CONFIG checks. This will fix isues that could arise when other "CONFIG_*" checks are added to Makefile.perf in future as well. Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Athira Rajeev <[email protected]> Cc: Disha Goel <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf test: Update cs_etm testcase for Arm ETERuidong Tian1-1/+14
Add ETE as one of the supported device types in perf cs_etm testcase. Reviewed-by: James Clark <[email protected]> Signed-off-by: Ruidong Tian <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Mike Leach <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf vendor events arm64: Add V1 metrics using Arm telemetry repoJames Clark1-0/+233
Metrics for V1 weren't previously included in the Perf Jsons, so add them using the telemetry source [1]. After generation any parts identical to the default metrics in sbsa.json were manually removed. [1]: https://gitlab.arm.com/telemetry-solution/telemetry-solution/-/blob/main/data/pmu/cpu/neoverse/neoverse-v1.json Signed-off-by: James Clark <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Nick Forrington <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf vendor events arm64: Update V1 events using Arm telemetry repoJames Clark20-341/+502
The new data [1] includes descriptions that may have product specific details and new groupings that will be consistent with other products. The following command was used to generate the jsons: $ telemetry-solution/tools/perf_json_generator/generate.py \ linux/tools/perf/ --telemetry-files \ telemetry-solution/data/pmu/cpu/neoverse/neoverse-v1.json [1]: https://gitlab.arm.com/telemetry-solution/telemetry-solution/-/blob/main/data/pmu/cpu/neoverse/neoverse-v1.json Signed-off-by: James Clark <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Nick Forrington <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf test: Add a test for strcmp_cpuid_str() expressionJames Clark1-4/+27
Test that the new expression builtin returns a match when the current escaped CPU ID is given, and that it doesn't match when "0x0" is given. The CPU ID in test__expr() has to be changed to perf_pmu__getcpuid() which returns the CPU ID string, rather than the raw CPU ID that get_cpuid() returns because that can't be used with strcmp_cpuid_str(). It doesn't affect the is_intel test because both versions contain "Intel". Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: James Clark <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Chen Zhongjin <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Haixin Yu <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Liam Howlett <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Jihong <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf util: Add a function for replacing characters in a stringJames Clark6-0/+83
It finds all occurrences of a single character and replaces them with a multi character string. This will be used in a test in a following commit. Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: James Clark <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Chen Zhongjin <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Haixin Yu <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Liam Howlett <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Jihong <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf jevents: Remove unused keywordJames Clark1-2/+1
'cpuid_not_more_than' was the working title of the new 'strcmp_cpuid_str' keyword and was accidentally left in. It was never used so tidying it up has no effect. Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: James Clark <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Chen Zhongjin <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Haixin Yu <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Liam Howlett <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Miguel Ojeda <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Jihong <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf test: Check result of has_event(cycles) testJames Clark1-1/+1
Currently the function always returns 0, so even when the has_event() test fails, the test still passes. Fix it by returning ret instead. Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: James Clark <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Chen Zhongjin <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Haixin Yu <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Liam Howlett <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Miguel Ojeda <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Jihong <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf list pfm: Retry supported test with exclude_kernelIan Rogers1-1/+14
With paranoia set at 2 evsel__open will fail with EACCES for non-root users. To avoid this stopping libpfm4 events from being printed, retry with exclude_kernel enabled - copying the regular is_event_supported test. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kang Minchul <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf list: Avoid a hardcoded cpu PMU nameIan Rogers1-11/+17
Use the first core PMU instead. On a Raspberry Pi, before: $ perf list ... cpu/t1=v1[,t2=v2,t3 ...]/modifier [Raw hardware event descriptor] [(see 'man perf-list' on how to encode it)] ... After: $ perf list ... armv8_cortex_a72/t1=v1[,t2=v2,t3 ...]/modifier [Raw hardware event descriptor] [(see 'man perf-list' on how to encode it)] ... ``` Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kang Minchul <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf test shell lock_contention: Add cgroup aggregation and filter testsNamhyung Kim1-0/+45
Add cgroup aggregation and filter tests. $ sudo ./perf test -v contention 84: kernel lock contention analysis test : --- start --- test child forked, pid 222423 Testing perf lock record and perf lock contention Testing perf lock contention --use-bpf Testing perf lock record and perf lock contention at the same time Testing perf lock contention --threads Testing perf lock contention --lock-addr Testing perf lock contention --lock-cgroup Testing perf lock contention --type-filter (w/ spinlock) Testing perf lock contention --lock-filter (w/ tasklist_lock) Testing perf lock contention --callstack-filter (w/ unix_stream) Testing perf lock contention --callstack-filter with task aggregation Testing perf lock contention --cgroup-filter Testing perf lock contention CSV output test child finished with 0 ---- end ---- kernel lock contention analysis test: Ok Committer testing: [root@quaco ~]# uname -a Linux quaco 6.4.10-200.fc38.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Aug 11 12:20:29 UTC 2023 x86_64 GNU/Linux [root@quaco ~]# perf test -v contention 84: kernel lock contention analysis test : --- start --- test child forked, pid 452625 Testing perf lock record and perf lock contention Testing perf lock contention --use-bpf Testing perf lock record and perf lock contention at the same time Testing perf lock contention --threads Testing perf lock contention --lock-addr Testing perf lock contention --lock-cgroup Testing perf lock contention --type-filter (w/ spinlock) Testing perf lock contention --lock-filter (w/ tasklist_lock) Testing perf lock contention --callstack-filter (w/ unix_stream) Testing perf lock contention --callstack-filter with task aggregation Testing perf lock contention --cgroup-filter Testing perf lock contention CSV output test child finished with 0 ---- end ---- kernel lock contention analysis test: Ok [root@quaco ~]# Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Hao Luo <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf lock contention: Add -G/--cgroup-filter optionNamhyung Kim7-2/+95
The -G/--cgroup-filter is to limit lock contention collection on the tasks in the specific cgroups only. $ sudo ./perf lock con -abt -G /user.slice/.../vte-spawn-52221fb8-b33f-4a52-b5c3-e35d1e6fc0e0.scope \ ./perf bench sched messaging # Running 'sched/messaging' benchmark: # 20 sender and receiver processes per group # 10 groups == 400 processes run Total time: 0.174 [sec] contended total wait max wait avg wait pid comm 4 114.45 us 60.06 us 28.61 us 214847 sched-messaging 2 111.40 us 60.84 us 55.70 us 214848 sched-messaging 2 106.09 us 59.42 us 53.04 us 214837 sched-messaging 1 81.70 us 81.70 us 81.70 us 214709 sched-messaging 68 78.44 us 6.83 us 1.15 us 214633 sched-messaging 69 73.71 us 2.69 us 1.07 us 214632 sched-messaging 4 72.62 us 60.83 us 18.15 us 214850 sched-messaging 2 71.75 us 67.60 us 35.88 us 214840 sched-messaging 2 69.29 us 67.53 us 34.65 us 214804 sched-messaging 2 69.00 us 68.23 us 34.50 us 214826 sched-messaging ... Export cgroup__new() function as it's needed from outside. Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Hao Luo <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf lock contention: Add --lock-cgroup optionNamhyung Kim6-10/+84
The --lock-cgroup option shows lock contention stats break down by cgroups. Add LOCK_AGGR_CGROUP mode and use it instead of use_cgroup field. $ sudo ./perf lock con -ab --lock-cgroup sleep 1 contended total wait max wait avg wait cgroup 8 15.70 us 6.34 us 1.96 us / 2 1.48 us 747 ns 738 ns /user.slice/.../app.slice/app-gnome-google\x2dchrome-6442.scope 1 848 ns 848 ns 848 ns /user.slice/.../session.slice/[email protected] 1 220 ns 220 ns 220 ns /user.slice/.../session.slice/pipewire-pulse.service For now, the cgroup mode only works with BPF (-b). Committer notes: Remove -g as it is used in the other tools with a clear meaning of collect/show callchains. As agreed with Namhyung off list. Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Hao Luo <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-09-12perf lock contention: Prepare to handle cgroupsNamhyung Kim3-4/+34
Save cgroup info and display cgroup names if requested. This is a preparation for the next patch. Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Hao Luo <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>