aboutsummaryrefslogtreecommitdiff
path: root/tools/perf
AgeCommit message (Collapse)AuthorFilesLines
2023-08-29perf pmus: Sort pmus by name then suffixIan Rogers1-0/+49
Sort PMUs by name. If two PMUs have the same name but differ by suffix, sort the suffixes numerically. For example, "breakpoint" comes before "cpu", "uncore_imc_free_running_0" comes before "uncore_imc_free_running_1". Suffixes need to be treated specially as otherwise they will be ordered like 0, 1, 10, 11, .., 2, 20, 21, .., etc. Only PMUs starting 'uncore_' are considered to have a potential suffix. Sorting of PMUs is done so that later patches can skip duplicate uncore PMUs that differ only by there suffix. Committer notes: Used the more compact, intention revealing strstarts() function we got from the kernel sources: - if (strncmp(str, "uncore_", 7)) + if (!strstarts(str, "uncore_")) Also in pmus_cmp() the lhs_num and rhs_num variables may end up not being set for non "uncore_" prefixed PMUs in pmu_name_len_no_suffix(), or at least gcc 7.5 in some distros (opensuse 15.5, to be EOLed in Dec/2024) thins so, so initialize both to zero. Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-29perf beauty mmap_flags: Use "test -f" instead of "[-f FILE]"Yanteng Si2-5/+9
"[" is part of the shell builtin test (and a synonym for it), not a link to the external command /usr/bin/test. Using the "test" is simpler because it avoids a lot of "[]". Suggested-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Yanteng Si <[email protected]> Acked-by: Huacai Chen <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/c50bc0a92dce0ff0fa6504c1a52fb53e2ac007bf.1692962043.git.siyanteng@loongson.cn Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-29perf beauty mmap_flags: Fix script for archs that use the generic mman.hYanteng Si1-1/+1
To address this error: grep: /root/linux-next/tools/arch/xxxxx/include/uapi/asm//mman.h: No such file or directory Signed-off-by: Yanteng Si <[email protected]> Acked-by: Huacai Chen <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/42e8e3565d6035302907426c1e65483b2a4007f5.1692962043.git.siyanteng@loongson.cn Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-29perf tools: Allow to use cpuinfo on LoongArchYanteng Si2-1/+3
Define these macros so that the CPU name can be displayed when running 'perf report' and 'perf timechart'. Committer notes: No need to have: if (strcasestr(buf, "Model Name")) { strlcpy(cpu_m, &buf[13], 255); break; } else if (strcasestr(buf, "model name")) { strlcpy(cpu_m, &buf[13], 255); break; } As the point of strcasestr() is to be case insensitive to both the haystack and the needle, so simplify the above to just: if (strcasestr(buf, "model name")) { strlcpy(cpu_m, &buf[13], 255); break; } Signed-off-by: Yanteng Si <[email protected]> Acked-by: Huacai Chen <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/db968a186a10e4629fe10c26a1210f7126ad41ec.1692962043.git.siyanteng@loongson.cn Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-28Merge tag 'seccomp-v6.6-rc1' of ↵Linus Torvalds4-0/+181
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull seccomp updates from Kees Cook: - Provide USER_NOTIFY flag for synchronous mode (Andrei Vagin, Peter Oskolkov). This touches the scheduler and perf but has been Acked by Peter Zijlstra. - Fix regression in syscall skipping and restart tracing on arm32. This touches arch/arm/ but has been Acked by Arnd Bergmann. * tag 'seccomp-v6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: seccomp: Add missing kerndoc notations ARM: ptrace: Restore syscall skipping for tracers ARM: ptrace: Restore syscall restart tracing selftests/seccomp: Handle arm32 corner cases better perf/benchmark: add a new benchmark for seccom_unotify selftest/seccomp: add a new test for the sync mode of seccomp_user_notify seccomp: add the synchronous mode for seccomp_unotify sched: add a few helpers to wake up tasks on the current cpu sched: add WF_CURRENT_CPU and externise ttwu seccomp: don't use semaphore and wait_queue together
2023-08-25perf lock contention: Fix typo in max-stack option descriptionKajol Jain1-1/+1
Fix typo in max-stack option description by changing lopck contention to lock contention. Signed-off-by: Kajol Jain <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Disha Goel <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-25perf tui slang: Tidy castsIan Rogers7-34/+15
Casts were necessary for older versions of libslang, however, these are now 15 years old and so we no longer need to care about supporting them. Tidy the casts and remove unnecessary logic. Move the ENABLE_SLFUTURE_CONST to the libslang.h common include file, and also enable ENABLE_SLFUTURE_VOID. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Ming Wang <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Wei Li <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-25perf build-id: Simplify build_id_cache__cachedir()Ian Rogers1-4/+2
Initialize realname to NULL, rather than name. This avoids a cast and as realpath is either NULL or an allocated string, free can be called unconditionally. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Ming Wang <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Wei Li <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-25perf pmu: Make id const and add missing freeIan Rogers3-3/+4
The struct pmu id is initialized from pmu_id that is read into allocated memory from a file, as such it needs free-ing in pmu__delete(). Make the id value const so that we can remove casts in tests. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Ming Wang <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Wei Li <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-25perf parse-events: Make term's config constIan Rogers4-17/+17
This avoids casts in tests. Use zfree in a few places to avoid warnings about a freeing a const pointer. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Ming Wang <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Wei Li <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-25perf pmu: Remove logic for PMU name being NULLIan Rogers15-62/+51
The PMU name could be NULL in the case of the fake_pmu. Initialize the name for the fake_pmu to "fake" so that all other logic can assume it is initialized. Add a const to the type of name so that a literal can be used to avoid additional initialization code. Propagate the cost through related routines and remove now unnecessary "(char *)" casts. Doing this located a bug in builtin-list for the pmu_glob that was missing a strdup. Signed-off-by: Ian Rogers <[email protected]> Link: https://lore.kernel.org/r/[email protected] Cc: K Prateek Nayak <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: James Clark <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Wei Li <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Will Deacon <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mike Leach <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Kan Liang <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: [email protected] Cc: Ming Wang <[email protected]> Cc: John Garry <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-25perf header: Fix missing PMU capsIan Rogers1-15/+16
PMU caps are written as HEADER_PMU_CAPS or for the special case of the PMU "cpu" as HEADER_CPU_PMU_CAPS. As the PMU "cpu" is special, and not any "core" PMU, the logic had become broken and core PMUs not called "cpu" were not having their caps written. This affects ARM and s390 non-hybrid PMUs. Simplify the PMU caps writing logic to scan one fewer time and to be more explicit in its behavior. Fixes: 178ddf3bad981380 ("perf header: Avoid hybrid PMU list in write_pmu_caps") Reported-by: Wei Li <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Ming Wang <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-24perf jevents: Don't append Unit to descIan Rogers3-19/+23
Unit with the PMU name is appended to desc in jevents.py, but on hybrid platforms it causes the desc to differ from the regular non-hybrid system with a PMU of 'cpu'. Having differing descs means the events don't deduplicate. To make the perf list output not differ, append the Unit on again in the perf list printing code. On x86 reduces the binary size by 409,600 bytes or about 4%. Update pmu-events test expectations to match the differently generated pmu-events.c code. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-24perf scripts python gecko: Launch the profiler UI on the default browser ↵Anup Sharma1-7/+63
with the appropriate URL All required libraries have been imported and make sure that none of them are external dependencies. To achieve this, created a virt env and verified. Modified usage information and added combined command. Modified the main() function to read the --save-only command-line option and set the output_file variable accordingly. Modified the trace_end() function to check for the output_file variable. If it is set, the profiler data is saved to a local file in Gecko Profile format, or the profiler.firefox.com is opened on the default browser. Included trace_begin() to initialize the Firefox Profiler and launch the default browser to display the profiler.firefox.com. Added a new function launchFirefox() to start a local server and launch the profiler UI on the default browser with the appropriate URL. Created the "CORSRequestHandler" class to enable Cross-Origin Resource Sharing. Summary: This integration now includes a exiting feature to conveniently host the Gecko Profile data on a local server and open it directly in the default web browser. This means that users can now effortlessly visualize and analyze the profiler results with just a single click. The addition of the --save-only command-line option allows users to save the profiler output to a local file in Gecko Profile format, but the real highlight lies in the capability to seamlessly launch a local server, making the data accessible to Firefox Profiler via a web browser. In addition, it's important to highlight that all data are hosted locally, eliminating any concerns about data privacy rules and regulations. Signed-off-by: Anup Sharma <[email protected]> Tested-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/ZNOS0vo58DnVLpD8@yoga Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-24perf scripts python: Add support for input args in gecko scriptAnup Sharma1-1/+5
Refines the argument handling mechanism in the "gecko-report" script to enable better compatibility and improved user experience. The script now differentiates between scenarios where arguments are provided for record and report cases where gecko.py arguments are passed. Signed-off-by: Anup Sharma <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/ZNf7W+EIrrCSHZN0@yoga Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-24perf jevents: Sort strings in the big C string to reduce faultsIan Rogers1-8/+23
Sort the strings within the big C string based on whether they were for a metric and then by when they were added. This helps group related strings and reduce minor faults by approximately 10 in 1740, about 0.57%. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-24perf pmu: Lazily load sysfs aliasesIan Rogers3-39/+46
Don't load sysfs aliases for a PMU when the PMU is first created, defer until an alias needs to be found. For the pmu-scan benchmark, average core PMU scanning is reduced by 30.8%, and average PMU scanning by 12.6%. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-24perf pmu: Be lazy about loading event info files from sysfsIan Rogers1-45/+83
Event info is only needed when an event is parsed or when merging data from an JSON and sysfs event. Be lazy in its loading to reduce file accesses. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-24perf pmu: Scan type early to fail an invalid PMU quicklyIan Rogers1-7/+12
Scan sysfs PMU's type early so that format and aliases aren't attempted to be loaded if the PMU name is invalid. This is the case for event_pmu tokens in parse-events.y where a wildcard name is first assumed to be a PMU name. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-24perf pmu: Lazily add JSON eventsIan Rogers6-15/+85
Rather than scanning all JSON events and adding them when a PMU is created, add the alias when the JSON event is needed. Average core PMU scanning run time reduced by 60.2%. Average PMU scanning run time reduced by 15%. Page faults with no events reduced by 74 page faults, 4% of total. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-24perf pmu: Cache JSON events tableIan Rogers4-11/+25
Cache the JSON events table so that finding it isn't done per event/alias. Change the events table find so that when the PMU is given, if the PMU has no JSON events return null. Update usage to always use the PMU variable. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-24perf pmu: Merge JSON events with sysfs at load timeIan Rogers1-89/+88
Rather than load all sysfs events then parsing all JSON events and merging with ones that already exist. When a sysfs event is loaded, look for a corresponding JSON event and merge immediately. To simplify the logic, early exit the perf_pmu__new_alias function if an alias is attempted to be added twice - as merging has already been explicitly handled. Fix the copying of terms to a merged alias and some ENOMEM paths. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-24perf pmu: Prefer passing pmu to aliases listIan Rogers1-28/+16
The aliases list is part of the PMU. Rather than pass the aliases list, pass the full PMU simplifying some callbacks. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-24perf pmu: Parse sysfs events directly from a fileIan Rogers5-40/+33
Rather than read a sysfs events file into a 256 byte char buffer, pass the FILE* directly to the lex/yacc parser. This avoids there being a maximum events file size. While changing the API, constify some arguments to remove unnecessary casts. Allocating the read buffer decreases the performance of pmu-scan by around 3%. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-24perf pmu-events: Add pmu_events_table__find_event()Ian Rogers4-0/+90
jevents stores events sorted by name. Add a find function that will binary search event names avoiding the need to linearly search through events. Add a test in tests/pmu-events.c. If the PMU or event aren't found -1000 is returned. If the event is found but no callback function given, 0 is returned. This allows the find function also act as a test for existence. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-24perf pmu-events: Reduce processed events by passing PMUIan Rogers6-38/+41
Pass the PMU to pmu_events_table__for_each_event so that entries that don't match don't need to be processed by callback. If a NULL PMU is passed then all PMUs are processed. 'perf bench internals pmu-scan's "Average PMU scanning" performance is reduced by about 5% on an Intel tigerlake. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-24perf s390 s390_cpumcfdg_dump: Don't scan all PMUsIan Rogers1-24/+26
Rather than scanning all PMUs for a counter name, scan the PMU associated with the evsel of the sample. This is done to remove a dependence on pmu-events.h. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-24perf parse-events: Improve error message for double settingIan Rogers3-9/+29
Double setting information for an event would produce an error message associated with the PMU rather than the term that was double setting. Improve the error message to be on the term. Before: $ perf stat -e 'cpu/inst_retired.any,inst_retired.any/' true event syntax error: 'cpu/inst_retired.any,inst_retired.any/' \___ Bad event or PMU Unabled to find PMU or event on a PMU of 'cpu' Run 'perf list' for a list of valid events $ After: $ perf stat -e 'cpu/inst_retired.any,inst_retired.any/' true event syntax error: '..etired.any,inst_retired.any/' \___ Bad event or PMU Unabled to find PMU or event on a PMU of 'cpu' Initial error: event syntax error: '..etired.any,inst_retired.any/' \___ Attempt to set event's scale twice Run 'perf list' for a list of valid events Signed-off-by: Ian Rogers <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-24perf jevents: Group events by PMUIan Rogers2-57/+154
Prior to this change a cpuid would map to a list of events where the PMU would be encoded alongside the event information. This change breaks apart each group of events so that there is a group per PMU. A new table is added with the PMU's name and the list of events, the original table now holding an array of these per PMU tables. These changes are to make it easier to get per PMU information about events, rather than the current approach of scanning all events. The perf binary size with BPF skeletons on x86 is reduced by about 1%. The unidentified PMU is now always expanded to "cpu". Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-24perf pmu-events: Add extra underscore to function namesIan Rogers7-22/+22
Add extra underscore before "for" of pmu_events_table_for_each_event and pmu_metrics_table_for_each_metric. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-24perf pmu: Abstract alias/event structIan Rogers6-323/+366
In order to be able to lazily compute aliases/events for a PMU, move the struct perf_pmu_alias into pmu.c. Add perf_pmu__find_event and perf_pmu__for_each_event that take a callback that is called for the found event or for each event. The layout of struct pmu and the event/alias list is unchanged but the API is altered so that aliases are no longer directly accessed, allowing for later changes. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-24perf pmu: Make the loading of formats lazyIan Rogers4-65/+106
The sysfs format files are loaded eagerly in a PMU. Add a flag so that we create the format but only load the contents when necessary. Reduce the size of the value in struct perf_pmu_format and avoid holes so there is no additional space requirement. For "perf stat -e cycles true" this reduces the number of openat calls from 648 to 573 (about 12%). The benchmark pmu scan speed is improved by roughly 5%. Before: $ perf bench internals pmu-scan Computing performance of sysfs PMU event scan for 100 times Average core PMU scanning took: 1061.100 usec (+- 9.965 usec) Average PMU scanning took: 4725.300 usec (+- 260.599 usec) After: $ perf bench internals pmu-scan Computing performance of sysfs PMU event scan for 100 times Average core PMU scanning took: 989.170 usec (+- 6.873 usec) Average PMU scanning took: 4520.960 usec (+- 251.272 usec) Committer testing: On a AMD Ryzen 5950x: Before: $ perf bench internals pmu-scan -i1000 # Running 'internals/pmu-scan' benchmark: Computing performance of sysfs PMU event scan for 1000 times Average core PMU scanning took: 563.466 usec (+- 1.008 usec) Average PMU scanning took: 1619.174 usec (+- 23.627 usec) $ perf stat -r5 perf bench internals pmu-scan -i1000 # Running 'internals/pmu-scan' benchmark: Computing performance of sysfs PMU event scan for 1000 times Average core PMU scanning took: 583.401 usec (+- 2.098 usec) Average PMU scanning took: 1677.352 usec (+- 24.636 usec) # Running 'internals/pmu-scan' benchmark: Computing performance of sysfs PMU event scan for 1000 times Average core PMU scanning took: 553.254 usec (+- 0.825 usec) Average PMU scanning took: 1635.655 usec (+- 24.312 usec) # Running 'internals/pmu-scan' benchmark: Computing performance of sysfs PMU event scan for 1000 times Average core PMU scanning took: 557.733 usec (+- 0.980 usec) Average PMU scanning took: 1600.659 usec (+- 23.344 usec) # Running 'internals/pmu-scan' benchmark: Computing performance of sysfs PMU event scan for 1000 times Average core PMU scanning took: 554.906 usec (+- 0.774 usec) Average PMU scanning took: 1595.338 usec (+- 23.288 usec) # Running 'internals/pmu-scan' benchmark: Computing performance of sysfs PMU event scan for 1000 times Average core PMU scanning took: 551.798 usec (+- 0.967 usec) Average PMU scanning took: 1623.213 usec (+- 23.998 usec) Performance counter stats for 'perf bench internals pmu-scan -i1000' (5 runs): 3276.82 msec task-clock:u # 0.990 CPUs utilized ( +- 0.82% ) 0 context-switches:u # 0.000 /sec 0 cpu-migrations:u # 0.000 /sec 1008 page-faults:u # 307.615 /sec ( +- 0.04% ) 12049614778 cycles:u # 3.677 GHz ( +- 0.07% ) (83.34%) 117507478 stalled-cycles-frontend:u # 0.98% frontend cycles idle ( +- 0.33% ) (83.32%) 27106761 stalled-cycles-backend:u # 0.22% backend cycles idle ( +- 9.55% ) (83.36%) 33294953848 instructions:u # 2.76 insn per cycle # 0.00 stalled cycles per insn ( +- 0.03% ) (83.31%) 6849825049 branches:u # 2.090 G/sec ( +- 0.03% ) (83.37%) 71533903 branch-misses:u # 1.04% of all branches ( +- 0.20% ) (83.30%) 3.3088 +- 0.0302 seconds time elapsed ( +- 0.91% ) $ After: $ perf stat -r5 perf bench internals pmu-scan -i1000 # Running 'internals/pmu-scan' benchmark: Computing performance of sysfs PMU event scan for 1000 times Average core PMU scanning took: 550.702 usec (+- 0.958 usec) Average PMU scanning took: 1566.577 usec (+- 22.747 usec) # Running 'internals/pmu-scan' benchmark: Computing performance of sysfs PMU event scan for 1000 times Average core PMU scanning took: 548.315 usec (+- 0.555 usec) Average PMU scanning took: 1565.499 usec (+- 22.760 usec) # Running 'internals/pmu-scan' benchmark: Computing performance of sysfs PMU event scan for 1000 times Average core PMU scanning took: 548.073 usec (+- 0.555 usec) Average PMU scanning took: 1586.097 usec (+- 23.299 usec) # Running 'internals/pmu-scan' benchmark: Computing performance of sysfs PMU event scan for 1000 times Average core PMU scanning took: 561.184 usec (+- 2.709 usec) Average PMU scanning took: 1567.153 usec (+- 22.548 usec) # Running 'internals/pmu-scan' benchmark: Computing performance of sysfs PMU event scan for 1000 times Average core PMU scanning took: 546.987 usec (+- 0.553 usec) Average PMU scanning took: 1562.814 usec (+- 22.729 usec) Performance counter stats for 'perf bench internals pmu-scan -i1000' (5 runs): 3170.86 msec task-clock:u # 0.992 CPUs utilized ( +- 0.22% ) 0 context-switches:u # 0.000 /sec 0 cpu-migrations:u # 0.000 /sec 1010 page-faults:u # 318.526 /sec ( +- 0.04% ) 11890047674 cycles:u # 3.750 GHz ( +- 0.14% ) (83.27%) 119090499 stalled-cycles-frontend:u # 1.00% frontend cycles idle ( +- 0.46% ) (83.40%) 32502449 stalled-cycles-backend:u # 0.27% backend cycles idle ( +- 8.32% ) (83.30%) 33119141261 instructions:u # 2.79 insn per cycle # 0.00 stalled cycles per insn ( +- 0.01% ) (83.37%) 6812816561 branches:u # 2.149 G/sec ( +- 0.01% ) (83.29%) 70157855 branch-misses:u # 1.03% of all branches ( +- 0.28% ) (83.38%) 3.19710 +- 0.00826 seconds time elapsed ( +- 0.26% ) $ Signed-off-by: Ian Rogers <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-23perf build: Allow customization of clang options for BPF targetGuilherme Amadio1-1/+6
This also puts an unconditional -Werror under control of WERROR. The clang includes added during the build can lead to a warning that may be turned into an error. In addition, hardened clang produces a warning about lack of support for -fstack-protector* options for the BPF target: clang -g -O2 -target bpf -Wall -Werror -Ilinux/tools/perf/util/bpf_skel/.tmp/.. \ -I -idirafter /usr/lib/llvm/16/bin/../../../../lib/clang/16/include -idirafter /usr/local/include \ -idirafter /usr/include -Ilinux/tools/include/uapi -c util/bpf_skel/bperf_follower.bpf.c \ -o linux/tools/perf/util/bpf_skel/.tmp/bperf_follower.bpf.o && llvm-strip -g linux/tools/perf/util/bpf_skel/.tmp/bperf_follower.bpf.o clang-16: error: /usr/lib/llvm/16/bin/../../../../lib/clang/16/include: 'linker' input unused [-Werror,-Wunused-command-line-argument] clang-16: error: ignoring '-fstack-protector-strong' option as it is not currently supported for target 'bpf' [-Werror,-Woption-ignored] make[1]: *** [Makefile.perf:1082: linux/tools/perf/util/bpf_skel/.tmp/bpf_prog_profiler.bpf.o] Error 1 Signed-off-by: Guilherme Amadio <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-23perf pmu: Pass PMU rather than aliases and formatIan Rogers4-65/+62
Pass the pmu so the aliases and format list can be better abstracted and later lazily loaded. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-23perf pmu: Avoid passing format list to perf_pmu__format_bits()Ian Rogers6-16/+15
Pass the PMU so the format list can be better abstracted and later lazily loaded. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Did missing conversions in tools/perf/arch/arm*/util/cs-etm.c ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-23perf pmu: Avoid passing format list to perf_pmu__format_typeIan Rogers3-4/+4
Pass the pmu so the format list can be better abstracted and later lazily loaded. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-23perf pmu: Avoid passing format list to perf_pmu__config_terms()Ian Rogers4-67/+61
Abstract the format list better, hiding it in the PMU, by changing perf_pmu__config_terms() the PMU rather than the format list in the PMU. Change the PMU test to pass a dummy PMU for this purpose. Changing the test allows perf_pmu__del_formats() to become static. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-23perf pmu: Reduce scope of perf_pmu_error()Ian Rogers2-2/+3
Move declaration from header file to pmu.y and make static. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-23perf pmu: Move perf_pmu__set_format to pmu.yIan Rogers3-13/+12
Avoid having the function in the C and header file, as it is only used locally by pmu.y. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-23perf pmu: Avoid a path name copyIan Rogers1-5/+7
Rather than read a base path and append into a 2nd path, read the base path directly into output buffer and append to that. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-23perf script ibs: Remove unused includeIan Rogers1-1/+0
Done to reduce dependencies on pmu-events.h. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-23perf bench breakpoint: Skip run if no breakpoints availableKajol Jain1-3/+21
Based on commit 7d54a4acd8c1de3e ("perf test: Skip watchpoint tests if no watchpoints available"), hardware breakpoints are not available for power9 platform and because of that 'perf bench breakpoint' run fails on power9 platform. Add code to check for the return value of perf_event_open() in the breakpoint run and skip the 'perf bench breakpoint' run, if hardware breakpoints are not available. Result on power9 system before patch changes: [command]# perf bench breakpoint thread perf_event_open: No such device Result on power9 system after patch changes: [command]# ./perf bench breakpoint thread Skipping perf bench breakpoint thread: No hardware support Reported-by: Disha Goel <[email protected]> Signed-off-by: Kajol Jain <[email protected]> Acked-by: Naveen N Rao <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Disha Goel <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-22perf lzma: Convert some pr_err() to pr_debug() as callers already use pr_debug()Arnaldo Carvalho de Melo1-7/+5
I noticed some error with: # perf list ex_ret_brn lzma: fopen failed on /usr/lib/modules/5.15.14-100.fc34.x86_64/kernel/net/bluetooth/bnep/bnep.ko.xz: 'No such file or directory' lzma: fopen failed on /usr/lib/modules/5.16.16-200.fc35.x86_64/kernel/drivers/gpu/drm/drm_kms_helper.ko.xz: 'No such file or directory' lzma: fopen failed on /usr/lib/modules/5.18.16-200.fc36.x86_64/kernel/arch/x86/crypto/crct10dif-pclmul.ko.xz: 'No such file or directory' lzma: fopen failed on /usr/lib/modules/5.16.16-200.fc35.x86_64/kernel/drivers/i2c/busses/i2c-piix4.ko.xz: 'No such file or directory' <BIG SNIP> Then using 'perf probe' + 'perf trace' to debug 'perf list', it seems its some inconsistency in the ~/.debug/ cache where broken build id symlinks that ends up making it try to uncompress some kernel modules using the lzma routines: 395.309 perf/3594447 probe_perf:lzma_decompress_to_file(__probe_ip: 6118448, input_string: "/usr/lib/modules/5.18.17-200.fc36.x86_64/kernel/drivers/nvme/host/nvme.ko.xz") lzma_decompress_to_file (/var/home/acme/bin/perf) filename__decompress (/var/home/acme/bin/perf) filename__read_build_id (/var/home/acme/bin/perf) filename__sprintf_build_id (inlined) build_id_cache__valid_id (inlined) build_id_cache__list_all (/var/home/acme/bin/perf) print_sdt_events (/var/home/acme/bin/perf) cmd_list (/var/home/acme/bin/perf) run_builtin (/var/home/acme/bin/perf) handle_internal_command (inlined) run_argv (inlined) main (/var/home/acme/bin/perf) __libc_start_call_main (/usr/lib64/libc.so.6) __libc_start_main@@GLIBC_2.34 (/usr/lib64/libc.so.6) _start (/var/home/acme/bin/perf) But callers of filename__decompress() already check its return and use pr_debug(), so be consistent and make functions it calls also use pr_debug(). Cc: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-21perf stat-display: Check if snprintf()'s fmt argument is NULLKaige Ye1-2/+2
It is undefined behavior to pass NULL as snprintf()'s fmt argument. Here is an example to trigger the problem: $ perf stat --metric-only -x, -e instructions -- sleep 1 insn per cycle, Segmentation fault (core dumped) With this patch: $ perf stat --metric-only -x, -e instructions -- sleep 1 insn per cycle, , Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Kaige Ye <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-21perf bpf augmented_raw_syscalls: Add an assert to make sure ↵Arnaldo Carvalho de Melo1-0/+1
sizeof(augmented_arg->value) is a power of two. Similar to what was done in the previous cset for sizeof(saddr), we need to make sure sizeof(augmented_arg->value) is a power of two to do bounds checking using &=: augmented_len &= sizeof(augmented_arg->value) - 1; Suggested-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-21perf bpf augmented_raw_syscalls: Add an assert to make sure sizeof(saddr) is ↵Arnaldo Carvalho de Melo1-0/+11
a power of two. We're using the BPF verifier suggestion: 22: (85) call bpf_probe_read#4 R2 min value is negative, either use unsigned or 'var &= const' That works only when const is a (power of two - 1) so add an assert to make sure that that is the case. Suggested-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-21perf vendor events arm64: AmpereOne: Remove unsupported eventsIlkka Koskinen1-120/+0
Some of the events included in the ampereone/core-imp-def are not supported on AmpereOne, remove them. Signed-off-by: Ilkka Koskinen <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Dave Kleikamp <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-21perf vendor events arm64: Add AmpereOne metricsIlkka Koskinen1-0/+362
This patch adds AmpereOne metrics. The metrics also work around the issue related to some of the events. Signed-off-by: Ilkka Koskinen <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Dave Kleikamp <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-21perf vendor events arm64: AmpereOne: Mark affected STALL_* events impacted ↵Ilkka Koskinen1-3/+9
by errata Per errata AC03_CPU_29, STALL_SLOT_FRONTEND, STALL_FRONTEND, and STALL events are not counting as expected. The follow up metrics patch will include correct way to calculate the impacted events. Reviewed-by: John Garry <[email protected]> Signed-off-by: Ilkka Koskinen <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Dave Kleikamp <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-21perf vendor events arm64: Remove L1D_CACHE_LMISS from AmpereOne listIlkka Koskinen1-3/+0
amperene/cache.json file tried to include L1D_CACHE_LMISS while it doesn't exist in common-and-microarch.json. While this bug doesn't seem to cause issue in newer kernels with jevents.py script, it prevents building older perf tools with the backported patch. Fixes: a9650b7f6fc09d16 ("perf vendor events arm64: Add AmpereOne core PMU events") Reported-by: Dave Kleikamp <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Reviewed-by: John Garry <[email protected]> Signed-off-by: Ilkka Koskinen <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ilkka Koskinen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Closes: https://lore.kernel.org/all/[email protected]/ Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>