aboutsummaryrefslogtreecommitdiff
path: root/tools/perf/util
AgeCommit message (Collapse)AuthorFilesLines
2021-08-31perf bpf: Fix memory leaks relating to BTF.Ian Rogers2-3/+3
BTF needs to be freed with btf__free(). Signed-off-by: Ian Rogers <[email protected]> Reviewed-by: Kajol Jain <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-31perf header: Fix spelling mistake "cant'" -> "can't"Colin Ian King1-1/+1
There is a spelling mistake in a warning message. Fix it. Signed-off-by: Colin King <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-30perf config: Fix caching and memory leak in perf_home_perfconfig()Arnaldo Carvalho de Melo1-1/+4
Acaict, perf_home_perfconfig() is supposed to cache the result of home_perfconfig, which returns the default location of perfconfig for the user, given the HOME environment variable. However, the current implementation calls home_perfconfig every time perf_home_perfconfig() is called (so no caching is actually performed), replacing the previous pointer, thus also causing a memory leak. This patch adds a check of whether either config or failed is set and, in that case, directly returns config without calling home_perfconfig at each invocation. Fixes: f5f03e19ce14fc31 ("perf config: Add perf_home_perfconfig function") Signed-off-by: Riccardo Mancini <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] [ Removed needless double check for the 'failed' variable ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-30perf tools: Fixup get_current_dir_name() compilationAlexey Dobriyan1-1/+2
strdup() prototype doesn't live in stdlib.h . Add limits.h for PATH_MAX definition as well. This fixes the build on Android. Signed-off-by: Alexey Dobriyan (SK hynix) <[email protected]> Acked-by: Namhyung Kim <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-24perf tools: Add missing newline at the end of header fileNghia Le1-1/+1
Add missing newline at the end of file parse-sublevel-options.h. Thus removing relevant warning reported by checkpatch. Signed-off-by: Nghia Le <[email protected]> Reviewed-by: Lukas Bulwahn <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http //lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-11perf tools: Enable on a list of CPUs for hybridJin Yao5-0/+114
The 'perf record' and 'perf stat' commands have supported the option '-C/--cpus' to count or collect only on the list of CPUs provided. This option needs to be supported for hybrid as well. For hybrid support, it needs to check that the cpu list are available on hybrid PMU. One example for AlderLake, cpu0-7 is 'cpu_core', cpu8-11 is 'cpu_atom'. Before: # perf stat -e cpu_core/cycles/ -C11 -- sleep 1 Performance counter stats for 'CPU(s) 11': <not supported> cpu_core/cycles/ 1.006179431 seconds time elapsed The 'perf stat' command silently returned "<not supported>" without any helpful information. It should error out pointing out that that cpu11 was not 'cpu_core'. After: # perf stat -e cpu_core/cycles/ -C11 -- sleep 1 WARNING: 11 isn't a 'cpu_core', please use a CPU list in the 'cpu_core' range (0-7) failed to use cpu list 11 We also need to support the events without pmu prefix specified. # perf stat -e cycles -C11 -- sleep 1 WARNING: 11 isn't a 'cpu_core', please use a CPU list in the 'cpu_core' range (0-7) Performance counter stats for 'CPU(s) 11': 1,067,373 cpu_atom/cycles/ 1.005544738 seconds time elapsed The perf tool creates two cycles events automatically, cpu_core/cycles/ and cpu_atom/cycles/. It checks that cpu11 is not 'cpu_core', then shows a warning for cpu_core/cycles/ and only count the cpu_atom/cycles/. If part of cpus are 'cpu_core' and part of cpus are 'cpu_atom', for example, # perf stat -e cycles -C0,11 -- sleep 1 WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list. WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list. Performance counter stats for 'CPU(s) 0,11': 1,914,704 cpu_core/cycles/ 2,036,983 cpu_atom/cycles/ 1.005815641 seconds time elapsed It now automatically selects cpu0 for cpu_core/cycles/, selects cpu11 for cpu_atom/cycles/, and output with some warnings. Some more complex examples, # perf stat -e cycles,instructions -C0,11 -- sleep 1 WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list. WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list. WARNING: use 0 in 'cpu_core' for 'instructions', skip other cpus in list. WARNING: use 11 in 'cpu_atom' for 'instructions', skip other cpus in list. Performance counter stats for 'CPU(s) 0,11': 2,780,387 cpu_core/cycles/ 1,583,432 cpu_atom/cycles/ 3,957,277 cpu_core/instructions/ 1,167,089 cpu_atom/instructions/ 1.006005124 seconds time elapsed # perf stat -e cycles,cpu_atom/instructions/ -C0,11 -- sleep 1 WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list. WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list. WARNING: use 11 in 'cpu_atom' for 'cpu_atom/instructions/', skip other cpus in list. Performance counter stats for 'CPU(s) 0,11': 3,290,301 cpu_core/cycles/ 1,953,073 cpu_atom/cycles/ 1,407,869 cpu_atom/instructions/ 1.006260912 seconds time elapsed Signed-off-by: Jin Yao <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jin Yao <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https //lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-11perf tools: Create hybrid flag in targetJin Yao2-1/+2
The user may count or collect only on a cpu list via '-C/--cpus' option. Previously cpus for an evsel were retrieved from PMU's sysfs. But if the target cpu list is defined, the retrieved cpus are not kept and the target cpu list is used instead. But for hybrid system, we can't directly use target cpu list. The cpu list may not be available on hybrid pmu (e.g. cpu_core or cpu_atom). So we should not set the 'has_user_cpus' flag for hybrid system. The difficulity is that we can't call perf_pmu__has_hybrid() in evlist.c to check hybrid system otherwise 'perf test python' would be failed (undefined symbol for perf_pmu__has_hybrid). If we add pmu.c to python-ext-sources, too many symbol dependencies are hard to resolve. We use an alternative method by using a new 'hybrid' flag in target for hybrid system checking. Signed-off-by: Jin Yao <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jin Yao <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https //lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-11perf tests: Add dlfilter testAdrian Hunter2-2/+4
Add a perf test to test the dlfilter C API. A perf.data file is synthesized and then processed by perf script with a dlfilter named dlfilter-test-api-v0.so. Also a C file is compiled to provide a dso to match the synthesized perf.data file. Committer testing: [root@five ~]# perf test dlfilter 72: dlfilter C API : Ok [root@five ~]# perf test -v dlfilter 72: dlfilter C API : --- start --- test child forked, pid 3387712 Checking for gcc Command: gcc --version gcc (GCC) 11.1.1 20210531 (Red Hat 11.1.1-3) Copyright (C) 2021 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. dlfilters path: /var/home/acme/libexec/perf-core/dlfilters Command: gcc -g -o /tmp/dlfilter-test-3387712-prog /tmp/dlfilter-test-3387712-prog.c Creating new host machine structure Command: /var/home/acme/bin/perf script -i /tmp/dlfilter-test-3387712-perf-data --dlfilter /var/home/acme/libexec/perf-core/dlfilters/dlfilter-test-api-v0.so --dlarg first --dlarg 1 --dlarg 4198669 --dlarg 4198662 --dlarg 0 --dlarg last start API filter_event_early API filter_event API stop API Command: /var/home/acme/bin/perf script -i /tmp/dlfilter-test-3387712-perf-data --dlfilter /var/home/acme/libexec/perf-core/dlfilters/dlfilter-test-api-v0.so --dlarg first --dlarg 1 --dlarg 4198669 --dlarg 4198662 --dlarg 1 --dlarg last start API filter_event_early API filter_event API stop API Command: /var/home/acme/bin/perf script -i /tmp/dlfilter-test-3387712-perf-data --dlfilter /var/home/acme/libexec/perf-core/dlfilters/dlfilter-test-api-v0.so --dlarg first --dlarg 1 --dlarg 4198669 --dlarg 4198662 --dlarg 2 --dlarg last start API filter_event_early API stop API test child finished with 0 ---- end ---- dlfilter C API: Ok [root@five ~]# Signed-off-by: Adrian Hunter <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Link: https //lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-11perf build: Move perf_dlfilters.h in the source treeAdrian Hunter2-151/+1
Move perf_dlfilters.h in the source tree so that it will be found when building dlfilters as part of the perf build. Signed-off-by: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Link: https //lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-10perf pmu: Make pmu_add_sys_aliases() publicJohn Garry2-1/+2
Function pmu_add_sys_aliases() will be required for the PMU events test for system events aliases, so make it public. Signed-off-by: John Garry <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: https //lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-10perf pmu: Check .is_uncore field in pmu_add_cpu_aliases_map()John Garry1-2/+1
Calling pmu_is_uncore() for fake PMUs does not work, as it checks sysfs for the PMU details (which won't exist). Check .is_uncore field instead, which makes sense anyway. Signed-off-by: John Garry <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: https //lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-09perf env: Track kernel 64-bit mode in environmentLeo Yan2-1/+26
It's useful to know that the kernel is running in 32-bit or 64-bit mode. E.g. We can decide if perf tool is running in compat mode based on the info. This patch adds an item "kernel_is_64_bit" into session's environment structure perf_env, its value is initialized based on the architecture string. Suggested-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Leo Yan <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Li Huafei <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: russell king <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-09perf: Cleanup for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORTLeo Yan1-5/+0
Since the __sync functions have been dropped, This patch removes unused build and checking for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT in perf tool. Signed-off-by: Leo Yan <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Daniel Díaz <[email protected]> Cc: Frank Ch. Eigler <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sedat Dilek <[email protected]> Cc: Song Liu <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-09perf auxtrace: Remove auxtrace_mmap__read_snapshot_head()Leo Yan2-18/+5
Since the function auxtrace_mmap__read_snapshot_head() is exactly same with auxtrace_mmap__read_head(), whether the session is in snapshot mode or not, it's unified to use function auxtrace_mmap__read_head() for reading AUX buffer head. And the function auxtrace_mmap__read_snapshot_head() is unused so this patch removes it. Signed-off-by: Leo Yan <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Daniel Díaz <[email protected]> Cc: Frank Ch. Eigler <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sedat Dilek <[email protected]> Cc: Song Liu <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-09perf auxtrace: Drop legacy __sync functionsLeo Yan1-19/+0
The main purpose for using __sync built-in functions is to support compat mode for 32-bit perf with 64-bit kernel. But using these built-in functions might cause potential issues. __sync functions originally support Intel Itanium processoer [1] but it cannot promise to support all 32-bit archs. Now these functions have become the legacy functions. Considering __sync functions cannot really fix the 64-bit value atomicity on 32-bit archs, thus this patch drops __sync functions. Credits to Peter for detailed analysis. [1] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fsync-Builtins.html#g_t_005f_005fsync-Builtins Suggested-by: Peter Zijlstra <[email protected]> Signed-off-by: Leo Yan <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Daniel Díaz <[email protected]> Cc: Frank Ch. Eigler <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sedat Dilek <[email protected]> Cc: Song Liu <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-09perf auxtrace: Use WRITE_ONCE() for updating aux_tailLeo Yan1-1/+1
Use WRITE_ONCE() for updating aux_tail, so can avoid unexpected memory behaviour. Signed-off-by: Leo Yan <[email protected]> Acked-by: Adrian Hunter <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Daniel Díaz <[email protected]> Cc: Frank Ch. Eigler <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sedat Dilek <[email protected]> Cc: Song Liu <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Link: http //lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-09perf cs-etm: Add warnings for missing DSOsJames Clark1-1/+10
Currently decode will silently fail if no binary data is available for the decode. This is made worse if only partial data is available because the decode will appear to work, but any trace from that missing DSO will silently not be generated. Add a UI popup once if there is any data missing, and then warn in the bottom left for each individual DSO that's missing. Reviewed-by: Leo Yan <[email protected]> Signed-off-by: James Clark <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: http //lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski3-13/+42
Build failure in drivers/net/wwan/mhi_wwan_mbim.c: add missing parameter (0, assuming we don't want buffer pre-alloc). Conflict in drivers/net/dsa/sja1105/sja1105_main.c between: 589918df9322 ("net: dsa: sja1105: be stateless with FDB entries on SJA1105P/Q/R/S/SJA1110 too") 0fac6aa098ed ("net: dsa: sja1105: delete the best_effort_vlan_filtering mode") Follow the instructions from the commit message of the former commit - removed the if conditions. When looking at commit 589918df9322 ("net: dsa: sja1105: be stateless with FDB entries on SJA1105P/Q/R/S/SJA1110 too") note that the mask_iotag fields get removed by the following patch. Signed-off-by: Jakub Kicinski <[email protected]>
2021-08-03perf cs-etm: Improve Coresight zero timestamp warningJames Clark1-2/+5
Only show the warning if the user hasn't already set timeless mode and improve the text because there was ambiguity around the meaning of '...' Change the warning to a UI warning instead of printing straight to stderr because this corrupts the UI when perf report TUI is used. The UI warning function also handles printing to stderr when in perf script mode. Suggested-by: Leo Yan <[email protected]> Signed-off-by: James Clark <[email protected]> Reviewed-by: Leo Yan <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-03perf tools: Add flag for tracking warnings of missing DSOsJames Clark1-0/+1
Auxtrace support may need DSOs for decoding (for example Arm Coresight). If one of these is missing it would make sense to warn once for each one that's missing, but not flood the output with every address as there could be thousands of lookups. This flag will allow tracking whether a warning was shown for each DSO. Signed-off-by: James Clark <[email protected]> Reviewed-by: Leo Yan <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-03perf annotate: Add disassembly warnings for annotate --stdioJames Clark1-2/+18
Currently 'perf annotate --stdio' (and --stdio2) will exit without printing anything if there are disassembly errors. Apply the same error handler that's used for TUI and GTK modes. This makes comparing disassembly across the different modes more consistent. Signed-off-by: James Clark <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-03perf tools: Add WARN_ONCE equivalent for UI warningsJames Clark1-0/+14
Currently WARN_ONCE prints to stderr and corrupts the TUI. Add equivalent methods for UI warnings. Signed-off-by: James Clark <[email protected]> Reviewed-by: Leo Yan <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-02perf inject: Fix output from a file to a pipeNamhyung Kim2-1/+58
When the input is a regular file but the output is a pipe, it should write a pipe header. But just repiping would write a portion of the existing header which is different in 'size' value. So we need to prevent it and write a new pipe header along with other information like event attributes and features. This can handle something like this: # perf record -a -B sleep 1 # perf inject -b -i perf.data | perf report -i - Factor out perf_event__synthesize_for_pipe() to be shared between perf record and inject. Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-02perf tools: Pass a fd to perf_file_header__read_pipe()Namhyung Kim4-13/+13
Currently it unconditionally writes to stdout for repipe. But perf inject can direct its output to a regular file. Then it needs to write the header to the file as well. Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-02perf tools: Remove repipe argument from perf_session__new()Namhyung Kim4-6/+15
The repipe argument is only used by perf inject and the all others passes 'false'. Let's remove it from the function signature and add __perf_session__new() to be called from perf inject directly. This is a preparation of the change the pipe input/output. Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] [ Fixed up some trivial conflicts as this patchset fell thru the cracks ;-( ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-02perf annotate: Add error log in symbol__annotate()Li Huafei1-1/+3
When users use 'perf annotate' on unsupported machines, error logs should be printed for user feedback. Signed-off-by: Li Huafei <[email protected]> Reviewed-by: James Clark <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Dengcheng Zhu <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Martin Liška <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: Zhang Jinhao <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-02perf env: Normalize aarch64.* and arm64.* to arm64 in normalize_arch()Li Huafei1-1/+1
On my aarch64 big endian machine, the perf annotate does not work. # perf annotate Percent | Source code & Disassembly of [kernel.kallsyms] for cycles (253 samples, percent: local period) -------------------------------------------------------------------------------------------------------------- Percent | Source code & Disassembly of [kernel.kallsyms] for cycles (1 samples, percent: local period) ------------------------------------------------------------------------------------------------------------ Percent | Source code & Disassembly of [kernel.kallsyms] for cycles (47 samples, percent: local period) ------------------------------------------------------------------------------------------------------------- ... This is because the arch_find() function uses the normalized architecture name provided by normalize_arch(), and my machine's architecture name aarch64_be is not normalized to arm64. Like other architectures such as arm and powerpc, we can fuzzy match the architecture names associated with aarch64.* and normalize them. It seems that there is also arm64_be architecture name, which we also normalize to arm64. Signed-off-by: Li Huafei <[email protected]> Reviewed-by: James Clark <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Dengcheng Zhu <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Martin Liška <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: Zhang Jinhao <[email protected]> Link: http //lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-02perf cs-etm: Pass unformatted flag to decoderJames Clark2-15/+42
The TRBE (Trace Buffer Extension) feature allows a separate trace buffer for each trace source, therefore the trace wouldn't need to be formatted. The driver was introduced in commit 3fbf7f011f24 ("coresight: sink: Add TRBE driver"). The formatted/unformatted mode is encoded in one of the flags of the AUX record. The first AUX record encountered for each event is used to determine the mode, and this will persist for the remaining trace that is either decoded or dumped. Reviewed-by: Mathieu Poirier <[email protected]> Signed-off-by: James Clark <[email protected]> Cc: Al Grant <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: https //lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-02perf cs-etm: Use existing decoder instead of resetting itJames Clark1-30/+7
When dumping trace, the decoder is continually deleted and recreated to decode each buffer. To support both formatted and unformatted trace in a later commit, the decoder will be configured in advance. This commit removes the deletion of the decoder and allows the formatted/unformatted setting to persist. Reviewed-by: Mathieu Poirier <[email protected]> Signed-off-by: James Clark <[email protected]> Cc: Al Grant <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: https //lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-02perf cs-etm: Suppress printing when resetting decoderJames Clark1-3/+7
The decoder is quite noisy when being reset. In a future commit, dump-raw-trace will use a code path that resets the decoder rather than creating a new one, so printing has to be suppressed to not flood the output. Reviewed-by: Mathieu Poirier <[email protected]> Signed-off-by: James Clark <[email protected]> Cc: Al Grant <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: https //lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-02perf cs-etm: Only setup queues when they are modifiedJames Clark1-40/+14
Continually creating queues in cs_etm__process_event() is unnecessary. They only need to be created when a buffer for a new CPU or thread is encountered. This can be in two places, when building the queues in advance in cs_etm__process_auxtrace_info(), or in cs_etm__process_auxtrace_event() when data_queued is false and the index wasn't available (pipe mode). This change will allow the 'formatted' decoder setting to applied when iterating over aux records in a later commit. Reviewed-by: Mathieu Poirier <[email protected]> Signed-off-by: James Clark <[email protected]> Cc: Al Grant <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: https //lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-02perf cs-etm: Split setup and timestamp search functionsJames Clark1-12/+29
This refactoring has some benefits: * Decoding is done to find the timestamp. If we want to print errors when maps aren't available, then doing it from cs_etm__setup_queue() may cause warnings to be printed. * The cs_etm__setup_queue() flow is shared between timed and timeless modes, so it needs to be guarded by an if statement which can now be removed. * Allows moving the setup queues function earlier. * If data was piped in, then not all queues would be filled so it wouldn't have worked properly anyway. Now it waits for flush so data in all queues will be available. The motivation for this is to decouple setup functions with ones that involve decoding. That way we can move the setup function earlier when the formatted/unformatted trace information is available. Reviewed-by: Mathieu Poirier <[email protected]> Signed-off-by: James Clark <[email protected]> Cc: Al Grant <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: https //lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-02perf cs-etm: Refactor initialisation of kernel start addressJames Clark1-5/+1
The kernel start address is already cached in the machine struct once it is initialised, so storing it in the cs_etm struct is unnecessary. It also depends on kernel maps being available to be initialised. Therefore cs_etm__setup_queues() isn't an appropriate place to call it because it could be called before processing starts. It would be better to initialise it at the point when it is needed, then we can be sure that all the necessary maps are available. Also by calling machine__kernel_start() multiple times it can be initialised at some point, even if it failed to initialise previously due to missing maps. In a later commit cs_etm__setup_queues() will be moved which is the motivation for this change. Reviewed-by: Mathieu Poirier <[email protected]> Signed-off-by: James Clark <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Al Grant <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-07-31Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski2-8/+15
Andrii Nakryiko says: ==================== bpf-next 2021-07-30 We've added 64 non-merge commits during the last 15 day(s) which contain a total of 83 files changed, 5027 insertions(+), 1808 deletions(-). The main changes are: 1) BTF-guided binary data dumping libbpf API, from Alan. 2) Internal factoring out of libbpf CO-RE relocation logic, from Alexei. 3) Ambient BPF run context and cgroup storage cleanup, from Andrii. 4) Few small API additions for libbpf 1.0 effort, from Evgeniy and Hengqi. 5) bpf_program__attach_kprobe_opts() fixes in libbpf, from Jiri. 6) bpf_{get,set}sockopt() support in BPF iterators, from Martin. 7) BPF map pinning improvements in libbpf, from Martynas. 8) Improved module BTF support in libbpf and bpftool, from Quentin. 9) Bpftool cleanups and documentation improvements, from Quentin. 10) Libbpf improvements for supporting CO-RE on old kernels, from Shuyi. 11) Increased maximum cgroup storage size, from Stanislav. 12) Small fixes and improvements to BPF tests and samples, from various folks. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (64 commits) tools: bpftool: Complete metrics list in "bpftool prog profile" doc tools: bpftool: Document and add bash completion for -L, -B options selftests/bpf: Update bpftool's consistency script for checking options tools: bpftool: Update and synchronise option list in doc and help msg tools: bpftool: Complete and synchronise attach or map types selftests/bpf: Check consistency between bpftool source, doc, completion tools: bpftool: Slightly ease bash completion updates unix_bpf: Fix a potential deadlock in unix_dgram_bpf_recvmsg() libbpf: Add btf__load_vmlinux_btf/btf__load_module_btf tools: bpftool: Support dumping split BTF by id libbpf: Add split BTF support for btf__load_from_kernel_by_id() tools: Replace btf__get_from_id() with btf__load_from_kernel_by_id() tools: Free BTF objects at various locations libbpf: Rename btf__get_from_id() as btf__load_from_kernel_by_id() libbpf: Rename btf__load() as btf__load_into_kernel() libbpf: Return non-null error on failures in libbpf_find_prog_btf_id() bpf: Emit better log message if bpf_iter ctx arg btf_id == 0 tools/resolve_btfids: Emit warnings and patch zero id for missing symbols bpf: Increase supported cgroup storage value size libbpf: Fix race when pinning maps in parallel ... ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2021-07-30Revert "perf map: Fix dso->nsinfo refcounting"Arnaldo Carvalho de Melo1-2/+0
This makes 'perf top' abort in some cases, and the right fix will involve surgery that is too much to do at this stage, so revert for now and fix it in the next merge window. This reverts commit 2d6b74baa7147251c30a46c4996e8cc224aa2dc5. Cc: Riccardo Mancini <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Krister Johansen <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-07-29tools: Replace btf__get_from_id() with btf__load_from_kernel_by_id()Quentin Monnet2-5/+11
Replace the calls to function btf__get_from_id(), which we plan to deprecate before the library reaches v1.0, with calls to btf__load_from_kernel_by_id() in tools/ (bpftool, perf, selftests). Update the surrounding code accordingly (instead of passing a pointer to the btf struct, get it as a return value from the function). Signed-off-by: Quentin Monnet <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: John Fastabend <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-07-29tools: Free BTF objects at various locationsQuentin Monnet2-3/+4
Make sure to call btf__free() (and not simply free(), which does not free all pointers stored in the struct) on pointers to struct btf objects retrieved at various locations. These were found while updating the calls to btf__get_from_id(). Fixes: 999d82cbc044 ("tools/bpf: enhance test_btf file testing to test func info") Fixes: 254471e57a86 ("tools/bpf: bpftool: add support for func types") Fixes: 7b612e291a5a ("perf tools: Synthesize PERF_RECORD_* for loaded BPF programs") Fixes: d56354dc4909 ("perf tools: Save bpf_prog_info and BTF of new BPF programs") Fixes: 47c09d6a9f67 ("bpftool: Introduce "prog profile" command") Fixes: fa853c4b839e ("perf stat: Enable counting events for BPF programs") Signed-off-by: Quentin Monnet <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2021-07-27perf pmu: Fix alias matchingJohn Garry1-9/+24
Commit c47a5599eda324ba ("perf tools: Fix pattern matching for same substring in different PMU type"), may have fixed some alias matching, but has broken some others. Firstly it cannot handle the simple scenario of PMU name in form pmu_name{digits} - it can only handle pmu_name_{digits}. Secondly it cannot handle more complex matching in the case where we have multiple tokens. In this scenario, the code failed to realise that we may examine multiple substrings in the PMU name. Fix in two ways: - Change perf_pmu__valid_suffix() to accept a PMU name without '_' in the suffix - Only pay attention to perf_pmu__valid_suffix() for the final token Also add const qualifiers as necessary to avoid casting. Fixes: c47a5599eda324ba ("perf tools: Fix pattern matching for same substring in different PMU type") Signed-off-by: John Garry <[email protected]> Tested-by: Jin Yao <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-07-27perf cs-etm: Split --dump-raw-trace by AUX recordsJames Clark1-2/+18
Currently --dump-raw-trace skips queueing and splitting buffers because of an early exit condition in cs_etm__process_auxtrace_info(). Once that is removed we can print the split data by using the queues and searching for split buffers with the same reference as the one that is currently being processed. This keeps the same behaviour of dumping in file order when an AUXTRACE event appears, rather than moving trace dump to where AUX records are in the file. There will be a newline and size printout for each fragment. For example this buffer is comprised of two AUX records, but was printed as one: 0 0 0x8098 [0x30]: PERF_RECORD_AUXTRACE size: 0xa0 offset: 0 ref: 0x491a4dfc52fc0e6e idx: 0 t . ... CoreSight ETM Trace data: size 160 bytes Idx:0; ID:10; I_ASYNC : Alignment Synchronisation. Idx:12; ID:10; I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 } Idx:17; ID:10; I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000000000000000; Idx:80; ID:10; I_ASYNC : Alignment Synchronisation. Idx:92; ID:10; I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 } Idx:97; ID:10; I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0xFFFFDE2AD3FD76D4; But is now printed as two fragments: 0 0 0x8098 [0x30]: PERF_RECORD_AUXTRACE size: 0xa0 offset: 0 ref: 0x491a4dfc52fc0e6e idx: 0 t . ... CoreSight ETM Trace data: size 80 bytes Idx:0; ID:10; I_ASYNC : Alignment Synchronisation. Idx:12; ID:10; I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 } Idx:17; ID:10; I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000000000000000; . ... CoreSight ETM Trace data: size 80 bytes Idx:80; ID:10; I_ASYNC : Alignment Synchronisation. Idx:92; ID:10; I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 } Idx:97; ID:10; I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0xFFFFDE2AD3FD76D4; Decoding errors that appeared in problematic files are now not present, for example: Idx:808; ID:1c; I_BAD_SEQUENCE : Invalid Sequence in packet.[I_ASYNC] ... PKTP_ETMV4I_0016 : 0x0014 (OCSD_ERR_INVALID_PCKT_HDR) [Invalid packet header]; TrcIdx=822 Signed-off-by: James Clark <[email protected]> Reviewed-by: Mathieu Poirier <[email protected]> Tested-by: Leo Yan <[email protected]> Cc: Al Grant <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Branislav Rankov <[email protected]> Cc: Denis Nikitin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-07-18perf probe: Fix add event failure when running 32-bit perf in a 64-bit kernelYang Jihong6-42/+38
The "address" member of "struct probe_trace_point" uses long data type. If kernel is 64-bit and perf program is 32-bit, size of "address" variable is 32 bits. As a result, upper 32 bits of address read from kernel are truncated, an error occurs during address comparison in kprobe_warn_out_range(). Before: # perf probe -a schedule schedule is out of .text, skip it. Error: Failed to add events. Solution: Change data type of "address" variable to u64 and change corresponding address printing and value assignment. After: # perf.new.new probe -a schedule Added new event: probe:schedule (on schedule) You can now use it in all perf tools, such as: perf record -e probe:schedule -aR sleep 1 # perf probe -l probe:schedule (on schedule@kernel/sched/core.c) # perf record -e probe:schedule -aR sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.156 MB perf.data (1366 samples) ] # perf report --stdio # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 1K of event 'probe:schedule' # Event count (approx.): 1366 # # Overhead Command Shared Object Symbol # ........ ............... ................. ............ # 6.22% migration/0 [kernel.kallsyms] [k] schedule 6.22% migration/1 [kernel.kallsyms] [k] schedule 6.22% migration/2 [kernel.kallsyms] [k] schedule 6.22% migration/3 [kernel.kallsyms] [k] schedule 6.15% migration/10 [kernel.kallsyms] [k] schedule 6.15% migration/11 [kernel.kallsyms] [k] schedule 6.15% migration/12 [kernel.kallsyms] [k] schedule 6.15% migration/13 [kernel.kallsyms] [k] schedule 6.15% migration/14 [kernel.kallsyms] [k] schedule 6.15% migration/15 [kernel.kallsyms] [k] schedule 6.15% migration/4 [kernel.kallsyms] [k] schedule 6.15% migration/5 [kernel.kallsyms] [k] schedule 6.15% migration/6 [kernel.kallsyms] [k] schedule 6.15% migration/7 [kernel.kallsyms] [k] schedule 6.15% migration/8 [kernel.kallsyms] [k] schedule 6.15% migration/9 [kernel.kallsyms] [k] schedule 0.22% rcu_sched [kernel.kallsyms] [k] schedule ... # # (Cannot load tips.txt file, please install perf!) # Signed-off-by: Yang Jihong <[email protected]> Acked-by: Masami Hiramatsu <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Frank Ch. Eigler <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jianlin Lv <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Li Huafei <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Srikar Dronamraju <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-07-18perf data: Close all files in close_dir()Riccardo Mancini1-1/+1
When using 'perf report' in directory mode, the first file is not closed on exit, causing a memory leak. The problem is caused by the iterating variable never reaching 0. Fixes: 145520631130bd64 ("perf data: Add perf_data__(create_dir|close_dir) functions") Signed-off-by: Riccardo Mancini <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Zhen Lei <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-07-18perf probe-file: Delete namelist in del_events() on the error pathRiccardo Mancini1-2/+2
ASan reports some memory leaks when running: # perf test "42: BPF filter" This second leak is caused by a strlist not being dellocated on error inside probe_file__del_events. This patch adds a goto label before the deallocation and makes the error path jump to it. Signed-off-by: Riccardo Mancini <[email protected]> Fixes: e7895e422e4da63d ("perf probe: Split del_perf_probe_events()") Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/174963c587ae77fa108af794669998e4ae558338.1626343282.git.rickyman7@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-07-15perf lzma: Close lzma stream on exitRiccardo Mancini1-3/+5
ASan reports memory leaks when running: # perf test "88: Check open filename arg using perf trace + vfs_getname" One of these is caused by the lzma stream never being closed inside lzma_decompress_to_file(). This patch adds the missing lzma_end(). Signed-off-by: Riccardo Mancini <[email protected]> Fixes: 80a32e5b498a7547 ("perf tools: Add lzma decompression support for kernel module") Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/aaf50bdce7afe996cfc06e1bbb36e4a2a9b9db93.1626343282.git.rickyman7@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-07-15perf session: Cleanup trace_eventRiccardo Mancini1-0/+1
ASan reports several memory leaks when running: # perf test "82: Use vfs_getname probe to get syscall args filenames" many of which are related to session->tevent. This patch will solve this problem, then next patch will fix the remaining memory leaks in 'perf script'. This bug is due to a missing deallocation of the trace_event data strutures. This patch adds the missing trace_event__cleanup() in perf_session__delete(). Signed-off-by: Riccardo Mancini <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/fa2a3f221d90e47ce4e5b7e2d6e64c3509ddc96a.1626343282.git.rickyman7@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-07-15perf report: Free generated help strings for sort optionRiccardo Mancini2-2/+2
ASan reports the memory leak of the strings allocated by sort_help() when running perf report. This patch changes the returned pointer to char* (instead of const char*), saves it in a temporary variable, and finally deallocates it at function exit. Signed-off-by: Riccardo Mancini <[email protected]> Fixes: 702fb9b415e7c99b ("perf report: Show all sort keys in help output") Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/a38b13f02812a8a6759200b9063c6191337f44d4.1626343282.git.rickyman7@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-07-15perf env: Fix memory leak of cpu_pmu_capsRiccardo Mancini1-0/+1
ASan reports memory leaks while running: # perf test "83: Zstd perf.data compression/decompression" The first of the leaks is caused by env->cpu_pmu_caps not being freed. This patch adds the missing (z)free inside perf_env__exit. Signed-off-by: Riccardo Mancini <[email protected]> Fixes: 6f91ea283a1ed23e ("perf header: Support CPU PMU capabilities") Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/6ba036a8220156ec1f3d6be3e5d25920f6145028.1626343282.git.rickyman7@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-07-15perf dso: Fix memory leak in dso__new_map()Riccardo Mancini1-1/+3
ASan reports a memory leak when running: # perf test "65: maps__merge_in". The causes of the leaks are two, this patch addresses only the first one, which is related to dso__new_map(). The bug is that dso__new_map() creates a new dso but never decreases the refcount it gets from creating it. This patch adds the missing dso__put(). Signed-off-by: Riccardo Mancini <[email protected]> Fixes: d3a7c489c7fd2463 ("perf tools: Reference count struct dso") Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/60bfe0cd06e89e2ca33646eb8468d7f5de2ee597.1626343282.git.rickyman7@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-07-15perf env: Fix sibling_dies memory leakRiccardo Mancini1-0/+1
ASan reports a memory leak in perf_env while running: # perf test "41: Session topology" Caused by sibling_dies not being freed. This patch adds the required free. Fixes: acae8b36cded0ee6 ("perf header: Add die information in CPU topology") Signed-off-by: Riccardo Mancini <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/2140d0b57656e4eb9021ca9772250c24c032924b.1626343282.git.rickyman7@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-07-15perf probe: Fix dso->nsinfo refcountingRiccardo Mancini1-1/+3
ASan reports a memory leak of nsinfo during the execution of: # perf test "31: Lookup mmap thread". The leak is caused by a refcounted variable being replaced without dropping the refcount. This patch makes sure that the refcnt of nsinfo is decreased whenever a refcounted variable is replaced with a new value. Signed-off-by: Riccardo Mancini <[email protected]> Fixes: 544abd44c7064c8a ("perf probe: Allow placing uprobes in alternate namespaces.") Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Krister Johansen <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/55223bc8821b34ccb01f92ef1401c02b6a32e61f.1626343282.git.rickyman7@gmail.com [ Split from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-07-15perf map: Fix dso->nsinfo refcountingRiccardo Mancini1-0/+2
ASan reports a memory leak of nsinfo during the execution of # perf test "31: Lookup mmap thread" The leak is caused by a refcounted variable being replaced without dropping the refcount. This patch makes sure that the refcnt of nsinfo is decreased whenever a refcounted variable is replaced with a new value. Signed-off-by: Riccardo Mancini <[email protected]> Fixes: bf2e710b3cb8445c ("perf maps: Lookup maps in both intitial mountns and inner mountns.") Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Krister Johansen <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/55223bc8821b34ccb01f92ef1401c02b6a32e61f.1626343282.git.rickyman7@gmail.com [ Split from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>