aboutsummaryrefslogtreecommitdiff
path: root/tools/perf/util
AgeCommit message (Collapse)AuthorFilesLines
2023-04-04perf list: Use relative path for tracepoint scanNamhyung Kim1-7/+31
Committer notes: Added missing #include <unistd.h> for the close() prototype to fix this on Alma Linux 8: 1 21.54 almalinux:8 : FAIL gcc version 8.5.0 20210514 (Red Hat 8.5.0-16) (GCC) util/print-events.c: In function 'print_tracepoint_events': util/print-events.c:103:4: error: implicit declaration of function 'close'; did you mean 'clone'? [-Werror=implicit-function-declaration] close(evt_fd); ^~~~~ clone Also use the newly added scandirat feature test to check if that function is available, providing a HAVE_SCANDIRAT_SUPPORT conditional warning to the user if it isn't available. Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-04-04perf intel-pt: Fix CYC timestamps after standalone CBRAdrian Hunter1-0/+2
After a standalone CBR (not associated with TSC), update the cycles reference timestamp and reset the cycle count, so that CYC timestamps are calculated relative to that point with the new frequency. Fixes: cc33618619cefc6d ("perf tools: Add Intel PT support for decoding CYC packets") Signed-off-by: Adrian Hunter <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-04-04perf auxtrace: Fix address filter entire kernel sizeAdrian Hunter1-1/+4
kallsyms is not completely in address order. In find_entire_kern_cb(), calculate the kernel end from the maximum address not the last symbol. Example: Before: $ sudo cat /proc/kallsyms | grep ' [twTw] ' | tail -1 ffffffffc00b8bd0 t bpf_prog_6deef7357e7b4530 [bpf] $ sudo cat /proc/kallsyms | grep ' [twTw] ' | sort | tail -1 ffffffffc15e0cc0 t iwl_mvm_exit [iwlmvm] $ perf.d093603a05aa record -v --kcore -e intel_pt// --filter 'filter *' -- uname |& grep filter Address filter: filter 0xffffffff93200000/0x2ceba000 After: $ perf.8fb0f7a01f8e record -v --kcore -e intel_pt// --filter 'filter *' -- uname |& grep filter Address filter: filter 0xffffffff93200000/0x2e3e2000 Fixes: 1b36c03e356936d6 ("perf record: Add support for using symbols in address filters") Signed-off-by: Adrian Hunter <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-04-04perf arm-spe: Add raw decoding for SPEv1.3 MTE and MOPS load/storeRob Herring2-0/+12
Arm SPEv1.3 adds new load/store operation subclasses for Memory Tagging Extension (MTE) and memory operations (MOPS). The memory operations are memcpy and memset. Add support for decoding these new subclasses in the raw decoding. Reviewed-by: Leo Yan <[email protected] Signed-off-by: Rob Herring <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-04-04perf cs-etm: Handle PERF_RECORD_AUX_OUTPUT_HW_ID packetMike Leach2-18/+235
When using dynamically assigned CoreSight trace IDs the drivers can output the ID / CPU association as a PERF_RECORD_AUX_OUTPUT_HW_ID packet. Update cs-etm decoder to handle this packet by setting the CPU/Trace ID mapping. Reviewed-by: James Clark <[email protected]> Signed-off-by: Mike Leach <[email protected]> Acked-by: Suzuki Poulouse <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Darren Hart <[email protected]> Cc: Ganapatrao Kulkarni <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-04-04perf cs-etm: Move mapping of Trace ID and cpu into helper functionMike Leach3-35/+74
The information to associate Trace ID and CPU will be changing. Drivers will start outputting this as a hardware ID packet in the data file which if present will be used in preference to the AUXINFO values. To prepare for this we provide a helper functions to do the individual ID mapping, and one to extract the IDs from the completed metadata blocks. Reviewed-by: James Clark <[email protected]> Signed-off-by: Mike Leach <[email protected]> Acked-by: Suzuki Poulouse <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Darren Hart <[email protected]> Cc: Ganapatrao Kulkarni <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-04-04perf lock contention: Show detail failure reason for BPFNamhyung Kim3-5/+20
It can fail to collect lock stat from BPF for various reasons. For example, I've got a report that sometimes time calculation seems wrong in case of contended spinlocks. I suspect the time delta went negative for some reason. Count them separately and show in the output like below: $ sudo perf lock contention -abE5 sleep 10 contended total wait max wait avg wait type caller 13 785.61 us 79.36 us 60.43 us spinlock remove_wait_queue+0x14 10 469.02 us 87.51 us 46.90 us spinlock prepare_to_wait+0x27 9 289.09 us 69.08 us 32.12 us spinlock finish_wait+0x36 114 251.05 us 8.56 us 2.20 us spinlock try_to_wake_up+0x1f5 132 188.63 us 5.01 us 1.43 us spinlock __wake_up_common_lock+0x62 === output for debug === bad: 1, total: 279 bad rate: 0.36 % histogram of failure reasons task: 1 stack: 0 time: 0 Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Hao Luo <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-04-04perf symbol: Remove unused branch_callstackAdrian Hunter1-1/+0
branch_callstack was added by commit 8b7bad58efb7 ("perf callchain: Support handling complete branch stacks as histograms") but never used. Signed-off-by: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-04-04perf block-range: Move debug code behind ifndef NDEBUGIan Rogers1-5/+1
Make good on a comment and avoid a unused-but-set-variable warning. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sean Christopherson <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-04-04perf symbol: Add command line support for addr2line pathIan Rogers3-10/+23
Allow addr2line to be set either on the command line or via the perfconfig file. This doesn't currently work with llvm-addr2line as the addr2line code emits two things: 1) the address to decode, 2) a bogus ',' value. The expectation is the bogus value will generate: ?? ??:0 that terminates the addr2line reading. However, the output from llvm-addr2line is a single line with just the input ',' locking up the addr2line reading that is expecting a second line. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andres Freund <[email protected]> Cc: German Gomez <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Tom Rix <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-04-04perf annotate: Allow objdump to be set in perfconfigIan Rogers1-0/+6
Allow the setting of the objdump command in the perfconfig. Update man page for this new option. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andres Freund <[email protected]> Cc: German Gomez <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Tom Rix <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-04-04perf annotate: Own objdump_path and disassembler_style stringsIan Rogers2-4/+10
Make struct annotation_options own the strings objdump_path and disassembler_style, freeing them on exit. Add missing strdup for disassembler_style when read from a config file. Committer notes: Converted free(obj->member) to zfree(&obj->member) in annotation_options__exit() Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andres Freund <[email protected]> Cc: German Gomez <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Tom Rix <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-04-04perf annotate: Add init/exit to annotation_options remove defaultIan Rogers2-10/+20
The annotation__default_options global variable was used to initialize annotation_options. Switch to the init/exit pattern as later changes will give ownership over strings and this will be necessary to avoid memory leaks. Committer note: Fix the GTK2=1 build, hist_entry__gtk_annotate() needs to receive a 'struct annotation_options' pointer. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andres Freund <[email protected]> Cc: German Gomez <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Tom Rix <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-04-04perf tools: Avoid warning in do_realloc_array_as_needed()Adrian Hunter1-1/+2
do_realloc_array_as_needed() used memcpy() of zero size with a NULL pointer. Check the size first to avoid sanitize warning. Discovered using EXTRA_CFLAGS="-fsanitize=undefined -fsanitize=address". Reported-by: kernel test robot <[email protected]> Signed-off-by: Adrian Hunter <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/oe-lkp/[email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-04-04perf symbols: Fix unaligned access in get_x86_64_plt_disp()Adrian Hunter1-1/+4
Use memcpy() to avoid unaligned access. Discovered using EXTRA_CFLAGS="-fsanitize=undefined -fsanitize=address". Fixes: ce4c8e7966f317ef ("perf symbols: Get symbols for .plt.got for x86-64") Reported-by: kernel test robot <[email protected]> Signed-off-by: Adrian Hunter <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/oe-lkp/[email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-04-04perf symbols: Fix use-after-free in get_plt_got_name()Adrian Hunter1-1/+4
Fix use-after-free in get_plt_got_name(). Discovered using EXTRA_CFLAGS="-fsanitize=undefined -fsanitize=address". Fixes: ce4c8e7966f317ef ("perf symbols: Get symbols for .plt.got for x86-64") Reported-by: kernel test robot <[email protected]> Signed-off-by: Adrian Hunter <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/oe-lkp/[email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-04-04perf metrics: Add has_pmem literalIan Rogers1-0/+19
Add literal so that if nvdimms aren't installed we can record fewer events. The file detection mechanism was suggested by Dan Williams <[email protected]> in: https://lore.kernel.org/linux-perf-users/[email protected]/ Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Dan Williams <[email protected]> Cc: Edward Baker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Weilin Wang <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-04-04perf lock contention: Fix msan issue in lock_contention_read()Namhyung Kim1-1/+1
I got a report of a msan failure like below: $ sudo perf lock con -ab -- sleep 1 ... ==224416==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x5651160d6c96 in lock_contention_read util/bpf_lock_contention.c:290:8 #1 0x565115f90870 in __cmd_contention builtin-lock.c:1919:3 #2 0x565115f90870 in cmd_lock builtin-lock.c:2385:8 #3 0x565115f03a83 in run_builtin perf.c:330:11 #4 0x565115f03756 in handle_internal_command perf.c:384:8 #5 0x565115f02d53 in run_argv perf.c:428:2 #6 0x565115f02d53 in main perf.c:562:3 #7 0x7f43553bc632 in __libc_start_main #8 0x565115e865a9 in _start It was because the 'key' variable is not initialized. Actually it'd be set by bpf_map_get_next_key() but msan didn't seem to understand it. Let's make msan happy by initializing the variable. Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-04-04perf hist: Improve srcfile sort key performance (really)Namhyung Kim1-6/+1
The earlier commit f0cdde28fecc0d7f ("perf hist: Improve srcfile sort key performance") updated the srcfile logic but missed to change the ->cmp() callback which is called for every sample. It should use the same logic like in the srcline to speed up the processing because it'd return the same information repeatedly for the same address. The real processing will be done in sort__srcfile_collapse(). Fixes: f0cdde28fecc0d7f ("perf hist: Improve srcfile sort key performance") Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-04-04perf report: Append inlines to non-DWARF callchainsArtem Savkov1-0/+5
Append information about inlined functions to FP and LBR callchains from DWARF debuginfo when available. Do so by calling append_inlines() from add_callchain_ip(). Testing it: Frame-pointer mode recorded with 'perf record --call-graph=fp --freq=max -- ./a.out' #include <stdio.h> #include <stdint.h> static __attribute__((noinline)) uint32_t func5(uint32_t i) { return i + 10; } static uint32_t func4(uint32_t i) { return func5(i + 5); } static inline uint32_t func3(uint32_t i) { return func4(i + 4); } static __attribute__((noinline)) uint32_t func2(uint32_t i) { return func3(i + 3); } static uint32_t func1(uint32_t i) { return func2(i + 2); } __attribute__((noinline)) uint64_t entry(void) { uint64_t ret = 0; uint32_t i = 0; for (i = 0; i < 1000000; i++) { ret += func1(i); ret -= func2(i); ret += func3(i); ret += func4(i); ret -= func5(i); } return ret; } int main(int argc, char **argv) { printf("%s\n", __func__); return entry(); } ====== Here is the output I get with '--call-graph callee --no-children' ====== # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 250 of event 'cycles:u' # Event count (approx.): 26819859 # # Overhead Command Shared Object Symbol # ........ ....... .................... ..................................... # 43.58% a.out a.out [.] func5 | |--28.93%--entry | main | __libc_start_call_main | --14.65%--func4 (inlined) | |--10.45%--entry | main | __libc_start_call_main | --4.20%--func3 (inlined) entry main __libc_start_call_main 38.80% a.out a.out [.] entry | |--23.27%--func4 (inlined) | | | |--20.28%--func3 (inlined) | | func2 | | main | | __libc_start_call_main | | | --2.99%--entry | main | __libc_start_call_main | |--8.17%--func5 | main | __libc_start_call_main | |--3.89%--func1 (inlined) | entry | main | __libc_start_call_main | --3.48%--entry main __libc_start_call_main 13.07% a.out a.out [.] func2 | ---func5 main __libc_start_call_main 1.54% a.out [unknown] [k] 0xffffffff81e011b7 1.16% a.out [unknown] [k] 0xffffffff81e00193 | --0.57%--__mmap64 (inlined) __mmap64 (inlined) 0.34% a.out ld-linux-x86-64.so.2 [.] __tunable_get_val 0.34% a.out ld-linux-x86-64.so.2 [.] strcmp 0.32% a.out libc.so.6 [.] strchr 0.31% a.out ld-linux-x86-64.so.2 [.] _dl_relocate_object 0.22% a.out ld-linux-x86-64.so.2 [.] _dl_init_paths 0.18% a.out ld-linux-x86-64.so.2 [.] get_common_cache_info.constprop.0 0.14% a.out ld-linux-x86-64.so.2 [.] __GI___tunables_init # # (Tip: Show individual samples with: perf script) # ====== It does not seem to be out of order, or at least it is consistent with what I get with dwarf unwinders. Committer notes: Adrian Hunter pointed out that this breaks --branch-history, so don't do it for branches, see the second Link below. Suggested-by: Andrii Nakryiko <[email protected]> Signed-off-by: <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Milian Wolff <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-03-21perf tools: Add support for perf_event_attr::config3Rob Herring5-0/+12
perf_event_attr has gained a new field, config3, so add support for it extending the existing configN support. Signed-off-by: Rob Herring <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-03-21perf kvm: Reference count 'struct kvm_info'Leo Yan2-0/+45
hists__add_entry_ops() doesn't allocate a new histogram entry if it has an existing entry for a KVM event, in this case, find_create_kvm_event() allocates a 'struct kvm_info' but it's not used by any histograms and never freed. To fix the memory leak, this patch first introduces a refcnt and a set of functions for refcnt operations on 'struct kvm_info'. When the data structure is not anymore used (the refcnt hits zero) kvm_info__zput() will free the memory used. Committer: Provide a nop version of kvm_info__zput() to be used when HAVE_KVM_STAT_SUPPORT isn't defined as it is used unconditionally in hists__findnew_entry() and hist_entry__delete(). Signed-off-by: Leo Yan <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-03-20perf report: Add 'simd' sort fieldGerman Gomez4-0/+51
Add 'simd' sort field to visualize SIMD ops in 'perf report'. Rows are labeled with the SIMD ISA, and the type of predicate (if any): - [p] partial predicate - [e] empty predicate (no elements in the vector being used) Example with Arm SPE and SVE (Scalable Vector Extension): #include <arm_sve.h> double src[1025], dst[1025]; int main(void) { svfloat64_t vc = svdup_f64(1); for(;;) for(int i = 0; i < 1025; i += svcntd()) { svbool_t pg = svwhilelt_b64(i, 1025); svfloat64_t vsrc = svld1(pg, &src[i]); svfloat64_t vdst = svadd_x(pg, vsrc, vc); svst1(pg, &dst[i], vdst); } return 0; } ... compiled using "gcc-11 -march=armv8-a+sve -O3" Profiling on a platform that implements FEAT_SVE and FEAT_SPEv1p1: $ perf record -e arm_spe_0// -- ./a.out $ perf report --itrace=i1i -s overhead,pid,simd,sym Overhead Pid:Command Simd Symbol ........ ................ ....... ...................... 53.76% 10758:program [.] main 46.14% 10758:program [.] SVE [.] main 0.09% 10758:program [p] SVE [.] main The report shows 0.09% of the sampled SVE operations use partial predicates due to src and dst arrays not being multiples of the vector register lengths. Signed-off-by: German Gomez <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: [email protected] Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: James Clark <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-03-20perf arm-spe: Add SVE flags to the SPE samplesGerman Gomez1-0/+20
Add flags from the Scalable Vector Extension (SVE) to the SPE samples which are available from Armv8.3 (FEAT_SPEv1p1). These will be displayed in a new SIMD sort field in a later commit. Signed-off-by: German Gomez <[email protected]> Signed-off-by: James Clark <[email protected]> Acked-by: Ian Rogers <[email protected]> Link: https://lore.kernel.org/r/[email protected] Cc: [email protected] Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Will Deacon <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mike Leach <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: [email protected] Cc: John Garry <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-03-20perf arm-spe: Refactor arm-spe to support operation packet typeGerman Gomez3-18/+67
Extend the decoder of Arm SPE records to support more fields from the operation packet type. Not all fields are being decoded by this commit. Only those needed to support the use-case SVE load/store/other operations. Suggested-by: Leo Yan <[email protected]> Signed-off-by: German Gomez <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: [email protected] Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: James Clark <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-03-20perf event: Add 'simd_flags' field to 'struct perf_sample'German Gomez1-0/+13
Add new field to 'struct perf_sample' to store flags related to SIMD ops. It will be used to store SIMD information from SVE and NEON when profiling using ARM SPE. Signed-off-by: German Gomez <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: [email protected] Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: James Clark <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-03-20perf intel-pt: Add support for new branch instructions ERETS and ERETUAdrian Hunter2-0/+20
Intel Flexible Return and Event Delivery (FRED) adds instructions ERETS (return to supervisor) and ERETU (return to user). Intel PT instruction decoder needs to know about these instructions because they are branch instructions. Similar to IRET instructions, when the decoder encounters one of these instructions it will match it to a TIP (target instruction pointer) packet that informs what the branch destination is. The existing "x86 instruction decoder - new instructions" test can be used to test the result e.g. $ perf test -v ins |& grep eret Decoded ok: f2 0f 01 ca erets Decoded ok: f3 0f 01 ca eretu Signed-off-by: Adrian Hunter <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-03-20perf symbol: Sort names under write lockIan Rogers1-0/+7
If finding a name doesn't find the sorted names then they are allocated and sorted. This shouldn't be done under a read lock as another reader may access it. Release the read lock and acquire the write lock, then release the write lock and reacquire the read lock. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexey Bayduraev <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andrew Morton <[email protected]> Cc: André Almeida <[email protected]> Cc: Andy Shevchenko <[email protected]> Cc: Darren Hart <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Dmitriy Vyukov <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: German Gomez <[email protected]> Cc: Hao Luo <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Miaoqian Lin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: Shunsuke Nakamura <[email protected]> Cc: Song Liu <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Stephen Brennan <[email protected]> Cc: Steven Rostedt (VMware) <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Yury Norov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-03-20perf bpf_counter: Use public cpumap accessorsIan Rogers1-3/+3
Avoid the use of internal apis via the cpumap accessor functions. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexey Bayduraev <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andrew Morton <[email protected]> Cc: André Almeida <[email protected]> Cc: Andy Shevchenko <[email protected]> Cc: Darren Hart <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Dmitriy Vyukov <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: German Gomez <[email protected]> Cc: Hao Luo <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Miaoqian Lin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: Shunsuke Nakamura <[email protected]> Cc: Song Liu <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Stephen Brennan <[email protected]> Cc: Steven Rostedt (VMware) <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Yury Norov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-03-20perf symbol: Avoid memory leak from abi::__cxa_demangleIan Rogers1-3/+2
Rather than allocate memory, allow abi::__cxa_demangle to do that. This avoids a problem where on error NULL was returned triggering a memory leak. Fixes: 3b4e4efe88f615f1 ("perf symbol: Add abi::__cxa_demangle C++ demangling support") Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexey Bayduraev <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andrew Morton <[email protected]> Cc: André Almeida <[email protected]> Cc: Andy Shevchenko <[email protected]> Cc: Darren Hart <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Dmitriy Vyukov <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: German Gomez <[email protected]> Cc: Hao Luo <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Miaoqian Lin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: Shunsuke Nakamura <[email protected]> Cc: Song Liu <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Stephen Brennan <[email protected]> Cc: Steven Rostedt (VMware) <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Yury Norov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-03-15perf kvm: Add TUI mode for stat reportLeo Yan1-0/+1
Since we have supported histograms list and prepared the dimensions in the tool, this patch adds TUI mode for stat report. It also adds UI progress for sorting for better user experience. Committer notes: kvm_display() is only used by functions enclosed in: #if defined(HAVE_KVM_STAT_SUPPORT) && defined(HAVE_LIBTRACEEVENT) So do it with this new function as well. Signed-off-by: Leo Yan <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-03-15perf kvm: Polish sorting keyLeo Yan1-8/+0
Since histograms supports sorting, the tool doesn't need to maintain the mapping between the sorting keys and the corresponding comparison callbacks, therefore, this patch removes structure kvm_event_key. But we still need to validate the sorting key, this patch uses an array for sorting keys and renames function select_key() to is_valid_key() to validate the sorting key passed by user. Signed-off-by: Leo Yan <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-03-15perf kvm: Use histograms list to replace cached listLeo Yan1-7/+0
perf kvm tool defines its own cached list which is managed with RB tree, histograms also provide RB tree to manage data entries. Since now we have introduced histograms in the tool, it's not necessary to use the self defined list and we can directly use histograms list to manage KVM events. This patch changes to use histograms list to track KVM events, and it invokes the common function hists__output_resort_cb() to sort result, this also give us flexibility to extend more sorting key words easily. After histograms list supported, the cached list is redundant so remove the relevant code for it. Committer notes: kvm_hists__reinit() is only used by functions enclosed in: #if defined(HAVE_KVM_STAT_SUPPORT) && defined(HAVE_LIBTRACEEVENT) So do it with this new function as well. Signed-off-by: Leo Yan <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-03-15perf kvm: Add dimensions for KVM event statisticsLeo Yan1-0/+2
To support KVM event statistics, this patch firstly registers histograms columns and sorting fields; every column or field has its own format structure, the format structure is dereferenced to access the dimension, finally the dimension provides the comparison callback for sorting result. Signed-off-by: Leo Yan <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-03-15perf hist: Add 'kvm_info' field in histograms entryLeo Yan4-7/+20
__hists__add_entry() creates a temporary entry and compare it with existed histograms entries, if any existed entry equals to the temporary entry it skips to allocation to avoid duplication. The problem for support KVM event in histograms is it doesn't contain any info to identify KVM event and can be used for comparison entries. This patch adds 'kvm_info' field in the histograms entry which contains the KVM event's key, this identifier will be used for comparison histograms entries in later change. Signed-off-by: Leo Yan <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-03-15perf kvm: Parse address location for samplesLeo Yan1-0/+4
Parse address location for samples and save it into the structure 'perf_kvm_stat', it is to be used by histograms entry. Reviewed-by: James Clark <[email protected]> Signed-off-by: Leo Yan <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-03-15perf kvm: Introduce histograms data structuresLeo Yan1-0/+1
This is a preparation to support histograms in perf kvm tool. As first step, this patch defines histograms data structures and initialize them. Committer notes: Those are only used by functions enclosed in: #if efined(HAVE_KVM_STAT_SUPPORT) && defined(HAVE_LIBTRACEEVENT) So do this for these new functions and struct as well. Reviewed-by: James Clark <[email protected]> Signed-off-by: Leo Yan <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-03-15perf kvm: Use macro to replace variable 'decode_str_len'Leo Yan1-1/+2
The variable 'decode_str_len' defines the string length for KVM event name and every arch defines its own values. This introduces complexity that the variable definition are spreading in multiple source files under arch folder. This patch refactors code to use a macro KVM_EVENT_NAME_LEN to define event name length and thus remove the definitions in arch files. Signed-off-by: Leo Yan <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-03-15perf kvm: Use subtraction for comparison metricsLeo Yan1-1/+1
Currently the metrics comparison uses greater operator (>), it returns the boolean value (0 or 1). This patch changes to use subtraction as comparison result, which can be used by histograms sorting. Since the subtraction result is u64 type, we change key_cmp_fun's return type to int64_t to avoid overflow. Signed-off-by: Leo Yan <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-03-15perf kvm: Add pointer to 'perf_kvm_stat' in kvm eventLeo Yan1-2/+3
Sometimes, handling kvm events needs to base on global variables, e.g. when read event counts we need to know the target vcpu ID; the global variables are stored in structure perf_kvm_stat. This patch adds add a 'perf_kvm_stat' pointer in kvm event structure, it is to be used by later refactoring. Reviewed-by: James Clark <[email protected]> Signed-off-by: Leo Yan <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-03-15perf bpf filter: Show warning for missing sample flagsNamhyung Kim1-0/+62
For a BPF filter to work properly, users need to provide appropriate options to enable the sample types. Otherwise the BPF program would see an invalid value (i.e. always 0) and filter won't work well. Show a warning message if sample types are missing like below. $ sudo ./perf record -e cycles --filter 'addr < 100' true Error: cycles event does not have PERF_SAMPLE_ADDR Hint: please add -d option to perf record. failed to set filter "BPF" on event cycles with 22 (Invalid argument) Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Hao Luo <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Song Liu <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-03-15perf bpf filter: Add logical OR operatorNamhyung Kim6-17/+79
It supports two or more expressions connected as a group and the group result is considered true when one of them returns true. The new group operators (GROUP_BEGIN and GROUP_END) are added to setup and check the condition. As it doesn't allow nested groups, the condition is saved in local variables. For example, the following is to get samples only if the data source memory level is L2 cache or the weight value is greater than 30. $ sudo ./perf record -adW -e cpu/mem-loads/pp \ > --filter 'mem_lvl == l2 || weight > 30' -- sleep 1 $ sudo ./perf script -F data_src,weight 10668100842 |OP LOAD|LVL L3 or L3 hit|SNP None|TLB L1 or L2 hit|LCK No|BLK N/A 47 11868100242 |OP LOAD|LVL LFB/MAB or LFB/MAB hit|SNP None|TLB L1 or L2 hit|LCK No|BLK N/A 57 10668100842 |OP LOAD|LVL L3 or L3 hit|SNP None|TLB L1 or L2 hit|LCK No|BLK N/A 56 10650100842 |OP LOAD|LVL L3 or L3 hit|SNP None|TLB L2 miss|LCK No|BLK N/A 144 10468100442 |OP LOAD|LVL L2 or L2 hit|SNP None|TLB L1 or L2 hit|LCK No|BLK N/A 16 10468100442 |OP LOAD|LVL L2 or L2 hit|SNP None|TLB L1 or L2 hit|LCK No|BLK N/A 20 11868100242 |OP LOAD|LVL LFB/MAB or LFB/MAB hit|SNP None|TLB L1 or L2 hit|LCK No|BLK N/A 189 1026a100142 |OP LOAD|LVL L1 or L1 hit|SNP None|TLB L1 or L2 hit|LCK Yes|BLK N/A 193 10468100442 |OP LOAD|LVL L2 or L2 hit|SNP None|TLB L1 or L2 hit|LCK No|BLK N/A 18 ... Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Hao Luo <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Song Liu <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-03-15perf bpf filter: Add data_src sample data supportNamhyung Kim2-0/+84
The data_src has many entries to express memory behaviors. Add each term separately so that users can combine them for their purpose. I didn't add prefix for the constants for simplicity as they are mostly distinguishable but I had to use l1_miss and l2_hit for mem_dtlb since mem_lvl has different values for the same names. Note that I decided mem_lvl to be used as an alias of mem_lvlnum as it's deprecated now. According to the comment in the UAPI header, users should use the mix of mem_lvlnum, mem_remote and mem_snoop. Also the SNOOPX bits are concatenated to mem_snoop for simplicity. The following terms are used for data_src and the corresponding perf sample data fields: * mem_op : { load, store, pfetch, exec } * mem_lvl: { l1, l2, l3, l4, cxl, io, any_cache, lfb, ram, pmem } * mem_snoop: { none, hit, miss, hitm, fwd, peer } * mem_remote: { remote } * mem_lock: { locked } * mem_dtlb { l1_hit, l1_miss, l2_hit, l2_miss, any_hit, any_miss, walk, fault } * mem_blk { by_data, by_addr } * mem_hops { hops0, hops1, hops2, hops3 } We can now use a filter expression like below: 'mem_op == load, mem_lvl <= l2, mem_dtlb == l1_hit' 'mem_dtlb == l2_miss, mem_hops > hops1' 'mem_lvl == ram, mem_remote == 1' Note that 'na' is shared among the terms as it has the same value except for mem_lvl. I don't have a good idea to handle that for now. Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Hao Luo <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Song Liu <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-03-15perf bpf filter: Add more weight sample data supportNamhyung Kim2-0/+14
The weight data consists of a couple of fields with the PERF_SAMPLE_WEIGHT_STRUCT. Add weight{1,2,3} term to select them separately. Also add their aliases like 'ins_lat', 'p_stage_cyc' and 'retire_lat'. Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Hao Luo <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Song Liu <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-03-15perf bpf filter: Add 'pid' sample data supportNamhyung Kim6-7/+26
The pid is special because it's saved in the PERF_SAMPLE_TID together. So it needs to differenciate tid and pid using the 'part' field in the perf bpf filter entry struct. Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Hao Luo <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Song Liu <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-03-15perf record: Record dropped sample countNamhyung Kim3-1/+14
When it uses bpf filters, event might drop some samples. It'd be nice if it can report how many samples it lost. As LOST_SAMPLES event can carry the similar information, let's use it for bpf filters. To indicate it's from BPF filters, add a new misc flag for that and do not display cpu load warnings. Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Hao Luo <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Song Liu <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-03-15perf record: Add BPF event filter supportNamhyung Kim5-14/+38
Use --filter option to set BPF filter for generic events other than the tracepoints or Intel PT. The BPF program will check the sample data and filter according to the expression. For example, the below is the typical perf record for frequency mode. The sample period started from 1 and increased gradually. $ sudo ./perf record -e cycles true $ sudo ./perf script perf-exec 2272336 546683.916875: 1 cycles: ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms]) perf-exec 2272336 546683.916892: 1 cycles: ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms]) perf-exec 2272336 546683.916899: 3 cycles: ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms]) perf-exec 2272336 546683.916905: 17 cycles: ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms]) perf-exec 2272336 546683.916911: 100 cycles: ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms]) perf-exec 2272336 546683.916917: 589 cycles: ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms]) perf-exec 2272336 546683.916924: 3470 cycles: ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms]) perf-exec 2272336 546683.916930: 20465 cycles: ffffffff828499b8 perf_event_exec+0x298 ([kernel.kallsyms]) true 2272336 546683.916940: 119873 cycles: ffffffff8283afdd perf_iterate_ctx+0x2d ([kernel.kallsyms]) true 2272336 546683.917003: 461349 cycles: ffffffff82892517 vma_interval_tree_insert+0x37 ([kernel.kallsyms]) true 2272336 546683.917237: 635778 cycles: ffffffff82a11400 security_mmap_file+0x20 ([kernel.kallsyms]) When you add a BPF filter to get samples having periods greater than 1000, the output would look like below: $ sudo ./perf record -e cycles --filter 'period > 1000' true $ sudo ./perf script perf-exec 2273949 546850.708501: 5029 cycles: ffffffff826f9e25 finish_wait+0x5 ([kernel.kallsyms]) perf-exec 2273949 546850.708508: 32409 cycles: ffffffff826f9e25 finish_wait+0x5 ([kernel.kallsyms]) perf-exec 2273949 546850.708526: 143369 cycles: ffffffff82b4cdbf xas_start+0x5f ([kernel.kallsyms]) perf-exec 2273949 546850.708600: 372650 cycles: ffffffff8286b8f7 __pagevec_lru_add+0x117 ([kernel.kallsyms]) perf-exec 2273949 546850.708791: 482953 cycles: ffffffff829190de __mod_memcg_lruvec_state+0x4e ([kernel.kallsyms]) true 2273949 546850.709036: 501985 cycles: ffffffff828add7c tlb_gather_mmu+0x4c ([kernel.kallsyms]) true 2273949 546850.709292: 503065 cycles: 7f2446d97c03 _dl_map_object_deps+0x973 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) Committer notes: Add stubs for perf_bpf_filter__prepare() and perf_bpf_filter__destroy() to tools/perf/util/python.c to keep it building. Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Hao Luo <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Song Liu <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-03-15perf bpf filter: Implement event sample filteringNamhyung Kim5-12/+235
The BPF program will be attached to a perf_event and be triggered when it overflows. It'd iterate the filters map and compare the sample value according to the expression. If any of them fails, the sample would be dropped. Also it needs to have the corresponding sample data for the expression so it compares data->sample_flags with the given value. To access the sample data, it uses the bpf_cast_to_kern_ctx() kfunc which was added in v6.2 kernel. Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Hao Luo <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Song Liu <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-03-15perf bpf filter: Introduce basic BPF filter expressionNamhyung Kim5-0/+225
This implements a tiny parser for the filter expressions used for BPF. Each expression will be converted to struct perf_bpf_filter_expr and be passed to a BPF map. For now, I'd like to start with the very basic comparisons like EQ or GT. The LHS should be a term for sample data and the RHS is a number. The expressions are connected by a comma. For example, period > 10000 ip < 0x1000000000000, cpu == 3 Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Hao Luo <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Song Liu <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-03-15perf top: Fix rare segfault in thread__comm_len()liuwenyu1-6/+19
In thread__comm_len(),strlen() is called outside of the thread->comm_lock critical section,which may cause a UAF problems if comm__free() is called by the process_thread concurrently. backtrace of the core file is as follows: (gdb) bt #0 __strlen_evex () at ../sysdeps/x86_64/multiarch/strlen-evex.S:77 #1 0x000055ad15d31de5 in thread__comm_len (thread=0x7f627d20e300) at util/thread.c:320 #2 0x000055ad15d4fade in hists__calc_col_len (h=0x7f627d295940, hists=0x55ad1772bfe0) at util/hist.c:103 #3 hists__calc_col_len (hists=0x55ad1772bfe0, h=0x7f627d295940) at util/hist.c:79 #4 0x000055ad15d52c8c in output_resort (hists=hists@entry=0x55ad1772bfe0, prog=0x0, use_callchain=false, cb=cb@entry=0x0, cb_arg=0x0) at util/hist.c:1926 #5 0x000055ad15d530a4 in evsel__output_resort_cb (evsel=evsel@entry=0x55ad1772bde0, prog=prog@entry=0x0, cb=cb@entry=0x0, cb_arg=cb_arg@entry=0x0) at util/hist.c:1945 #6 0x000055ad15d53110 in evsel__output_resort (evsel=evsel@entry=0x55ad1772bde0, prog=prog@entry=0x0) at util/hist.c:1950 #7 0x000055ad15c6ae9a in perf_top__resort_hists (t=t@entry=0x7ffcd9cbf4f0) at builtin-top.c:311 #8 0x000055ad15c6cc6d in perf_top__print_sym_table (top=0x7ffcd9cbf4f0) at builtin-top.c:346 #9 display_thread (arg=0x7ffcd9cbf4f0) at builtin-top.c:700 #10 0x00007f6282fab4fa in start_thread (arg=<optimized out>) at pthread_create.c:443 #11 0x00007f628302e200 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 The reason is that strlen() get a pointer to a memory that has been freed. The string pointer is stored in the structure comm_str, which corresponds to a rb_tree node,when the node is erased, the memory of the string is also freed. In thread__comm_len(),it gets the pointer within the thread->comm_lock critical section, but passed to strlen() outside of the thread->comm_lock critical section, and the perf process_thread may called comm__free() concurrently, cause this segfault problem. The process is as follows: display_thread process_thread -------------- -------------- thread__comm_len -> thread__comm_str # held the comm read lock -> __thread__comm_str(thread) # release the comm read lock thread__delete # held the comm write lock -> comm__free -> comm_str__put(comm->comm_str) -> zfree(&cs->str) # release the comm write lock # The memory of the string pointed to by comm has been free. -> thread->comm_len = strlen(comm); This patch expand the critical section range of thread->comm_lock in thread__comm_len(), to make strlen() called safe. Signed-off-by: Wenyu Liu <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Christian Brauner <[email protected]> Cc: Feilong Lin <[email protected]> Cc: Hewenliang <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Yunfeng Ye <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>