aboutsummaryrefslogtreecommitdiff
path: root/tools/perf/util
AgeCommit message (Collapse)AuthorFilesLines
2023-08-22perf lzma: Convert some pr_err() to pr_debug() as callers already use pr_debug()Arnaldo Carvalho de Melo1-7/+5
I noticed some error with: # perf list ex_ret_brn lzma: fopen failed on /usr/lib/modules/5.15.14-100.fc34.x86_64/kernel/net/bluetooth/bnep/bnep.ko.xz: 'No such file or directory' lzma: fopen failed on /usr/lib/modules/5.16.16-200.fc35.x86_64/kernel/drivers/gpu/drm/drm_kms_helper.ko.xz: 'No such file or directory' lzma: fopen failed on /usr/lib/modules/5.18.16-200.fc36.x86_64/kernel/arch/x86/crypto/crct10dif-pclmul.ko.xz: 'No such file or directory' lzma: fopen failed on /usr/lib/modules/5.16.16-200.fc35.x86_64/kernel/drivers/i2c/busses/i2c-piix4.ko.xz: 'No such file or directory' <BIG SNIP> Then using 'perf probe' + 'perf trace' to debug 'perf list', it seems its some inconsistency in the ~/.debug/ cache where broken build id symlinks that ends up making it try to uncompress some kernel modules using the lzma routines: 395.309 perf/3594447 probe_perf:lzma_decompress_to_file(__probe_ip: 6118448, input_string: "/usr/lib/modules/5.18.17-200.fc36.x86_64/kernel/drivers/nvme/host/nvme.ko.xz") lzma_decompress_to_file (/var/home/acme/bin/perf) filename__decompress (/var/home/acme/bin/perf) filename__read_build_id (/var/home/acme/bin/perf) filename__sprintf_build_id (inlined) build_id_cache__valid_id (inlined) build_id_cache__list_all (/var/home/acme/bin/perf) print_sdt_events (/var/home/acme/bin/perf) cmd_list (/var/home/acme/bin/perf) run_builtin (/var/home/acme/bin/perf) handle_internal_command (inlined) run_argv (inlined) main (/var/home/acme/bin/perf) __libc_start_call_main (/usr/lib64/libc.so.6) __libc_start_main@@GLIBC_2.34 (/usr/lib64/libc.so.6) _start (/var/home/acme/bin/perf) But callers of filename__decompress() already check its return and use pr_debug(), so be consistent and make functions it calls also use pr_debug(). Cc: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-21perf stat-display: Check if snprintf()'s fmt argument is NULLKaige Ye1-2/+2
It is undefined behavior to pass NULL as snprintf()'s fmt argument. Here is an example to trigger the problem: $ perf stat --metric-only -x, -e instructions -- sleep 1 insn per cycle, Segmentation fault (core dumped) With this patch: $ perf stat --metric-only -x, -e instructions -- sleep 1 insn per cycle, , Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Kaige Ye <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-21perf bpf augmented_raw_syscalls: Add an assert to make sure ↵Arnaldo Carvalho de Melo1-0/+1
sizeof(augmented_arg->value) is a power of two. Similar to what was done in the previous cset for sizeof(saddr), we need to make sure sizeof(augmented_arg->value) is a power of two to do bounds checking using &=: augmented_len &= sizeof(augmented_arg->value) - 1; Suggested-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-21perf bpf augmented_raw_syscalls: Add an assert to make sure sizeof(saddr) is ↵Arnaldo Carvalho de Melo1-0/+11
a power of two. We're using the BPF verifier suggestion: 22: (85) call bpf_probe_read#4 R2 min value is negative, either use unsigned or 'var &= const' That works only when const is a (power of two - 1) so add an assert to make sure that that is the case. Suggested-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-17perf jevents: Add a new expression builtin strcmp_cpuid_str()James Clark6-1/+45
This will allow writing formulas that are conditional on a specific CPU type or CPU version. It calls through to the existing strcmp_cpuid_str() function in Perf which has a default weak version, and an arch specific version for x86 and arm64. The function takes an 'ID' type value, which is a string. But in this case Arm CPU IDs are hex numbers prefixed with '0x'. metric.py assumes strings are only used by event names, and that they can't start with a number ('0'), so an additional change has to be made to the regex to convert hex numbers back to 'ID' types. Signed-off-by: James Clark <[email protected]> Reviewed-by: John Garry <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Haixin Yu <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Nick Forrington <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Cc: Sohom Datta <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-16perf bpf_skel augmented_raw_syscalls: Cap the socklen parameter using &= ↵Arnaldo Carvalho de Melo1-4/+2
sizeof(saddr) This works with: $ clang -v clang version 14.0.5 (Fedora 14.0.5-2.fc36) $ But not with: $ clang -v clang version 16.0.6 (Fedora 16.0.6-2.fc38) $ [root@quaco ~]# perf trace -e connect*,sendto* ping -c 10 localhost libbpf: prog 'sys_enter_sendto': BPF program load failed: Permission denied libbpf: prog 'sys_enter_sendto': -- BEGIN PROG LOAD LOG -- reg type unsupported for arg#0 function sys_enter_sendto#59 0: R1=ctx(off=0,imm=0) R10=fp0 ; int sys_enter_sendto(struct syscall_enter_args *args) 0: (bf) r6 = r1 ; R1=ctx(off=0,imm=0) R6_w=ctx(off=0,imm=0) 1: (b7) r1 = 0 ; R1_w=0 ; int key = 0; 2: (63) *(u32 *)(r10 -4) = r1 ; R1_w=0 R10=fp0 fp-8=0000???? 3: (bf) r2 = r10 ; R2_w=fp0 R10=fp0 ; 4: (07) r2 += -4 ; R2_w=fp-4 ; return bpf_map_lookup_elem(&augmented_args_tmp, &key); 5: (18) r1 = 0xffff8de5a5b8bc00 ; R1_w=map_ptr(off=0,ks=4,vs=8272,imm=0) 7: (85) call bpf_map_lookup_elem#1 ; R0_w=map_value_or_null(id=1,off=0,ks=4,vs=8272,imm=0) 8: (bf) r7 = r0 ; R0_w=map_value_or_null(id=1,off=0,ks=4,vs=8272,imm=0) R7_w=map_value_or_null(id=1,off=0,ks=4,vs=8272,imm=0) 9: (b7) r0 = 1 ; R0_w=1 ; if (augmented_args == NULL) 10: (15) if r7 == 0x0 goto pc+25 ; R7_w=map_value(off=0,ks=4,vs=8272,imm=0) ; unsigned int socklen = args->args[5]; 11: (79) r1 = *(u64 *)(r6 +56) ; R1_w=scalar() R6_w=ctx(off=0,imm=0) ; 12: (bf) r2 = r1 ; R1_w=scalar(id=2) R2_w=scalar(id=2) 13: (67) r2 <<= 32 ; R2_w=scalar(smax=9223372032559808512,umax=18446744069414584320,var_off=(0x0; 0xffffffff00000000),s32_min=0,s32_max=0,u32_max=0) 14: (77) r2 >>= 32 ; R2_w=scalar(umax=4294967295,var_off=(0x0; 0xffffffff)) 15: (b7) r8 = 128 ; R8=128 ; if (socklen > sizeof(augmented_args->saddr)) 16: (25) if r2 > 0x80 goto pc+1 ; R2=scalar(umax=128,var_off=(0x0; 0xff)) 17: (bf) r8 = r1 ; R1=scalar(id=2) R8_w=scalar(id=2) ; const void *sockaddr_arg = (const void *)args->args[4]; 18: (79) r3 = *(u64 *)(r6 +48) ; R3_w=scalar() R6=ctx(off=0,imm=0) ; bpf_probe_read(&augmented_args->saddr, socklen, sockaddr_arg); 19: (bf) r1 = r7 ; R1_w=map_value(off=0,ks=4,vs=8272,imm=0) R7=map_value(off=0,ks=4,vs=8272,imm=0) 20: (07) r1 += 64 ; R1_w=map_value(off=64,ks=4,vs=8272,imm=0) ; bpf_probe_read(&augmented_args->saddr, socklen, sockaddr_arg); 21: (bf) r2 = r8 ; R2_w=scalar(id=2) R8_w=scalar(id=2) 22: (85) call bpf_probe_read#4 R2 min value is negative, either use unsigned or 'var &= const' processed 22 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 1 -- END PROG LOAD LOG -- libbpf: prog 'sys_enter_sendto': failed to load: -13 libbpf: failed to load object 'augmented_raw_syscalls_bpf' libbpf: failed to load BPF skeleton 'augmented_raw_syscalls_bpf': -13 So use the suggested &= variant since sizeof(saddr) == 128 bytes. Cc: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-16perf parse-regs: Move out arch specific header from util/perf_regs.hLeo Yan1-2/+0
util/perf_regs.h includes another perf_regs.h: #include <perf_regs.h> Here it includes architecture specific header, for example, if we build arm64 target, the header tools/perf/arch/arm64/include/perf_regs.h is included. We use this implicit way to include architecture specific header, which is not directive; furthermore, util/perf_regs.c is coupled with the architecture specific definitions. This patch moves out arch specific header from util/perf_regs.h for generalizing the 'util' folder, as a result, the source files in 'arch' folder explicitly include architecture's perf_regs.h. Signed-off-by: Leo Yan <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Eric Lin <[email protected]> Cc: Fangrui Song <[email protected]> Cc: Guo Ren <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ivan Babrou <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Ming Wang <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-16perf parse-regs: Remove PERF_REGS_{MAX|MASK} from common codeLeo Yan3-6/+4
The macros PERF_REGS_MAX and PERF_REGS_MASK are architecture specific, let's remove them from the common file util/perf_regs.c. As a side effect, the weak functions arch__intr_reg_mask() and arch__user_reg_mask() just return zeros, every arch defines its own functions in the 'arch' folder for returning right values. Note, we don't need to return intr/user register masks dynamically, this is because these two functions are invoked during recording phase but not decoding phase, they are always invoked on the native environment, thus we don't need to parse them dynamically. Signed-off-by: Leo Yan <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Eric Lin <[email protected]> Cc: Fangrui Song <[email protected]> Cc: Guo Ren <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ivan Babrou <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Ming Wang <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-16perf unwind: Use perf_arch_reg_{ip|sp}() to substitute macrosLeo Yan5-16/+10
We use perf_arch_reg_ip() and perf_arch_reg_sp() to substitute macros for obtaining the register numbers of SP and IP. This modification enables cross analysis in the unwinding, therefore, the unwinding is not restricted to the predefined values by the macros. Consequently, the macros LIBUNWIND__ARCH_REG_{IP|SP} are removed since they are no longer used. Committer notes: Add missing "util/env.h" header to make sure we have the definition for perf_env__arch(), that when built with NO_LIBUNWIND=1 isn't available, i.e. it was being included by sheer luck. Signed-off-by: Leo Yan <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Eric Lin <[email protected]> Cc: Fangrui Song <[email protected]> Cc: Guo Ren <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ivan Babrou <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Ming Wang <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-16perf parse-regs: Introduce functions perf_arch_reg_{ip|sp}()Leo Yan12-5/+186
The current code uses macros PERF_REG_IP and PERF_REG_SP for parsing registers and we build perf with these macros statically, which means it only can correctly analyze CPU registers for the native architecture and fails to support cross analysis (e.g. we build perf on x86 and cannot analyze Arm64's registers). We need to generalize util/perf_regs.c for support multi architectures, as a first step, this commit introduces new functions perf_arch_reg_ip() and perf_arch_reg_sp(), these two functions dynamically return IP and SP register index respectively according to the parameter "arch". Every architecture has its own functions (like __perf_reg_ip_arm64 and __perf_reg_sp_arm64), these architecture specific functions are defined in each arch source file under folder util/perf-regs-arch; at the end all of them are built into the tool for cross analysis. Committer notes: Make DWARF_MINIMAL_REGS() an inline function, so that we can use the __maybe_unused attribute for the 'arch' parameter, as this will avoid a build failure when that variable is unused in the callers. That happens when building on unsupported architectures, the ones without HAVE_PERF_REGS_SUPPORT defined. Signed-off-by: Leo Yan <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Eric Lin <[email protected]> Cc: Fangrui Song <[email protected]> Cc: Guo Ren <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ivan Babrou <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Ming Wang <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-16perf parse-regs: Refactor arch register parsing functionsLeo Yan14-716/+803
Every architecture has a specific register parsing function for returning register name based on register index, to support cross analysis (e.g. we use perf x86 binary to parse Arm64's perf data), we build all these register parsing functions into the tool, this is why we place all related functions into util/perf_regs.c. Unfortunately, since util/perf_regs.c needs to include every arch's perf_regs.h, this easily introduces duplicated definitions coming from multiple headers, finally it's fragile for building and difficult for maintenance. We cannot simply move these register parsing functions into the corresponding 'arch' folder, the folder is only conditionally built based on the target architecture. Therefore, this commit creates a new folder util/perf-regs-arch/ and uses a dedicated source file to keep every architecture's register parsing function to avoid definition conflicts. This is only a refactoring, no functionality change is expected. Committer notes: Had to add util/perf-regs-arch/*.c to tools/perf/util/python-ext-sources to keep 'perf test python' passing. Signed-off-by: Leo Yan <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Eric Lin <[email protected]> Cc: Fangrui Song <[email protected]> Cc: Guo Ren <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ivan Babrou <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Ming Wang <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-15perf cs-etm: Don't duplicate FIELD_GET()James Clark1-12/+2
linux/bitfield.h can be included as long as linux/kernel.h is included first, so change the order of the includes and drop the duplicate macro. Reviewed-by: John Garry <[email protected]> Signed-off-by: James Clark <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Nick Forrington <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Cc: Sohom Datta <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-15perf dlfilter: Add al_cleanup()Adrian Hunter1-0/+29
Add perf_dlfilter_fns.al_cleanup() to do addr_location__exit() on data passed via perf_dlfilter_fns.resolve_address(). Add dlfilter-test-api-v2 to the "dlfilter C API" test to test it. Update documentation, clarifying that data returned by APIs should not be dereferenced after filter_event() and filter_event_early() return. Fixes: 0dd5041c9a0eaf8c ("perf addr_location: Add init/exit/copy functions") Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-15perf dlfilter: Initialize addr_location before passing it to ↵Arnaldo Carvalho de Melo1-0/+1
thread__find_symbol_fb() As thread__find_symbol_fb() will end up calling thread__find_map() and it in turn will call these on uninitialized memory: maps__zput(al->maps); map__zput(al->map); thread__zput(al->thread); Fixes: 0dd5041c9a0eaf8c ("perf addr_location: Add init/exit/copy functions") Reviewed-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Disha Goel <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-15perf evsel: Remove duplicate check for `field` in evsel__intval()Yang Jihong1-3/+0
The `file` parameter in evsel__intval() is checked repeatedly, fix it. No functional change. Signed-off-by: Yang Jihong <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sandipan Das <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-15perf bpf-filter: Fix sample flag check with ||Namhyung Kim1-0/+10
For logical OR operator, the actual sample_flags are in the 'groups' list so it needs to check entries in the list instead. Otherwise it would show the following error message. $ sudo perf record -a -e cycles:p --filter 'period > 100 || weight > 0' sleep 1 Error: cycles:p event does not have sample flags 0 failed to set filter "BPF" on event cycles:p with 2 (No such file or directory) Actually it should warn on 'weight' is used without WEIGHT flag. Error: cycles:p event does not have PERF_SAMPLE_WEIGHT Hint: please add -W option to perf record failed to set filter "BPF" on event cycles:p with 2 (No such file or directory) Fixes: 4310551b76e0d676 ("perf bpf filter: Show warning for missing sample flags") Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-15perf trace: Tidy comments related to BPF + syscall augmentationIan Rogers1-8/+0
Now tools/perf/examples/bpf/augmented_syscalls.c is tools/perf/util/bpf_skel/augmented_syscalls.bpf.c and not enabled as a BPF event, tidy the comments to reflect this. Signed-off-by: Ian Rogers <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: Carsten Haitzler <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Fangrui Song <[email protected]> Cc: He Kuang <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Cc: Tiezhu Yang <[email protected]> Cc: Tom Rix <[email protected]> Cc: Wang Nan <[email protected]> Cc: Wang ShaoBo <[email protected]> Cc: Yang Jihong <[email protected]> Cc: Yonghong Song <[email protected]> Cc: YueHaibing <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-15perf trace: Migrate BPF augmentation to use a skeletonIan Rogers1-0/+418
Previously a BPF event of augmented_raw_syscalls.c could be used to enable augmentation of syscalls by perf trace. As BPF events are no longer supported, switch to using a BPF skeleton which when attached explicitly opens the sysenter and sysexit tracepoints. The dump map is removed as debugging wasn't supported by the augmentation and bpf_printk can be used when necessary. Remove tools/perf/examples/bpf/augmented_raw_syscalls.c so that the rename/migration to a BPF skeleton captures that this was the source. Committer notes: Some minor stylistic changes to help visualizing the diff. Use libbpf_strerror when failing to load the augmented raw syscalls BPF. Use bpf_object__for_each_program(prog, trace.skel->obj) to disable auto attachment for all but the sys_enter, sys_exit tracepoints, to avoid having to add extra lines as we go adding support for more pointer receiving syscalls. Committer testing: # perf trace -e open* --max-events=10 0.000 ( 0.022 ms): systemd-oomd/1151 openat(dfd: CWD, filename: "/proc/meminfo", flags: RDONLY|CLOEXEC) = 11 208.833 ( ): gnome-terminal/3223 openat(dfd: CWD, filename: "/proc/51250/cmdline") ... 249.993 ( 0.024 ms): systemd-oomd/1151 openat(dfd: CWD, filename: "/proc/meminfo", flags: RDONLY|CLOEXEC) = 11 250.118 ( 0.030 ms): systemd-oomd/1151 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1000.slice/[email protected]/memory.pressure", flags: RDONLY|CLOEXEC) = 11 250.205 ( 0.016 ms): systemd-oomd/1151 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1000.slice/[email protected]/memory.current", flags: RDONLY|CLOEXEC) = 11 250.244 ( 0.014 ms): systemd-oomd/1151 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1000.slice/[email protected]/memory.min", flags: RDONLY|CLOEXEC) = 11 250.282 ( 0.014 ms): systemd-oomd/1151 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1000.slice/[email protected]/memory.low", flags: RDONLY|CLOEXEC) = 11 250.320 ( 0.014 ms): systemd-oomd/1151 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1000.slice/[email protected]/memory.swap.current", flags: RDONLY|CLOEXEC) = 11 250.355 ( 0.014 ms): systemd-oomd/1151 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1000.slice/[email protected]/memory.stat", flags: RDONLY|CLOEXEC) = 11 250.717 ( 0.016 ms): systemd-oomd/1151 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1001.slice/[email protected]/memory.pressure", flags: RDONLY|CLOEXEC) = 11 # # perf trace -e *nanosleep* --max-events=10 ? ( ): SCTP timer/28304 ... [continued]: clock_nanosleep()) = 0 0.007 (10.058 ms): SCTP timer/28304 clock_nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 }, rmtp: 0x7f0466b78de0) = 0 10.069 ( ): SCTP timer/28304 clock_nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 }, rmtp: 0x7f0466b78de0) ... 10.069 (10.056 ms): SCTP timer/28304 ... [continued]: clock_nanosleep()) = 0 17.059 ( ): podman/3572 nanosleep(rqtp: 0x7fc4f4d75be0) ... 17.059 (10.061 ms): podman/3572 ... [continued]: nanosleep()) = 0 20.131 (10.059 ms): SCTP timer/28304 clock_nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 }, rmtp: 0x7f0466b78de0) = 0 30.195 (10.038 ms): SCTP timer/28304 clock_nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 }, rmtp: 0x7f0466b78de0) = 0 40.238 (10.057 ms): SCTP timer/28304 clock_nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 }, rmtp: 0x7f0466b78de0) = 0 50.301 ( ): SCTP timer/28304 clock_nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 }, rmtp: 0x7f0466b78de0) ... # # perf trace -e perf_event* -- perf stat -e instructions,cycles,cache-misses sleep 0.1 0.000 ( 0.011 ms): perf/51331 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0x1 (PERF_COUNT_HW_INSTRUCTIONS), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 51332 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 3 0.013 ( 0.003 ms): perf/51331 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0 (PERF_COUNT_HW_CPU_CYCLES), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 51332 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 4 0.017 ( 0.002 ms): perf/51331 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0x3 (PERF_COUNT_HW_CACHE_MISSES), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 51332 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 5 Performance counter stats for 'sleep 0.1': 1,495,051 instructions # 1.11 insn per cycle 1,347,641 cycles 35,424 cache-misses 0.100935279 seconds time elapsed 0.000924000 seconds user 0.000000000 seconds sys # # perf trace -e connect* ssh localhost 0.000 ( 0.012 ms): ssh/51346 connect(fd: 4, uservaddr: { .family: LOCAL, path: /var/lib/sss/pipes/nss }, addrlen: 110) = -1 ECONNREFUSED (Connection refused) 0.118 ( 0.004 ms): ssh/51346 connect(fd: 6, uservaddr: { .family: LOCAL, path: /var/lib/sss/pipes/nss }, addrlen: 110) = -1 ECONNREFUSED (Connection refused) 0.399 ( 0.007 ms): ssh/51346 connect(fd: 4, uservaddr: { .family: LOCAL, path: /var/lib/sss/pipes/nss }, addrlen: 110) = -1 ECONNREFUSED (Connection refused) 0.426 ( 0.003 ms): ssh/51346 connect(fd: 4, uservaddr: { .family: LOCAL, path: /var/lib/sss/pipes/nss }, addrlen: 110) = -1 ECONNREFUSED (Connection refused) 0.754 ( 0.009 ms): ssh/51346 connect(fd: 4, uservaddr: { .family: INET, port: 22, addr: 127.0.0.1 }, addrlen: 16) = 0 0.771 ( 0.010 ms): ssh/51346 connect(fd: 4, uservaddr: { .family: INET6, port: 22, addr: ::1 }, addrlen: 28) = 0 0.798 ( 0.053 ms): ssh/51346 connect(fd: 4, uservaddr: { .family: INET6, port: 22, addr: ::1 }, addrlen: 28) = 0 0.870 ( 0.004 ms): ssh/51346 connect(fd: 5, uservaddr: { .family: LOCAL, path: /var/lib/sss/pipes/nss }, addrlen: 110) = -1 ECONNREFUSED (Connection refused) 0.904 ( 0.003 ms): ssh/51346 connect(fd: 5, uservaddr: { .family: LOCAL, path: /var/lib/sss/pipes/nss }, addrlen: 110) = -1 ECONNREFUSED (Connection refused) 0.930 ( 0.003 ms): ssh/51346 connect(fd: 5, uservaddr: { .family: LOCAL, path: /var/lib/sss/pipes/nss }, addrlen: 110) = -1 ECONNREFUSED (Connection refused) 0.957 ( 0.003 ms): ssh/51346 connect(fd: 5, uservaddr: { .family: LOCAL, path: /var/lib/sss/pipes/nss }, addrlen: 110) = -1 ECONNREFUSED (Connection refused) 0.981 ( 0.003 ms): ssh/51346 connect(fd: 5, uservaddr: { .family: LOCAL, path: /var/lib/sss/pipes/nss }, addrlen: 110) = -1 ECONNREFUSED (Connection refused) 1.006 ( 0.004 ms): ssh/51346 connect(fd: 5, uservaddr: { .family: LOCAL, path: /var/lib/sss/pipes/nss }, addrlen: 110) = -1 ECONNREFUSED (Connection refused) 1.036 ( 0.005 ms): ssh/51346 connect(fd: 5, uservaddr: { .family: LOCAL, path: /var/lib/sss/pipes/nss }, addrlen: 110) = -1 ECONNREFUSED (Connection refused) 65.077 ( 0.022 ms): ssh/51346 connect(fd: 5, uservaddr: { .family: LOCAL, path: /var/run/.heim_org.h5l.kcm-socket }, addrlen: 110) = 0 66.608 ( 0.014 ms): ssh/51346 connect(fd: 5, uservaddr: { .family: LOCAL, path: /var/run/.heim_org.h5l.kcm-socket }, addrlen: 110) = 0 root@localhost's password: # # perf trace -e sendto* ping -c 2 localhost PING localhost(localhost (::1)) 56 data bytes 64 bytes from localhost (::1): icmp_seq=1 ttl=64 time=0.024 ms 0.000 ( 0.011 ms): ping/51357 sendto(fd: 5, buff: 0x7ffcca35e620, len: 20, addr: { .family: NETLINK }, addr_len: 0xc) = 20 0.135 ( 0.026 ms): ping/51357 sendto(fd: 4, buff: 0x5601398f7b20, len: 64, addr: { .family: INET6, port: 58, addr: ::1 }, addr_len: 0x1c) = 64 1014.929 ( 0.050 ms): ping/51357 sendto(fd: 4, buff: 0x5601398f7b20, len: 64, flags: CONFIRM, addr: { .family: INET6, port: 58, addr: ::1 }, addr_len: 0x1c) = 64 64 bytes from localhost (::1): icmp_seq=2 ttl=64 time=0.046 ms --- localhost ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1015ms rtt min/avg/max/mdev = 0.024/0.035/0.046/0.011 ms # Signed-off-by: Ian Rogers <[email protected]> Acked-by: Jiri Olsa <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: Carsten Haitzler <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Fangrui Song <[email protected]> Cc: He Kuang <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Cc: Tiezhu Yang <[email protected]> Cc: Tom Rix <[email protected]> Cc: Wang Nan <[email protected]> Cc: Wang ShaoBo <[email protected]> Cc: Yang Jihong <[email protected]> Cc: Yonghong Song <[email protected]> Cc: YueHaibing <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-15perf parse-events: Remove BPF event supportIan Rogers10-3262/+2
New features like the BPF --filter support in perf record have made the BPF event functionality somewhat redundant. As shown by commit fcb027c1a4f6 ("perf tools: Revert enable indices setting syntax for BPF map") and commit 14e4b9f4289a ("perf trace: Raw augmented syscalls fix libbpf 1.0+ compatibility") the BPF event support hasn't been well maintained and it adds considerable complexity in areas like event parsing, not least as '/' is a separator for event modifiers as well as in paths. This patch removes support in the event parser for BPF events and then the associated functions are removed. This leads to the removal of whole source files like bpf-loader.c. Removing support means that augmented syscalls in perf trace is broken, this will be fixed in a later commit adding support using BPF skeletons. The removal of BPF events causes an unused label warning from flex generated code, so update build to ignore it: ``` util/parse-events-flex.c:2704:1: error: label ‘find_rule’ defined but not used [-Werror=unused-label] 2704 | find_rule: /* we branch to this label when backing up */ ``` Committer notes: Extracted from a larger patch that was also removing the support for linking with libllvm and libclang, that were an alternative to using an external clang execution to compile the .c event source code into BPF bytecode. Testing it: # perf trace -e /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c event syntax error: '/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c' \___ Bad event or PMU Unabled to find PMU or event on a PMU of 'home' Initial error: event syntax error: '/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c' \___ Cannot find PMU `home'. Missing kernel support? Run 'perf list' for a list of valid events Usage: perf trace [<options>] [<command>] or: perf trace [<options>] -- <command> [<options>] or: perf trace record [<options>] [<command>] or: perf trace record [<options>] -- <command> [<options>] -e, --event <event> event/syscall selector. use 'perf list' to list available events # Signed-off-by: Ian Rogers <[email protected]> Acked-by: Jiri Olsa <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: Carsten Haitzler <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Fangrui Song <[email protected]> Cc: He Kuang <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Cc: Tiezhu Yang <[email protected]> Cc: Tom Rix <[email protected]> Cc: Wang Nan <[email protected]> Cc: Wang ShaoBo <[email protected]> Cc: Yang Jihong <[email protected]> Cc: Yonghong Song <[email protected]> Cc: YueHaibing <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-15perf bpf: Remove support for embedding clang for compiling BPF events (-e foo.c)Ian Rogers7-380/+4
This never was in the default build for perf, is difficult to maintain as it uses clang/llvm internals so ditch it, keeping, for now, the external compilation of .c BPF into .o bytecode and its subsequent loading, that is also going to be removed, do it separately to help bisection and to properly document what is being removed and why. Committer notes: Extracted from a larger patch and removed some leftovers, namely deleting these now unused feature tests: tools/build/feature/test-clang.cpp tools/build/feature/test-cxx.cpp tools/build/feature/test-llvm-version.cpp tools/build/feature/test-llvm.cpp Testing the use of BPF events after applying this patch: To use the external clang/llvm toolchain to compile a .c event and then use libbpf to load it, to get the syscalls:sys_enter_open* tracepoints and read the filename pointer, putting it into the ring buffer right after the usual tracepoint payload for 'perf trace' to then print it: [root@quaco ~]# perf trace -e /home/acme/git/perf-tools-next/tools/perf/examples/bpf/augmented_raw_syscalls.c,open* --max-events=10 0.000 systemd-oomd/959 openat(dfd: CWD, filename: "/proc/meminfo", flags: RDONLY|CLOEXEC) = 12 0.083 abrt-dump-jour/1453 openat(dfd: CWD, filename: "/var/log/journal/d6a97235307247e09f13f326fb607e3c/system.journal", flags: RDONLY|CLOEXEC|NONBLOCK) = 4 0.063 abrt-dump-jour/1454 openat(dfd: CWD, filename: "/var/log/journal/d6a97235307247e09f13f326fb607e3c/system.journal", flags: RDONLY|CLOEXEC|NONBLOCK) = 4 0.082 abrt-dump-jour/1455 openat(dfd: CWD, filename: "/var/log/journal/d6a97235307247e09f13f326fb607e3c/system.journal", flags: RDONLY|CLOEXEC|NONBLOCK) = 4 250.124 systemd-oomd/959 openat(dfd: CWD, filename: "/proc/meminfo", flags: RDONLY|CLOEXEC) = 12 250.521 systemd-oomd/959 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1000.slice/[email protected]/app.slice/memory.pressure", flags: RDONLY|CLOEXEC) = 12 251.047 systemd-oomd/959 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1000.slice/[email protected]/app.slice/memory.current", flags: RDONLY|CLOEXEC) = 12 251.162 systemd-oomd/959 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1000.slice/[email protected]/app.slice/memory.min", flags: RDONLY|CLOEXEC) = 12 251.242 systemd-oomd/959 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1000.slice/[email protected]/app.slice/memory.low", flags: RDONLY|CLOEXEC) = 12 251.353 systemd-oomd/959 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1000.slice/[email protected]/app.slice/memory.swap.current", flags: RDONLY|CLOEXEC) = 12 [root@quaco ~]# Same thing, but with a prebuilt .o BPF bytecode: [root@quaco ~]# perf trace -e /home/acme/git/perf-tools-next/tools/perf/examples/bpf/augmented_raw_syscalls.o,open* --max-events=10 0.000 systemd-oomd/959 openat(dfd: CWD, filename: "/proc/meminfo", flags: RDONLY|CLOEXEC) = 12 0.083 abrt-dump-jour/1453 openat(dfd: CWD, filename: "/var/log/journal/d6a97235307247e09f13f326fb607e3c/system.journal", flags: RDONLY|CLOEXEC|NONBLOCK) = 4 0.083 abrt-dump-jour/1455 openat(dfd: CWD, filename: "/var/log/journal/d6a97235307247e09f13f326fb607e3c/system.journal", flags: RDONLY|CLOEXEC|NONBLOCK) = 4 0.062 abrt-dump-jour/1454 openat(dfd: CWD, filename: "/var/log/journal/d6a97235307247e09f13f326fb607e3c/system.journal", flags: RDONLY|CLOEXEC|NONBLOCK) = 4 249.985 systemd-oomd/959 openat(dfd: CWD, filename: "/proc/meminfo", flags: RDONLY|CLOEXEC) = 12 466.763 thermald/1234 openat(dfd: CWD, filename: "/sys/class/powercap/intel-rapl/intel-rapl:0/intel-rapl:0:2/energy_uj") = 13 467.145 thermald/1234 openat(dfd: CWD, filename: "/sys/class/powercap/intel-rapl/intel-rapl:0/energy_uj") = 13 467.311 thermald/1234 openat(dfd: CWD, filename: "/sys/class/thermal/thermal_zone2/temp") = 13 500.040 cgroupify/24006 openat(dfd: 4, filename: ".", flags: RDONLY|CLOEXEC|DIRECTORY|NONBLOCK) = 5 500.295 cgroupify/24006 openat(dfd: 4, filename: "24616/cgroup.procs") = 5 [root@quaco ~]# Signed-off-by: Ian Rogers <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: Carsten Haitzler <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Fangrui Song <[email protected]> Cc: He Kuang <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: "Naveen N. Rao" <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Cc: Tiezhu Yang <[email protected]> Cc: Tom Rix <[email protected]> Cc: Wang Nan <[email protected]> Cc: Wang ShaoBo <[email protected]> Cc: Yang Jihong <[email protected]> Cc: Yonghong Song <[email protected]> Cc: YueHaibing <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-14x86/retpoline,kprobes: Skip optprobe check for indirect jumps with ↵Petr Pavlu1-3/+1
retpolines and IBT The kprobes optimization check can_optimize() calls insn_is_indirect_jump() to detect indirect jump instructions in a target function. If any is found, creating an optprobe is disallowed in the function because the jump could be from a jump table and could potentially land in the middle of the target optprobe. With retpolines, insn_is_indirect_jump() additionally looks for calls to indirect thunks which the compiler potentially used to replace original jumps. This extra check is however unnecessary because jump tables are disabled when the kernel is built with retpolines. The same is currently the case with IBT. Based on this observation, remove the logic to look for calls to indirect thunks and skip the check for indirect jumps altogether if the kernel is built with retpolines or IBT. Remove subsequently the symbols __indirect_thunk_start and __indirect_thunk_end which are no longer needed. Dropping this logic indirectly fixes a problem where the range [__indirect_thunk_start, __indirect_thunk_end] wrongly included also the return thunk. It caused that machines which used the return thunk as a mitigation and didn't have it patched by any alternative ended up not being able to use optprobes in any regular function. Fixes: 0b53c374b9ef ("x86/retpoline: Use -mfunction-return") Suggested-by: Peter Zijlstra (Intel) <[email protected]> Suggested-by: Masami Hiramatsu (Google) <[email protected]> Signed-off-by: Petr Pavlu <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Acked-by: Masami Hiramatsu (Google) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-08-11perf script python: Add stub for PMU symbol to the python bindingIan Rogers1-0/+5
Fix missing symbol seen in: ``` 19: 'import perf' in python : --- start --- test child forked, pid 2640936 python usage test: "echo "import sys ; sys.path.insert(0, 'python'); import perf" | '/usr/bin/python3' " Traceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: tools/perf/python/perf.cpython-311-x86_64-linux-gnu.so: undefined symbol: perf_pmus__supports_extended_type test child finished with -1 ---- end ---- 'import perf' in python: FAILED! ``` Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Yang Jihong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-11perf symbols: Fix DSO kernel load and symbol process to correctly map DSO to ↵Athira Rajeev2-5/+12
its long_name, type and adjust_symbols Test "object code reading" fails sometimes for kernel address as below: Reading object code for memory address: 0xc000000000004c3c File is: [kernel.kallsyms] On file address is: 0x14c3c dso__data_read_offset failed test child finished with -1 ---- end ---- Object code reading: FAILED! Here dso__data_read_offset() fails for symbol address 0xc000000000004c3c. This is because the DSO long_name here is "[kernel.kallsyms]" and hence open_dso() fails to open this file. There is an incorrect DSO to map handling here. The key points here are: - The DSO long_name is set to "[kernel.kallsyms]". This file is not present and hence returns error - The DSO binary type is set to DSO_BINARY_TYPE__NOT_FOUND - The DSO adjust_symbols member is set to zero In the end dso__data_read_offset() returns -1 and the address 0x14c3c can not be resolved. Hence the test fails. But the address actually maps to the kernel DSO # objdump -z -d --start-address=0xc000000000004c3c --stop-address=0xc000000000004cbc /home/athira/linux/vmlinux /home/athira/linux/vmlinux: file format elf64-powerpcle Disassembly of section .head.text: c000000000004c3c <exc_virt_0x4c00_system_call+0x3c>: c000000000004c3c: a6 02 9b 7d mfsrr1 r12 c000000000004c40: 78 13 42 7c mr r2,r2 c000000000004c44: 18 00 4d e9 ld r10,24(r13) c000000000004c48: 60 c6 4a 61 ori r10,r10,50784 c000000000004c4c: a6 03 49 7d mtctr r10 Fix dso__process_kernel_symbol() to set the binary_type and adjust_symbols members. dso->adjust_symbols is used by map__rip_2objdump() which converts the symbol start address to the objdump address. Also set dso->long_name in dso__load_vmlinux(). Suggested-by: Adrian Hunter <[email protected]> Signed-off-by: Athira Rajeev <[email protected]> Acked-by: Adrian Hunter <[email protected]> Cc: Disha Goel <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-11perf build: Remove -Wno-unused-but-set-variable from the flex flags when ↵Arnaldo Carvalho de Melo1-5/+16
building with clang < 13.0.0 clang < 13.0.0 doesn't grok -Wno-unused-but-set-variable, so just remove it to avoid: error: unknown warning option '-Wno-unused-but-set-variable'; did you mean '-Wno-unused-const-variable'? [-Werror,-Wunknown-warning-option] make[4]: *** [/git/perf-6.5.0-rc4/tools/build/Makefile.build:128: /tmp/build/perf/util/pmu-flex.o] Error 1 make[4]: *** Waiting for unfinished jobs.... Fixes: ddc8e4c966923ad1 ("perf build: Disable fewer bison warnings") Cc: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/lkml/ZNUSWr52jUnVaaa%[email protected]/ Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-10Merge remote-tracking branch 'torvalds/master' into perf-tools-nextArnaldo Carvalho de Melo2-5/+5
To pick up some more fixes that went upstream via the perf-tools fixes branch. Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-08perf stat: Don't display zero tool countsIan Rogers1-0/+5
Andi reported (see link below) a regression when printing the 'duration_time' tool event, where it gets printed as "not counted" for most of the CPUs, fix it by skipping zero counts for tool events. Reported-by: Andi Kleen <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Tested-by: Andi Kleen <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Claire Jensen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/all/ZMlrzcVrVi1lTDmn@tassilo/ Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-08Revert "perf report: Append inlines to non-DWARF callchains"Arnaldo Carvalho de Melo1-5/+0
This reverts commit 46d21ec067490ab9cdcc89b9de5aae28786a8b8e. The tests were made with a specific workload, further tests on a recently updated fedora 38 system with a system wide perf.data file shows 'perf report' taking excessive time resolving inlines in vmlinux, so lets revert this until a full investigation and improvement on the addr2line support code is made. Reported-by: Jesper Dangaard Brouer <[email protected]> Acked-by: Artem Savkov <[email protected]> Tested-by: Jesper Dangaard Brouer <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Milian Wolff <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-07perf probe: Make synthesize_perf_probe_point() private to probe-event.cArnaldo Carvalho de Melo2-2/+3
Not used in any other place, so just make it static. Cc: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/lkml/ZM0pjfOe6R4X%[email protected]/ Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-07perf probe: Free string returned by synthesize_perf_probe_point() on failure ↵Arnaldo Carvalho de Melo1-2/+6
in synthesize_perf_probe_command() Building perf with EXTRA_CFLAGS="-fsanitize=address" a leak was detected elsewhere and lead to an audit, where we found that synthesize_perf_probe_command() may leak synthesize_perf_probe_point() return on failure, fix it. Cc: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-07perf probe: Free string returned by synthesize_perf_probe_point() on failure ↵Arnaldo Carvalho de Melo1-2/+3
to add a probe Building perf with EXTRA_CFLAGS="-fsanitize=address" a leak is detect when trying to add a probe to a non-existent function: # perf probe -x ~/bin/perf dso__neW Probe point 'dso__neW' not found. Error: Failed to add events. ================================================================= ==296634==ERROR: LeakSanitizer: detected memory leaks Direct leak of 128 byte(s) in 1 object(s) allocated from: #0 0x7f67642ba097 in calloc (/lib64/libasan.so.8+0xba097) #1 0x7f67641a76f1 in allocate_cfi (/lib64/libdw.so.1+0x3f6f1) Direct leak of 65 byte(s) in 1 object(s) allocated from: #0 0x7f67642b95b5 in __interceptor_realloc.part.0 (/lib64/libasan.so.8+0xb95b5) #1 0x6cac75 in strbuf_grow util/strbuf.c:64 #2 0x6ca934 in strbuf_init util/strbuf.c:25 #3 0x9337d2 in synthesize_perf_probe_point util/probe-event.c:2018 #4 0x92be51 in try_to_find_probe_trace_events util/probe-event.c:964 #5 0x93d5c6 in convert_to_probe_trace_events util/probe-event.c:3512 #6 0x93d6d5 in convert_perf_probe_events util/probe-event.c:3529 #7 0x56f37f in perf_add_probe_events /var/home/acme/git/perf-tools-next/tools/perf/builtin-probe.c:354 #8 0x572fbc in __cmd_probe /var/home/acme/git/perf-tools-next/tools/perf/builtin-probe.c:738 #9 0x5730f2 in cmd_probe /var/home/acme/git/perf-tools-next/tools/perf/builtin-probe.c:766 #10 0x635d81 in run_builtin /var/home/acme/git/perf-tools-next/tools/perf/perf.c:323 #11 0x6362c1 in handle_internal_command /var/home/acme/git/perf-tools-next/tools/perf/perf.c:377 #12 0x63667a in run_argv /var/home/acme/git/perf-tools-next/tools/perf/perf.c:421 #13 0x636b8d in main /var/home/acme/git/perf-tools-next/tools/perf/perf.c:537 #14 0x7f676302950f in __libc_start_call_main (/lib64/libc.so.6+0x2950f) SUMMARY: AddressSanitizer: 193 byte(s) leaked in 2 allocation(s). # synthesize_perf_probe_point() returns a "detachec" strbuf, i.e. a malloc'ed string that needs to be free'd. An audit will be performed to find other such cases. Acked-by: Masami Hiramatsu <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-04Merge remote-tracking branch 'torvalds/master' into perf-tools-nextArnaldo Carvalho de Melo4-31/+55
To pick up the fixes that were just merged from perf-tools/perf-tools for v6.5. Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-03perf annotate bpf: Don't enclose non-debug code with an assert()Arnaldo Carvalho de Melo1-3/+7
In 616b14b47a86d880 ("perf build: Conditionally define NDEBUG") we started using NDEBUG=1 when DEBUG=1 isn't present, so code that is enclosed with assert() is not called. In dd317df072071903 ("perf build: Make binutil libraries opt in") we stopped linking against binutils-devel, for licensing reasons. Recently people asked me why annotation of BPF programs wasn't working, i.e. this: $ perf annotate bpf_prog_5280546344e3f45c_kfree_skb was returning: case SYMBOL_ANNOTATE_ERRNO__NO_LIBOPCODES_FOR_BPF: scnprintf(buf, buflen, "Please link with binutils's libopcode to enable BPF annotation"); This was on a fedora rpm, so its new enough that I had to try to test by rebuilding using BUILD_NONDISTRO=1, only to get it segfaulting on me. This combination made this libopcode function not to be called: assert(bfd_check_format(bfdf, bfd_object)); Changing it to: if (!bfd_check_format(bfdf, bfd_object)) abort(); Made it work, looking at this "check" function made me realize it changes the 'bfdf' internal state, i.e. we better call it. So stop using assert() on it, just call it and abort if it fails. Probably it is better to propagate the error, etc, but it seems it is unlikely to fail from the usage done so far and we really need to stop using libopcodes, so do the quick fix above and move on. With it we have BPF annotation back working when built with BUILD_NONDISTRO=1: ⬢[acme@toolbox perf-tools-next]$ perf annotate --stdio2 bpf_prog_5280546344e3f45c_kfree_skb | head No kallsyms or vmlinux with build-id 939bc71a1a51cdc434e60af93c7e734f7d5c0e7e was found Samples: 12 of event 'cpu-clock:ppp', 4000 Hz, Event count (approx.): 3000000, [percent: local period] bpf_prog_5280546344e3f45c_kfree_skb() bpf_prog_5280546344e3f45c_kfree_skb Percent int kfree_skb(struct trace_event_raw_kfree_skb *args) { nop 33.33 xchg %ax,%ax push %rbp mov %rsp,%rbp sub $0x180,%rsp push %rbx push %r13 ⬢[acme@toolbox perf-tools-next]$ Fixes: 6987561c9e86eace ("perf annotate: Enable annotation of BPF programs") Cc: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mohamed Mahmoud <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Dave Tucker <[email protected]> Cc: Derek Barbosa <[email protected]> Cc: Song Liu <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-03perf script python: Cope with declarations after statements found in Python.hArnaldo Carvalho de Melo1-1/+2
With -Werror the build was failing on fedora rawhide: [perfbuilder@27cfe44d67ed perf-6.5.0-rc2]$ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/13/lto-wrapper OFFLOAD_TARGET_NAMES=nvptx-none OFFLOAD_TARGET_DEFAULT=1 Target: x86_64-redhat-linux Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,objc,obj-c++,ada,go,d,m2,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --enable-libstdcxx-backtrace --with-libstdcxx-zoneinfo=/usr/share/zoneinfo --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --with-isl=/builddir/build/BUILD/gcc-13.2.1-20230728/obj-x86_64-redhat-linux/isl-install --enable-offload-targets=nvptx-none --without-cuda-driver --enable-offload-defaulted --enable-gnu-indirect-function --enable-cet --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux --with-build-config=bootstrap-lto --enable-link-serialization=1 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 13.2.1 20230728 (Red Hat 13.2.1-1) (GCC) [perfbuilder@27cfe44d67ed perf-6.5.0-rc2]$ In file included from /usr/include/python3.12/Python.h:44, from scripts/python/Perf-Trace-Util/Context.c:14: /usr/include/python3.12/object.h: In function 'Py_SIZE': /usr/include/python3.12/object.h:217:5: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] 217 | PyVarObject *var_ob = _PyVarObject_CAST(ob); | ^~~~~~~~~~~ In file included from /usr/include/python3.12/Python.h:53: /usr/include/python3.12/cpython/longintrepr.h: In function '_PyLong_CompactValue': /usr/include/python3.12/cpython/longintrepr.h:121:5: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] 121 | Py_ssize_t sign = 1 - (op->long_value.lv_tag & _PyLong_SIGN_MASK); | ^~~~~~~~~~ <SNIP> In file included from /usr/include/python3.12/Python.h:44, from util/scripting-engines/trace-event-python.c:22: /usr/include/python3.12/object.h: In function 'Py_SIZE': /usr/include/python3.12/object.h:217:5: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] 217 | PyVarObject *var_ob = _PyVarObject_CAST(ob); | ^~~~~~~~~~~ CC /tmp/build/perf/util/units.o CC /tmp/build/perf/util/time-utils.o In file included from /usr/include/python3.12/Python.h:53: /usr/include/python3.12/cpython/longintrepr.h: In function '_PyLong_CompactValue': /usr/include/python3.12/cpython/longintrepr.h:121:5: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] 121 | Py_ssize_t sign = 1 - (op->long_value.lv_tag & _PyLong_SIGN_MASK); | ^~~~~~~~~~ So add -Wno-declaration-after-statement to the python scripting CFLAGS. Reviewed-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/lkml/ZMpdKeO8gU%[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-03perf python: Cope with declarations after statements found in Python.hArnaldo Carvalho de Melo1-0/+3
With -Werror the build was failing on fedora rawhide: [perfbuilder@27cfe44d67ed perf-6.5.0-rc2]$ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/13/lto-wrapper OFFLOAD_TARGET_NAMES=nvptx-none OFFLOAD_TARGET_DEFAULT=1 Target: x86_64-redhat-linux Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,objc,obj-c++,ada,go,d,m2,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --enable-libstdcxx-backtrace --with-libstdcxx-zoneinfo=/usr/share/zoneinfo --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --with-isl=/builddir/build/BUILD/gcc-13.2.1-20230728/obj-x86_64-redhat-linux/isl-install --enable-offload-targets=nvptx-none --without-cuda-driver --enable-offload-defaulted --enable-gnu-indirect-function --enable-cet --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux --with-build-config=bootstrap-lto --enable-link-serialization=1 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 13.2.1 20230728 (Red Hat 13.2.1-1) (GCC) [perfbuilder@27cfe44d67ed perf-6.5.0-rc2]$ In file included from /usr/include/python3.12/Python.h:44, from /git/perf-6.5.0-rc2/tools/perf/util/python.c:2: /usr/include/python3.12/object.h: In function ‘Py_SIZE’: /usr/include/python3.12/object.h:217:5: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] 217 | PyVarObject *var_ob = _PyVarObject_CAST(ob); | ^~~~~~~~~~~ LD /tmp/build/perf/arch/perf-in.o In file included from /usr/include/python3.12/Python.h:53: /usr/include/python3.12/cpython/longintrepr.h: In function ‘_PyLong_CompactValue’: /usr/include/python3.12/cpython/longintrepr.h:121:5: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] 121 | Py_ssize_t sign = 1 - (op->long_value.lv_tag & _PyLong_SIGN_MASK); | ^~~~~~~~~~ So add -Wno-declaration-after-statement to the python binding CFLAGS. Reviewed-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-03perf probe: Show correct error message about @symbol usage for uprobeMasami Hiramatsu1-4/+9
Since @symbol variable access is not supported by uprobe event, it must be correctly warn user instead of kernel version update. Committer testing: With/without the patch: [root@quaco ~]# perf probe -x ~/bin/perf -L sigtrap_handler <sigtrap_handler@/home/acme/git/perf-tools-next/tools/perf/tests/sigtrap.c:0> 0 sigtrap_handler(int signum __maybe_unused, siginfo_t *info, void *ucontext __maybe_unused) 1 { 2 if (!__atomic_fetch_add(&ctx.signal_count, 1, __ATOMIC_RELAXED)) 3 ctx.first_siginfo = *info; 4 __atomic_fetch_sub(&ctx.tids_want_signal, syscall(SYS_gettid), __ATOMIC_RELAXED); 5 } static void *test_thread(void *arg) { [root@quaco ~]# perf probe -x ~/bin/perf sigtrap_handler:4 "ctx.signal_count" Without the patch: [root@quaco ~]# perf probe -x ~/bin/perf sigtrap_handler:4 "ctx.signal_count" Failed to write event: Invalid argument Please upgrade your kernel to at least 3.14 to have access to feature @ctx Error: Failed to add events. [root@quaco ~]# With the patch: [root@quaco ~]# Failed to write event: Invalid argument @ctx accesses a variable by symbol name, but that is not supported for user application probe. Error: Failed to add events. [root@quaco ~]# Reported-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Masami Hiramatsu <[email protected]> Closes: https://lore.kernel.org/all/[email protected]/ Tested-by: Arnaldo Carvalho de Melo <[email protected]> Link: https://lore.kernel.org/r/169055397023.67089.12693645664676964310.stgit@devnote2 Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-03perf parse-events: Remove array remnantsIan Rogers3-119/+0
parse_events_array was set up by event term parsing, which no longer exists. Remove this struct and references to it. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: He Kuang <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Cc: Wang Nan <[email protected]> Cc: Wang ShaoBo <[email protected]> Cc: YueHaibing <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-03perf tools: Revert enable indices setting syntax for BPF mapIan Rogers3-140/+1
This reverts commit e571e029bdbf ("perf tools: Enable indices setting syntax for BPF map"). The reverted commit added a notion of arrays that could be set as event terms for BPF events. The parsing hasn't worked over multiple Linux releases. Given the broken nature of the parsing it appears the code isn't in use, nor could I find a way for it to be used to add a test. The original commit contains a test in the commit message, however, running it yields: ``` $ perf record -e './test_bpf_map_3.c/map:channel.value[0,1,2,3...5]=101/' usleep 2 event syntax error: '..pf_map_3.c/map:channel.value[0,1,2,3...5]=101/' \___ parser error Run 'perf list' for a list of valid events Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>] -e, --event <event> event selector. use 'perf list' to list available events ``` Given the code can't be used this commit reverts and removes it. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: He Kuang <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Cc: Wang Nan <[email protected]> Cc: Wang ShaoBo <[email protected]> Cc: YueHaibing <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-03perf parse-event: Avoid BPF test SEGVIan Rogers1-1/+1
loc is passed as NULL in tools/perf/tests/bpf.c do_test, meaning errors trigger a SEGV when trying to access. Add the missing NULL check. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: He Kuang <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Cc: Wang Nan <[email protected]> Cc: Wang ShaoBo <[email protected]> Cc: YueHaibing <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-03perf build: Include generated header files properlyNamhyung Kim4-8/+8
The flex and bison generate header files from the source. When user specified a build directory with O= option, it'd generate files under the directory. The build command has -I option to specify the header include directory. But the -I option only affects the files included like <...>. Let's change the flex and bison headers to use it instead of "...". Fixes: 80eeb67fe577aa76 ("perf jevents: Program to convert JSON file") Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Anup Sharma <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-03perf build: Remove -Wno-redundant-decls in 2 casesIan Rogers2-3/+0
Properly fix a warning and remove the -Wno-redundant-decls C flag. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Cc: Tom Rix <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-03perf build: Disable fewer bison warningsIan Rogers5-5/+17
If bison is version 3.8.2, reduce the number of bison C warnings disabled. Earlier bison versions have all C warnings disabled. Avoid implicit declarations of yylex by adding the declaration in the C file. A header can't be included as a circular dependency would occur due to the lexer using the bison defined tokens. Committer notes: Some recent versions of gcc and clang (noticed on Alpine Linux 3.17, edge, clearlinux, fedora 37, etc. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Cc: Tom Rix <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-03perf build: Disable fewer flex warningsIan Rogers1-6/+18
If flex is version 2.6.4, reduce the number of flex C warnings disabled. Earlier flex versions have all C warnings disabled. Committer notes: Added this to the list of ignored warnings to get it building on a Fedora 36 machine with flex 2.6.4: -Wno-misleading-indentation Noticed when building with: $ make LLVM=1 -C tools/perf NO_BPF_SKEL=1 DEBUG=1 Take two: We can't just try to canonicalize flex versions by just removing the dots, as we end up with: 2.6.4 >= 2.5.37 becoming: 264 >= 2537 Failing the build on flex 2.5.37, so instead use the back to the past added $(call version_ge3,$(FLEX_VERSION),2.6.4) variant to check for that. Making sure $(FLEX_VERSION) keeps the dots as we may want to use 'sort -V' or something nicer when available everywhere. Some other tweaks for other flex versions and combinations with gcc and clang versions were added, notes on the patch. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Cc: Tom Rix <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-01perf test parse-events: Test complex name has required event formatIan Rogers2-0/+12
test__checkevent_complex_name will use an "event" format which if not present, such as with a placeholder PMU, will cause test failures. Skip the test in this case to avoid failures in restricted environments. Add perf_pmu__has_format utility as a general PMU utility. Fixes: 628eaa4e877af823 ("perf pmus: Add placeholder core PMU") Signed-off-by: Ian Rogers <[email protected]> Tested-by: Thomas Richter <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-01perf pmus: Create placholder regardless of scanning core_onlyIan Rogers1-9/+7
If scanning all PMUs the placeholder is still necessary if no core PMU is found. This situation occurs in perf test's parse-events test, when uncore events appear before core. Fixes: 628eaa4e877af823 ("perf pmus: Add placeholder core PMU") Signed-off-by: Ian Rogers <[email protected]> Tested-by: Thomas Richter <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-07-28perf build: Add Wextra for C++ compilationIan Rogers1-0/+3
Commit d58ac0bf8d1e ("perf build: Add clang and llvm compile and linking support") added -Wall and -Wno-strict-aliasing for CXXFLAGS, but not -Wextra. -Wno-strict-aliasing is no longer necessary, adding -Wextra for CXXFLAGS requires adding -Wno-unused-parameter clang.cpp and clang-test.cpp for LIBCLANGLLVM=1 to build. Signed-off-by: Ian Rogers <[email protected]> Acked-by: James Clark <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Cc: Tom Rix <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-07-28perf bpf-loader: Remove unneeded diagnostic pragmaIan Rogers1-3/+0
Added during the progress to libbpf 1.0 the deprecated functions are no longer used and so the pragma can be removed. Signed-off-by: Ian Rogers <[email protected]> Acked-by: James Clark <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Eduard Zingerman <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Cc: Tom Rix <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-07-28perf machine: Include data symbols in the kernel mapNamhyung Kim1-1/+3
When 'perf record -d' is used, it needs data mmaps to symbolize global data. But it missed to collect kernel data maps so it cannot symbolize them. Instead of having a separate map, just increase the kernel map size to include the data section. Probably we can have a separate kernel map for data, but the current code assumes a single kernel map. So it'd require more changes in other places and looks error-prone. I decided not to go that way for now. Also it seems the kernel module size already includes the data section. For example, my system has the following. $ grep -e _stext -e _etext -e _edata /proc/kallsyms ffffffff99800000 T _stext ffffffff9a601ac8 T _etext ffffffff9b446a00 D _edata Size of the text section is (0x9a601ac8 - 0x99800000 = 0xe01ac8) and size including data section is (0x9b446a00 - 0x99800000 = 0x1c46a00). Before: $ perf record -d true $ perf report -D | grep MMAP | head -1 0 0 0x460 [0x60]: PERF_RECORD_MMAP -1/0: [0xffffffff99800000(0xe01ac8) @ 0xffffffff99800000]: x [kernel.kallsyms]_text ^^^^^^^^ here After: $ perf report -D | grep MMAP | head -1 0 0 0x460 [0x60]: PERF_RECORD_MMAP -1/0: [0xffffffff99800000(0x1c46a00) @ 0xffffffff99800000]: x [kernel.kallsyms]_text ^^^^^^^^^ Instead of just replacing it to _edata, try _edata first and then fall back to _etext just in case. Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-07-28perf symbols: Add kallsyms__get_symbol_start()Namhyung Kim2-3/+29
The kallsyms__get_symbol_start() to get any symbol address from kallsyms. The existing kallsyms__get_function_start() only allows text symbols so create this to allow data symbols too. Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Adrian Hunter <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-07-28perf parse-events: Remove ABORT_ONIan Rogers1-8/+14
Prefer informative messages rather than none with ABORT_ON. Document one failure mode and add an error message for another. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-07-28perf parse-events: Improve location for add pmuIan Rogers3-11/+13
Improve the location for add PMU for cases when PMUs aren't found. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>