aboutsummaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2018-11-21perf script: Share code and output format for uregs and iregs outputMilian Wolff1-23/+17
The iregs output was missing the newline at end as well as the leading ABI output. This made it hard to compare the iregs and uregs values. Instead, use a single function to output the register values and use it for both, iregs and uregs, to ensure the output is consistent. Before: perf 7049 [-01] 1343.354347: 1 cycles:ppp: ffffffffa7bc21ce perf_event_exec+0x18e (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux) ffffffffa7c7ead3 setup_new_exec+0xf3 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux) ffffffffa7cd7be5 load_elf_binary+0x395 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux) ffffffffa7c7e540 search_binary_handler+0x80 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux) ffffffffa7c7f1aa __do_execve_file.isra.13+0x58a (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux) ffffffffa7c7f561 do_execve+0x21 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux) ffffffffa7c7f596 __x64_sys_execve+0x26 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux) ffffffffa7a041cb do_syscall_64+0x5b (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux) ffffffffa840008c entry_SYSCALL_64+0x7c (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux) AX:0x80000000 BX:0x0 CX:0x0 DX:0x7 SI:0xf DI:0x286 BP:0xffff95bc8213a460 SP:0xffffacbf0ba97d18 IP:0xffffffffa7bc21cd FLAGS:0x28e CS:0x10 SS:0x18 R8:0x2 R9:0x21440 R10:0x33816fb3b8c R11:0x1 R12:0xffff95bc8213a460 R13:0xffff95bc8213a400 R14:0xffff95bc8213a400 R15:0x1 ABI:2 AX:0xffffffffffffffda BX:0xffffffffffffffff CX:0x7f84ad85798b DX:0x560209699d50 SI:0x7ffe2c7a6820 DI:0x7ffe2c7a8c9b BP:0x7ffe2c7a20d0 SP:0x7ffe2c7a2058 IP:0x7f84ad85798b FLAGS:0x206 CS:0x33 SS:0x2b R8:0x7ffe2c7a2030 R9:0x7f84ae55f010 R10:0x8 R11:0x206 R12:0xffffffffffffffff R13:0xffffffffffffffff R14:0xffffffffffffffff R15:0xffffffffffffffff perf 7049 [-01] 1343.354363: 1 cycles:ppp: ... After: perf 7049 [-01] 1343.354347: 1 cycles:ppp: ffffffffa7bc21ce perf_event_exec+0x18e (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux) ffffffffa7c7ead3 setup_new_exec+0xf3 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux) ffffffffa7cd7be5 load_elf_binary+0x395 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux) ffffffffa7c7e540 search_binary_handler+0x80 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux) ffffffffa7c7f1aa __do_execve_file.isra.13+0x58a (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux) ffffffffa7c7f561 do_execve+0x21 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux) ffffffffa7c7f596 __x64_sys_execve+0x26 (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux) ffffffffa7a041cb do_syscall_64+0x5b (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux) ffffffffa840008c entry_SYSCALL_64+0x7c (/lib/modules/4.20.0-rc1perf-devel-05115-gc0bc98f76e39-dirty/build/vmlinux) ABI:2 AX:0x80000000 BX:0x0 CX:0x0 DX:0x7 SI:0xf DI:0x286 BP:0xffff95bc8213a460 SP:0xffffacbf0ba97d18 IP:0xffffffffa7bc21cd FLAGS:0x28e CS:0x10 SS:0x18 R8:0x2 R9:0x21440 R10:0x33816fb3b8c R11:0x1 R12:0xffff95bc8213a460 R13:0xffff95bc8213a400 R14:0xffff95bc8213a400 R15:0x1 ABI:2 AX:0xffffffffffffffda BX:0xffffffffffffffff CX:0x7f84ad85798b DX:0x560209699d50 SI:0x7ffe2c7a6820 DI:0x7ffe2c7a8c9b BP:0x7ffe2c7a20d0 SP:0x7ffe2c7a2058 IP:0x7f84ad85798b FLAGS:0x206 CS:0x33 SS:0x2b R8:0x7ffe2c7a2030 R9:0x7f84ae55f010 R10:0x8 R11:0x206 R12:0xffffffffffffffff R13:0xffffffffffffffff R14:0xffffffffffffffff R15:0xffffffffffffffff perf 7049 [-01] 1343.354363: 1 cycles:ppp: ... Signed-off-by: Milian Wolff <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-11-21perf bpf: Reduce the hardcoded .max_entries for pid_mapsArnaldo Carvalho de Melo1-1/+9
While working on augmented syscalls I got into this error: # trace -vv --filter-pids 2469,1663 -e tools/perf/examples/bpf/augmented_raw_syscalls.c sleep 1 <SNIP> libbpf: map 0 is "__augmented_syscalls__" libbpf: map 1 is "__bpf_stdout__" libbpf: map 2 is "pids_filtered" libbpf: map 3 is "syscalls" libbpf: collecting relocating info for: '.text' libbpf: relo for 13 value 84 name 133 libbpf: relocation: insn_idx=3 libbpf: relocation: find map 3 (pids_filtered) for insn 3 libbpf: collecting relocating info for: 'raw_syscalls:sys_enter' libbpf: relo for 8 value 0 name 0 libbpf: relocation: insn_idx=1 libbpf: relo for 8 value 0 name 0 libbpf: relocation: insn_idx=3 libbpf: relo for 9 value 28 name 178 libbpf: relocation: insn_idx=36 libbpf: relocation: find map 1 (__augmented_syscalls__) for insn 36 libbpf: collecting relocating info for: 'raw_syscalls:sys_exit' libbpf: relo for 8 value 0 name 0 libbpf: relocation: insn_idx=0 libbpf: relo for 8 value 0 name 0 libbpf: relocation: insn_idx=2 bpf: config program 'raw_syscalls:sys_enter' bpf: config program 'raw_syscalls:sys_exit' libbpf: create map __bpf_stdout__: fd=3 libbpf: create map __augmented_syscalls__: fd=4 libbpf: create map syscalls: fd=5 libbpf: create map pids_filtered: fd=6 libbpf: added 13 insn from .text to prog raw_syscalls:sys_enter libbpf: added 13 insn from .text to prog raw_syscalls:sys_exit libbpf: load bpf program failed: Operation not permitted libbpf: failed to load program 'raw_syscalls:sys_exit' libbpf: failed to load object 'tools/perf/examples/bpf/augmented_raw_syscalls.c' bpf: load objects failed: err=-4009: (Incorrect kernel version) event syntax error: 'tools/perf/examples/bpf/augmented_raw_syscalls.c' \___ Failed to load program for unknown reason (add -v to see detail) Run 'perf list' for a list of valid events Usage: perf trace [<options>] [<command>] or: perf trace [<options>] -- <command> [<options>] or: perf trace record [<options>] [<command>] or: perf trace record [<options>] -- <command> [<options>] -e, --event <event> event/syscall selector. use 'perf list' to list available events If I then try to use strace (perf trace'ing 'perf trace' needs some more work before its possible) to get a bit more info I get: # strace -e bpf trace --filter-pids 2469,1663 -e tools/perf/examples/bpf/augmented_raw_syscalls.c sleep 1 bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERF_EVENT_ARRAY, key_size=4, value_size=4, max_entries=4, map_flags=0, inner_map_fd=0, map_name="__bpf_stdout__", map_ifindex=0}, 72) = 3 bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERF_EVENT_ARRAY, key_size=4, value_size=4, max_entries=4, map_flags=0, inner_map_fd=0, map_name="__augmented_sys", map_ifindex=0}, 72) = 4 bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=1, max_entries=500, map_flags=0, inner_map_fd=0, map_name="syscalls", map_ifindex=0}, 72) = 5 bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=1, max_entries=512, map_flags=0, inner_map_fd=0, map_name="pids_filtered", map_ifindex=0}, 72) = 6 bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_TRACEPOINT, insn_cnt=57, insns=0x1223f50, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(4, 18, 10), prog_flags=0, prog_name="sys_enter", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS}, 72) = 7 bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_TRACEPOINT, insn_cnt=18, insns=0x1224120, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(4, 18, 10), prog_flags=0, prog_name="sys_exit", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS}, 72) = -1 EPERM (Operation not permitted) bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_TRACEPOINT, insn_cnt=18, insns=0x1224120, license="GPL", log_level=1, log_size=262144, log_buf="", kern_version=KERNEL_VERSION(4, 18, 10), prog_flags=0, prog_name="sys_exit", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS}, 72) = -1 EPERM (Operation not permitted) bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_KPROBE, insn_cnt=18, insns=0x1224120, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(4, 18, 10), prog_flags=0, prog_name="sys_exit", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS}, 72) = -1 EPERM (Operation not permitted) event syntax error: 'tools/perf/examples/bpf/augmented_raw_syscalls.c' \___ Failed to load program for unknown reason <SNIP similar output as without 'strace'> # I managed to create the maps, etc, but then installing the "sys_exit" hook into the "raw_syscalls:sys_exit" tracepoint somehow gets -EPERMed... I then go and try reducing the size of this new table: +++ b/tools/perf/examples/bpf/augmented_raw_syscalls.c @@ -47,6 +47,17 @@ struct augmented_filename { #define SYS_OPEN 2 #define SYS_OPENAT 257 +struct syscall { + bool filtered; +}; + +struct bpf_map SEC("maps") syscalls = { + .type = BPF_MAP_TYPE_ARRAY, + .key_size = sizeof(int), + .value_size = sizeof(struct syscall), + .max_entries = 500, +}; And after reducing that .max_entries a tad, it works. So yeah, the "unknown reason" should be related to the number of bytes all this is taking, reduce the default for pid_map()s so that we can have a "syscalls" map with enough slots for all syscalls in most arches. And take notes about this error message, improve it :-) Cc: Adrian Hunter <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: David Ahern <[email protected]> Cc: Edward Cree <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Cc: Yonghong Song <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-11-21perf script: Add newline after uregs outputMilian Wolff1-0/+2
This change makes it much easier to easily distinguish between consecutive samples by keeping the empty line between them, like we see when we do not enable uregs output. Before: cpp-inlining 28298 [-01] 54837.342780: 3068085 cycles:pp: 7ffff7c96709 __hypot_finite+0xa9 (/usr/lib/libm-2.28.so) ... ABI:2 AX:0x0 BX:0x40f56cf6 CX:0x294a3ae7 ... cpp-inlining 28298 [-01] 54837.344493: 2881929 cycles:pp: 7ffff7c96696 __hypot_finite+0x36 (/usr/lib/libm-2.28.so) ... ABI:2 AX:0x40d440c7 BX:0x40d440c7 CX:0x4d45e5da ... After: cpp-inlining 28298 [-01] 54837.342780: 3068085 cycles:pp: 7ffff7c96709 __hypot_finite+0xa9 (/usr/lib/libm-2.28.so) ... ABI:2 AX:0x0 BX:0x40f56cf6 CX:0x294a3ae7 ... cpp-inlining 28298 [-01] 54837.344493: 2881929 cycles:pp: 7ffff7c96696 __hypot_finite+0x36 (/usr/lib/libm-2.28.so) ... ABI:2 AX:0x40d440c7 BX:0x40d440c7 CX:0x4d45e5da ... Signed-off-by: Milian Wolff <[email protected]> Cc: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-11-21Revert "perf augmented_syscalls: Drop 'write', 'poll' for testing without ↵Arnaldo Carvalho de Melo1-4/+0
self pid filter" Now that we have the "filtered_pids" logic in place, no need to do this rough filter to avoid the feedback loop from 'perf trace's own syscalls, revert it. This reverts commit 7ed71f124284359676b6496ae7db724fee9da753. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-11-21perf augmented_syscalls: Remove example hardcoded set of filtered pidsArnaldo Carvalho de Melo1-27/+0
Now that 'perf trace' fills in that "filtered_pids" BPF map, remove the set of filtered pids used as an example to test that feature. That feature works like this: Starting a system wide 'strace' like 'perf trace' augmented session we noticed that lots of events take place for a pid, which ends up being the feedback loop of perf trace's syscalls being processed by the 'gnome-terminal' process: # perf trace -e tools/perf/examples/bpf/augmented_raw_syscalls.c 0.391 ( 0.002 ms): gnome-terminal/2469 read(fd: 17</dev/ptmx>, buf: 0x564b79f750bc, count: 8176) = 453 0.394 ( 0.001 ms): gnome-terminal/2469 read(fd: 17</dev/ptmx>, buf: 0x564b79f75280, count: 7724) = -1 EAGAIN Resource temporarily unavailable 0.438 ( 0.001 ms): gnome-terminal/2469 read(fd: 4<anon_inode:[eventfd]>, buf: 0x7fffc696aeb0, count: 16) = 8 0.519 ( 0.001 ms): gnome-terminal/2469 read(fd: 17</dev/ptmx>, buf: 0x564b79f75280, count: 7724) = 114 0.522 ( 0.001 ms): gnome-terminal/2469 read(fd: 17</dev/ptmx>, buf: 0x564b79f752f1, count: 7611) = -1 EAGAIN Resource temporarily unavailable ^C So we can use --filter-pids to get rid of that one, and in this case what is being used to implement that functionality is that "filtered_pids" BPF map that the tools/perf/examples/bpf/augmented_raw_syscalls.c created and that 'perf trace' bpf loader noticed and created a "struct bpf_map" associated that then got populated by 'perf trace': # perf trace --filter-pids 2469 -e tools/perf/examples/bpf/augmented_raw_syscalls.c 0.020 ( 0.002 ms): gnome-shell/1663 epoll_pwait(epfd: 12<anon_inode:[eventpoll]>, events: 0x7ffd8f3ef960, maxevents: 32, sigsetsize: 8) = 1 0.025 ( 0.002 ms): gnome-shell/1663 read(fd: 24</dev/input/event4>, buf: 0x560c01bb8240, count: 8112) = 48 0.029 ( 0.001 ms): gnome-shell/1663 read(fd: 24</dev/input/event4>, buf: 0x560c01bb8258, count: 8088) = -1 EAGAIN Resource temporarily unavailable 0.032 ( 0.001 ms): gnome-shell/1663 read(fd: 24</dev/input/event4>, buf: 0x560c01bb8240, count: 8112) = -1 EAGAIN Resource temporarily unavailable 0.040 ( 0.003 ms): gnome-shell/1663 recvmsg(fd: 46<socket:[35893]>, msg: 0x7ffd8f3ef950) = -1 EAGAIN Resource temporarily unavailable 21.529 ( 0.002 ms): gnome-shell/1663 epoll_pwait(epfd: 5<anon_inode:[eventpoll]>, events: 0x7ffd8f3ef960, maxevents: 32, sigsetsize: 8) = 1 21.533 ( 0.004 ms): gnome-shell/1663 recvmsg(fd: 82<socket:[42826]>, msg: 0x7ffd8f3ef7b0, flags: DONTWAIT|CMSG_CLOEXEC) = 236 21.581 ( 0.006 ms): gnome-shell/1663 ioctl(fd: 8</dev/dri/card0>, cmd: DRM_I915_GEM_BUSY, arg: 0x7ffd8f3ef060) = 0 21.605 ( 0.020 ms): gnome-shell/1663 ioctl(fd: 8</dev/dri/card0>, cmd: DRM_I915_GEM_CREATE, arg: 0x7ffd8f3eeea0) = 0 21.626 ( 0.119 ms): gnome-shell/1663 ioctl(fd: 8</dev/dri/card0>, cmd: DRM_I915_GEM_SET_DOMAIN, arg: 0x7ffd8f3eee94) = 0 21.746 ( 0.081 ms): gnome-shell/1663 ioctl(fd: 8</dev/dri/card0>, cmd: DRM_I915_GEM_PWRITE, arg: 0x7ffd8f3eeea0) = 0 ^C Oops, yet another gnome process that is involved with the output that 'perf trace' generates, lets filter that out too: # perf trace --filter-pids 2469,1663 -e tools/perf/examples/bpf/augmented_raw_syscalls.c ? ( ): wpa_supplicant/1366 ... [continued]: select()) = 0 Timeout 0.006 ( 0.002 ms): wpa_supplicant/1366 clock_gettime(which_clock: BOOTTIME, tp: 0x7fffe5b1e430) = 0 0.011 ( 0.001 ms): wpa_supplicant/1366 clock_gettime(which_clock: BOOTTIME, tp: 0x7fffe5b1e3e0) = 0 0.014 ( 0.001 ms): wpa_supplicant/1366 clock_gettime(which_clock: BOOTTIME, tp: 0x7fffe5b1e430) = 0 ? ( ): gmain/1791 ... [continued]: poll()) = 0 Timeout 0.017 ( ): wpa_supplicant/1366 select(n: 6, inp: 0x55646fed3ad0, outp: 0x55646fed3b60, exp: 0x55646fed3bf0, tvp: 0x7fffe5b1e4a0) ... 157.879 ( 0.019 ms): gmain/1791 inotify_add_watch(fd: 8<anon_inode:inotify>, pathname: , mask: 16789454) = -1 ENOENT No such file or directory ? ( ): cupsd/1001 ... [continued]: epoll_pwait()) = 0 ? ( ): gsd-color/1908 ... [continued]: poll()) = 0 Timeout 499.615 ( ): cupsd/1001 epoll_pwait(epfd: 4<anon_inode:[eventpoll]>, events: 0x557a21166500, maxevents: 4096, timeout: 1000, sigsetsize: 8) ... 586.593 ( 0.004 ms): gsd-color/1908 recvmsg(fd: 3<socket:[38074]>, msg: 0x7ffdef34e800) = -1 EAGAIN Resource temporarily unavailable ? ( ): fwupd/2230 ... [continued]: poll()) = 0 Timeout ? ( ): rtkit-daemon/906 ... [continued]: poll()) = 0 Timeout ? ( ): rtkit-daemon/907 ... [continued]: poll()) = 1 724.603 ( 0.007 ms): rtkit-daemon/907 read(fd: 6<anon_inode:[eventfd]>, buf: 0x7f05ff768d08, count: 8) = 8 ? ( ): ssh/5461 ... [continued]: select()) = 1 810.431 ( 0.002 ms): ssh/5461 clock_gettime(which_clock: BOOTTIME, tp: 0x7ffd7f39f870) = 0 ^C Several syscall exit events for syscalls in flight when 'perf trace' started, etc. Saner :-) Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-11-21perf trace: Fill in BPF "filtered_pids" map when presentArnaldo Carvalho de Melo1-13/+48
This makes the augmented_syscalls support the --filter-pids and auto-filtered feedback loop pids just like when working without BPF, i.e. with just raw_syscalls:sys_{enter,exit} and tracepoint filters. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-11-21perf trace: See if there is a map named "filtered_pids"Arnaldo Carvalho de Melo1-1/+24
Lookup for the first map named "filtered_pids" and, if augmenting syscalls, i.e. if a BPF event is present and the "__augmented_syscalls__" is present, then fill in that map with the pids to filter, be it feedback loop ones (perf trace's pid, its father if it is "sshd", more auto-filtered in the future) or the ones explicitely stated in the tool command line via --filter-pids. The code to actually fill in the map comes next. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-11-21perf trace: Add "_from_option" suffix to trace__set_filter()Arnaldo Carvalho de Melo1-3/+3
As we'll need that name for a new function to set filters for both tracepoints and BPF maps for filtering pids. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-11-21perf evlist: Rename perf_evlist__set_filter* to perf_evlist__set_tp_filter*Arnaldo Carvalho de Melo3-10/+10
To better reflect that this is a tracepoint filter, as opposed, for instance to map based BPF filters. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-11-21perf augmented_syscalls: Use pid_filterArnaldo Carvalho de Melo1-2/+32
Just to test filtering a bunch of pids, now its time to go and get that hooked up in 'perf trace', right after we load the bpf program, if we find a "pids_filtered" map defined, we'll populate it with the filtered pids. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-11-21perf augmented_syscalls: Drop 'write', 'poll' for testing without self pid ↵Arnaldo Carvalho de Melo1-0/+4
filter When testing system wide tracing without filtering the syscalls called by 'perf trace' itself we get into a feedback loop, drop for now those two syscalls, that are the ones that 'perf trace' does in its loop for writing the syscalls it intercepts, to help with testing till we get that filtering in place. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-11-21perf bpf: Add simple pid_filter class accessible to BPF proggiesArnaldo Carvalho de Melo1-0/+21
Will be used in the augmented_raw_syscalls.c to implement 'perf trace --filter-pids'. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-11-21perf bpf: Add defines for map insertion/lookupArnaldo Carvalho de Melo1-0/+11
Starting with a helper for a basic pid_map(), a hash using a pid as a key. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-11-21perf augmented_syscalls: Remove needless linux/socket.h includeArnaldo Carvalho de Melo1-1/+0
Leftover from when we started augmented_raw_syscalls.c from tools/perf/examples/bpf/augmented_syscalls. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Fixes: e58a0322dbac ("perf examples bpf: Start augmenting raw_syscalls:sys_{start,exit}") Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-11-21perf augmented_syscalls: Filter on a hard coded pidArnaldo Carvalho de Melo1-1/+5
Just to show where we'll hook pid based filters, and what we use to obtain the current pid, using a BPF getpid() equivalent. Now we need to remove that hardcoded PID with a BPF hash map, so that we start by filtering 'perf trace's own PID, implement the --filter-pid functionality, etc. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-11-21perf bpf: Add unistd.h to the headers accessible to bpf proggiesArnaldo Carvalho de Melo1-0/+10
Start with a getpid() function wrapping BPF_FUNC_get_current_pid_tgid, idea is to mimic the system headers. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-11-21Merge tag 'perf-urgent-for-mingo-4.20-20181121' of ↵Ingo Molnar16-4/+93
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes: - Update kernel ABI headers, one of them lead to a small change in the ioctl 'cmd' beautifier in 'perf trace' to support the new ISO7816 commands. (Arnaldo Carvalho de Melo) - Restore proper cwd on return from mnt namespace (Jiri Olsa) - Add feature check for the get_current_dir_name() function used in the namespace fix from Jiri, that is not available in systems such as Alpine Linux, which uses the musl libc (Arnaldo Carvalho de Melo) - Fix crash in 'perf record' when synthesizing the unit for events such as 'cpu-clock' (Jiri Olsa) Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2018-11-21Merge tag 'linux-cpupower-4.20-rc4' of ↵Rafael J. Wysocki7-14/+14
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux Pull cpupower utility updates for 4.20-rc4 from Shuah Khan: "This cpupower update for Linux 4.20-rc4 consists of compile fixes to allow use of outside build flags and override of CFLAGS from Jiri Olsa, and fix to compilation with STATIC=true from Konstantin Khlebnikov." * tag 'linux-cpupower-4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux: tools cpupower: Override CFLAGS assignments tools cpupower debug: Allow to use outside build flags tools/power/cpupower: fix compilation with STATIC=true
2018-11-20tools/bpf: bpftool: add support for func typesYonghong Song5-0/+230
This patch added support to print function signature if btf func_info is available. Note that ksym now uses function name instead of prog_name as prog_name has a limit of 16 bytes including ending '\0'. The following is a sample output for selftests test_btf with file test_btf_haskv.o for translated insns and jited insns respectively. $ bpftool prog dump xlated id 1 int _dummy_tracepoint(struct dummy_tracepoint_args * arg): 0: (85) call pc+2#bpf_prog_2dcecc18072623fc_test_long_fname_1 1: (b7) r0 = 0 2: (95) exit int test_long_fname_1(struct dummy_tracepoint_args * arg): 3: (85) call pc+1#bpf_prog_89d64e4abf0f0126_test_long_fname_2 4: (95) exit int test_long_fname_2(struct dummy_tracepoint_args * arg): 5: (b7) r2 = 0 6: (63) *(u32 *)(r10 -4) = r2 7: (79) r1 = *(u64 *)(r1 +8) ... 22: (07) r1 += 1 23: (63) *(u32 *)(r0 +4) = r1 24: (95) exit $ bpftool prog dump jited id 1 int _dummy_tracepoint(struct dummy_tracepoint_args * arg): bpf_prog_b07ccb89267cf242__dummy_tracepoint: 0: push %rbp 1: mov %rsp,%rbp ...... 3c: add $0x28,%rbp 40: leaveq 41: retq int test_long_fname_1(struct dummy_tracepoint_args * arg): bpf_prog_2dcecc18072623fc_test_long_fname_1: 0: push %rbp 1: mov %rsp,%rbp ...... 3a: add $0x28,%rbp 3e: leaveq 3f: retq int test_long_fname_2(struct dummy_tracepoint_args * arg): bpf_prog_89d64e4abf0f0126_test_long_fname_2: 0: push %rbp 1: mov %rsp,%rbp ...... 80: add $0x28,%rbp 84: leaveq 85: retq Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2018-11-20tools/bpf: enhance test_btf file testing to test func infoYonghong Song3-13/+136
Change the bpf programs test_btf_haskv.c and test_btf_nokv.c to have two sections, and enhance test_btf.c test_file feature to test btf func_info returned by the kernel. Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2018-11-20tools/bpf: refactor to implement btf_get_from_id() in lib/bpfYonghong Song3-66/+72
The function get_btf() is implemented in tools/bpf/bpftool/map.c to get a btf structure given a map_info. This patch refactored this function to be function btf_get_from_id() in tools/lib/bpf so that it can be used later. Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2018-11-20tools/bpf: do not use pahole if clang/llvm can generate BTF sectionsYonghong Song1-0/+8
Add additional checks in tools/testing/selftests/bpf and samples/bpf such that if clang/llvm compiler can generate BTF sections, do not use pahole. Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2018-11-20tools/bpf: add support to read .BTF.ext sectionsYonghong Song4-15/+442
The .BTF section is already available to encode types. These types can be used for map pretty print. The whole .BTF will be passed to the kernel as well for which kernel can verify and return to the user space for pretty print etc. The llvm patch at https://reviews.llvm.org/D53736 will generate .BTF section and one more section .BTF.ext. The .BTF.ext section encodes function type information and line information. Note that this patch set only supports function type info. The functionality is implemented in libbpf. The .BTF section can be directly loaded into the kernel, and the .BTF.ext section cannot. The loader may need to do some relocation and merging, similar to merging multiple code sections, before loading into the kernel. Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2018-11-20tools/bpf: extends test_btf to test load/retrieve func_type infoYonghong Song1-3/+329
A two function bpf program is loaded with btf and func_info. After successful prog load, the bpf_get_info syscall is called to retrieve prog info to ensure the types returned from the kernel matches the types passed to the kernel from the user space. Several negative tests are also added to test loading/retriving of func_type info. Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2018-11-20tools/bpf: add new fields for program load in lib/bpfYonghong Song2-0/+8
The new fields are added for program load in lib/bpf so application uses api bpf_load_program_xattr() is able to load program with btf and func_info data. This functionality will be used in next patch by bpf selftest test_btf. Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2018-11-20tools/bpf: sync kernel uapi bpf.h header to tools directoryYonghong Song1-0/+13
The kernel uapi bpf.h is synced to tools directory. Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2018-11-20tools/bpf: Add tests for BTF_KIND_FUNC_PROTO and BTF_KIND_FUNCMartin KaFai Lau2-2/+476
This patch adds unit tests for BTF_KIND_FUNC_PROTO and BTF_KIND_FUNC to test_btf. Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2018-11-20tools/bpf: Sync kernel btf.h headerMartin KaFai Lau1-3/+15
The kernel uapi btf.h is synced to the tools directory. Signed-off-by: Martin KaFai Lau <[email protected]> Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]>
2018-11-20objtool: Fix segfault in .cold detection with -ffunction-sectionsArtem Savkov1-3/+14
Because find_symbol_by_name() traverses the same lists as read_symbols(), changing sym->name in place without copying it affects the result of find_symbol_by_name(). In the case where a ".cold" function precedes its parent in sec->symbol_list, it can result in a function being considered a parent of itself. This leads to function length being set to 0 and other consequent side-effects including a segfault in add_switch_table(). The effects of this bug are only visible when building with -ffunction-sections in KCFLAGS. Fix by copying the search string instead of modifying it in place. Signed-off-by: Artem Savkov <[email protected]> Signed-off-by: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Fixes: 13810435b9a7 ("objtool: Support GCC 8's cold subfunctions") Link: http://lkml.kernel.org/r/910abd6b5a4945130fd44f787c24e07b9e07c8da.1542736240.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <[email protected]>
2018-11-20objtool: Fix double-free in .cold detection error pathArtem Savkov1-1/+1
If read_symbols() fails during second list traversal (the one dealing with ".cold" subfunctions) it frees the symbol, but never deletes it from the list/hash_table resulting in symbol being freed again in elf_close(). Fix it by just returning an error, leaving cleanup to elf_close(). Signed-off-by: Artem Savkov <[email protected]> Signed-off-by: Josh Poimboeuf <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Fixes: 13810435b9a7 ("objtool: Support GCC 8's cold subfunctions") Link: http://lkml.kernel.org/r/beac5a9b7da9e8be90223459dcbe07766ae437dd.1542736240.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <[email protected]>
2018-11-19selftests: mlxsw: Add a test for VxLAN floodingIdo Schimmel1-0/+309
The device stores flood records in a singly linked list where each record stores up to three IPv4 addresses of remote VTEPs. The test verifies that packets are correctly flooded in various cases such as deletion of a record in the middle of the list. Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Petr Machata <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-11-19selftests: mlxsw: Add a test for VxLAN configurationIdo Schimmel1-0/+664
Test various aspects of VxLAN offloading which are specific to mlxsw, such as sanitization of invalid configurations and offload indication. Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Petr Machata <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-11-19selftests: forwarding: vxlan_bridge_1d_port_8472: New testPetr Machata1-0/+10
This simple wrapper reruns the VXLAN ping test with a port number of 8472. Signed-off-by: Petr Machata <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-11-19selftests: forwarding: vxlan_bridge_1d: Add an ECN decap testPetr Machata1-0/+116
Test that when decapsulating from VXLAN, the values of inner and outer TOS are handled appropriately. Because VXLAN driver on its own won't produce the arbitrary TOS combinations necessary to test this feature, simply open-code a single ICMP packet and have mausezahn assemble it. Signed-off-by: Petr Machata <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-11-19selftests: forwarding: vxlan_bridge_1d: Add an ECN encap testPetr Machata1-0/+26
Test that ECN bits in the VXLAN envelope are correctly deduced from the overlay packet. Signed-off-by: Petr Machata <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-11-19selftests: forwarding: vxlan_bridge_1d: Add a TOS testPetr Machata1-0/+14
Test that TOS is inherited from the tunneled packet into the envelope as configured at the VXLAN device. Signed-off-by: Petr Machata <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-11-19selftests: forwarding: vxlan_bridge_1d: Add a TTL testPetr Machata1-0/+33
This tests whether TTL of VXLAN envelope packets is properly set based on the device configuration. Signed-off-by: Petr Machata <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-11-19selftests: forwarding: vxlan_bridge_1d: Reconfigure & rerun testsPetr Machata1-0/+26
The ordering of the topology creation can have impact on whether a driver is successful in offloading VXLAN. Therefore add a pseudo-test that reshuffles bits of the topology, and then reruns the same suite of tests again to make sure that the new setup is supported as well. Signed-off-by: Petr Machata <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-11-19selftests: forwarding: vxlan_bridge_1d: Add unicast testPetr Machata1-0/+54
Test that when sending traffic to a learned MAC address, the traffic is forwarded accurately only to the right endpoint. Signed-off-by: Petr Machata <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-11-19selftests: forwarding: vxlan_bridge_1d: Add flood testPetr Machata1-0/+105
Test that when sending traffic to an unlearned MAC address, the traffic is flooded to both remote VXLAN endpoints. Signed-off-by: Petr Machata <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-11-19selftests: forwarding: vxlan_bridge_1d: Add ping testPetr Machata1-0/+8
Test end-to-end reachability between local and remote endpoints. Note that because learning is disabled on the VXLAN device, the ICMP requests will end up being flooded to all remotes. Signed-off-by: Petr Machata <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-11-19selftests: forwarding: Add a skeleton of vxlan_bridge_1dPetr Machata1-0/+296
This skeleton sets up a topology with three VXLAN endpoints: one "local", possibly offloaded, and two "remote", formed using veth pairs and likely purely software bridges. The "local" endpoint is connected to host systems by a VLAN-unaware bridge. Since VXLAN tunnels must be unique per namespace, each of the "remote" endpoints is in its own namespace. H3 forms the bridge between the three domains. Signed-off-by: Petr Machata <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-11-19selftests: forwarding: lib: Add link_stats_rx_errors_get()Petr Machata1-2/+15
Such a function will be useful for counting malformed packets in the ECN decap test. To that end, introduce a common handler for handling stat-fetching, and reuse it in link_stats_tx_packets_get() and link_stats_rx_errors_get(). Signed-off-by: Petr Machata <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-11-19selftests: forwarding: ping{6, }_do(): Allow passing ping argumentsPetr Machata1-2/+4
Make the ping routine more generic by allowing passing arbitrary ping command-line arguments. Signed-off-by: Petr Machata <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-11-19selftests: forwarding: ping{6, }_test(): Add description argumentPetr Machata1-2/+2
Have ping_test() recognize an optional argument with a description of the test. This is handy if there are several ping test, to make it clear which is which. Signed-off-by: Petr Machata <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-11-19selftests: forwarding: lib: Add in_ns()Petr Machata1-0/+11
In order to run a certain command inside another network namespace, it's possible to use "ip netns exec ns command". However then one can't use functions defined in lib.sh or a test suite. One option is to do "ip netns exec ns bash -c command", provided that all functions that one wishes to use (and their dependencies) are published using "export -f". That may not be practical. Therefore, introduce a helper in_ns(), which wraps a given command in a boilerplate of "ip netns exec" and "source lib.sh", thus making all library functions available. (Custom functions that a script wishes to run within a namespace still need to be exported.) Because quotes in "$@" aren't recognized in heredoc, hand-expand the array in an explicit for loop, leveraging printf %q to handle proper quoting. Signed-off-by: Petr Machata <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-11-19selftests: forwarding: lib: Support NUM_NETIFS of 0Petr Machata1-2/+2
So far the case of NUM_NETIFS of 0 has not been interesting. However if one wishes to reuse the lib.sh routines in a setup of a separate namespace, being able to import like this is handy. Therefore replace the {1..$NUM_NETIFS} references, which cause iteration over 1 and 0, with an explicit for loop like we do in setup_wait() and tc_offload_check(), so that for NUM_NETIFS of 0 no iteration is done. Signed-off-by: Petr Machata <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-11-20tools: add selftest for BPF_F_ZERO_SEEDLorenz Bauer1-9/+55
Check that iterating two separate hash maps produces the same order of keys if BPF_F_ZERO_SEED is used. Signed-off-by: Lorenz Bauer <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
2018-11-20tools: sync linux/bpf.hLorenz Bauer1-3/+10
Synchronize changes to linux/bpf.h from * "bpf: allow zero-initializing hash map seed" * "bpf: move BPF_F_QUERY_EFFECTIVE after map flags" Signed-off-by: Lorenz Bauer <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
2018-11-20bpf: libbpf: retry map creation without the nameStanislav Fomichev1-1/+10
Since commit 88cda1c9da02 ("bpf: libbpf: Provide basic API support to specify BPF obj name"), libbpf unconditionally sets bpf_attr->name for maps. Pre v4.14 kernels don't know about map names and return an error about unexpected non-zero data. Retry sys_bpf without a map name to cover older kernels. v2 changes: * check for errno == EINVAL as suggested by Daniel Borkmann Signed-off-by: Stanislav Fomichev <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>