aboutsummaryrefslogtreecommitdiff
path: root/tools/perf
AgeCommit message (Collapse)AuthorFilesLines
2019-10-15perf evlist: Fix fix for freed id arraysAndi Kleen1-1/+1
In the earlier fix for the memory overrun of id arrays I managed to typo the wrong event in the fix. Of course we need to close the current event in the loop, not the original failing event. The same test case as in the original patch still passes. Fixes: 7834fa948beb ("perf evlist: Fix access of freed id arrays") Signed-off-by: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-15perf jvmti: Link against tools/lib/ctype.h to have weak strlcpy()Thomas Richter1-1/+5
The build of file libperf-jvmti.so succeeds but the resulting object fails to load: # ~/linux/tools/perf/perf record -k mono -- java \ -XX:+PreserveFramePointer \ -agentpath:/root/linux/tools/perf/libperf-jvmti.so \ hog 100000 123450 Error occurred during initialization of VM Could not find agent library /root/linux/tools/perf/libperf-jvmti.so in absolute path, with error: /root/linux/tools/perf/libperf-jvmti.so: undefined symbol: _ctype Add the missing _ctype symbol into the build script. Fixes: 79743bc927f6 ("perf jvmti: Link against tools/lib/string.o to have weak strlcpy()") Signed-off-by: Thomas Richter <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Vasily Gorbik <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-15perf trace: Add syscall failure stats to -s/--summary and -S/--with-summaryArnaldo Carvalho de Melo1-24/+34
Just like strace has: # trace -s sleep 1 Summary of events: sleep (32370), 80 events, 93.0% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ nanosleep 1 0 1000.402 1000.402 1000.402 1000.402 0.00% mmap 8 0 0.023 0.002 0.003 0.004 8.49% close 5 0 0.015 0.001 0.003 0.009 51.39% mprotect 4 0 0.014 0.002 0.003 0.005 16.95% openat 3 0 0.013 0.003 0.004 0.005 14.29% munmap 1 0 0.010 0.010 0.010 0.010 0.00% read 4 0 0.005 0.001 0.001 0.002 16.83% brk 4 0 0.004 0.001 0.001 0.002 20.82% access 1 1 0.004 0.004 0.004 0.004 0.00% fstat 3 0 0.003 0.001 0.001 0.001 12.17% lseek 3 0 0.003 0.001 0.001 0.001 11.45% arch_prctl 2 1 0.002 0.001 0.001 0.001 2.30% execve 1 0 0.000 0.000 0.000 0.000 0.00% # # perf trace -S sleep 1 ? ... [continued]: execve()) = 0 0.028 brk(brk: NULL) = 0x559f5bd96000 0.033 arch_prctl(option: 0x3001, arg2: 0x7ffda8b715a0) = -1 EINVAL (Invalid argument) 0.046 access(filename: "/etc/ld.so.preload", mode: R) = -1 ENOENT (No such file or directory) 0.055 openat(dfd: CWD, filename: "/etc/ld.so.cache", flags: RDONLY|CLOEXEC) = 3 0.060 fstat(fd: 3, statbuf: 0x7ffda8b707a0) = 0 0.062 mmap(addr: NULL, len: 134346, prot: READ, flags: PRIVATE, fd: 3, off: 0) = 0x7f3aedfc4000 0.066 close(fd: 3) = 0 0.079 openat(dfd: CWD, filename: "/lib64/libc.so.6", flags: RDONLY|CLOEXEC) = 3 0.085 read(fd: 3, buf: 0x7ffda8b70948, count: 832) = 832 0.088 lseek(fd: 3, offset: 792, whence: SET) = 792 0.090 read(fd: 3, buf: 0x7ffda8b70810, count: 68) = 68 0.093 fstat(fd: 3, statbuf: 0x7ffda8b707f0) = 0 0.095 mmap(addr: NULL, len: 8192, prot: READ|WRITE, flags: PRIVATE|ANONYMOUS) = 0x7f3aedfc2000 0.101 lseek(fd: 3, offset: 792, whence: SET) = 792 0.103 read(fd: 3, buf: 0x7ffda8b70450, count: 68) = 68 0.105 lseek(fd: 3, offset: 864, whence: SET) = 864 0.107 read(fd: 3, buf: 0x7ffda8b70470, count: 32) = 32 0.110 mmap(addr: NULL, len: 1857472, prot: READ, flags: PRIVATE|DENYWRITE, fd: 3, off: 0) = 0x7f3aeddfc000 0.114 mprotect(start: 0x7f3aede1e000, len: 1679360, prot: NONE) = 0 0.121 mmap(addr: 0x7f3aede1e000, len: 1363968, prot: READ|EXEC, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x22000) = 0x7f3aede1e000 0.127 mmap(addr: 0x7f3aedf6b000, len: 311296, prot: READ, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x16f000) = 0x7f3aedf6b000 0.131 mmap(addr: 0x7f3aedfb8000, len: 24576, prot: READ|WRITE, flags: PRIVATE|FIXED|DENYWRITE, fd: 3, off: 0x1bb000) = 0x7f3aedfb8000 0.138 mmap(addr: 0x7f3aedfbe000, len: 14272, prot: READ|WRITE, flags: PRIVATE|FIXED|ANONYMOUS) = 0x7f3aedfbe000 0.147 close(fd: 3) = 0 0.158 arch_prctl(option: SET_FS, arg2: 0x7f3aedfc3580) = 0 0.210 mprotect(start: 0x7f3aedfb8000, len: 16384, prot: READ) = 0 0.230 mprotect(start: 0x559f5b27d000, len: 4096, prot: READ) = 0 0.236 mprotect(start: 0x7f3aee00f000, len: 4096, prot: READ) = 0 0.240 munmap(addr: 0x7f3aedfc4000, len: 134346) = 0 0.300 brk(brk: NULL) = 0x559f5bd96000 0.302 brk(brk: 0x559f5bdb7000) = 0x559f5bdb7000 0.305 brk(brk: NULL) = 0x559f5bdb7000 0.310 openat(dfd: CWD, filename: "/usr/lib/locale/locale-archive", flags: RDONLY|CLOEXEC) = 3 0.315 fstat(fd: 3, statbuf: 0x7f3aedfbdac0) = 0 0.318 mmap(addr: NULL, len: 217750512, prot: READ, flags: PRIVATE, fd: 3, off: 0) = 0x7f3ae0e52000 0.325 close(fd: 3) = 0 0.358 nanosleep(rqtp: 0x7ffda8b714b0, rmtp: NULL) = 0 1000.622 close(fd: 1) = 0 1000.641 close(fd: 2) = 0 1000.664 exit_group(error_code: 0) = ? Summary of events: sleep (722), 80 events, 93.0% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ nanosleep 1 0 1000.194 1000.194 1000.194 1000.194 0.00% mmap 8 0 0.025 0.002 0.003 0.005 10.17% close 5 0 0.018 0.001 0.004 0.010 50.18% mprotect 4 0 0.016 0.003 0.004 0.006 16.81% openat 3 0 0.011 0.003 0.004 0.004 6.57% munmap 1 0 0.010 0.010 0.010 0.010 0.00% brk 4 0 0.005 0.001 0.001 0.002 20.72% read 4 0 0.005 0.001 0.001 0.002 16.71% access 1 1 0.005 0.005 0.005 0.005 0.00% fstat 3 0 0.004 0.001 0.001 0.002 14.82% lseek 3 0 0.003 0.001 0.001 0.001 11.66% arch_prctl 2 1 0.002 0.001 0.001 0.001 3.59% execve 1 0 0.000 0.000 0.000 0.000 0.00% # Works for system wide, e.g. for 1ms: # perf trace -s -a sleep 0.001 Summary of events: sleep (768), 94 events, 37.9% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ nanosleep 1 0 1.133 1.133 1.133 1.133 0.00% execve 7 6 0.351 0.003 0.050 0.316 88.53% mmap 8 0 0.024 0.002 0.003 0.004 8.86% mprotect 4 0 0.017 0.003 0.004 0.006 16.02% openat 3 0 0.013 0.004 0.004 0.005 8.34% munmap 1 0 0.010 0.010 0.010 0.010 0.00% brk 4 0 0.007 0.001 0.002 0.002 10.99% close 5 0 0.005 0.001 0.001 0.002 11.69% read 5 0 0.005 0.000 0.001 0.002 30.53% access 1 1 0.004 0.004 0.004 0.004 0.00% fstat 3 0 0.004 0.001 0.001 0.002 10.74% lseek 3 0 0.003 0.001 0.001 0.001 10.20% arch_prctl 2 1 0.002 0.001 0.001 0.001 3.34% Web Content (21258), 46 events, 18.5% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ recvmsg 12 12 0.015 0.001 0.001 0.002 8.50% futex 2 0 0.008 0.003 0.004 0.005 27.08% poll 6 0 0.006 0.000 0.001 0.002 22.14% read 2 0 0.006 0.002 0.003 0.003 26.08% write 1 0 0.002 0.002 0.002 0.002 0.00% Web Content (4365), 36 events, 14.5% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ recvmsg 10 10 0.015 0.001 0.002 0.003 11.83% poll 5 0 0.006 0.000 0.001 0.002 28.44% futex 2 0 0.005 0.001 0.003 0.004 48.29% read 1 0 0.003 0.003 0.003 0.003 0.00% Timer (21275), 14 events, 5.6% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ futex 6 1 0.240 0.000 0.040 0.149 64.58% write 1 0 0.008 0.008 0.008 0.008 0.00% Timer (4383), 14 events, 5.6% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ futex 6 2 0.186 0.000 0.031 0.181 96.45% write 1 0 0.010 0.010 0.010 0.010 0.00% Web Content (20354), 28 events, 11.3% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ recvmsg 8 8 0.010 0.001 0.001 0.002 15.24% poll 4 0 0.004 0.000 0.001 0.002 35.68% futex 1 0 0.003 0.003 0.003 0.003 0.00% read 1 0 0.003 0.003 0.003 0.003 0.00% Timer (20371), 10 events, 4.0% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ futex 4 1 0.077 0.000 0.019 0.075 95.46% write 1 0 0.005 0.005 0.005 0.005 0.00% [root@quaco ~]# Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Luis Cláudio Gonçalves <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-15perf stat: Support --all-kernel/--all-userJin Yao4-0/+24
'perf record' has supported --all-kernel / --all-user to configure all used events to run in kernel space or run in user space. But 'perf stat' doesn't support these options. It would be useful to support these options in 'perf stat' too to keep the same semantics available in both tools. Signed-off-by: Jin Yao <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-15perf jvmti: Link against tools/lib/ctype.h to have weak strlcpy()Thomas Richter1-1/+5
The build of file libperf-jvmti.so succeeds but the resulting object fails to load: # ~/linux/tools/perf/perf record -k mono -- java \ -XX:+PreserveFramePointer \ -agentpath:/root/linux/tools/perf/libperf-jvmti.so \ hog 100000 123450 Error occurred during initialization of VM Could not find agent library /root/linux/tools/perf/libperf-jvmti.so in absolute path, with error: /root/linux/tools/perf/libperf-jvmti.so: undefined symbol: _ctype Add the missing _ctype symbol into the build script. Fixes: 79743bc927f6 ("perf jvmti: Link against tools/lib/string.o to have weak strlcpy()") Signed-off-by: Thomas Richter <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Vasily Gorbik <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-15perf annotate: Fix objdump --no-show-raw-insn flagIan Rogers1-2/+2
Remove redirection of objdump's stderr to /dev/null to help diagnose failures. Fix the '--no-show-raw' flag to be '--no-show-raw-insn' which binutils is permissive and allows, but fails with LLVM objdump. Signed-off-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-15perf annotate: Don't pipe objdump output through 'expand' commandIan Rogers1-19/+76
Avoiding a pipe allows objdump command failures to surface. Move to the caller of symbol__parse_objdump_line the call to strim that removes leading and trailing tabs. Add a new expand_tabs function that if a tab is present allocate a new line in which tabs are expanded. In symbol__parse_objdump_line the line had no leading spaces, so simplify the line_ip processing. Signed-off-by: Ian Rogers <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-15perf annotate: Don't pipe objdump output through 'grep' commandIan Rogers1-1/+8
Simplify the objdump command by not piping the output of objdump through grep. Instead, drop lines that match the grep pattern during the reading loop. Signed-off-by: Ian Rogers <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-15perf annotate: Use libsubcmd's run-command.h to fork objdumpIan Rogers1-35/+37
Reduce duplicated logic by using the subcmd library. Ensure when errors occur they are reported to the caller. Before this patch, if no lines are read the error status is 0. Signed-off-by: Ian Rogers <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Link: http://lore.kernel.org/lkml/[email protected] [ merged follow up fix for NULL termination as in the 2nd link above ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-15perf annotate: Avoid reallocation in objdump parsingIan Rogers1-12/+14
Objdump output is parsed using getline which allocates memory for the read. Getline will realloc if the memory is too small, but currently the line is always freed after the call. Simplify parse_objdump_line by performing the reading in symbol__disassemble. Signed-off-by: Ian Rogers <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-15perf report: Add warning when libunwind not compiled inJin Yao1-0/+7
We received a user report that call-graph DWARF mode was enabled in 'perf record' but 'perf report' didn't unwind the callstack correctly. The reason was, libunwind was not compiled in. We can use 'perf -vv' to check the compiled libraries but it would be valuable to report a warning to user directly (especially valuable for a perf newbie). The warning is: Warning: Please install libunwind development packages during the perf build. Both TUI and stdio are supported. Signed-off-by: Jin Yao <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-15perf test: Avoid infinite loop for task exit caseLeo Yan1-0/+8
When executing the task exit testing case, perf gets stuck in an endless loop this case and doesn't return back on Arm64 Juno board. After digging into this issue, since Juno board has Arm's big.LITTLE CPUs, thus the PMUs are not compatible between the big CPUs and little CPUs. This leads to a PMU event that cannot be enabled properly when the traced task is migrated from one variant's CPU to another variant. Finally, the test case runs into infinite loop for cannot read out any event data after return from polling. Eventually, we need to work out formal solution to allow PMU events can be freely migrated from one CPU variant to another, but this is a difficult task and a different topic. This patch tries to fix the Perf test case to avoid infinite loop, when the testing detects 1000 times retrying for reading empty events, it will directly bail out and return failure. This allows the Perf tool can continue its other test cases. Signed-off-by: Leo Yan <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-15perf test: Report failure for mmap eventsLeo Yan1-0/+1
When fail to mmap events in task exit case, it misses to set 'err' to -1; thus the testing will not report failure for it. This patch sets 'err' to -1 when fails to mmap events, thus Perf tool can report correct result. Fixes: d723a55096b8 ("perf test: Add test case for checking number of EXIT events") Signed-off-by: Leo Yan <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-15perf evlist: Fix fix for freed id arraysAndi Kleen1-1/+1
In the earlier fix for the memory overrun of id arrays I managed to typo the wrong event in the fix. Of course we need to close the current event in the loop, not the original failing event. The same test case as in the original patch still passes. Fixes: 7834fa948beb ("perf evlist: Fix access of freed id arrays") Signed-off-by: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-15perf script: Fix --reltime with --timeAndi Kleen3-5/+32
My earlier patch to just enable --reltime with --time was a little too optimistic. The --time parsing would accept absolute time, which is very confusing to the user. Support relative time in --time parsing too. This only works with recent perf record that records the first sample time. Otherwise we error out. Fixes: 3714437d3fcc ("perf script: Allow --time with --reltime") Signed-off-by: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-15perf tools: Allow to build with -ltcmallocJiri Olsa2-0/+7
By using "make TCMALLOC=1" you can enable perf to be build for usage with libtcmalloc.so (gperftools). Get heap profile (tools/perf directory): $ <install gperftools> $ make TCMALLOC=1 DEBUG=1 $ HEAPPROFILE=/tmp/heapprof ./perf ... $ pprof ./perf /tmp/heapprof.000* (pprof) top Total: 2335.5 MB 1735.1 74.3% 74.3% 1735.1 74.3% memdup 402.0 17.2% 91.5% 402.0 17.2% zalloc 140.2 6.0% 97.5% 145.8 6.2% map__new 33.6 1.4% 98.9% 33.6 1.4% symbol__new 12.4 0.5% 99.5% 12.4 0.5% alloc_event 6.2 0.3% 99.7% 6.2 0.3% nsinfo__new 5.5 0.2% 100.0% 5.5 0.2% nsinfo__copy 0.3 0.0% 100.0% 0.3 0.0% dso__new 0.1 0.0% 100.0% 0.1 0.0% do_read_string 0.0 0.0% 100.0% 0.0 0.0% __GI__IO_file_doallocate See callstack: $ pprof --pdf ./perf /tmp/heapprof.00* > callstack.pdf $ pprof --web ./perf /tmp/heapprof.00* Committer testing: Install gperftools, on fedora: # dnf install gperftools-devel Then build: $ make TCMALLOC=1 DEBUG=1 -C tools/perf O=/tmp/build/perf install-bin Verify that it linked against the right library: $ ldd ~/bin/perf | grep tcma libtcmalloc.so.4 => /lib64/libtcmalloc.so.4 (0x00007fb2953a7000) $ Run 'perf trace' system wide for 1 minute: # HEAPPROFILE=/tmp/heapprof perf trace -a sleep 1m <SNIP> 59985.524 ( 0.006 ms): Web Content/20354 recvmsg(fd: 9<socket:[1762817]>, msg: 0x7ffee5fdafb0) = -1 EAGAIN (Resource temporarily unavailable) 59985.536 ( 0.005 ms): Web Content/20354 recvmsg(fd: 9<socket:[1762817]>, msg: 0x7ffee5fdafc0) = -1 EAGAIN (Resource temporarily unavailable) 59981.956 (10.143 ms): SCTP timer/21716 ... [continued]: select()) = 0 (Timeout) 59985.549 ( ): Web Content/20354 poll(ufds: 0x7f1df38af180, nfds: 3, timeout_msecs: 4294967295) ... 0.926 (59999.481 ms): sleep/29764 ... [continued]: nanosleep()) = 0 59992.133 ( ): SCTP timer/21716 select(tvp: 0x7ff5bf7fee80) ... 60000.477 ( 0.009 ms): sleep/29764 close(fd: 1) = 0 60000.493 ( 0.005 ms): sleep/29764 close(fd: 2) = 0 60000.514 ( ): sleep/29764 exit_group() = ? Dumping heap profile to /tmp/heapprof.0001.heap (Exiting, 3 MB in use) [root@quaco ~]# Install pprof: # dnf install pprof And run it: # pprof ~/bin/perf /tmp/heapprof.0001.heap Using local file /root/bin/perf. Using local file /tmp/heapprof.0001.heap. Welcome to pprof! For help, type 'help'. (pprof) top Total: 4.0 MB 1.7 42.0% 42.0% 2.2 54.1% map__new 0.9 23.3% 65.3% 0.9 23.3% zalloc 0.5 11.4% 76.7% 0.5 11.4% dso__new 0.2 5.6% 82.3% 0.3 8.5% trace__sys_enter 0.2 4.9% 87.2% 0.2 4.9% __GI___strdup 0.2 3.8% 91.0% 0.2 3.8% new_term 0.1 2.2% 93.2% 0.4 10.1% __perf_pmu__new_alias 0.0 1.0% 94.3% 0.0 1.2% event_read_fields 0.0 0.8% 95.1% 0.0 0.8% nsinfo__new 0.0 0.7% 95.8% 0.1 3.2% trace__read_syscall_info (pprof) Signed-off-by: Jiri Olsa <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-11perf diff: Report noisy for cycles diffJin Yao9-0/+203
This patch prints the stddev and hist for the cycles diff of program block. It can help us to understand if the cycles is noisy or not. This patch is inspired by Andi Kleen's patch: https://lwn.net/Articles/600471/ We create new option '--cycles-hist'. Example: perf record -b ./div perf record -b ./div perf diff -c cycles # Baseline [Program Block Range] Cycles Diff Shared Object Symbol # ........ .......................................................... .... ................. ............................ # 46.72% [div.c:40 -> div.c:40] 0 div [.] main 46.72% [div.c:42 -> div.c:44] 0 div [.] main 46.72% [div.c:42 -> div.c:39] 0 div [.] main 20.54% [random_r.c:357 -> random_r.c:394] 1 libc-2.27.so [.] __random_r 20.54% [random_r.c:357 -> random_r.c:380] 0 libc-2.27.so [.] __random_r 20.54% [random_r.c:388 -> random_r.c:388] 0 libc-2.27.so [.] __random_r 20.54% [random_r.c:388 -> random_r.c:391] 0 libc-2.27.so [.] __random_r 17.04% [random.c:288 -> random.c:291] 0 libc-2.27.so [.] __random 17.04% [random.c:291 -> random.c:291] 0 libc-2.27.so [.] __random 17.04% [random.c:293 -> random.c:293] 0 libc-2.27.so [.] __random 17.04% [random.c:295 -> random.c:295] 0 libc-2.27.so [.] __random 17.04% [random.c:295 -> random.c:295] 0 libc-2.27.so [.] __random 17.04% [random.c:298 -> random.c:298] 0 libc-2.27.so [.] __random 8.40% [div.c:22 -> div.c:25] 0 div [.] compute_flag 8.40% [div.c:27 -> div.c:28] 0 div [.] compute_flag 5.14% [rand.c:26 -> rand.c:27] 0 libc-2.27.so [.] rand 5.14% [rand.c:28 -> rand.c:28] 0 libc-2.27.so [.] rand 2.15% [rand@plt+0 -> rand@plt+0] 0 div [.] rand@plt 0.00% [kernel.kallsyms] [k] __x86_indirect_thunk_rax 0.00% [do_mmap+714 -> do_mmap+732] -10 [kernel.kallsyms] [k] do_mmap 0.00% [do_mmap+737 -> do_mmap+765] 1 [kernel.kallsyms] [k] do_mmap 0.00% [do_mmap+262 -> do_mmap+299] 0 [kernel.kallsyms] [k] do_mmap 0.00% [__x86_indirect_thunk_r15+0 -> __x86_indirect_thunk_r15+0] 7 [kernel.kallsyms] [k] __x86_indirect_thunk_r15 0.00% [native_sched_clock+0 -> native_sched_clock+119] -1 [kernel.kallsyms] [k] native_sched_clock 0.00% [native_write_msr+0 -> native_write_msr+16] -13 [kernel.kallsyms] [k] native_write_msr When we enable the option '--cycles-hist', the output is perf diff -c cycles --cycles-hist # Baseline [Program Block Range] Cycles Diff stddev/Hist Shared Object Symbol # ........ .......................................................... .... ................. ................. ............................ # 46.72% [div.c:40 -> div.c:40] 0 ± 37.8% ▁█▁▁██▁█ div [.] main 46.72% [div.c:42 -> div.c:44] 0 ± 49.4% ▁▁▂█▂▂▂▂ div [.] main 46.72% [div.c:42 -> div.c:39] 0 ± 24.1% ▃█▂▄▁▃▂▁ div [.] main 20.54% [random_r.c:357 -> random_r.c:394] 1 ± 33.5% ▅▂▁█▃▁▂▁ libc-2.27.so [.] __random_r 20.54% [random_r.c:357 -> random_r.c:380] 0 ± 39.4% ▁▁█▁██▅▁ libc-2.27.so [.] __random_r 20.54% [random_r.c:388 -> random_r.c:388] 0 libc-2.27.so [.] __random_r 20.54% [random_r.c:388 -> random_r.c:391] 0 ± 41.2% ▁▃▁▂█▄▃▁ libc-2.27.so [.] __random_r 17.04% [random.c:288 -> random.c:291] 0 ± 48.8% ▁▁▁▁███▁ libc-2.27.so [.] __random 17.04% [random.c:291 -> random.c:291] 0 ±100.0% ▁█▁▁▁▁▁▁ libc-2.27.so [.] __random 17.04% [random.c:293 -> random.c:293] 0 ±100.0% ▁█▁▁▁▁▁▁ libc-2.27.so [.] __random 17.04% [random.c:295 -> random.c:295] 0 ±100.0% ▁█▁▁▁▁▁▁ libc-2.27.so [.] __random 17.04% [random.c:295 -> random.c:295] 0 libc-2.27.so [.] __random 17.04% [random.c:298 -> random.c:298] 0 ± 75.6% ▃█▁▁▁▁▁▁ libc-2.27.so [.] __random 8.40% [div.c:22 -> div.c:25] 0 ± 42.1% ▁▃▁▁███▁ div [.] compute_flag 8.40% [div.c:27 -> div.c:28] 0 ± 41.8% ██▁▁▄▁▁▄ div [.] compute_flag 5.14% [rand.c:26 -> rand.c:27] 0 ± 37.8% ▁▁▁████▁ libc-2.27.so [.] rand 5.14% [rand.c:28 -> rand.c:28] 0 libc-2.27.so [.] rand 2.15% [rand@plt+0 -> rand@plt+0] 0 div [.] rand@plt 0.00% [kernel.kallsyms] [k] __x86_indirect_thunk_rax 0.00% [do_mmap+714 -> do_mmap+732] -10 [kernel.kallsyms] [k] do_mmap 0.00% [do_mmap+737 -> do_mmap+765] 1 [kernel.kallsyms] [k] do_mmap 0.00% [do_mmap+262 -> do_mmap+299] 0 [kernel.kallsyms] [k] do_mmap 0.00% [__x86_indirect_thunk_r15+0 -> __x86_indirect_thunk_r15+0] 7 [kernel.kallsyms] [k] __x86_indirect_thunk_r15 0.00% [native_sched_clock+0 -> native_sched_clock+119] -1 ± 38.5% ▄█▁ [kernel.kallsyms] [k] native_sched_clock 0.00% [native_write_msr+0 -> native_write_msr+16] -13 ± 47.1% ▁█▇▃▁▁ [kernel.kallsyms] [k] native_write_msr v8: --- Rebase to perf/core branch v7: --- 1. v6 got Jiri's ACK. 2. Rebase to latest perf/core branch. v6: --- 1. Jiri provides better code for using data__hpp_register() in ui_init(). Use this code in v6. v5: --- 1. Refine the use of data__hpp_register() in ui_init() according to Jiri's suggestion. v4: --- 1. Rename the new option from '--noisy' to '--cycles-hist' 2. Remove the option '-n'. 3. Only update the spark value and stats when '--cycles-hist' is enabled. 4. Remove the code of printing '..'. v3: --- 1. Move the histogram to a separate column 2. Move the svals[] out of struct stats v2: --- Jiri got a compile error, CC builtin-diff.o builtin-diff.c: In function ‘compute_cycles_diff’: builtin-diff.c:712:10: error: taking the absolute value of unsigned type ‘u64’ {aka ‘long unsigned int’} has no effect [-Werror=absolute-value] 712 | labs(pair->block_info->cycles_spark[i] - | ^~~~ Because the result of u64 - u64 is still u64. Now we change the type of cycles_spark[] to s64. Signed-off-by: Jin Yao <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-11perf tools: Propagate CFLAGS to libperfJiri Olsa3-15/+18
Andi reported that 'make DEBUG=1' does not propagate to the libbperf code. It's true also for the other flags. Changing the code to propagate the global build flags to libperf compilation. Reported-by: Andi Kleen <[email protected]> Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-10libperf: Adopt perf_evlist__filter_pollfd() from tools/perfJiri Olsa4-11/+19
Introduce the perf_evlist__filter_pollfd function and export it in the perf/evlist.h header, so that libperf users can check if the descriptor is still alive. Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-10libperf: Introduce perf_evlist__purge()Jiri Olsa2-0/+31
Add a static perf_evlist__purge() function to purge evsels from a evlist. Add also perf_evlist__for_each_entry_safe() which is used by perf_evlist__purge(). Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-10libperf: Introduce perf_evlist__exit()Jiri Olsa3-6/+14
Add the perf_evlist__exit() function, so far it's not exported and added only for internal use for perf and libperf. USe it to release cpus/threads and pollfd array. Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-10libperf: Move the pollfd allocation from tools/perf to libperfJiri Olsa2-4/+5
It's needed in libperf only, so move it to the perf_evlist__mmap_ops() function. Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-10libperf: Centralize map refcnt settingJiri Olsa2-30/+15
Currently when a new map is mmapped we set its refcnt to 2 in the perf_evlist_mmap_ops::mmap callback. Every mmap gets its refcnt set to 2 when it's first mmaped: - 1 for the current user, which will be taken out by a call to perf_evlist__munmap_filtered(), where we find out there's no more data comming from kernel to this mmap. - 1 for the drain code where in perf_mmap__consume() the mmap is released if it is empty. Move this common setup into libperf's generic code before the mmap callback is called. Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-10perf evlist: Switch to libperf's mmap interfaceJiri Olsa1-175/+4
Switch to the libperf mmap interface by calling directly perf_evlist__mmap_ops() and removing perf's evlist__mmap_per_* functions. By switching to libperf perf_evlist__mmap() we need to operate over 'struct perf_mmap' in evlist__add_pollfd, so make the related changes there. Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-10perf evlist: Introduce perf_evlist__mmap_cb_mmap()Jiri Olsa1-2/+13
Add the perf_evlist__mmap_cb_mmap() function to call perf specific mmap__mmap() function during perf_evlist__mmap_ops() call. Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-10perf evlist: Introduce perf_evlist__mmap_cb_get()Jiri Olsa1-0/+24
Add the perf_evlist__mmap_cb_get() function to return 'struct perf_mmap' object during perf_evlist__mmap_ops() call. The array of 'struct mmap' is allocated via evlist__alloc_mmap(), in this callback we simply returns pointer to the base object. Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-10perf tools: Introduce perf_evlist__mmap_cb_idx()Jiri Olsa1-0/+14
Add perf_evlist__mmap_cb_idx function to call auxtrace_mmap_params__set_idx() on each new index during perf_evlist__mmap_ops call. Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-10libperf: Introduce perf_evlist_mmap_ops::mmap callbackJiri Olsa2-3/+29
Add the perf_evlist_mmap_ops::mmap callback to be called in mmap_per_evsel() to actually mmap the map. Add libperf's perf_evlist__mmap_cb_mmap() function as libperf's mmap callback. New mmaped map gets refcount set to 2 in mmap__mmap(), we follow that in mmap callback. We will move this to common place after we switch to perf_evlist__mmap(). Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-10libperf: Add perf_evlist_mmap_ops::get callbackJiri Olsa2-8/+13
Add the perf_evlist_mmap_ops::get callback to be called in mmap_per_evsel() to get/allocate the 'struct perf_mmap' object. Add the libperf's perf_evlist__mmap_cb_get() function as libperf's get callback. Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-10libperf: Introduce perf_evlist_mmap_ops::idx callbackJiri Olsa2-5/+17
Add the perf_evlist_mmap_ops::idx callback to be called in mmap_per_cpu() and mmap_per_thread() with current cpu and thread indexes. It's used by current aux code, so perf will use this callback to set the aux index. Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-10libperf: Introduce perf_evlist__mmap_ops()Jiri Olsa2-6/+26
To be able to pass specific callbacks to evlist's mmap. There will be a specific call to this function from perf's evlist__mmap() and libperf's perf_evlist__mmap() functions in following changes. Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Jiri Olsa <[email protected]>
2019-10-10libperf: Adopt perf_evlist__mmap()/munmap() from tools/perfJiri Olsa4-0/+243
Add libperf's version of perf_evlist__mmap()/munmap() functions and exporting them in the perf/evlist.h header. It's the backbone of what we have in perf code. The following changes will add needed callbacks and then we'll finally switch the perf code to use libperf's version. Add mmap/mmap_ovw 'struct perf_mmap' object arrays to hold maps for libperf's evlist. Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-10libperf: Adopt perf_mmap__read_event() from tools/perfJiri Olsa21-95/+98
Move perf_mmap__read_event() from tools/perf to libperf and export it in the perf/mmap.h header. Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-10libperf: Adopt perf_mmap__read_done() from tools/perfJiri Olsa20-33/+34
Move perf_mmap__read_init() from tools/perf to libperf and export it in the perf/mmap.h header. Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-10libperf: Adopt perf_mmap__read_init() from tools/perfJiri Olsa23-98/+107
Move perf_mmap__read_init() from tools/perf to libperf and export it in perf/mmap.h header. And add pr_debug2()/pr_debug3() macros support, because the code is using them. Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-10libperf: Adopt perf_mmap__consume() function from tools/perfJiri Olsa23-54/+87
Move perf_mmap__consume() vrom tools/perf to libperf and export it in the perf/mmap.h header. Move also the needed helpers perf_mmap__write_tail(), perf_mmap__read_head() and perf_mmap__empty(). Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-10perf tools: Use perf_mmap way to detect aux mmapJiri Olsa1-1/+3
We will move this code to libperf shortly, so we need to free it of 'struct auxtrace_mmap' usage, because it won't be available in libperf (for now). The perf_event_mmap_page::aux_size is set when the aux mmap is mapped, so the check is equivalent. Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-10libperf: Adopt perf_mmap__put() function from tools/perfJiri Olsa6-32/+48
Move perf_mmap__put() from tools/perf to libperf. Once perf_mmap__put() is moved, we need a way to call application related unmap code (AIO and aux related code for eprf), when the map goes away. Add the perf_mmap::unmap callback to do that. The unmap path from perf is: perf_mmap__put (libperf) perf_mmap__munmap (libperf) map->unmap_cb -> perf_mmap__unmap_cb (perf) mmap__munmap (perf) Committer notes: Add missing linux/kernel.h to tools/perf/lib/mmap.c to get the BUG_ON definition. Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-10libperf: Adopt perf_mmap__unmap() function from tools/perfJiri Olsa5-11/+17
Move perf_mmap__unmap() from tools/perf to libperf, to internal header internal/mmap.h. It will be used in the following patches. And rename the existing perf's function to mmap__munmap(). Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-10libperf: Adopt perf_mmap__get() function from tools/perfJiri Olsa6-8/+8
Move perf_mmap__get() from tools/perf to libperf in the internal header internal/mmap.h. Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-10libperf: Adopt perf_mmap__mmap() function from tools/perfJiri Olsa5-11/+25
Move perf_mmap__mmap() from tools/perf to libperf, it will be used in the following patches. And rename the existing perf's function to mmap__mmap(). Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-10libperf: Adopt perf_mmap__mmap_len() function from tools/perfJiri Olsa5-13/+21
Move perf_mmap__mmap_len() from tools/perf wto libperf, it will be used in the following patches. And rename the existing perf's function to mmap__mmap_len(). Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-10libperf: Add 'struct perf_mmap_param'Jiri Olsa4-8/+18
Add libperf's version of mmap params 'struct perf_mmap_param' object with the basics: 'prot' and 'mask'. Encapsulate it in the current 'struct mmap_params' object. Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-10libperf: Add perf_mmap__init() functionJiri Olsa4-3/+14
Add perf_mmap__init() function to initialize 'struct perf_mmap' objects. Add it to a new mmap.c source file, that will carry all the mmap related functions. Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-10perf tools: Avoid 'sample_reg_masks' being const + weakIan Rogers13-8/+46
Being const + weak breaks with some compilers that constant-propagate from the weak symbol. This behavior is outside of the specification, but in LLVM is chosen to match GCC's behavior. LLVM's implementation was set in this patch: https://github.com/llvm/llvm-project/commit/f49573d1eedcf1e44893d5a062ac1b72c8419646 A const + weak symbol is set to be weak_odr: https://llvm.org/docs/LangRef.html ODR is one definition rule, and given there is one constant definition constant-propagation is possible. It is possible to get this code to miscompile with LLVM when applying link time optimization. As compilers become more aggressive, this is likely to break in more instances. Move the definition of sample_reg_masks to the conditional part of perf_regs.h and guard usage with HAVE_PERF_REGS_SUPPORT. This avoids the weak symbol. Fix an issue when HAVE_PERF_REGS_SUPPORT isn't defined from patch v1. In v3, add perf_regs.c for architectures that HAVE_PERF_REGS_SUPPORT but don't declare sample_regs_masks. Further notes: Jiri asked: "Is this just a precaution or you actualy saw some breakage?" Ian answered: "We saw a breakage with clang with thinlto enabled for linking. Our compiler team had recently seen, and were surprised by, a similar issue and were able to dig out the weak ODR issue." Signed-off-by: Ian Rogers <[email protected]> Reviewed-by: Nick Desaulniers <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexey Budankov <[email protected]> Cc: Andi Kleen <[email protected]> Cc: [email protected] Cc: Guo Ren <[email protected]> Cc: Kan Liang <[email protected]> Cc: [email protected] Cc: Mao Han <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-09perf beauty: Introduce strtoul() for x86 MSRsArnaldo Carvalho de Melo3-1/+9
Continuing from the previous cset comment, now that filter expression works: # perf trace -e msr:* --filter="msr!=FS_BASE && msr != IA32_TSC_DEADLINE && msr != 0x830 && msr != 0x83f && msr !=IA32_SPEC_CTRL" --filter-pids 3750 0.000 Timer/5033 msr:write_msr(msr: SYSCALL_MASK, val: 292608) 0.009 Timer/5033 msr:write_msr(msr: LSTAR, val: -1398800368) 0.010 Timer/5033 msr:write_msr(msr: TSC_AUX, val: 4) 0.050 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST) 45.661 gnome-terminal/12595 msr:write_msr(msr: SYSCALL_MASK, val: 292608) 45.672 gnome-terminal/12595 msr:write_msr(msr: LSTAR, val: -1398800368) 45.675 gnome-terminal/12595 msr:write_msr(msr: TSC_AUX, val: 3) 54.852 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST) 130.508 Timer/4050 msr:write_msr(msr: SYSCALL_MASK, val: 292608) 130.527 Timer/4050 msr:write_msr(msr: LSTAR, val: -1398800368) 130.531 Timer/4050 msr:write_msr(msr: TSC_AUX, val: 3) 140.924 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST) 164.738 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST) 603.578 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST) 620.809 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST) 690.115 JS Watchdog/4259 msr:write_msr(msr: SYSCALL_MASK, val: 292608) 690.136 JS Watchdog/4259 msr:write_msr(msr: LSTAR, val: -1398800368) 690.141 JS Watchdog/4259 msr:write_msr(msr: TSC_AUX, val: 3) 690.186 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST) 759.016 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST) ^C[root@quaco ~]# Or look at the first 3 write_msr events for that IA32_TSC_DEADLINE to learn why it happens so often: # perf trace --max-events=3 --max-stack=8 -e msr:* --filter="msr==IA32_TSC_DEADLINE" --filter-pids 3750 0.000 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 19296732550862) do_trace_write_msr ([kernel.kallsyms]) do_trace_write_msr ([kernel.kallsyms]) lapic_next_deadline ([kernel.kallsyms]) clockevents_program_event ([kernel.kallsyms]) hrtimer_interrupt ([kernel.kallsyms]) smp_apic_timer_interrupt ([kernel.kallsyms]) apic_timer_interrupt ([kernel.kallsyms]) cpuidle_enter_state ([kernel.kallsyms]) 32.646 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 19296800134158) do_trace_write_msr ([kernel.kallsyms]) do_trace_write_msr ([kernel.kallsyms]) lapic_next_deadline ([kernel.kallsyms]) clockevents_program_event ([kernel.kallsyms]) hrtimer_start_range_ns ([kernel.kallsyms]) tick_nohz_restart_sched_tick ([kernel.kallsyms]) tick_nohz_idle_exit ([kernel.kallsyms]) do_idle ([kernel.kallsyms]) 32.802 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 19297507436922) do_trace_write_msr ([kernel.kallsyms]) do_trace_write_msr ([kernel.kallsyms]) lapic_next_deadline ([kernel.kallsyms]) clockevents_program_event ([kernel.kallsyms]) hrtimer_try_to_cancel ([kernel.kallsyms]) hrtimer_cancel ([kernel.kallsyms]) tick_nohz_restart_sched_tick ([kernel.kallsyms]) tick_nohz_idle_exit ([kernel.kallsyms]) # And if some of the strings can't be found: # trace -e msr:* --filter="msr!=SPECULATIVE_EXECUTION_PROBLEMS_SOLUTION && msr != IA32_TSC_DEADLINE && msr != 0x830 && msr != 0x83f && msr !=IA32_SPEC_CTRL" --filter-pids 3750 "SPECULATIVE_EXECUTION_PROBLEMS_SOLUTION" not found for "msr" in "msr:read_msr", can't set filter "(msr!=SPECULATIVE_EXECUTION_PROBLEMS_SOLUTION && msr != IA32_TSC_DEADLINE && msr != 0x830 && msr != 0x83f && msr !=IA32_SPEC_CTRL) && (common_pid != 28131 && common_pid != 3750)" # Next step is to automatically wire up the pre-existing strarrays, which there are quite a few. The strtoul() methods will be further enhanced to allow for looking at other arguments in a syscall/tracepoint, just like going from integer to string (scnprintf methods), so that those "val" lines for the msr tracepoints can be properly formatted or even resolved into some string. Cc: Adrian Hunter <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Luis Cláudio Gonçalves <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-09perf trace: Expand strings in filters to integersArnaldo Carvalho de Melo1-0/+130
So that one can try things like: # perf trace -e msr:* --filter="msr!=FS_BASE && msr != IA32_TSC_DEADLINE && msr != 0x830 && msr != 0x83f && msr !=IA32_SPEC_CTRL" --filter-pids 3750 That, at this point in the patchset, without any strtoul in place for tracepoint arguments, will result in: No resolver (strtoul) for "msr" in "msr:read_msr", can't set filter "(msr!=FS_BASE && msr != IA32_TSC_DEADLINE && msr != 0x830 && msr != 0x83f && msr !=IA32_SPEC_CTRL) && (common_pid != 25407 && common_pid != 3750)" # See you in the next cset! Cc: Adrian Hunter <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Luis Cláudio Gonçalves <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-09perf trace: Introduce a strtoul() method for 'struct strarrays'Arnaldo Carvalho de Melo2-0/+33
And also for 'struct strarray', since its needed to implement strarrays__strtoul(). This just traverses the entries and when finding a match, returns (offset + index), i.e. the value associated with the searched string. E.g. "EFER" (MSR_EFER) returns: # grep -w EFER -B2 /tmp/build/perf/trace/beauty/generated/x86_arch_MSRs_array.c #define x86_64_specific_MSRs_offset 0xc0000080 static const char *x86_64_specific_MSRs[] = { [0xc0000080 - x86_64_specific_MSRs_offset] = "EFER", # 0xc0000080 This will be auto-attached to 'struct syscall_arg_fmt' entries associated with strarrays as soon as we add a ->strarray and ->strarrays to 'struct syscall_arg_fmt'. Cc: Adrian Hunter <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Luis Cláudio Gonçalves <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-09perf trace: Add a strtoul() method to 'struct syscall_arg_fmt'Arnaldo Carvalho de Melo1-1/+7
This will go from a string to a number, so that filter expressions can be constructed with strings and then, before applying the tracepoint filters (or eBPF, in the future) we can map those strings to numbers. The first one will be for 'msr' tracepoint arguments, but real quickly we will be able to reuse all strarrays for that. Cc: Adrian Hunter <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Luis Cláudio Gonçalves <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-10-09perf trace: Introduce --filter for tracepoint eventsArnaldo Carvalho de Melo2-3/+10
Similar to what is in 'perf record', works just like there: # perf trace -e msr:* 328.297 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888) 328.302 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888) 328.306 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888) 328.317 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888) 328.322 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888) 328.327 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888) 328.331 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888) 328.336 :0/0 msr:write_msr(msr: FS_BASE, val: 140240388381888) 328.340 :0/0 ^Cmsr:write_msr(msr: FS_BASE, val: 140240388381888) # So, for a system wide trace session looking at the write_msr tracepoint we see a flood of MSR_FS_BASE, we need to get the number for that: # grep FS_BASE /tmp/build/perf/trace/beauty/generated/x86_arch_MSRs_array.c [0xc0000100 - x86_64_specific_MSRs_offset] = "FS_BASE", # And then use it in a filter: # perf trace -e msr:* --filter="msr!=0xc0000100" <SNIP> 942.177 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 3056931068232) 942.199 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 3057135655252) 942.203 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 3056931068222) 942.231 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 3056998373022) 942.241 :0/0 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 3056931068236) <SNIP> # Ok, lets filter that too, too noisy: # grep TSC_DEADLINE /tmp/build/perf/trace/beauty/generated/x86_arch_MSRs_array.c [0x000006E0] = "IA32_TSC_DEADLINE", # # perf trace -e msr:* --filter="msr!=0xc0000100 && msr!=0x6e0" -a sleep 0.1 0.000 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST) 0.066 CPU 0/KVM/4895 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6) 0.070 CPU 0/KVM/4895 msr:write_msr(msr: 0x830, val: 34359740667) 0.099 CPU 0/KVM/4895 msr:read_msr(msr: IA32_SYSENTER_ESP, val: -2199021993472) 0.100 CPU 0/KVM/4895 msr:read_msr(msr: IA32_APICBASE, val: 4276096000) 0.101 CPU 0/KVM/4895 msr:read_msr(msr: IA32_DEBUGCTLMSR) 0.109 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL) 1.000 :0/0 msr:write_msr(msr: 0x830, val: 17179871485) 18.893 :0/0 msr:write_msr(msr: 0x83f, val: 246) 28.810 :0/0 msr:write_msr(msr: 0x830, val: 68719479037) 40.117 CPU 0/KVM/4895 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6) 40.127 CPU 0/KVM/4895 msr:read_msr(msr: IA32_DEBUGCTLMSR) 40.139 CPU 0/KVM/4895 msr:write_msr(msr: LSTAR, val: -2130661312) 40.141 CPU 0/KVM/4895 msr:write_msr(msr: SYSCALL_MASK, val: 14080) 40.142 CPU 0/KVM/4895 msr:write_msr(msr: TSC_AUX) 40.144 CPU 0/KVM/4895 msr:write_msr(msr: KERNEL_GS_BASE) 40.147 CPU 0/KVM/4895 msr:write_msr(msr: IA32_SPEC_CTRL) 40.148 CPU 0/KVM/4895 msr:write_msr(msr: IA32_FLUSH_CMD, val: 1) 40.151 CPU 0/KVM/4895 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6) ^C # One can combine that with filtering pids as well: # perf trace -e msr:* --filter="msr!=0xc0000100 && msr!=0x6e0" --filter-pids 4895 -a sleep 0.09 0.000 :0/0 msr:write_msr(msr: 0x830, val: 4294969597) 0.291 gnome-terminal/2790 msr:write_msr(msr: SYSCALL_MASK, val: 292608) 0.294 gnome-terminal/2790 msr:write_msr(msr: LSTAR, val: -1935671280) 0.295 gnome-terminal/2790 msr:write_msr(msr: TSC_AUX, val: 6) 10.940 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 4294969597) 15.943 gnome-shell/2096 msr:write_msr(msr: 0x830, val: 4294969597) 16.975 :0/0 msr:write_msr(msr: 0x830, val: 4294969597) 19.560 :0/0 msr:write_msr(msr: 0x83f, val: 246) 25.162 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST) 25.807 JS Watchdog/3635 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6) 25.820 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL) 25.941 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 4294969597) 26.941 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 4294969597) 29.942 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 4294969597) 45.313 :0/0 msr:write_msr(msr: 0x83f, val: 246) 56.945 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 4294969597) 60.946 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 4294969597) 74.096 JS Watchdog/8971 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6) 74.130 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL) 79.673 :0/0 msr:write_msr(msr: 0x83f, val: 246) 79.947 gnome-terminal/2790 msr:write_msr(msr: 0x830, val: 17179871485) # Or for just a pid, with callchains: # grep SYSCALL_MAS /tmp/build/perf/trace/beauty/generated/x86_arch_MSRs_array.c [0xc0000084 - x86_64_specific_MSRs_offset] = "SYSCALL_MASK", # perf trace -e msr:* --filter="msr==0xc0000084" --pid 2790 --call-graph=dwarf 0.000 gnome-terminal/2790 msr:write_msr(msr: SYSCALL_MASK, val: 292608) do_trace_write_msr ([kernel.kallsyms]) do_trace_write_msr ([kernel.kallsyms]) kvm_on_user_return ([kvm]) fire_user_return_notifiers ([kernel.kallsyms]) exit_to_usermode_loop ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) entry_SYSCALL_64 ([kernel.kallsyms]) __GI___poll (inlined) 9299.073 gnome-terminal/2790 msr:write_msr(msr: SYSCALL_MASK, val: 292608) do_trace_write_msr ([kernel.kallsyms]) do_trace_write_msr ([kernel.kallsyms]) kvm_on_user_return ([kvm]) fire_user_return_notifiers ([kernel.kallsyms]) exit_to_usermode_loop ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) entry_SYSCALL_64 ([kernel.kallsyms]) __GI___poll (inlined) 9348.374 gnome-terminal/2790 msr:write_msr(msr: SYSCALL_MASK, val: 292608) do_trace_write_msr ([kernel.kallsyms]) do_trace_write_msr ([kernel.kallsyms]) kvm_on_user_return ([kvm]) fire_user_return_notifiers ([kernel.kallsyms]) exit_to_usermode_loop ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) entry_SYSCALL_64 ([kernel.kallsyms]) __GI___poll (inlined) <SNIP> # Ok, just another form of KVM to emit MSRs :-) Next step: elliminate those greps by getting the filter expression, looking for arg names, then for the arrays associated with it to do a reverse lookup. Also allow those filters to be associated with strace-like syscall names. After that: augment the 'val' arg for 'msr:write_msr' based on the first arg, 'msr'. Then, do that with eBPF too, not just with tracepoint filters. Cc: Adrian Hunter <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Luis Cláudio Gonçalves <[email protected]> Cc: Marcelo Tosatti <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>