aboutsummaryrefslogtreecommitdiff
path: root/tools/perf
AgeCommit message (Collapse)AuthorFilesLines
2013-10-14perf symbols: Add map_groups__find_ams()Arnaldo Carvalho de Melo2-0/+21
Add a function to find a symbol using an ip that might be on a different map. Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Adrian Hunter <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-14perf symbols: Workaround objdump difficulties with kcoreAdrian Hunter4-0/+273
The objdump tool fails to annotate module symbols when looking at kcore. Workaround this by extracting object code from kcore and putting it in a temporary file for objdump to use instead. The temporary file is created to look like kcore but contains only the function being disassembled. Signed-off-by: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Renamed 'index' to 'idx' to avoid shadowing string.h's 'index' in Fedora 12, Replace local with variable length with malloc/free to fix build in Fedora 12 ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-14perf symbols: Validate kcore module addressesAdrian Hunter1-21/+175
Before using kcore we need to check that modules are in memory at the same addresses that they were when data was recorded. This is done because, while we could remap symbols to different addresses, the object code linkages would still be different which would provide an erroneous view of the object code. Signed-off-by: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Rename basename to base_name to avoid shadowing libgen's basename in fedora 12 ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-14perf tools: Separate lbfd check out of NO_DEMANGLE conditionJiri Olsa1-7/+9
We fail build with NO_DEMANGLE with missing -lbfd externals error. The reason is that we now use bfd code in srcline object: perf tools: Implement addr2line directly using libbfd So we need to check/add -lbfd always now. Signed-off-by: Jiri Olsa <[email protected]> Cc: Corey Ashford <[email protected]> Cc: David Ahern <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]>
2013-10-14perf tests: Fix memory leak in dso-data.cFelipe Pena1-0/+1
Fix for a memory leak on test_file() function in dso-data.c. Signed-off-by: Felipe Pena <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-14perf tools: Fix old GCC build error in 'get_srcline'David Ahern1-2/+2
trace-event-parse.c:parse_proc_kallsyms() Old GCC (4.4.2) does not see through the code flow of get_srcline() and gets confused about the status of 'file' and 'line': CC /tmp/build/perf/util/srcline.o cc1: warnings being treated as errors util/srcline.c: In function ¿get_srcline¿: util/srcline.c:226: error: ¿file¿ may be used uninitialized in this function util/srcline.c:227: error: ¿line¿ may be used uninitialized in this function make[1]: *** [/tmp/build/perf/util/srcline.o] Error 1 make: *** [install] Error 2 make: Leaving directory `/home/acme/git/linux/tools/perf' [acme@fedora12 linux]$ Help out GCC by initializing 'file' and 'line'. Signed-off-by: David Ahern <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-14perf trace: Add summary option to dump syscall statisticsDavid Ahern2-12/+102
When enabled dumps a summary of all syscalls by task with the usual statistics -- min, max, average and relative stddev. For example, make - 26341 : 3344 [ 17.4% ] 0.000 ms read : 52 0.000 4.802 0.644 30.08 write : 20 0.004 0.036 0.010 21.72 open : 24 0.003 0.046 0.014 23.68 close : 64 0.002 0.055 0.008 22.53 stat : 2714 0.002 0.222 0.004 4.47 fstat : 18 0.001 0.041 0.006 46.26 mmap : 30 0.003 0.009 0.006 5.71 mprotect : 8 0.006 0.039 0.016 32.16 munmap : 12 0.007 0.077 0.020 38.25 brk : 48 0.002 0.014 0.004 10.18 rt_sigaction : 18 0.002 0.002 0.002 2.11 rt_sigprocmask : 60 0.002 0.128 0.010 32.88 access : 2 0.006 0.006 0.006 0.00 pipe : 12 0.004 0.048 0.013 35.98 vfork : 34 0.448 0.980 0.692 3.04 execve : 20 0.000 0.387 0.046 56.66 wait4 : 34 0.017 9923.287 593.221 68.45 fcntl : 8 0.001 0.041 0.013 48.79 getdents : 48 0.002 0.079 0.013 19.62 getcwd : 2 0.005 0.005 0.005 0.00 chdir : 2 0.070 0.070 0.070 0.00 getrlimit : 2 0.045 0.045 0.045 0.00 arch_prctl : 2 0.002 0.002 0.002 0.00 setrlimit : 2 0.002 0.002 0.002 0.00 openat : 94 0.003 0.005 0.003 2.11 Signed-off-by: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-14perf util: Add findnew method to intlistDavid Ahern4-7/+44
Similar to other findnew based methods if the requested object is not found, add it to the list. v2: followed format of other findnew methods per acme's request Signed-off-by: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-14perf trace: Improve the error messagesRamkumar Ramachandra1-9/+28
Currently, execution of 'perf trace' reports the following cryptic message to the user: $ perf trace Couldn't read the raw_syscalls tracepoints information! Typically this happens because the user does not have permissions to read the debugfs filesystem. Also handle the case when the kernel was not compiled with debugfs support or when it isn't mounted. Now, the tool prints detailed error messages: $ perf trace Error: Unable to find debugfs Hint: Was your kernel was compiled with debugfs support? Hint: Is the debugfs filesystem mounted? Hint: Try 'sudo mount -t debugfs nodev /sys/kernel/debug' $ perf trace Error: No permissions to read /sys/kernel/debug//tracing/events/raw_syscalls Hint: Try 'sudo mount -o remount,mode=755 /sys/kernel/debug/' Signed-off-by: Ramkumar Ramachandra <[email protected]> Cc: David Ahern <[email protected]> Cc: Ingo Molnar <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Added ready to use commands to fix the issues as extra hints, use the current debugfs mount point when reporting permission error, use strerror_r instead of the deprecated sys_errlist, as reported by David Ahern ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-11perf timechart: Add example in the documentationRamkumar Ramachandra1-1/+14
While at it, update the synopsis to show both forms. Signed-off-by: Ramkumar Ramachandra <[email protected]> Cc: David Ahern <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-11perf tools: Implement summary output for 'make install'Ingo Molnar3-36/+48
'make install' used to show all the install lines, which is way too verbose to be really informative to the user. Implement summary output instead: comet:~/tip/tools/perf> make install BUILD: Doing 'make -j12' parallel build SUBDIR Documentation INSTALL Documentation-man INSTALL binaries INSTALL libexec INSTALL perf-archive INSTALL perl-scripts INSTALL python-scripts INSTALL bash_completion-script INSTALL tests 'make install V=1' will still show the old, detailed output. Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Fixed conflict with libperf-gtk patches in acme/perf/core, cope with 'trace' alias ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-11perf tools: Align perf version output to other build messagesIngo Molnar1-1/+1
Before: CC util/pmu.o CC util/parse-events.o PERF_VERSION = 3.12.rc4.g1b30c CC util/parse-events-flex.o GEN perf-archive After: CC util/pmu.o CC util/parse-events.o PERF_VERSION = 3.12.rc4.g1b30c CC util/parse-events-flex.o GEN perf-archive Signed-off-by: Ingo Molnar <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-11tools: Harmonize the various build messages in perf, lib-traceevent, lib-lkIngo Molnar3-10/+11
The various build lines from libtraceevent and perf mix up during a parallel build and produce unaligned output like: CC builtin-buildid-list.o CC builtin-buildid-cache.o CC builtin-list.o CC FPIC trace-seq.o CC builtin-record.o CC FPIC parse-filter.o CC builtin-report.o CC builtin-stat.o CC FPIC parse-utils.o CC FPIC kbuffer-parse.o CC builtin-timechart.o CC builtin-top.o CC builtin-script.o BUILD STATIC LIB libtraceevent.a CC builtin-probe.o CC builtin-kmem.o CC builtin-lock.o To solve this, harmonize all the build message alignments to be similar to the kernel's kbuild output: prefixed by two spaces and 11-char wide. After the patch the output looks pretty tidy, even if output lines get mixed up: CC builtin-annotate.o FLAGS: * new build flags or cross compiler CC builtin-bench.o AR liblk.a CC bench/sched-messaging.o CC FPIC event-parse.o CC bench/sched-pipe.o CC FPIC trace-seq.o CC bench/mem-memcpy.o CC bench/mem-memset.o CC FPIC parse-filter.o CC builtin-diff.o CC builtin-evlist.o CC builtin-help.o Signed-off-by: Ingo Molnar <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-11perf tools: Implement summary output for 'make clean'Ingo Molnar3-20/+28
'make clean' used to show all the rm lines, which isn't really informative in any way and spams the console. Implement summary output: comet:~/tip/tools/perf> make clean CLEAN libtraceevent CLEAN liblk CLEAN config CLEAN core-objs CLEAN core-progs CLEAN core-gen CLEAN Documentation CLEAN python 'make clean V=1' will still show the old, detailed output. Signed-off-by: Ingo Molnar <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-11perf tools: Fix redirection printoutsIngo Molnar1-3/+3
Fix the duplicate util/util printout Arnaldo reported: $ make V=1 O=/tmp/build/perf -C tools/perf/ util/srcline.o ... # Redirected target util/srcline.o => /tmp/build/perf/util/util/srcline.o Reported-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-11perf trace: Initial beautifier for ioctl's 'cmd' argArnaldo Carvalho de Melo1-1/+33
[root@zoo linux]# trace -e ioctl | grep -v "cmd: 0x" | head -10 0.386 ( 0.001 ms): trace/1602 ioctl(fd: 1<pipe:[127057]>, cmd: TCGETS, arg: 0x7fff59fcb4d0 ) = -1 ENOTTY Inappropriate ioctl for device 1459.368 ( 0.002 ms): inotify_reader/10352 ioctl(fd: 18<anon_inode:inotify>, cmd: FIONREAD, arg: 0x7fb835228bcc ) = 0 1463.586 ( 0.002 ms): inotify_reader/10352 ioctl(fd: 18<anon_inode:inotify>, cmd: FIONREAD, arg: 0x7fb835228bcc ) = 0 1463.611 ( 0.002 ms): inotify_reader/10352 ioctl(fd: 18<anon_inode:inotify>, cmd: FIONREAD, arg: 0x7fb835228bcc ) = 0 3740.526 ( 0.002 ms): awk/1612 ioctl(fd: 1<pipe:[128265]>, cmd: TCGETS, arg: 0x7fff4d166b90 ) = -1 ENOTTY Inappropriate ioctl for device 3740.704 ( 0.001 ms): awk/1612 ioctl(fd: 3</proc/meminfo>, cmd: TCGETS, arg: 0x7fff4d1669a0 ) = -1 ENOTTY Inappropriate ioctl for device 3742.550 ( 0.002 ms): ps/1614 ioctl(fd: 1<pipe:[128266]>, cmd: TIOCGWINSZ, arg: 0x7fff591762b0 ) = -1 ENOTTY Inappropriate ioctl for device 3742.555 ( 0.003 ms): ps/1614 ioctl(fd: 2<socket:[19550]>, cmd: TIOCGWINSZ, arg: 0x7fff591762b0 ) = -1 ENOTTY Inappropriate ioctl for device 3742.558 ( 0.002 ms): ps/1614 ioctl(cmd: TIOCGWINSZ, arg: 0x7fff591762b0 ) = -1 ENOTTY Inappropriate ioctl for device 3742.572 ( 0.002 ms): ps/1614 ioctl(fd: 1<pipe:[128266]>, cmd: TCGETS, arg: 0x7fff59176220 ) = -1 ENOTTY Inappropriate ioctl for device [root@zoo linux]# Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-11perf trace: Prepare the strarray scnprintf method for reuseArnaldo Carvalho de Melo1-3/+10
Right now when an index passed to that method has no string associated it'll print the index as a decimal number, prepare it so that we can use it to print it in hex as well, for ioctls, for instance. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-11perf trace: Allow specifying index offset in strarraysArnaldo Carvalho de Melo1-4/+11
So that the index passed doesn't have to start at zero, being decremented from an offset specified when declaring the strarray before being used as the real array index. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-11perf symbols: Make a separate function to parse /proc/modulesAdrian Hunter3-49/+79
Make a separate function to parse /proc/modules so that it can be reused. Signed-off-by: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-11perf intlist: Add priv memberDavid Ahern2-0/+2
Allows commands to leverage intlist infrastructure for opaque structures. For example an upcoming perf-trace change will use this as a means of tracking syscalls statistics by task. Signed-off-by: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-11perf trace: Use new machine method to loop over threadsDavid Ahern1-28/+48
Use the new machine method that loops over threads to dump summary data. Signed-off-by: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-11perf machine: Add method to loop over threads and invoke handlerDavid Ahern2-0/+27
Loop over all threads within a machine - including threads moved to the dead threads list -- and invoked a function. This allows commands to run some specific function on each thread (eg., dump statistics) yet hides how the threads are maintained within the machine. Signed-off-by: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-11perf trace: Add record optionDavid Ahern2-3/+41
The record option is a convience alias to include the -e raw_syscalls:* argument to perf-record. All other options are passed to perf-record's handler. Resulting data file can be analyzed by perf-trace -i. Signed-off-by: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-11perf trace: Fix comm resolution when reading events from fileDavid Ahern3-12/+31
Task comm's are getting lost when processing events from a file. The problem is that the trace struct used by the live processing has its host machine and the perf-session used for file based processing has its host machine. Fix by having both references point to the same machine. Before: 0.030 ( 0.001 ms): :27743/27743 brk( ... 0.057 ( 0.004 ms): :27743/27743 mmap(len: 4096, prot: READ|WRITE, flags: ... 0.075 ( 0.006 ms): :27743/27743 access(filename: 0x7f3809fbce00, mode: R ... 0.091 ( 0.005 ms): :27743/27743 open(filename: 0x7f3809fba14c, flags: CLOEXEC ... ... After: 0.030 ( 0.001 ms): make/27743 brk( ... 0.057 ( 0.004 ms): make/27743 mmap(len: 4096, prot: READ|WRITE, flags: ... 0.075 ( 0.006 ms): make/27743 access(filename: 0x7f3809fbce00, mode: R ... 0.091 ( 0.005 ms): make/27743 open(filename: 0x7f3809fba14c, flags: CLOEXEC ... ... Signed-off-by: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Moved creation of new host machine to a separate constructor: machine__new_host() ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-11perf stat: Add units to nanosec-based countersDavid Ahern1-1/+4
Ingo pointed out that the task-clock counter should have the units explicitly stated since it is not a counter. Before: perf stat -a -- sleep 1 Performance counter stats for 'sleep 1': 16186.874834 task-clock # 16.154 CPUs utilized ... After: perf stat -a -- sleep 1 Performance counter stats for 'system wide': 16146.402138 task-clock (msec) # 16.125 CPUs utilized ... Reported-by: Ingo Molnar <[email protected]> Signed-off-by: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-11perf stat: Don't require a workload when using system wide or CPU optionsDavid Ahern1-1/+2
The "perf stat" command can do system wide counters or one or more cpus. For these options do not require a workload to be specified. v2: use perf_target__none per Namhyung's comment. Signed-off-by: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-11perf stat: Fix misleading message when specifying cpu list or system wideDavid Ahern1-1/+5
The "perf stat" tool displays the command run in its summary output which is misleading when using a cpu list or system wide collection. Before: perf stat -a -- sleep 1 Performance counter stats for 'sleep 1': 16152.670249 task-clock # 16.132 CPUs utilized 417 context-switches # 0.002 M/sec 7 cpu-migrations # 0.030 K/sec ... After: perf stat -a -- sleep 1 Performance counter stats for 'system wide': 16206.931120 task-clock # 16.144 CPUs utilized 395 context-switches # 0.002 M/sec 5 cpu-migrations # 0.030 K/sec ... or perf stat -C1 -- sleep 1 Performance counter stats for 'CPU(s) 1': 1001.669257 task-clock # 1.000 CPUs utilized 4,264 context-switches # 0.004 M/sec 3 cpu-migrations # 0.003 K/sec ... Signed-off-by: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-11perf evlist: Fix perf_evlist__mmap_read event overflowJiri Olsa4-3/+9
The perf_evlist__mmap_read used 'union perf_event' as a placeholder for event crossing the mmap boundary. This is ok for sample shorter than ~PATH_MAX. However we could grow up to the maximum sample size which is 16 bits max. I hit this overflow issue when using 'perf top -G dwarf' which produces sample with the size around 8192 bytes. We could configure any valid sample size here using: '-G dwarf,size'. Using array with sample max size instead for the event placeholder. Also adding another safe check for the dynamic size of the user stack. TODO: The 'struct perf_mmap' is quite big now, maybe we could use some lazy allocation for event_copy size. Signed-off-by: Jiri Olsa <[email protected]> Acked-by: David Ahern <[email protected]> Cc: Corey Ashford <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-11perf bench: Fix failing assertions in numa benchPetr Holasek1-13/+21
Patch adds more subtle handling of -C and -N parameters in parse_{cpu,node}_setup_list() functions when there isn't enough NUMA nodes or CPUs present. Instead of assertion and terminating benchmark, partial test is skipped with error message and perf will continue to the next one. Fixed problem can be easily reproduced on machine with only one NUMA node: # Running numa/mem benchmark... # Running main, "perf bench numa mem -a" ... # Running RAM-bw-remote, "perf bench numa mem -p 1 -t 1 -P 1024 -C 0 -M 1 -s perf: bench/numa.c:622: parse_setup_node_list: Assertion `!(bind_node_0 < 0 || bind_node_0 >= g->p.nr_nodes)' failed. Aborted Signed-off-by: Petr Holasek <[email protected]> Acked-by: Ingo Molnar <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Petr Benas <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Petr Benas <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-11perf tools: Ignore 'perf timechart' output fileRamkumar Ramachandra1-0/+1
The default output file produced by the 'perf timechart' tool is called output.svg, add it to .gitignore. Signed-off-by: Ramkumar Ramachandra <[email protected]> Cc: Ingo Molnar <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-11perf stat: Don't print bogus data on -e instructionsRamkumar Ramachandra1-4/+3
When only the instructions event is requested: $ perf stat -e instructions git s M builtin-stat.c Performance counter stats for 'git s': 917,453,420 instructions # 0.00 insns per cycle 0.213002926 seconds time elapsed The 0.00 insns per cycle comment in the output is totally bogus and misleading. It happens because update_shadow_stats() doesn't touch runtime_cycles_stats when only the instructions event is requested. So, omit printing the bogus data altogether. Signed-off-by: Ramkumar Ramachandra <[email protected]> Acked-by: Ingo Molnar <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-11perf stat: Don't print bogus data on -e cyclesRamkumar Ramachandra1-4/+4
When only the cycles event is requested: $ perf stat -e cycles dd if=/dev/zero of=/dev/null count=1000000 1000000+0 records in 1000000+0 records out 512000000 bytes (512 MB) copied, 0.26123 s, 2.0 GB/s Performance counter stats for 'dd if=/dev/zero of=/dev/null count=1000000': 911,626,453 cycles # 0.000 GHz 0.262113350 seconds time elapsed The 0.000 GHz comment in the output is totally bogus and misleading. It happens because update_shadow_stats() doesn't touch runtime_nsecs_stats; it is only written when a requested counter matches a SW_TASK_CLOCK. In our case, since we have only requested HW_CPU_CYCLES, runtime_nsecs_stats is unavailable. So, omit printing the comment altogether. Signed-off-by: Ramkumar Ramachandra <[email protected]> Acked-by: Ingo Molnar <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-11perf tools: Move start conditions to start of the flex fileJiri Olsa1-31/+32
Moving start conditions to start of the flex file so it's clear what the INITIAL condition rules are. Plus adding default rule for INITIAL condition. This prevents default space to be printed for events like: $ ./perf stat -e "cycles " kill 2>/dev/null $ ^^^^^^^^ Signed-off-by: Jiri Olsa <[email protected]> Cc: Corey Ashford <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-11perf tools: Add missing -ldl for gtk buildJiri Olsa1-0/+1
If we build perf with NO_LIBPYTHON=1 NO_LIBPERL=1 the '-ldl' is not added to libs build fails if we have gtk2 code in, because it depends on it. Signed-off-by: Jiri Olsa <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Corey Ashford <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-11perf machine: Use snprintf instead of sprintfAdrian Hunter1-3/+3
To avoid buffer overruns. Signed-off-by: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Split from aa7fe3b ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-11perf bench sched: Add --threaded optionIngo Molnar1-29/+86
Allow the measurement of thread versus process context switch performance. The default stays at 'process' based measurement, like lmbench's lat_ctx benchmark. Sample output: comet:~/tip/tools/perf> taskset 1 ./perf bench sched pipe # Running sched/pipe benchmark... # Executed 1000000 pipe operations between two processes Total time: 4.138 [sec] 4.138729 usecs/op 241620 ops/sec comet:~/tip/tools/perf> taskset 1 ./perf bench sched pipe --threaded # Running sched/pipe benchmark... # Executed 1000000 pipe operations between two threads Total time: 3.667 [sec] 3.667667 usecs/op 272652 ops/sec Signed-off-by: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-11perf trace: Add 'trace' alias to 'perf trace'Arnaldo Carvalho de Melo2-1/+10
Make 'perf trace' more accessible by aliasing it to just 'trace': [root@zoo linux]# trace --duration 15 -a -e futex sleep 1 110.092 (16.188 ms): libvirtd/1166 futex(uaddr: 0x185b344, op: WAIT|PRIV, val: 174293 ) = 0 110.101 (15.903 ms): libvirtd/1171 futex(uaddr: 0x185b3dc, op: WAIT|PRIV, val: 139265 ) = 0 111.594 (15.776 ms): libvirtd/1165 futex(uaddr: 0x185b344, op: WAIT|PRIV, val: 174295 ) = 0 111.610 (15.969 ms): libvirtd/1169 futex(uaddr: 0x185b3dc, op: WAIT|PRIV, val: 139267 ) = 0 113.556 (16.216 ms): libvirtd/1168 futex(uaddr: 0x185b3dc, op: WAIT|PRIV, val: 139269 ) = 0 291.265 (199.508 ms): chromium-brows/15830 futex(uaddr: 0x7fff2986bcb4, op: WAIT_BITSET|PRIV|CLKRT, val: 1, utime: 0x7fff2986bab0, val3: 4294967295) = -1 ETIMEDOUT Connection timed out 360.354 (69.053 ms): chromium-brows/15830 futex(uaddr: 0x7fff2986bcb4, op: WAIT_BITSET|PRIV|CLKRT, val: 1, utime: 0x7fff2986bab0, val3: 4294967295) = -1 ETIMEDOUT Connection timed out [root@zoo linux]# I.e. looking for futex calls that take at least 15ms, system wide, during a one second window. Now to get callchains into 'trace' to figure out what are those locks :-) Requested-by: Ingo Molnar <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-09perf trace: Show path associated with fd in live sessionsArnaldo Carvalho de Melo1-16/+237
For live sessions we can just access /proc to map an fd to its path, on a best effort way, i.e. sometimes the fd will have gone away when we try to do the mapping, as it is done in a lazy way, only when a reference to such fd is made then the path will be looked up in /proc. This is disabled when processing perf.data files, where we will have to have a way to get getname events, be it via an on-the-fly 'perf probe' event or after a vfs_getname tracepoint is added to the kernel. A first step will be to synthesize such event for the use cases where the threads in the monitored workload exist already. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-09perf trace: Beautify mlock & friends 'addr' argArnaldo Carvalho de Melo1-0/+6
Printing it as an hex number. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-09perf trace: Handle MSG_WAITFORONE not definedDavid Ahern1-0/+3
Needed for compiles on Fedora 12 for example. Signed-off-by: David Ahern <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-09perf trace: Add beautifier for clock_gettime's clk_id argumentDavid Ahern1-0/+7
Before: 0.030 ( 0.002 ms): 2571 clock_gettime(which_clock: 1, tp: 0x7f3b45729cd0 ) = 0 After: 0.030 ( 0.002 ms): 2571 clock_gettime(which_clock: MONOTONIC, tp: 0x7f3b45729cd0 ) = 0 v2: Update to use the STRARRAY option Signed-off-by: David Ahern <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-09perf trace: Beautify pipe2 'flags' argArnaldo Carvalho de Melo1-0/+25
4.234 (0.005 ms): fetchmail/3224 pipe2(fildes: 0x7fffc72bcee0, flags: CLOEXEC) = 0 Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-09perf trace: Use socket's beautifiers in socketpairArnaldo Carvalho de Melo1-0/+4
For the address family and socket type. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-09perf trace: Don't supress zeroed args when there is an strarray entry for itArnaldo Carvalho de Melo1-2/+9
Case in hand: 9.682 ( 0.001 ms): Xorg/13079 setitimer(which: REAL, value: 0x7fffede42470) = 0 ITIMER_REAL is zero. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-09perf trace: Add helper for syscalls with a single strarray argArnaldo Carvalho de Melo1-27/+13
In such cases just stating the (arg, name, array) is enough, reducing the size of the syscall formatters table. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-09perf trace: Beautify flock 'cmd' argArnaldo Carvalho de Melo1-0/+33
4735.638 ( 0.003 ms): man/19881 flock(fd: 3, cmd: SH|NB) = 0 4735.832 ( 0.002 ms): man/19881 flock(fd: 3, cmd: UN ) = 0 Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-09perf trace: Beautify epoll_ctl 'op' argArnaldo Carvalho de Melo1-0/+6
[root@sandy ~]# perf trace -e epoll_ctl 2.490 (0.003 ms): systemd-logind/586 epoll_ctl(epfd: 10, op: ADD, fd: 24, event: 0x7fff22314ef0) = 0 2.621 (0.003 ms): systemd-logind/586 epoll_ctl(epfd: 10, op: DEL, fd: 24 ) = 0 2.833 (0.010 ms): systemd-logind/586 epoll_ctl(epfd: 10, op: ADD, fd: 24, event: 0x7fff22314cd0) = 0 2.953 (0.002 ms): systemd-logind/586 epoll_ctl(epfd: 10, op: DEL, fd: 24 ) = 0 3.118 (0.002 ms): systemd-logind/586 epoll_ctl(epfd: 10, op: ADD, fd: 24, event: 0x7fff22314d20) = 0 4.762 (0.002 ms): systemd-logind/586 epoll_ctl(epfd: 10, op: DEL, fd: 24 ) = 0 ^C[root@sandy ~]# Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-09perf tools: Fix srcline sort key behaviorNamhyung Kim1-21/+20
Currently the srcline sort key compares ip rather than srcline info. I guess this was due to a performance reason to run external addr2line utility. Now we have implemented the functionality inside, use the srcline info when comparing hist entries. Also constantly print "??:0" string for unknown srcline rather than printing ip. Signed-off-by: Namhyung Kim <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-09perf tools: Implement addr2line directly using libbfdRoberto Vitillo2-0/+171
When the srcline sort key is used , the external addr2line utility needs to be run for each hist entry to get the srcline info. This can consume quite a time if one has a huge perf.data file. So rather than executing the external utility, implement it internally and just call it. We can do it since we've linked with libbfd already. Signed-off-by: Roberto Agostino Vitillo <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Use a2l_data struct instead of static globals ] Signed-off-by: Namhyung Kim <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-10-09perf tools: Save failed result of get_srcline()Namhyung Kim3-2/+10
Some dso's lack srcline info, so there's no point to keep trying on them. Just save failture status and skip them. Signed-off-by: Namhyung Kim <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>