aboutsummaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2016-03-10perf stat: Add --metric-only support for -AAndi Kleen2-9/+38
Add metric only support for -A too. This requires a new print function that prints the metrics in the right order. v2: Fix manpage v3: Simplify nrcpus computation Signed-off-by: Andi Kleen <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-10perf stat: Implement --metric-only modeAndi Kleen2-10/+205
Add a new mode to only print metrics. Sometimes we don't care about the raw values, just want the computed metrics. This allows more compact printing, so with -I each sample is only a single line. This also allows easier plotting and processing with other tools. The main target is with using --topdown, but it also works with -T and standard perf stat. A few metrics are not supported. To avoiding having to hardcode all the metrics in the code it uses a two pass approach: first compute dummy metrics and only print the headers in the print_metric callback. Then use the callback to print the actual values. There are some additional changes in the stat printout code to handle all metrics being on a single line. One issue is that the column code doesn't know in advance what events are not supported by the CPU, and it would be hard to find out as this could change based on dynamic conditions. That causes empty columns in some cases. The output can be fairly wide, often you may need more than 80 columns. Example: % perf stat -a -I 1000 --metric-only 1.001452803 frontend cycles idle insn per cycle stalled cycles per insn branch-misses of all branches 1.001452803 158.91% 0.66 2.39 2.92% 2.002192321 180.63% 0.76 2.08 2.96% 3.003088282 150.59% 0.62 2.57 2.84% 4.004369835 196.20% 0.98 1.62 3.79% 5.005227314 231.98% 0.84 1.90 4.71% v2: Lots of updates. v3: Use slightly narrower columns v4: Add comment Signed-off-by: Andi Kleen <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-10perf stat: Document CSV format in manpageAndi Kleen1-0/+23
With all the recently added fields in the perf stat CSV output we should finally document them in the man page. Do this here. v2: Fix fields in documentation (Jiri) v3: fix order of fields again (Jiri) v4: Change order again. v5: Document more fields (Jiri) v6: Move time stamp first v7: More fixes (Jiri) Signed-off-by: Andi Kleen <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-10perf hists browser: Check sort keys before hot key actionsNamhyung Kim1-0/+9
The context menu in TUI hists browser checks corresponding sort keys when creating the menu item. But hotkey actions lacks these checks so it can filter using incorrect info. For example, default sort key of 'perf top' doesn't contain 'comm' or 'pid' sort key so each hist entry's thread info is not reliable. Thus it should prohibit using thread filter on 't' key. Signed-off-by: Namhyung Kim <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-10perf hists browser: Allow thread filtering for comm sort keyNamhyung Kim1-10/+32
The commit 2eafd410e669 ("perf hists browser: Only 'Zoom into thread' only when sort order has 'pid'") disabled thread filtering in hist browser for the default sort key. However the he->thread is still valid even if 'pid' sort key is not given. Only thing it should not use is the pid (or tid) of the thread. So allow to filter by thread when 'comm' sort key is given and show pid only if 'pid' sort key is given. Signed-off-by: Namhyung Kim <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-10perf tools: Add sort__has_comm variableNamhyung Kim2-0/+4
The sort__has_comm variable is to check whether the comm sort key is given. This is necessary to support thread filtering in the TUI hists browser later. Signed-off-by: Namhyung Kim <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-10perf tools: Recalc total periods using top-level entries in hierarchyNamhyung Kim1-10/+34
When hierarchy mode is enabled, each entry in a hierarchy level shares the period. IOW an upper level entry's period is the sum of lower level entries. Thus perf uses only one of them to calculate the total period of hists. It was lowest-level (leaf) entries but it has a problem when it comes to filters. If a filter is applied, entries in the same level will be filtered or not. But upper level entries still have period of their sum including filtered one. So total sum of upper level entries will not be same as sum of lower level entries. This resulted in entries having more than 100% of overhead and it can be produced using perf top with filter(s). Reported-and-Tested-by: Jiri Olsa <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-10perf tools: Remove nr_sort_keys fieldNamhyung Kim3-31/+0
The nr_sort_keys field is to carry the number of sort entries in a hpp_list or hists to determine the depth of indentation of a hist entry. As it's only used in hierarchy mode and now we have used nr_hpp_node for this reason, there's no need to keep it anymore. Let's get rid of it. Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-10perf hists browser: Cleanup hist_browser__fprintf_hierarchy_entry()Namhyung Kim1-14/+8
The hist_browser__fprintf_hierarchy_entry() if to dump current output into a file so it needs to be sync-ed with the corresponding function hist_browser__show_hierarchy_entry(). So use hists->nr_hpp_node to indent width and use first fmt_node to print overhead columns instead of checking whether it's a sort entry (or dynamic entry). Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-10perf tools: Remove hist_entry->fmt fieldNamhyung Kim1-1/+0
It's not used anymore and the output format is accessed by the hpp_list pointer instead when hierarchy is enabled. Let's get rid of it. Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-10perf tools: Fix command line filters in hierarchy modeNamhyung Kim1-3/+97
When a command-line filter is applied in hierarchy mode, output is broken especially when filtering on lower level. The higher level entries doesn't show up so it's hard to see the results. Also it needs to handle multi sort keys in a single hierarchy level. Before: $ perf report --hierarchy -s 'cpu,{dso,comm}' --comms swapper --stdio ... # Overhead CPU / Shared Object+Command # ........... ........................... # 13.79% [kernel.vmlinux] swapper 31.71% 000 13.80% [kernel.vmlinux] swapper 0.43% [e1000e] swapper 11.89% [kernel.vmlinux] swapper 9.18% [kernel.vmlinux] swapper After: # Overhead CPU / Shared Object+Command # ........... ............................... # 33.09% 003 13.79% [kernel.vmlinux] swapper 31.71% 000 13.80% [kernel.vmlinux] swapper 0.43% [e1000e] swapper 21.90% 002 11.89% [kernel.vmlinux] swapper 13.30% 001 9.18% [kernel.vmlinux] swapper Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Tested-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-10perf tools: Add more sort entry check functionsNamhyung Kim2-31/+23
Those functions are for checkinf if a given perf_hpp_fmt is a filter-related sort entry. With hierarchy mode, it needs to check filters on the hist entries with its own hpp format list. Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-10perf tools: Fix hist_entry__filter() for hierarchyNamhyung Kim1-7/+21
When hierarchy mode is enabled each output format is in a separate hpp list. So when applying a filter it should check all formats in the list. Currently it only checks a single ->fmt field which was not set properly. Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-10perf jitdump: Build only on supported archsJiri Olsa7-6/+19
Build jitdump only on architectures defined in util/genelf.h file, to avoid breaking the build on such arches. Signed-off-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Colin Ian King <[email protected]> Cc: David Ahern <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: He Kuang <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-10tools lib traceevent: Add '~' operation within arg_num_eval()Steven Rostedt1-0/+6
When evaluating values for print flags, if the value included a '~' operator, the parsing would fail. This broke kmalloc's parsing of: __print_flags(REC->gfp_flags, "|", {(unsigned long)((((((( gfp_t)(0x400000u|0x2000000u)) | (( gfp_t)0x40u) | (( gfp_t)0x80u) | (( gfp_t)0x20000u)) | (( gfp_t)0x02u)) | (( gfp_t)0x08u)) | (( gfp_t)0x4000u) | (( gfp_t)0x10000u) | (( gfp_t)0x1000u) | (( gfp_t)0x200u)) & ~(( gfp_t)0x2000000u)) ^ | here Signed-off-by: Steven Rostedt <[email protected]> Reported-by: Arnaldo Carvalho de Melo <[email protected]> Tested-by: David Ahern <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-09perf tools: Omit unnecessary cast in perf_pmu__parse_scaleJiri Olsa1-2/+2
There's no need to use a const char pointer, we can used char pointer from the beginning and omit the unnecessary cast. Reported-by: Ingo Molnar <[email protected]> Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-09perf tools: Pass perf_hpp_list all the way through setup_sort_listJiri Olsa1-18/+26
Pass perf_hpp_list all the way through setup_sort_list so that the sort entry can be added on the arbitrary list. Signed-off-by: Jiri Olsa <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-09perf tools: Fix perf script python database export crashChris Phlipot1-4/+2
Remove the union in evsel so that the database id and priv pointer can be used simultainously without conflicting and crashing. Detailed Description for the fixed bug follows: perf script crashes with a segmentation fault on user space tool version 4.5.rc7.ge2857b when using the python database export API. It works properly in 4.4 and prior versions. the crash fist appeared in: cfc8874a4859 ("perf script: Process cpu/threads maps") How to reproduce the bug: Remove any temporary files left over from a previous crash (if you have already attemped to reproduce the bug): $ rm -r test_db-perf-data $ dropdb test_db $ perf record timeout 1 yes >/dev/null $ perf script -s scripts/python/export-to-postgresql.py test_db Stack Trace: Program received signal SIGSEGV, Segmentation fault. __GI___libc_free (mem=0x1) at malloc.c:2929 2929 malloc.c: No such file or directory. (gdb) bt at util/stat.c:122 argv=<optimized out>, prefix=<optimized out>) at builtin-script.c:2231 argc=argc@entry=4, argv=argv@entry=0x7fffffffdf70) at perf.c:390 at perf.c:451 Signed-off-by: Chris Phlipot <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Peter Zijlstra <[email protected]> Fixes: cfc8874a4859 ("perf script: Process cpu/threads maps") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-09perf jitdump: DWARF is also neededArnaldo Carvalho de Melo2-5/+8
While building on a Docker container for ubuntu and installing package by package one ends up with: MKDIR /tmp/build/util/ CC /tmp/build/util/genelf.o util/genelf.c:22:19: fatal error: dwarf.h: No such file or directory #include <dwarf.h> ^ compilation terminated. mv: cannot stat '/tmp/build/util/.genelf.o.tmp': No such file or directory Because the jitdump code needs the DWARF related development packages to be installed. So make it dependent on that so that the build can succeed without jitdump support. Cc: Adrian Hunter <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-09perf bench mem: Prepare the x86-64 build for upstream memcpy_mcsafe() changesIngo Molnar1-0/+5
The following upcoming upstream commit: 92b0729c34ca ("x86/mm, x86/mce: Add memcpy_mcsafe()") Adds _ASM_EXTABLE_FAULT(), which is not available in user-space and breaks the build. We don't really need _ASM_EXTABLE_FAULT() in user-space, so simply wrap it to nothing. Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: David Ahern <[email protected]> Cc: Hitoshi Mitake <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-03-08perf report: Use hierarchy hpp list on gtkNamhyung Kim1-22/+33
Now hpp formats are linked using perf_hpp_list_node when hierarchy is enabled. Like in stdio, use this info to print entries with multiple sort keys in a single hierarchy properly. Signed-off-by: Namhyung Kim <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-03-08perf hists browser: Use hierarchy hpp listNamhyung Kim1-36/+45
Now hpp formats are linked using perf_hpp_list_node when hierarchy is enabled. Like in stdio, use this info to print entries with multiple sort keys in a single hierarchy properly. Tested-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-03-08perf report: Use hierarchy hpp list on stdioNamhyung Kim1-46/+57
Now hpp formats are linked using perf_hpp_list_node when hierarchy is enabled. Use this info to print entries with multiple sort keys in a single hierarchy properly. For example, the below example shows using 4 sort keys with 2 levels. $ perf report --hierarchy -s '{prev_pid,prev_comm},{next_pid,next_comm}' \ --percent-limit 1 -i perf.data.sched ... # Overhead prev_pid+prev_comm / next_pid+next_comm # ........... ....................................... # 22.36% 0 swapper/0 9.48% 17773 transmission-gt 5.25% 109 kworker/0:1H 1.53% 6524 Xephyr 21.39% 17773 transmission-gt 9.52% 0 swapper/0 9.04% 0 swapper/2 1.78% 0 swapper/3 Tested-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-03-08perf hists: Fix indent for multiple hierarchy sort keyNamhyung Kim4-28/+23
When multiple sort keys are used in a single hierarchy, it should indent using number of hierarchy levels instead of number of sort keys. Signed-off-by: Namhyung Kim <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-03-08perf hists: Support multiple sort keys in a hierarchy levelNamhyung Kim1-10/+32
This implements having multiple sort keys in a single hierarchy level. Originally only single sort key is supported for each level, but now using the group syntax with '{ }', it can set more than one sort key in one level. Note that now it needs to quote in order to prevent shell interpretation. For example: $ perf report --hierarchy -s '{comm,dso},sym' ... # Overhead Command / Shared Object / Symbol # .............. .......................................... # 48.67% swapper [kernel.vmlinux] 34.42% [k] intel_idle 1.30% [k] __tick_nohz_idle_enter 1.03% [k] cpuidle_reflect 8.87% firefox libpthread-2.22.so 6.60% [.] __GI___libc_recvmsg 1.18% [.] pthread_cond_signal@@GLIBC_2.3.2 1.09% [.] 0x000000000000ff4b 6.11% Xorg libc-2.22.so 5.27% [.] __memcpy_sse2_unaligned In the above example, the command name and the shared object name are shown on the same line but the symbol name is on the different line. Since the first two are grouped by '{}', they are in the same level. Suggested-and-Tested=by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-03-08perf hists: Use own hpp_list for hierarchy modeNamhyung Kim7-73/+103
Now each hists has its own hpp lists in hierarchy. So instead of having a pointer to a single perf_hpp_fmt in a hist entry, make it point the hpp_list for its level. This will be used to support multiple sort keys in a single hierarchy level. Signed-off-by: Namhyung Kim <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-03-08perf hists: Introduce perf_hpp__setup_hists_formats()Namhyung Kim4-0/+118
The perf_hpp__setup_hists_formats() is to build hists-specific output formats (and sort keys). Currently it's only used in order to build the output format in a hierarchy with same sort keys, but it could be used with different sort keys in non-hierarchy mode later. Signed-off-by: Namhyung Kim <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-03-08perf stat: Document --detailed optionBorislav Petkov1-0/+8
I'm surprised this remained undocumented since at least 2011. And it is actually a very useful switch, as Steve and I came to realize recently. Add the text from 2cba3ffb9a9d ("perf stat: Add -d -d and -d -d -d options to show more CPU events") which added the incrementing aspect to -d. Tested-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: David Ahern <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Thomas Gleixner <[email protected]> Fixes: 2cba3ffb9a9d ("perf stat: Add -d -d and -d -d -d options to show more CPU events") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-03-08perf hists: Add level field to struct perf_hpp_fmtNamhyung Kim2-33/+42
The level field is to distinguish levels in the hierarchy mode. Currently each column (perf_hpp_fmt) has a different level. Signed-off-by: Namhyung Kim <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-03-08perf tools: Use 64-bit shifts with (TSC) time conversionAdrian Hunter2-2/+2
Commit b9511cd761fa ("perf/x86: Fix time_shift in perf_event_mmap_page") altered the time conversion algorithms documented in the perf_event.h header file, to use 64-bit shifts. That was done to make the code more future-proof (i.e. some time in the future a 32-bit shift could be allowed). Reflect those changes in perf tools. Signed-off-by: Adrian Hunter <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-03-08perf jit: Move clockid validationAdrian Hunter2-24/+23
Move clockid validation into jit_process() so it can later be made conditional. Signed-off-by: Adrian Hunter <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-03-08perf jit: Let jit_process() return errorsAdrian Hunter2-6/+16
In preparation for moving clockid validation into jit_process(). Previously a return value of zero meant the processing had been done and non-zero meant either the processing was not done (i.e. not the jitdump file mmap event) or an error occurred. Change it so that zero means the processing was not done, one means the processing was done and successful, and negative values are an error. Signed-off-by: Adrian Hunter <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-03-08perf session: Simplify tool stubsAdrian Hunter1-33/+7
Some of the stubs are identical so just have one function for them. Signed-off-by: Adrian Hunter <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-03-08perf inject: Hit all DSOs for AUX data in JIT and other casesAdrian Hunter1-4/+8
Currently, when injecting build ids, if there is AUX data then 'perf inject' hits all DSOs because it is not known which DSOs the trace data would hit. That needs to be done for JIT injection also, and in fact there is no reason to distinguish what kind of injection is being done. That is, any time there is AUX data and the HEADER_BUID_ID feature flag is set, and the AUX data is not being processed, then hit all DSOs. This patch does that. Signed-off-by: Adrian Hunter <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-03-08perf tools: Explicitly declare inc_group_count as a void functionColin Ian King1-1/+1
The return type is not defined, so it defaults to int, however, the function is not returning anything, so this is clearly not correct. Make it a void function. Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: David Ahern <[email protected]> Cc: He Kuang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-03-03perf stat: Check for frontend stalled for metricsAndi Kleen3-1/+10
Add an extra check for frontend stalled in the metrics. This avoids an extra column for the --metric-only case when the CPU does not support frontend stalled. v2: Add separate init function Signed-off-by: Andi Kleen <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-03tools/power turbostat: fix various build warningsColin Ian King1-4/+4
When building with gcc 6 we're getting various build warnings that just require some trivial function declaration and call fixes: turbostat.c: In function ‘dump_cstate_pstate_config_info’: turbostat.c:1973:1: warning: type of ‘family’ defaults to ‘int’ dump_cstate_pstate_config_info(family, model) turbostat.c:1973:1: warning: type of ‘model’ defaults to ‘int’ turbostat.c: In function ‘get_tdp’: turbostat.c:2145:8: warning: type of ‘model’ defaults to ‘int’ double get_tdp(model) turbostat.c: In function ‘perf_limit_reasons_probe’: turbostat.c:2259:6: warning: type of ‘family’ defaults to ‘int’ void perf_limit_reasons_probe(family, model) turbostat.c:2259:6: warning: type of ‘model’ defaults to ‘int’ Signed-off-by: Colin Ian King <[email protected]> Cc: Matt Fleming <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-03perf tests: Initialize sa.sa_flagsColin Ian King1-0/+1
The sa_flags field is not being initialized, so a garbage value is being passed to sigaction. Initialize it to zero. Signed-off-by: Colin Ian King <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-03perf test: Fix hists related entriesArnaldo Carvalho de Melo1-15/+22
That got broken by d3a72fd8187b ("perf report: Fix indentation of dynamic entries in hierarchy"), by using the evlist in setup_sorting() without checking if it is NULL, as done in some 'perf test' entries: $ find tools/ -name "*.c" | xargs grep 'setup_sorting(NULL);' tools/perf/tests/hists_output.c: setup_sorting(NULL); tools/perf/tests/hists_output.c: setup_sorting(NULL); tools/perf/tests/hists_output.c: setup_sorting(NULL); tools/perf/tests/hists_output.c: setup_sorting(NULL); tools/perf/tests/hists_output.c: setup_sorting(NULL); tools/perf/tests/hists_cumulate.c: setup_sorting(NULL); tools/perf/tests/hists_cumulate.c: setup_sorting(NULL); tools/perf/tests/hists_cumulate.c: setup_sorting(NULL); tools/perf/tests/hists_cumulate.c: setup_sorting(NULL); $ Fix it. Before: [root@jouet ~]# perf test <SNIP> 15: Test matching and linking multiple hists : FAILED! 16: Try 'import perf' in python, checking link problems : Ok 17: Test breakpoint overflow signal handler : Ok 18: Test breakpoint overflow sampling : Ok 19: Test number of exit event of a simple workload : Ok 20: Test software clock events have valid period values : Ok 21: Test object code reading : Ok 22: Test sample parsing : Ok 23: Test using a dummy software event to keep tracking : Ok 24: Test parsing with no sample_id_all bit set : Ok 25: Test filtering hist entries : FAILED! 26: Test mmap thread lookup : Ok 27: Test thread mg sharing : Ok 28: Test output sorting of hist entries : FAILED! 29: Test cumulation of child hist entries : FAILED! <SNIP> After the patch the above failed tests complete successfully. Acked-by: Namhyung Kim <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Wang Nan <[email protected]> Fixes: d3a72fd8187b ("perf report: Fix indentation of dynamic entries in hierarchy") Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-03tools lib traceevent: Fix output of %llu for 64 bit values read on 32 bit ↵Steven Rostedt (Red Hat)1-1/+1
machines When a long value is read on 32 bit machines for 64 bit output, the parsing needs to change "%lu" into "%llu", as the value is read natively. Unfortunately, if "%llu" is already there, the code will add another "l" to it and fail to parse it properly. Signed-off-by: Steven Rostedt <[email protected]> Cc: Andrew Morton <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-03tools lib traceevent: Set int_array fields to NULL if freeing from errorSteven Rostedt (Red Hat)1-0/+3
Had a bug where on error of parsing __print_array() where the fields are freed after they were allocated, but since they were not set to NULL, the freeing of the arg also tried to free the already freed fields causing a double free. Fix process_hex() while at it. Signed-off-by: Steven Rostedt <[email protected]> Cc: Andrew Morton <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-03tools lib traceevent: Fix time stamp rounding issueChaos.Chen1-0/+5
When rounding to microseconds, if the timestamp subsecond is between .999999500 and .999999999, it is rounded to .1000000, when it should instead increment the second counter due to the overflow. For example, if the timestamp is 1234.999999501 instead of seeing: 1235.000000 we see: 1234.1000000 Signed-off-by: Chaos.Chen <[email protected]> Cc: Andrew Morton <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ fixed incrementing "secs" instead of decrementing it ] Signed-off-by: Steven Rostedt <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-03perf script: Fix double free on command_lineColin Ian King1-2/+2
The 'command_line' variable is free'd twice if db_export__branch_types() fails. To avoid this, defer the free'ing of 'command_line' to after this call so that the error return path will just free 'command_line' once. Signed-off-by: Colin Ian King <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: He Kuang <[email protected]> Cc: Javi Merino <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-03tools build: Use .s extension for preprocessed assembler codeMasahiro Yamada1-1/+1
The "man gcc" says .i extension represents the file is C source code that should not be preprocessed. Here, .s should be used. For clarification, .c ---(preprocess)---> .i .S ---(preprocess)---> .s Signed-off-by: Masahiro Yamada <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Aaro Koskinen <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Lukas Wunner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-03perf stat: Support metrics in --per-core/socket modeAndi Kleen2-8/+63
Enable metrics printing in --per-core / --per-socket mode. We need to save the shadow metrics in a unique place. Always use the first CPU in the aggregation. Then use the same CPU to retrieve the shadow value later. Example output: % perf stat --per-core -a ./BC1s Performance counter stats for 'system wide': S0-C0 2 2966.020381 task-clock (msec) # 2.004 CPUs utilized (100.00%) S0-C0 2 49 context-switches # 0.017 K/sec (100.00%) S0-C0 2 4 cpu-migrations # 0.001 K/sec (100.00%) S0-C0 2 467 page-faults # 0.157 K/sec S0-C0 2 4,599,061,773 cycles # 1.551 GHz (100.00%) S0-C0 2 9,755,886,883 instructions # 2.12 insn per cycle (100.00%) S0-C0 2 1,906,272,125 branches # 642.704 M/sec (100.00%) S0-C0 2 81,180,867 branch-misses # 4.26% of all branches S0-C1 2 2965.995373 task-clock (msec) # 2.003 CPUs utilized (100.00%) S0-C1 2 62 context-switches # 0.021 K/sec (100.00%) S0-C1 2 8 cpu-migrations # 0.003 K/sec (100.00%) S0-C1 2 281 page-faults # 0.095 K/sec S0-C1 2 6,347,290 cycles # 0.002 GHz (100.00%) S0-C1 2 4,654,156 instructions # 0.73 insn per cycle (100.00%) S0-C1 2 947,121 branches # 0.319 M/sec (100.00%) S0-C1 2 37,322 branch-misses # 3.94% of all branches 1.480409747 seconds time elapsed v2: Rebase to older patches v3: Document shadow cpus. Fix aggr_get_id argument. Fix -A shadows (Jiri) Signed-off-by: Andi Kleen <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-03perf stat: Implement CSV metrics outputAndi Kleen2-5/+70
Now support CSV output for metrics. With the new output callbacks this is relatively straight forward by creating new callbacks. This allows to easily plot metrics from CSV files. The new line callback needs to know the number of fields to skip them correctly Example output before: % perf stat -x, true 0.200687,,task-clock,200687,100.00 0,,context-switches,200687,100.00 0,,cpu-migrations,200687,100.00 40,,page-faults,200687,100.00 730871,,cycles,203601,100.00 551056,,stalled-cycles-frontend,203601,100.00 <not supported>,,stalled-cycles-backend,0,100.00 385523,,instructions,203601,100.00 78028,,branches,203601,100.00 3946,,branch-misses,203601,100.00 After: % perf stat -x, true .502457,,task-clock,502457,100.00,0.485,CPUs utilized 0,,context-switches,502457,100.00,0.000,K/sec 0,,cpu-migrations,502457,100.00,0.000,K/sec 45,,page-faults,502457,100.00,0.090,M/sec 644692,,cycles,509102,100.00,1.283,GHz 423470,,stalled-cycles-frontend,509102,100.00,65.69,frontend cycles idle <not supported>,,stalled-cycles-backend,0,100.00,,,, 492701,,instructions,509102,100.00,0.76,insn per cycle ,,,,,0.86,stalled cycles per insn 97767,,branches,509102,100.00,194.578,M/sec 4788,,branch-misses,509102,100.00,4.90,of all branches or easier readable $ perf stat -x, -o x.csv true $ column -s, -t x.csv 0.490635 task-clock 490635 100.00 0.489 CPUs utilized 0 context-switches 490635 100.00 0.000 K/sec 0 cpu-migrations 490635 100.00 0.000 K/sec 45 page-faults 490635 100.00 0.092 M/sec 629080 cycles 497698 100.00 1.282 GHz 409498 stalled-cycles-frontend 497698 100.00 65.09 frontend cycles idle <not supported> stalled-cycles-backend 0 100.00 491424 instructions 497698 100.00 0.78 insn per cycle 0.83 stalled cycles per insn 97278 branches 497698 100.00 198.270 M/sec 4569 branch-misses 497698 100.00 4.70 of all branches Two new fields are added: metric value and metric name. v2: Split out function argument changes v3: Reenable metrics for real. v4: Fix wrong hunk from refactoring. v5: Remove extra "noise" printing (Jiri), but add it to the not counted case. Print empty metrics for not counted. v6: Avoid outputting metric on empty format. v7: Print metric at the end v8: Remove extra run, ena fields v9: Avoid extra new line for unsupported counters Signed-off-by: Andi Kleen <[email protected]> Acked-by: Jiri Olsa <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-03perf record: Ensure return non-zero rc when mmap failWang Nan1-1/+4
perf_evlist__mmap_ex() can fail without setting errno (for example, fail in condition checking. In this case all syscall is success). If this happen, record__open() incorrectly returns 0. Force setting rc is a quick way to avoid this problem, or we have to follow all possible code path in perf_evlist__mmap_ex() to make sure there's at least one system call before returning an error. Signed-off-by: Wang Nan <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: He Kuang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Li Zefan <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Zefan Li <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: He Kuang <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-03perf record: Introduce record__finish_output() to finish a perf.dataWang Nan1-12/+25
Move code for finalizing 'perf.data' to record__finish_output(). It will be used by following commits to split output to multiple files. Signed-off-by: He Kuang <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: He Kuang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Li Zefan <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Zefan Li <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Wang Nan <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-03perf record: Extract synthesize code to record__synthesize()Wang Nan1-55/+70
Create record__synthesize(). It can be used to create tracking events for each perf.data after perf supporting splitting into multiple outputs. Signed-off-by: He Kuang <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: He Kuang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Li Zefan <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Zefan Li <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Wang Nan <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-03perf record: Use WARN_ONCE to replace 'if' conditionWang Nan1-8/+7
Commits in a BPF patchkit will extract kernel and module synthesizing code into a separated function and call it multiple times. This patch replace 'if (err < 0)' using WARN_ONCE, makes sure the error message show one time. Signed-off-by: Wang Nan <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: He Kuang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Li Zefan <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Zefan Li <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>