aboutsummaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2015-10-26perf tools: Setup pager when printing usage and helpNamhyung Kim1-2/+13
It's annoying to see error or help message when command has many options like in perf record, report or top. So setup pager when print parser error or help message - it should be OK since no UI is enabled at the parsing time. The usage_with_options() already disables it by calling exit_browser() anyway. Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Ingo Molnar <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-26perf report: Rename to --show-cpu-utilizationNamhyung Kim3-2/+5
So that it can be more consistent with other --show-* options. The old name (--showcpuutilization) is provided only for compatibility. Signed-off-by: Namhyung Kim <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-26perf tools: Improve ambiguous option help messageNamhyung Kim1-9/+8
Currently if an option name is ambiguous it only prints first two matched option names but no help. It'd be better it could show all possible names and help messages too. Before: $ perf report --show Error: Ambiguous option: show (could be --show-total-period or --show-ref-call-graph) Usage: perf report [<options>] After: $ perf report --show Error: Ambiguous option: show (could be --show-total-period or --show-ref-call-graph) Usage: perf report [<options>] -n, --show-nr-samples Show a column with the number of samples --showcpuutilization Show sample percentage for different cpu modes -I, --show-info Display extended information about perf.data file --show-total-period Show a column with the sum of periods --show-ref-call-graph Show callgraph from reference event Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Ingo Molnar <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-26Merge 4.3-rc7 into usb-nextGreg Kroah-Hartman1-0/+9
We want the USB and other fixes in here as well. Signed-off-by: Greg Kroah-Hartman <[email protected]>
2015-10-23perf tools: Provide help for subset of optionsArnaldo Carvalho de Melo1-9/+33
Some tools have a lot of options, so, providing a way to show help just for some of them may come handy: $ perf report -h --tui Usage: perf report [<options>] --tui Use the TUI interface $ perf report -h --tui --showcpuutilization -b -c Usage: perf report [<options>] -b, --branch-stack use branch records for per branch histogram filling -c, --comms <comm[,comm...]> only consider symbols in these comms --showcpuutilization Show sample percentage for different cpu modes --tui Use the TUI interface $ Using it with perf bash completion is also handy, just make sure you source the needed file: $ . ~/git/linux/tools/perf/perf-completion.sh Then press tab/tab after -- to see a list of options, put them after -h and only the options chosen will have its help presented: $ perf report -h -- --asm-raw --demangle-kernel --group --kallsyms --pretty --stdio --branch-history --disassembler-style --gtk --max-stack --showcpuutilization --symbol-filter --branch-stack --dsos --header --mem-mode --show-info --symbols --call-graph --dump-raw-trace --header-only --modules --show-nr-samples --symfs --children --exclude-other --hide-unresolved --objdump --show-ref-call-graph --threads --column-widths --fields --ignore-callees --parent --show-total-period --tid --comms --field-separator --input --percentage --socket-filter --tui --cpu --force --inverted --percent-limit --sort --verbose --demangle --full-source-path --itrace --pid --source --vmlinux $ perf report -h --socket-filter Usage: perf report [<options>] --socket-filter <n> only show processor socket that match with this filter Suggested-by: Ingo Molnar <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: Chandler Carruth <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-23perf tools: Show tool command line options orderedArnaldo Carvalho de Melo1-0/+48
When asking for a listing of the options, be it using -h or when an unknown option is passed, order it by one-letter options, then the ones having just long names. Suggested-by: Ingo Molnar <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: Chandler Carruth <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-22Merge tag 'usb-for-v4.4' of ↵Greg Kroah-Hartman1-5/+5
git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-next Felipe writes: usb: patches for v4.4 merge window This pull request is large with a total of 136 non-merge commits. Because of its size, we will only describe the big things in broad terms. Many will be happy to know that dwc3 is now almost twice as fast after some profiling and speed improvements. Also in dwc3, John Youn from Synopsys added support for their new DWC USB3.1 IP Core and the HAPS platform which can be used to validate it. A series of patches from Robert Baldyga cleaned up uses of ep->driver_data as a flag for "claimed endpoint" in favor of the new ep->claimed flag. Sudip Mukherjee fixed a ton of really old problems on the amd5536udc driver. That should make a few people happy. Heikki Krogerus worked on converting dwc3 to the unified device property interface. Together with these, there's a ton of non-critical fixes, typos and stuff like that. Signed-off-by: Felipe Balbi <[email protected]>
2015-10-22perf annotate: Don't die() when finding an invalid config optionArnaldo Carvalho de Melo1-3/+3
The perf_config() infrastructure we inherited from git calls die() when the provided config callback returns -1, meaning some key in a config section is unexpected, that seems ok for a stdio based tool, but in --tui we end up messing up the output, so just tell the user about the error, wait for a keystroke and return 0, being more resilient and proceeding with what we managed to parse. That die() needs to die, tho :-) Cc: Adrian Hunter <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-22perf ui tui: Register the error callbacks before initializing the widgetsArnaldo Carvalho de Melo1-4/+4
I.e. we want to tell the user about errors found during, for instance, the ui_browser initialization, so that a call to ui__warning() appears as a window waiting for a key to be pressed. Cc: Adrian Hunter <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-22perf annotate: Fix 'annotate.use_offset' config variable usageNamhyung Kim1-1/+1
The annotate__configs should be sorted so that it can use bsearch(3). However commit 0c4a5bcea460 ("perf annotate: Display total number of samples with --show-total-period") added a new config item at the end. This resulted in the 'annotate.use_offset' config variable cannot be found and perf terminated like below: $ perf report bad config file line 6 in ~/.perfconfig Signed-off-by: Namhyung Kim <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Martin Liška <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Taeung Song <[email protected]> Fixes: 0c4a5bcea460 ("perf annotate: Display total number of samples with --show-total-period") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-22perf tools: Improve call graph documents and help messagesNamhyung Kim6-30/+62
The --call-graph option is complex so we should provide better guide for users. Also change help message to be consistent with config option names. Now perf top will show help like below: $ perf top --call-graph Error: option `call-graph' requires a value Usage: perf top [<options>] --call-graph <record_mode[,record_size],print_type,threshold[,print_limit],order,sort_key[,branch]> setup and enables call-graph (stack chain/backtrace): record_mode: call graph recording mode (fp|dwarf|lbr) record_size: if record_mode is 'dwarf', max size of stack recording (<bytes>) default: 8192 (bytes) print_type: call graph printing style (graph|flat|fractal|none) threshold: minimum call graph inclusion threshold (<percent>) print_limit: maximum number of call graph entry (<number>) order: call graph order (caller|callee) sort_key: call graph sort key (function|address) branch: include last branch info to call graph (branch) Default: fp,graph,0.5,caller,function Requested-by: Ingo Molnar <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Frederic Weisbecker <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: Chandler Carruth <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-22perf tools: Defaults to 'caller' callchain order only if --children is enabledNamhyung Kim5-1/+9
The caller callchain order is useful with --children option since it can show 'overview' style output, but other commands which don't use --children feature like 'perf script' or even 'perf report/top' without --children are better to keep callee order. Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Brendan Gregg <[email protected]> Acked-by: Frederic Weisbecker <[email protected]> Acked-by: Ingo Molnar <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Chandler Carruth <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-22perf top: Support call-graph display options alsoNamhyung Kim4-10/+62
Currently 'perf top --call-graph' option is same as 'perf record'. But 'perf top' also need to receive display options in 'perf report'. To do that, change parse_callchain_report_opt() to allow record options too. Now perf top can receive display options like below: $ perf top --call-graph Error: option `call-graph' requires a value Usage: perf top [<options>] --call-graph <mode[,dump_size],output_type,min_percent[,print_limit],call_order[,branch]> setup and enables call-graph (stack chain/backtrace) recording: fp dwarf lbr, output_type (graph, flat, fractal, or none), min percent threshold, optional print limit, callchain order, key (function or address), add branches $ perf top --call-graph callee,graph,fp Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: Chandler Carruth <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-22perf tools: Move callchain help messages to callchain.hNamhyung Kim3-10/+20
These messages will be used by 'perf top' in the next patch. Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: Chandler Carruth <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-22tools lib traceevent: Support %ps/%pSScott Wood1-2/+2
Commits such as 65dd297ac25565 ("xfs: %pF is only for function pointers") caused a regression because pretty_print() didn't support %ps/%pS. The current %pf/%pF implementation in pretty_print() is what %ps/%pS is supposed to do, so use the same code for %ps/%pS. Addressing the incorrect %pf/%pF implementation is beyond the scope of this patch. Signed-off-by: Scott Wood <[email protected]> Acked-by: Steven Rostedt <[email protected]> Cc: Dave Chinner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-22ACPICA: iASL: General cleanup of the file suffix #definesBob Moore1-1/+1
ACPICA commit bed456ed2976bdaafdef406b982fdf6c539befc0 Removed some extraneous defines, reordered others. Link: https://github.com/acpica/acpica/commit/bed456ed Signed-off-by: Bob Moore <[email protected]> Signed-off-by: Lv Zheng <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2015-10-21perf annotate: Add debug message for out of bounds sampleArnaldo Carvalho de Melo1-1/+4
Cc: Adrian Hunter <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-21perf evsel: Print branch filter state with -vvAndi Kleen1-0/+1
Add a missing field to the perf_event_attr debug output. Signed-off-by: Andi Kleen <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Print it between config2 and sample_regs_user (peterz)] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-20perf cpu_map: Fix core dump caused by per-socket/core system-wide statKan Liang1-1/+1
Perf will core dump if --per-socket/core -a are applied for perf stat. The root cause is that cpu_map__build_map set refcnt of evlist's cpu_map to 1. It should set refcnt for the newly created cpu_map, not evlist's cpu_map. Here is the example: # perf stat -e cycles --per-socket -a sleep 1 Performance counter stats for 'system wide': S0 36 30,196,257 cycles S1 28 15,823,536 cycles 1.001126828 seconds time elapsed *** Error in `./perf': corrupted double-linked list: 0x00000000021f9090 *** ======= Backtrace: ========= /lib64/libc.so.6[0x3002e7bbe7] /lib64/libc.so.6[0x3002e7d2b5] ./perf(perf_evsel__delete+0x28)[0x485bdd] ./perf[0x4800e8] ./perf(perf_evlist__delete+0x5e)[0x482cd5] ./perf(cmd_stat+0xf25)[0x432328] ./perf[0x4768e0] ./perf[0x476ad6] ./perf[0x476b41] ./perf(main+0x1d0)[0x476db2] /lib64/libc.so.6(__libc_start_main+0xf5)[0x3002e21b45] ./perf[0x4202c5] Signed-off-by: Kan Liang <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-20tools lib traceevent: update KVM pluginPaolo Bonzini1-8/+17
The format of the role word has changed through the years and the plugin was never updated; some VMX exit reasons were missing too. Signed-off-by: Paolo Bonzini <[email protected]> Acked-by: Steven Rostedt <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-20perf build: Add fixdep to .gitignoreYunlong Song1-0/+1
Commit 7c422f5572667fef0db38d2046ecce69dcf0afc8 ("tools build: Build fixdep helper from perf and basic libs") dynamically creates fixdep during the perf building. Add it to .gitignore. Signed-off-by: Yunlong Song <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Fixes: 7c422f557266 ("tools build: Build fixdep helper from perf and basic libs") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller4-1/+13
Conflicts: drivers/net/usb/asix_common.c net/ipv4/inet_connection_sock.c net/switchdev/switchdev.c In the inet_connection_sock.c case the request socket hashing scheme is completely different in net-next. The other two conflicts were overlapping changes. Signed-off-by: David S. Miller <[email protected]>
2015-10-20perf record: Add ability to sample call branchesStephane Eranian2-0/+2
This patch add a new branch type sampling filter to perf record. It is named 'call' and maps to PERF_SAMPLE_BRANCH_CALL. It samples direct call branches only, unlike 'any_call' which includes indirect calls as well. $ perf record -j call -e cycles ..... The man page is updated accordingly. Signed-off-by: Stephane Eranian <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-10-19perf bench: Use named initializers in the trailer tooArnaldo Carvalho de Melo1-2/+2
To avoid this splat with gcc 4.4.7: cc1: warnings being treated as errors bench/mem-functions.c:273: error: missing initializer bench/mem-functions.c:273: error: (near initialization for ‘memcpy_functions[4].desc’) bench/mem-functions.c:366: error: missing initializer bench/mem-functions.c:366: error: (near initialization for ‘memset_functions[4].desc’) Cc: David Ahern <[email protected]> Cc: Hitoshi Mitake <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-19perf script: Check output fields only for samplesJiri Olsa1-1/+4
There's no need to check sampling output fields for events without perf_event_attr::sample_type field set. Signed-off-by: Jiri Olsa <[email protected]> Tested-by: Kan Liang <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-19perf cpu_map: Add data arg to cpu_map__build_map callbackJiri Olsa5-15/+27
Adding data arg to cpu_map__build_map callback, so we could pass data along to the callback. It'll be needed in following patches to retrieve topology info from perf.data. Signed-off-by: Jiri Olsa <[email protected]> Tested-by: Kan Liang <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-19perf cpu_map: Make cpu_map__build_map globalJiri Olsa2-2/+4
We'll need to call it from perf stat in the stat_script patchkit Signed-off-by: Jiri Olsa <[email protected]> Tested-by: Kan Liang <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-19perf stat: Add AGGR_UNSET modeJiri Olsa3-0/+7
Adding AGGR_UNSET mode, so we could distinguish unset aggr_mode in following patches. Signed-off-by: Jiri Olsa <[email protected]> Tested-by: Kan Liang <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-19perf stat: Rename perf_stat struct into perf_stat_evselJiri Olsa3-8/+8
It's used as the perf_evsel::priv data, so the name suits better. Also we'll need the perf_stat name free for more generic struct. Signed-off-by: Jiri Olsa <[email protected]> Tested-by: Kan Liang <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-19perf help: Change 'usage' to 'Usage' for consistencyYunlong Song2-3/+3
Capitalize 'usage' to make it consistent with all the other 'Usage' in the codes, e.g., usage_builtin. Signed-off-by: Yunlong Song <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ramkumar Ramachandra <[email protected]> Cc: Sriram Raghunathan <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-19perf bench: Run benchmarks, don't test themIngo Molnar1-4/+4
So right now we output this text: memcpy: Benchmark for memcpy() functions memset: Benchmark for memset() functions all: Test all memory access benchmarks But the right verb to use with benchmarks is to 'run' them, not 'test' them. So change this (and all similar texts) to: memcpy: Benchmark for memcpy() functions memset: Benchmark for memset() functions all: Run all memory access benchmarks Signed-off-by: Ingo Molnar <[email protected]> Cc: David Ahern <[email protected]> Cc: Hitoshi Mitake <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-19perf bench mem: Rename 'routine' to 'function'Ingo Molnar2-38/+38
So right now there's a somewhat inconsistent mess of the benchmarking code and options sometimes calling benchmarked functions 'functions', sometimes calling them 'routines'. Name them 'functions' consistently. Signed-off-by: Ingo Molnar <[email protected]> Cc: David Ahern <[email protected]> Cc: Hitoshi Mitake <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Updated perf-bench man page, pointed out by David Ahern ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-19perf bench: Harmonize all the -l/--nr_loops optionsIngo Molnar4-23/+23
We have three benchmarking subsystems that specify some sort of 'number of loops' parameter - but all of them do it inconsistently: numa: -l/--nr_loops sched messaging: -l/--loops mem memset/memcpy: -i/--iterations Harmonize them to -l/--nr_loops by picking the numa variant - which is also the most likely one to have existing scripting which we don't want to break. Plus improve the parameter help texts to indicate the default value for the nr_loops variable to keep users from guessing ... Also propagate the naming to internal variables. Signed-off-by: Ingo Molnar <[email protected]> Cc: David Ahern <[email protected]> Cc: Hitoshi Mitake <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Let the harmonisation reach the perf-bench man page as well ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-19perf bench mem: Reorganize the code a bitIngo Molnar1-19/+19
Reorder functions a bit, so that we synchronize the layout of the memcpy() and memset() portions of the code. This improves the code, especially after we'll add an strlcpy() variant as well. Signed-off-by: Ingo Molnar <[email protected]> Cc: David Ahern <[email protected]> Cc: Hitoshi Mitake <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-19perf bench mem: Improve user visible stringsIngo Molnar2-15/+20
- fix various typos in user visible output strings - make the output consistent (wrt. capitalization and spelling) - offer the list of routines to benchmark on '-r help'. Signed-off-by: Ingo Molnar <[email protected]> Cc: David Ahern <[email protected]> Cc: Hitoshi Mitake <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-19perf bench mem: Fix 'length' vs. 'size' naming confusionIngo Molnar2-50/+50
So 'perf bench mem memcpy/memset' consistently uses 'len' and 'length' for buffer sizes - while it's really a memory buffer size. (strings have length.) Rename all affected variables. Signed-off-by: Ingo Molnar <[email protected]> Cc: David Ahern <[email protected]> Cc: Hitoshi Mitake <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Update perf-bench man page ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-19perf bench mem: Rename 'routine' to 'routine_str'Ingo Molnar1-6/+6
So bench/mem-functions.c has a 'routine' name for the routines parameter string, but a 'length_str' name for the length parameter string. We also have another entity named 'routine': 'struct routine'. This is inconsistent and confusing: rename 'routine' to 'routine_str'. Also fix typos in the --routine help text. Signed-off-by: Ingo Molnar <[email protected]> Cc: David Ahern <[email protected]> Cc: Hitoshi Mitake <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-19perf bench mem: Change 'cycle' to 'cycles'Ingo Molnar2-30/+30
So 'perf bench mem memset/memcpy' has a CPU cycles measurement method, but calls it 'cycle' (singular) throughout the code, which makes it harder to read. Rename all related functions, variables and options to a plural 'cycles' nomenclature. Signed-off-by: Ingo Molnar <[email protected]> Cc: David Ahern <[email protected]> Cc: Hitoshi Mitake <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ s/--cycle/--cycles/g in perf-bench man page ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-19perf bench: List output formatting options on 'perf bench -h'Ingo Molnar1-1/+1
So 'perf bench -h' is not very helpful when printing the help line about the output formatting options: -f, --format <default> Specify format style There are two output format styles, 'default' and 'simple', so improve the help text to: -f, --format <default|simple> Specify the output formatting style Signed-off-by: Ingo Molnar <[email protected]> Cc: David Ahern <[email protected]> Cc: Hitoshi Mitake <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Removed leftovers from the mem-functions.c rename ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-19perf bench: Remove the prefaulting complication from 'perf bench mem mem*'Ingo Molnar2-112/+50
So 'perf bench mem memcpy/memset' has elaborate code to measure memcpy()/memset() performance both with freshly allocated buffers (which includes initial page fault overhead) and with preallocated buffers. But the thing is, the resulting bandwidth results are mostly meaningless, because page faults dominate so much of the cost. It might make sense to measure cache cold vs. cache hot performance, but the code does not do this. So remove this complication, and always prefault the ranges before using them. Signed-off-by: Ingo Molnar <[email protected]> Cc: David Ahern <[email protected]> Cc: Hitoshi Mitake <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Remove --no-prefault, --only-prefault from docs, noticed by David Ahern ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-19perf bench: Rename 'mem-memcpy.c' => 'mem-functions.c'Ingo Molnar2-1/+1
So mem-memcpy.c started out as a simple memcpy() benchmark, then it grew memset() functionality and now I plan to add string copy benchmarks as well. This makes the file name a misnomer: rename it to the more generic mem-functions.c name. Signed-off-by: Ingo Molnar <[email protected]> Cc: David Ahern <[email protected]> Cc: Hitoshi Mitake <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ The "rename" was introducing __unused, wasn't removing the old file, and didn't update tools/perf/bench/Build, fix it ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-19perf bench: Eliminate unused argument from bench_mem_common()Ingo Molnar1-7/+4
Signed-off-by: Ingo Molnar <[email protected]> Cc: David Ahern <[email protected]> Cc: Hitoshi Mitake <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-19perf bench: Default to all routines in 'perf bench mem'Ingo Molnar1-2/+2
So few people know that the --routine option to 'perf bench memcpy/memset' exists, and would not know that it's capable of testing the kernel's memcpy/memset implementations. Furthermore, 'perf bench mem all' will not run all routines: vega:~> perf bench mem all # Running mem/memcpy benchmark... Routine default (Default memcpy() provided by glibc) # Copying 1MB Bytes ... 894.454383 MB/Sec 3.844734 GB/Sec (with prefault) # Running mem/memset benchmark... Routine default (Default memset() provided by glibc) # Copying 1MB Bytes ... 1.220703 GB/Sec 9.042245 GB/Sec (with prefault) Because misleadingly the 'all' refers to 'all sub-benchmarks', not 'all sub-benchmarks and routines'. Fix all this by making the memcpy/memset routine to default to 'all', which results in all the benchmarks being run: triton:~> perf bench mem all # Running mem/memcpy benchmark... Routine default (Default memcpy() provided by glibc) # Copying 1MB Bytes ... 1.448906 GB/Sec 4.957170 GB/Sec (with prefault) Routine x86-64-unrolled (unrolled memcpy() in arch/x86/lib/memcpy_64.S) # Copying 1MB Bytes ... 1.614153 GB/Sec 4.379204 GB/Sec (with prefault) Routine x86-64-movsq (movsq-based memcpy() in arch/x86/lib/memcpy_64.S) # Copying 1MB Bytes ... 1.570036 GB/Sec 4.264465 GB/Sec (with prefault) Routine x86-64-movsb (movsb-based memcpy() in arch/x86/lib/memcpy_64.S) # Copying 1MB Bytes ... 1.788576 GB/Sec 6.554111 GB/Sec (with prefault) # Running mem/memset benchmark... Routine default (Default memset() provided by glibc) # Copying 1MB Bytes ... 2.082223 GB/Sec 9.126752 GB/Sec (with prefault) Routine x86-64-unrolled (unrolled memset() in arch/x86/lib/memset_64.S) # Copying 1MB Bytes ... 5.710892 GB/Sec 8.346688 GB/Sec (with prefault) Routine x86-64-stosq (movsq-based memset() in arch/x86/lib/memset_64.S) # Copying 1MB Bytes ... 9.765625 GB/Sec 12.520032 GB/Sec (with prefault) Routine x86-64-stosb (movsb-based memset() in arch/x86/lib/memset_64.S) # Copying 1MB Bytes ... 9.668936 GB/Sec 12.682630 GB/Sec (with prefault) Signed-off-by: Ingo Molnar <[email protected]> Cc: David Ahern <[email protected]> Cc: Hitoshi Mitake <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-19perf bench: Improve the 'perf bench mem memcpy' code readabilityIngo Molnar1-56/+45
- improve the readability of initializations - fix unnecessary double negations - fix ugly line breaks - fix other small details Signed-off-by: Ingo Molnar <[email protected]> Cc: Andrew Morton <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-19perf test: Suppress libtraceevent warningsNamhyung Kim3-0/+20
Currently libtraceevent emits warning on unsupported event formats. However it'd be better to see them only -v option is given. To do that, it needs to override the warning() function which is used in the libtracevent. Thus add set_warning_routine() same as set_die_routine() and check the verbose flag in our warning routine. Before: # perf test 5 5: parse events tests : Warning: [kvmmmu:kvm_mmu_get_page] bad op token { Warning: [kvmmmu:kvm_mmu_sync_page] bad op token { Warning: [kvmmmu:kvm_mmu_unsync_page] bad op token { Warning: [kvmmmu:kvm_mmu_prepare_zap_page] bad op token { Warning: [kvmmmu:fast_page_fault] function is_writable_pte not defined ... Ok After: # perf test 5 5: parse events tests : Ok Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Acked-by: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-19perf test: Silence tracepoint event failuresNamhyung Kim3-3/+3
Currently, when 'perf test' is run by a normal user, it'll fail to access tracepoint events. The output becomes somewhat messy because it tries to be nice with long error messages and hints. IMHO this is not needed for 'perf test' by default and AFAIK 'perf test' uses pr_debug() rather than pr_err() for such messages so that one can use -v option to see further details on failed testcases if needed. Before: $ perf test 1: vmlinux symtab matches kallsyms : FAILED! 2: detect openat syscall event :Error: No permissions to read /sys/kernel/debug/tracing/events/syscalls/sys_enter_openat Hint: Try 'sudo mount -o remount,mode=755 /sys/kernel/debug/tracing' FAILED! 3: detect openat syscall event on all cpus :Error: No permissions to read /sys/kernel/debug/tracing/events/syscalls/sys_enter_openat Hint: Try 'sudo mount -o remount,mode=755 /sys/kernel/debug/tracing' FAILED! ... After: $ perf test 1: vmlinux symtab matches kallsyms : FAILED! 2: detect openat syscall event : FAILED! 3: detect openat syscall event on all cpus : FAILED! ... $ perf test -v 2 2: detect openat syscall event : --- start --- test child forked, pid 30575 Error: No permissions to read /sys/kernel/debug/tracing/events/syscalls/sys_enter_openat Hint: Try 'sudo mount -o remount,mode=755 /sys/kernel/debug/tracing' test child finished with -1 ---- end ---- detect openat syscall event: FAILED! Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Acked-by: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-19selftests/powerpc: Run EBB tests only on POWER8Denis Kirjanov24-0/+56
EBB (Event Based Branches) are currently only available on POWER8, so we should skip them on other CPUs. I've found that at least one test loops forever on 970MP (cycles_with_freeze_test). Signed-off-by: Denis Kirjanov <[email protected]> [mpe: Minor change log editing, add skip to cpu_event_vs_ebb_test] Signed-off-by: Michael Ellerman <[email protected]>
2015-10-19Merge branch 'for-mingo' of ↵Ingo Molnar6-4/+20
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU updates from Paul E. McKenney: - Miscellaneous fixes. (Paul E. McKenney, Boqun Feng, Oleg Nesterov, Patrick Marlier) - Improvements to expedited grace periods. (Paul E. McKenney) - Performance improvements to and locktorture tests for percpu-rwsem. (Oleg Nesterov, Paul E. McKenney) - Torture-test changes. (Paul E. McKenney, Davidlohr Bueso) - Documentation updates. (Paul E. McKenney) Signed-off-by: Ingo Molnar <[email protected]>
2015-10-16Merge tag 'powerpc-4.3-4' of ↵Linus Torvalds1-0/+9
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Re-enable CONFIG_SCSI_DH in our defconfigs - Remove unused os_area_db_id_video_mode - cxl: fix leak of IRQ names in cxl_free_afu_irqs() from Andrew - cxl: fix leak of ctx->irq_bitmap when releasing context via kernel API from Andrew - cxl: fix leak of ctx->mapping when releasing kernel API contexts from Andrew - cxl: Workaround malformed pcie packets on some cards from Philippe - cxl: Fix number of allocated pages in SPA from Christophe Lombard - Fix checkstop in native_hpte_clear() with lockdep from Cyril - Panic on unhandled Machine Check on powernv from Daniel - selftests/powerpc: Fix build failure of load_unaligned_zeropad test * tag 'powerpc-4.3-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: selftests/powerpc: Fix build failure of load_unaligned_zeropad test powerpc/powernv: Panic on unhandled Machine Check powerpc: Fix checkstop in native_hpte_clear() with lockdep cxl: Fix number of allocated pages in SPA cxl: Workaround malformed pcie packets on some cards cxl: fix leak of ctx->mapping when releasing kernel API contexts cxl: fix leak of ctx->irq_bitmap when releasing context via kernel API cxl: fix leak of IRQ names in cxl_free_afu_irqs() powerpc/ps3: Remove unused os_area_db_id_video_mode powerpc/configs: Re-enable CONFIG_SCSI_DH
2015-10-15selftests/powerpc: Allow the tm-syscall test to build with old headersMichael Ellerman1-2/+12
When building against older kernel headers, currently the tm-syscall test fails to build because PPC_FEATURE2_HTM_NOSC is not defined. Tweak the test so that if PPC_FEATURE2_HTM_NOSC is not defined it still builds, but prints a warning at run time and marks the test as skipped. Signed-off-by: Michael Ellerman <[email protected]>