aboutsummaryrefslogtreecommitdiff
path: root/tools/perf
AgeCommit message (Collapse)AuthorFilesLines
2017-02-08perf vendor events intel: Add uncore events for Haswell Server processorAndi Kleen4-0/+512
This is not a full uncore event list, but a short list of useful and understandable metrics. Signed-off-by: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-02-08perf tools: Fix include of linux/mman.hArnaldo Carvalho de Melo1-1/+1
It was using uapi/linux/mmap.h which caused for at least one reporter, that hasn't specified in what environment the problem manifests itself: ---- The original error is: In file included from util/event.c:2:0: ...tools/include/uapi/linux/mman.h:4:27: fatal error: uapi/asm/mman.h: No such file or directory #include <uapi/asm/mman.h> ^ compilation terminated. ---- Test built it on these containers: # dm 1 alpine:3.4: Ok 2 android-ndk:r12b-arm: Ok 3 archlinux:latest: Ok 4 centos:5: Ok 5 centos:6: Ok 6 centos:7: Ok 7 debian:7: Ok 8 debian:8: Ok 9 debian:experimental: Ok 10 debian:experimental-x-arm64: Ok 11 debian:experimental-x-mips: Ok 12 debian:experimental-x-mips64: Ok 13 debian:experimental-x-mipsel: Ok 14 fedora:20: Ok 15 fedora:21: Ok 16 fedora:22: Ok 17 fedora:23: Ok 18 fedora:24: Ok 19 fedora:24-x-ARC-uClibc: Ok 20 fedora:25: Ok 21 fedora:rawhide: Ok 22 mageia:5: Ok 23 opensuse:13.2: Ok 24 opensuse:42.1: Ok 25 opensuse:tumbleweed: Ok 26 ubuntu:12.04.5: Ok 27 ubuntu:14.04.4-x-linaro-arm64: Ok 28 ubuntu:15.10: Ok 29 ubuntu:16.04: Ok 30 ubuntu:16.04-x-arm: Ok 31 ubuntu:16.04-x-arm64: Ok 32 ubuntu:16.04-x-powerpc: Ok 33 ubuntu:16.04-x-powerpc64: Ok 34 ubuntu:16.04-x-powerpc64el: Ok 35 ubuntu:16.04-x-s390: Ok 36 ubuntu:16.10: Ok Reported-by: David Carrillo-Cisneros <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: He Kuang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Michal Marek <[email protected]> Cc: Paul Turner <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Uwe Kleine-König <[email protected]> Cc: Wang Nan <[email protected]> Fixes: fbef103fad50 ("perf tools: Do hugetlb handling in more systems") Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-02-08perf tools: Use zfree() to avoid keeping dangling pointersTaeung Song1-6/+6
The cases changed in this patch are for when we free but keep the pointer to the freed area, which is not always a good idea. Be more defensive and zero the pointer to avoid possible use after free bugs to take more time to be detected. Signed-off-by: Taeung Song <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ rewrote commit log ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-02-08perf tools: Use zfree() instead of ad hoc equivalentTaeung Song1-4/+2
We have zfree(&ptr) for this very common pattern: free(ptr); ptr = NULL; So use it in a few more places. Signed-off-by: Taeung Song <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ rewrote commit log ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-02-08perf tools: Add missing check for failure in a zalloc() callTaeung Song1-0/+2
Signed-off-by: Taeung Song <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-02-08perf tools: Only increase index if perf_evsel__new_idx() succeedsTaeung Song1-1/+2
Signed-off-by: Taeung Song <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-02-08perf probe: Add option --symfsUwe Kleine-König1-0/+2
perf probe makes use of debug symbols, so add --symfs as the other commands have. Signed-off-by: Uwe Kleine-König <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-02-08perf symbols: Take into account symfs setting when reading file build IDVictor Kamensky1-2/+4
After commit 5baecbcd9c9a ("perf symbols: we can now read separate debug-info files based on a build ID") and when --symfs option is used perf failed to pick up symbols for file with the same name between host and sysroot specified by --symfs option. One can see message like this: bin/bash with build id 26f0062cb6950d4d1ab0fd9c43eae8b10ca42062 not found, continuing without symbols It happens because code added by 5baecbcd9c9a opens files directly by dso->long_name without symbol_conf.symfs consideration, which as result picks one from the host. It reads its build ID and later even code finds another proper file in directory pointed by --symfs perf ignores it because build id mismatches. Fix is to use __symbol__join_symfs to adjust file name according to --symfs setting. If no --symfs passed the operation would noop and picks the same host file as before. Also note in latter tree after 5baecbcd9c9a commit additional check for '!dso->has_build_id' was added, so to observe error condition 'perf record' should run with --no-buildid, so perf.data itself would not have build id for target binary in buildid perf section and 'perf report' will pass '!dso->has_build_id' condition. Or target binary should not have build id, but the same binary on host has build id, again '!dso->has_build_id' will pass in this case and incorrect build id could be read if --symfs is used. Signed-off-by: Victor Kamensky <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Chris Phlipot <[email protected]> Cc: Dima Kogan <[email protected]> Cc: He Kuang <[email protected]> Cc: Kan Liang <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Cc: [email protected] Fixes: 5baecbcd9c9a ("perf symbols: we can now read separate debug-info files based on a build ID") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-02-08perf list: Add debug support for outputing alias stringAndi Kleen4-0/+15
For debugging and testing it is useful to see the converted alias string. Add support to perf stat/record and perf list to print the alias conversion. The text string is saved in the alias structure. For perf stat/record it is folded into the normal -v. For perf list -v was taken, so we use --debug. Before: % perf list ... cache: l1d.replacement [L1D data line replacements] l1d_pend_miss.fb_full [Cycles a demand request was blocked due to Fill Buffers inavailability] After % perf list --debug ... cache: l1d.replacement [L1D data line replacements] cpu/umask=0x1,period=2000003,event=0x51/ l1d_pend_miss.fb_full [Cycles a demand request was blocked due to Fill Buffers inavailability] cpu/umask=0x2,period=2000003,cmask=1,event=0x48/ Signed-off-by: Andi Kleen <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-02-08perf pmu: Support event aliases for non cpu// pmusAndi Kleen2-27/+51
The code for handling pmu aliases without specifying the PMU hardcoded only supported the cpu PMU. This patch extends it to work for all PMUs. We always duplicate the event for all PMUs that have an matching alias. This allows to automatically expand an alias for all instances of a PMU (so for example you can monitor all cache boxes with a single event) Signed-off-by: Andi Kleen <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-02-08perf pmu: Support per pmu json aliasesAndi Kleen1-4/+9
Add support for registering json aliases per PMU. Any alias with an unit matching the prefix is registered to the PMU. Uncore has multiple instances of most units, so all these aliases get registered for each individual PMU (this is important later to run the event on every instance of the PMU). To avoid printing the events multiple times in perf list filter out duplicated events during printing. v2: Rely on uncore_ prefix already in unit v3: Document why calls were reordered Signed-off-by: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-02-08perf jevents: Add support for parsing uncore json filesAndi Kleen4-13/+98
Handle the "Unit" field, which is needed to find the right PMU for an event. We call it "pmu" and convert it to the perf pmu name with an uncore prefix. Handle the "ExtSel" field, which just extends the event mask with an additional bit. Handle the "Filter" field which adds parameters to the main event to configure filtering. Handle the "Unit" field which declares the unit the values should be scaled too (similar to what the kernel exports) Set up the "perpkg" field for uncore events so that perf knows they are per package (similar to what the kernel exports) Then output the fields into the pmu-events data structures which are compiled into perf. Filter out zero fields, except for the event itself. v2: Fix compilation. Add uncore_ prefix at pre-processing time. Move eventcode change to separate patch. v3: Remove extra __maybe_unused v4: dont duplicate aliases for cpu pmu events Signed-off-by: Andi Kleen <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-02-08perf jevents: Parse eventcode as numberAndi Kleen1-1/+9
The next patch needs to modify event code. Previously eventcode was just passed through as a string. Now parse it as a number. v2: Don't special case 0 Signed-off-by: Andi Kleen <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-02-08perf bpf: Add missing newline in debug messagesHe Kuang1-2/+2
These two debug messages are missing the trailing newline. Signed-off-by: He Kuang <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Bintian Wang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-02-08perf tools arm64: Add support for generating bpf prologueHe Kuang2-1/+15
Since HAVE_KPROBES can be enabled in arm64, this patch introduces regs_query_register_offset() to convert register name to offset for arm64, so the BPF prologue feature is ready to use. Signed-off-by: He Kuang <[email protected]> Reviewed-by: Will Deacon <[email protected]> Acked-by: Masami Hiramatsu <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Bintian Wang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-02-06Merge remote-tracking branch 'tip/perf/urgent' into perf/coreArnaldo Carvalho de Melo3-1/+18
To pick fixes that are affecting tests of new 'perf diff' features in perf/core. Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-02-02cgroup, perf_event: make perf_event controller work on cgroup2 hierarchyTejun Heo1-7/+19
perf_event is a utility controller whose primary role is identifying cgroup membership to filter perf events; however, because it also tracks some per-css state, it can't be replaced by pure cgroup membership test. Mark the controller as implicitly enabled on the default hierarchy so that perf events can always be filtered based on cgroup v2 path as long as the controller is not mounted on a legacy hierarchy. "perf record" is updated accordingly so that it searches for both v1 and v2 hierarchies. A v1 hierarchy is used if perf_event is mounted on it; otherwise, it uses the v2 hierarchy. v2: Doc updated to reflect more flexible rebinding behavior. Signed-off-by: Tejun Heo <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]>
2017-02-02perf callchain: Reference count mapsKrister Johansen3-2/+22
If dso__load_kcore frees all of the existing maps, but one has already been attached to a callchain cursor node, then we can get a SIGSEGV in any function that happens to try to use this invalid cursor. Use the existing map refcount mechanism to forestall cleanup of a map until the cursor iterates past the node. Signed-off-by: Krister Johansen <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: [email protected] Fixes: 84c2cafa2889 ("perf tools: Reference count struct map") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-02-02perf diff: Fix -o/--order option behavior (again)Namhyung Kim3-1/+14
Commit 21e6d8428664 ("perf diff: Use perf_hpp__register_sort_field interface") changed list_add() to perf_hpp__register_sort_field(). This resulted in a behavior change since the field was added to the tail instead of the head. So the -o option is mostly ignored due to its order in the list. This patch fixes it by adding perf_hpp__prepend_sort_field(). Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Fixes: 21e6d8428664 ("perf diff: Use perf_hpp__register_sort_field interface") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-02-02perf diff: Fix segfault on 'perf diff -o N' optionNamhyung Kim1-0/+4
The -o/--order option is to select column number to sort a diff result. It does the job by adding a hpp field at the beginning of the sort list. But it should not be added to the output field list as it has no callbacks required by a output field. During the setup_sorting(), the perf_hpp__setup_output_field() appends the given sort keys to the output field if it's not there already. Originally it was checked by fmt->list being non-empty. But commit 3f931f2c4274 ("perf hists: Make hpp setup function generic") changed it to check the ->equal callback. Anyways, we don't need to add the pseudo hpp field to the output field list since it won't be used for output. So just skip fields if they have no ->color or ->entry callbacks. Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Fixes: 3f931f2c4274 ("perf hists: Make hpp setup function generic") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-31perf ftrace: Add ftrace.tracer config optionTaeung Song1-0/+25
Currently 'perf ftrace' command allows selecting 'function_graph' or 'function', defaulting to 'function_graph'. Add ftrace.tracer config option to select the default tracer: # cat ~/.perfconfig [ftrace] tracer = function # perf ftrace usleep 123456 | head -10 <...>-14450 [002] d... 10089.284231: finish_task_switch <-__schedule <...>-14450 [002] .... 10089.284232: finish_wait <-pipe_wait <...>-14450 [002] .... 10089.284232: mutex_lock <-pipe_wait <...>-14450 [002] .... 10089.284232: _cond_resched <-mutex_lock Committer notes: Retesting it with invalid variables, invalid values for ftrace.tracer, and a valid one: # cat ~/.perfconfig [ftrace] trace = function # perf ftrace usleep 1 Error: wrong config key-value pair ftrace.trace=function # cat ~/.perfconfig [ftrace] tracer = functin # perf ftrace usleep 1 Please select "function_graph" (default) or "function" Error: wrong config key-value pair ftrace.tracer=functin # cat ~/.perfconfig [ftrace] tracer = function # perf ftrace usleep 1 | head -5 <idle>-0 [000] d... 3855.820847: switch_mm_irqs_off <-__schedule <...>-18550 [000] d... 3855.820849: finish_task_switch <-__schedule <...>-18550 [000] d... 3855.820851: smp_irq_work_interrupt <-irq_work_interrupt <...>-18550 [000] d... 3855.820851: irq_enter <-smp_irq_work_interrupt <...>-18550 [000] d... 3855.820851: rcu_irq_enter <-irq_enter # Signed-off-by: Taeung Song <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Added missign space in error message, changed the logic to make it more compact and less error prone ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-31perf tools: Create for_each_event macro for tracepoints iterationTaeung Song1-20/+18
Similar to for_each_subsystem and for_each_event in util/parse-events.c, add new macro 'for_each_event' for easy iteration over the tracepoints in order to be more compact and readable. Signed-off-by: Taeung Song <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Slight change to keep existing style for checking strcmp() return ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-31perf test: Add libbpf pinning testJoe Stringer1-1/+41
Add a test for the newly added BPF object pinning functionality. For example: # tools/perf/perf test 37 37: BPF filter : 37.1: Basic BPF filtering : Ok 37.2: BPF pinning : Ok 37.3: BPF prologue generation : Ok 37.4: BPF relocation checker : Ok # tools/perf/perf test 37 -v 2>&1 | grep pinned libbpf: pinned map '/sys/fs/bpf/perf_test/flip_table' libbpf: pinned program '/sys/fs/bpf/perf_test/func=SyS_epoll_wait/0' Signed-off-by: Joe Stringer <[email protected]> Requested-and-Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: Wang Nan <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-31tools perf util: Make rm_rf(path) argument constJoe Stringer2-2/+2
rm_rf() doesn't modify its path argument, and a future caller will pass a string constant into it to delete. Signed-off-by: Joe Stringer <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: Wang Nan <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-31perf callchain: Reference count mapsKrister Johansen3-2/+22
If dso__load_kcore frees all of the existing maps, but one has already been attached to a callchain cursor node, then we can get a SIGSEGV in any function that happens to try to use this invalid cursor. Use the existing map refcount mechanism to forestall cleanup of a map until the cursor iterates past the node. Signed-off-by: Krister Johansen <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: [email protected] Fixes: 84c2cafa2889 ("perf tools: Reference count struct map") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller3-41/+72
Two trivial overlapping changes conflicts in MPLS and mlx5. Signed-off-by: David S. Miller <[email protected]>
2017-01-27perf tools: Propagate perf_config() errorsArnaldo Carvalho de Melo12-23/+62
Previously these were being ignored, sometimes silently. Stop doing that, emitting debug messages and handling the errors. Testing it: $ cat ~/.perfconfig cat: /home/acme/.perfconfig: No such file or directory $ perf stat -e cycles usleep 1 Performance counter stats for 'usleep 1': 938,996 cycles:u 0.003813731 seconds time elapsed $ perf top --stdio Error: You may not have permission to collect system-wide stats. Consider tweaking /proc/sys/kernel/perf_event_paranoid, <SNIP> [ perf record: Captured and wrote 0.019 MB perf.data (7 samples) ] [acme@jouet linux]$ perf report --stdio # To display the perf.data header info, please use --header/--header-only options. # Overhead Command Shared Object Symbol # ........ ....... ................. ......................... 71.77% usleep libc-2.24.so [.] _dl_addr 27.07% usleep ld-2.24.so [.] _dl_next_ld_env_entry 1.13% usleep [kernel.kallsyms] [k] page_fault $ $ touch ~/.perfconfig $ ls -la ~/.perfconfig -rw-rw-r--. 1 acme acme 0 Jan 27 12:14 /home/acme/.perfconfig $ $ perf stat -e instructions usleep 1 Performance counter stats for 'usleep 1': 244,610 instructions:u 0.000805383 seconds time elapsed $ [root@jouet ~]# chown acme.acme ~/.perfconfig [root@jouet ~]# perf stat -e cycles usleep 1 Warning: File /root/.perfconfig not owned by current user or root, ignoring it. Performance counter stats for 'usleep 1': 937,615 cycles 0.000836931 seconds time elapsed # Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-27perf config: Do not consider an error not to have any perfconfig fileArnaldo Carvalho de Melo1-6/+8
While propagating the errors from perf_config(), which were being completely ignored, everything stopped working for people without a ~/.perfconfig file, because the perf_config_set__init() was considering an error not to have a .perfconfig file, duh, fix it by checking the errno after the failed stat() call. It should also not return an error when it says it is ignoring the file, and also a empty file should not return an error either. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Taeung Song <[email protected]> Cc: Wang Nan <[email protected]> Fixes: 8beeb00f2c84 ("perf config: Use new perf_config_set__init() to initialize config set") Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-26perf ftrace: Remove needless code setting default tracerTaeung Song1-4/+1
As a result of commit a3497642c261 ("perf ftrace: Make 'function_graph' be the default tracer") the ftrace.tracer variable can't be NULL but the other code setting default tracer remained. Signed-off-by: Taeung Song <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-26Merge tag 'perf-core-for-mingo-4.11-20170126' of ↵Ingo Molnar20-67/+546
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull the latest perf/core updates from Arnaldo Carvalho de Melo: New features: - Introduce 'perf ftrace' a perf front end to the kernel's ftrace function and function_graph tracer, defaulting to the "function_graph" tracer, more work will be done in reviving this effort, forward porting it from its initial patch submission (Namhyung Kim) - Add 'e' and 'c' hotkeys to expand/collapse call chains for a single hist entry in the 'perf report' and 'perf top' TUI (Jiri Olsa) Fixes: - Fix wrong register name for arm64, used in 'perf probe' (He Kuang) - Fix map offsets in relocation in libbpf (Joe Stringer) - Fix looking up dwarf unwind stack info (Matija Glavinic Pecotic) Infrastructure changes: - libbpf prog functions sync with what is exported via uapi (Joe Stringer) Trivial changes: - Remove unnecessary checks and assignments in 'perf probe's try_to_find_absolute_address() (Markus Elfring) Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2017-01-26perf ftrace: Make 'function_graph' be the default tracerArnaldo Carvalho de Melo1-1/+2
So that we can suppress the '-t function_graph' and get a more compact command line: # perf ftrace usleep 123456 | grep raw_spin_lock | sort -k2 -nr | head -5 2) 0.555 us | _raw_spin_lock(); 2) 0.516 us | _raw_spin_lock(); 2) 0.410 us | _raw_spin_lock_irq(); 2) 0.374 us | _raw_spin_lock_irqsave(); # Tested-by: Masami Hiramatsu <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jeremy Eder <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-26perf ftrace: Introduce new 'ftrace' toolNamhyung Kim6-0/+282
The 'perf ftrace' command is a simple wrapper of kernel's ftrace functionality. It only supports single thread tracing currently and just reads trace_pipe in text and then write it to stdout. Committer notes: Testing it: # perf ftrace -f function_graph usleep 123456 <SNIP> 2) | SyS_nanosleep() { 2) | _copy_from_user() { <SNIP> 2) 0.900 us | } 2) 1.354 us | } 2) | hrtimer_nanosleep() { 2) 0.062 us | __hrtimer_init(); 2) | do_nanosleep() { 2) | hrtimer_start_range_ns() { <SNIP> 2) 5.025 us | } 2) | schedule() { 2) 0.125 us | rcu_note_context_switch(); 2) 0.057 us | _raw_spin_lock(); 2) | deactivate_task() { 2) 0.369 us | update_rq_clock.part.77(); 2) | dequeue_task_fair() { <SNIP> 2) + 22.453 us | } 2) + 23.736 us | } 2) | pick_next_task_fair() { <SNIP> 2) + 47.167 us | } 2) | pick_next_task_idle() { <SNIP> 2) 4.462 us | } ------------------------------------------ 2) usleep-20387 => <idle>-0 ------------------------------------------ 2) 0.806 us | switch_mm_irqs_off(); ------------------------------------------ 2) <idle>-0 => usleep-20387 ------------------------------------------ 2) 0.151 us | finish_task_switch(); 2) @ 123597.2 us | } 2) 0.037 us | _cond_resched(); 2) | hrtimer_try_to_cancel() { 2) 0.064 us | hrtimer_active(); 2) 0.353 us | } 2) @ 123605.3 us | } 2) @ 123606.2 us | } 2) @ 123608.3 us | } /* SyS_nanosleep */ 2) | __do_page_fault() { <SNIP> Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Tested-by: Masami Hiramatsu <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jeremy Eder <[email protected]> Cc: Jiri Olsa <[email protected]>, Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Steven Rostedt <[email protected]> Link: http://lkml.kernel.org/n/[email protected] [ Various foward port fixes, add man page ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-26perf util: Add more debug message on failure pathNamhyung Kim2-18/+39
It's helpful for debugging on tracing features. Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Masami Hiramatsu <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jeremy Eder <[email protected]> Cc: Jiri Olsa <[email protected]>, Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Steven Rostedt <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-26perf util: Save pid-cmdline mapping into tracing headerNamhyung Kim4-3/+84
Current trace info data lacks the saved cmdline mapping which is needed for pevent to find out the comm of a task. Add this and bump up the version number so that perf can determine its presence when reading. This is mostly corresponding to trace.dat file version 6, but still lacks 4 byte of number of cpus, and 10 bytes of type string - and I think we don't need those anyway. Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Masami Hiramatsu <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jeremy Eder <[email protected]> Cc: Jiri Olsa <[email protected]>, Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Steven Rostedt <[email protected]> [ Change version test from == to >= ] Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-26perf scripting perl: Do not die() when not founding event for a typeArnaldo Carvalho de Melo1-2/+4
Do just like handling other cases i.e. print some debug message and ignore the sample. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-26tools lib bpf: Add libbpf_get_error()Joe Stringer1-1/+1
This function will turn a libbpf pointer into a standard error code (or 0 if the pointer is valid). This also allows removal of the dependency on linux/err.h in the public header file, which causes problems in userspace programs built against libbpf. Signed-off-by: Joe Stringer <[email protected]> Acked-by: Wang Nan <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-26perf probe: Delete an unnecessary assignment in try_to_find_absolute_address()Markus Elfring1-3/+2
Remove an error code assignment which is redundant in an if branch for the handling of a memory allocation failure because the same value was set for the local variable "err" before. Signed-off-by: Markus Elfring <[email protected]> Acked-by: Masami Hiramatsu <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: He Kuang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Milian Wolff <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Wang Nan <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-26perf probe: Delete an unnecessary check in try_to_find_absolute_address()Markus Elfring1-4/+2
Remove a condition check which is unnecessary at the end because this source code place should usually only be reached with a non-zero pointer. Signed-off-by: Markus Elfring <[email protected]> Acked-by: Masami Hiramatsu <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: He Kuang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Milian Wolff <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Wang Nan <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-26perf probe: Fix wrong register name for arm64He Kuang1-6/+6
The register name of arm64 architecture is x0-x31 not r0-r31, this patch changes this typo. Before this patch: # perf probe --definition 'sys_write count' p:probe/sys_write _text+1502872 count=%r2:s64 # echo 'p:probe/sys_write _text+1502872 count=%r2:s64' > \ /sys/kernel/debug/tracing/kprobe_events Parse error at argument[0]. (-22) After this patch: # perf probe --definition 'sys_write count' p:probe/sys_write _text+1502872 count=%x2:s64 # echo 'p:probe/sys_write _text+1502872 count=%x2:s64' > \ /sys/kernel/debug/tracing/kprobe_events # echo 1 >/sys/kernel/debug/tracing/events/probe/enable # cat /sys/kernel/debug/tracing/trace ... sh-422 [000] d... 650.495930: sys_write: (SyS_write+0x0/0xc8) count=22 sh-422 [000] d... 651.102389: sys_write: (SyS_write+0x0/0xc8) count=26 sh-422 [000] d... 651.358653: sys_write: (SyS_write+0x0/0xc8) count=86 Signed-off-by: He Kuang <[email protected]> Acked-by: Masami Hiramatsu <[email protected]> Acked-by: Will Deacon <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Bintian Wang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-25lib, traceevent: add PRINT_HEX_STR variantDaniel Borkmann2-0/+2
Add support for the __print_hex_str() macro that was added for tracing, so that user space tools such as perf can understand it as well. Signed-off-by: Daniel Borkmann <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-01-25Merge branch 'linus' into perf/core, to pick up fixesIngo Molnar4-42/+72
Signed-off-by: Ingo Molnar <[email protected]>
2017-01-20perf c2c report: Coalesce by default only by pid,iaddrJiri Olsa2-2/+2
It seems to be the most used argument for -c option so far. In the beginning when you want to have the overall process report, so it makes sense to make it the default one. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Don Zickus <[email protected]> Cc: Joe Mario <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-20perf c2c report: Display Total records column in offset viewJiri Olsa1-0/+1
Adding "Total records" column into cacheline pareto table, between cycles and cpu info. $ perf c2c report ... --- ---------- cycles ---------- Total cpu rmt hitm lcl hitm load records cnt ... ........ ........ ........ ....... ........ 0 112 71 34 4 0 0 0 18 1 0 0 0 2 1 0 132 0 3 3 ... It's useful to see how many recorded samples represent each offset. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Don Zickus <[email protected]> Cc: Joe Mario <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-20perf hists browser: Add e/c hotkeys to expand/collapse callchain for current ↵Jiri Olsa1-0/+17
entry Currently we allow only to expand or collapse all entries in the browser with 'E' or 'C' keys. Allow user to expand or collapse only current entry in the browser with e or c key. Signed-off-by: Jiri Olsa <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: David Ahern <[email protected]> Cc: Don Zickus <[email protected]> Cc: Joe Mario <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-20perf hists browser: Put hist_entry folding logic into single functionJiri Olsa1-18/+25
It will be used in following patch to expand or collapse only the current browser entry. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-18perf unwind: Fix looking up dwarf unwind stack infoMatija Glavinic Pecotic2-16/+86
Using perf with call graph method dwarf fails to provide backtrace support for stripped binary even though .gnu_debuglink points to *.dbg flavor with properly populated debug symbols. Problem is reproduced on ARM (v7, v8), kernels 3.14.y, 4.4.y and 4.10.rc3. Perf is configured with libunwind, and unwind dwarf support [1]. Test code (stress_bt.c) can be found on [2]. Running (explicitly disable other unwinding methods): $ gcc -g -o stress_bt -fomit-frame-pointer -fno-unwind-tables \ -fno-asynchronous-unwind-tables stress_bt.c $ perf record -N --call-graph dwarf ./stress_bt $ perf report results in properly generated call graph. Stripping the binary and running it results with missing call graph. Expected result is to have call graph: $ gcc -g -o stress_bt -fomit-frame-pointer -fno-unwind-tables \ -fno-asynchronous-unwind-tables stress_bt.c $ objcopy --only-keep-debug stress_bt stress_bt.dbg $ objcopy --strip-debug stress_bt $ objcopy --add-gnu-debuglink=stress_bt.dbg stress_bt $ perf record -N --call-graph dwarf ./stress_bt $ perf report Problem is that perf doesn't try to read symbols pointed by gnu debuglink. Patch adds checking, and reading of the symbols from debuglink and symsrc. Order of the check is to first check within dso, then check whether symsrc is defined and try to read from it. Finally, debuglink is checked. Default locations of debug files are discussed in [3] and [4]. Comments on RFC are on [5]. [1] https://wiki.linaro.org/LEG/Engineering/TOOLS/perf-callstack-unwinding [2] [1]#Backtrace_stress_application [3] https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html [4] https://sourceware.org/binutils/docs/binutils/objcopy.html [5] https://lkml.org/lkml/2016/8/22/473 Signed-off-by: Matija Glavinic Pecotic <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Sverdlin <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-17Merge tag 'perf-urgent-for-mingo-4.10-20170117' of ↵Ingo Molnar3-41/+72
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull 'perf probe' fixes from Arnaldo Carvalho de Melo <[email protected]> - Show correct locations for 'perf probe' on modules (Masami Hiramatsu) - Correctly handle 'perf probe's on GCC generated functions in modules (Masami Hiramatsu) Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2017-01-17perf evlist: Fix typo in deliver_sample()Soramichi AKIYAMA1-1/+1
This patch fixes a typo: s/delievery/delivery/ Signed-off-by: Soramichi Akiyama <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-17perf tools: Move two variables usied in libperf from perf.cSoramichi AKIYAMA5-5/+5
The use_browser and perf_version_string variables are both declared in perf.c but they are also referenced by other functions of libperf.a. Therefore a user linking an own main() with libperf.a must declare those two variables in their files even if the files never use the browser or the version information. This patch fixes this issue by moving use_browser and perf_version_string out of perf.c to some other files. Signed-off-by: Soramichi Akiyama <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-01-17perf sched timehist: Show total wait times for summaryNamhyung Kim1-3/+41
When --state option is given, the summary will show total run, sleep, iowait, preempt and delay time instead of statistics of runtime. $ perf sched timehist -s --state Wait-time summary comm parent sched-in run-time sleep iowait preempt delay (count) (msec) (msec) (msec) (msec) (msec) --------------------------------------------------------------------- systemd[1] 0 3 0.497 1.685 0.000 0.000 0.061 ksoftirqd/0[3] 2 21 0.434 989.948 0.000 0.000 0.325 rcu_preempt[7] 2 28 0.386 993.211 0.000 0.000 0.712 migration/0[10] 2 12 0.126 50.174 0.000 0.000 0.044 watchdog/0[11] 2 1 0.009 0.000 0.000 0.000 0.016 migration/1[13] 2 2 0.029 11.755 0.000 0.000 0.007 <SNIP> Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>