aboutsummaryrefslogtreecommitdiff
path: root/tools/perf/util
AgeCommit message (Collapse)AuthorFilesLines
2016-09-01perf uprobe: Skip prologue if program compiled without optimizationRavi Bangoria1-0/+165
The function prologue prepares stack and registers before executing function logic. When target program is compiled without optimization, function parameter information is only valid after the prologue. When we probe entrypc of the function, and try to record a function parameter, it contains a garbage value. For example: $ vim test.c #include <stdio.h> void foo(int i) { printf("i: %d\n", i); } int main() { foo(42); return 0; } $ gcc -g test.c -o test $ objdump -dl test | less foo(): /home/ravi/test.c:4 400536: 55 push %rbp 400537: 48 89 e5 mov %rsp,%rbp 40053a: 48 83 ec 10 sub -bashx10,%rsp 40053e: 89 7d fc mov %edi,-0x4(%rbp) /home/ravi/test.c:5 400541: 8b 45 fc mov -0x4(%rbp),%eax ... ... main(): /home/ravi/test.c:9 400558: 55 push %rbp 400559: 48 89 e5 mov %rsp,%rbp /home/ravi/test.c:10 40055c: bf 2a 00 00 00 mov -bashx2a,%edi 400561: e8 d0 ff ff ff callq 400536 <foo> $ perf probe -x ./test 'foo i' $ cat /sys/kernel/debug/tracing/uprobe_events p:probe_test/foo /home/ravi/test:0x0000000000000536 i=-12(%sp):s32 $ perf record -e probe_test:foo ./test $ perf script test 5778 [001] 4918.562027: probe_test:foo: (400536) i=0 Here variable 'i' is passed via stack which is pushed on stack at 0x40053e. But we are probing at 0x400536. To resolve this issues, we need to probe on next instruction after prologue. gdb and systemtap also does same thing. I've implemented this patch based on approach systemtap has used. After applying patch: $ perf probe -x ./test 'foo i' $ cat /sys/kernel/debug/tracing/uprobe_events p:probe_test/foo /home/ravi/test:0x0000000000000541 i=-4(%bp):s32 $ perf record -e probe_test:foo ./test $ perf script test 6300 [001] 5877.879327: probe_test:foo: (400541) i=42 No need to skip prologue for optimized case since debug info is correct for each instructions for -O2 -g. For more details please visit: https://bugzilla.redhat.com/show_bug.cgi?id=612253#c6 Changes in v2: - Skipping prologue only when any ARG is either C variable, $params or $vars. - Probe on line(:1) may not be always possible. Recommend only address to force probe on function entry. Committer notes: Testing it with 'perf trace': # perf probe -x ./test foo i Added new event: probe_test:foo (on foo in /home/acme/c/test with i) You can now use it in all perf tools, such as: perf record -e probe_test:foo -aR sleep 1 # cat /sys/kernel/debug/tracing/uprobe_events p:probe_test/foo /home/acme/c/test:0x0000000000000526 i=-12(%sp):s32 # trace --no-sys --event probe_*:* ./test i: 42 0.000 probe_test:foo:(400526) i=0) # After the patch: # perf probe -d *:* Removed event: probe_test:foo # perf probe -x ./test foo i Target program is compiled without optimization. Skipping prologue. Probe on address 0x400526 to force probing at the function entry. Added new event: probe_test:foo (on foo in /home/acme/c/test with i) You can now use it in all perf tools, such as: perf record -e probe_test:foo -aR sleep 1 # cat /sys/kernel/debug/tracing/uprobe_events p:probe_test/foo /home/acme/c/test:0x0000000000000531 i=-4(%bp):s32 # trace --no-sys --event probe_*:* ./test i: 42 0.000 probe_test:foo:(400531) i=42) # Reported-by: Michael Petlan <[email protected]> Report-Link: https://www.mail-archive.com/[email protected]/msg02348.html Signed-off-by: Ravi Bangoria <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Tested-by: Jiri Olsa <[email protected]> Acked-by: Masami Hiramatsu <[email protected]> Acked-by: Naveen N. Rao <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Hemant Kumar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Cc: Yauheni Kaliuta <[email protected]> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1299021 Link: http://lkml.kernel.org/r/1470214725-5023-2-git-send-email-ravi.bangoria@linux.vnet.ibm.com [ Rename 'die' to 'cu_die' to avoid shadowing a die() definition on at least centos 5, Debian 7 and ubuntu:12.04.5] [ Use PRIx64 instead of lx to format a Dwarf_Addr, aka long long unsigned int, fixing the build on 32-bit systems ] [ dwarf_getsrclines() expects a size_t * argument ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-09-01perf probe: Add helper function to check if probe with variableRavi Bangoria2-7/+17
Introduce helper function instead of inline code and replace hardcoded strings "$vars" and "$params" with their corresponding macros. perf_probe_with_var() is not declared as static since it will be called from different file in subsequent patch. Signed-off-by: Ravi Bangoria <[email protected]> Acked-by: Masami Hiramatsu <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Hemant Kumar <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/1470214725-5023-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-09-01perf symbols: Fixup symbol sizes before picking best onesArnaldo Carvalho de Melo2-2/+2
When we call symbol__fixup_duplicate() we use algorithms to pick the "best" symbols for cases where there are various functions/aliases to an address, and those check zero size symbols, which, before calling symbol__fixup_end() are _all_ symbols in a just parsed kallsyms file. So first fixup the end, then fixup the duplicates. Found while trying to figure out why 'perf test vmlinux' failed, see the output of 'perf test -v vmlinux' to see cases where the symbols picked as best for vmlinux don't match the ones picked for kallsyms. Cc: Anton Blanchard <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Fixes: 694bf407b061 ("perf symbols: Add some heuristics for choosing the best duplicate symbol") Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-09-01perf symbols: Check symbol_conf.allow_aliases for kallsyms loading tooArnaldo Carvalho de Melo2-2/+4
We can allow aliases to be kept, but we were checking this just when loading vmlinux files, be consistent, do it for any symbol table loading code that calls symbol__fixup_duplicate() by making this function check .allow_aliases instead. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Fixes: 680d926a8cb0 ("perf symbols: Allow symbol alias when loading map for symbol name") Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-09-01perf probe: Support probing on offline cross-arch binaryMasami Hiramatsu5-11/+83
Support probing on offline cross-architecture binary by adding getting the target machine arch from ELF and choose correct register string for the machine. Here is an example: ----- $ perf probe --vmlinux=./vmlinux-arm --definition 'do_sys_open $params' p:probe/do_sys_open do_sys_open+0 dfd=%r5:s32 filename=%r1:u32 flags=%r6:s32 mode=%r3:u16 ----- Here, we can get probe/do_sys_open from above and append it to to the target machine's tracing/kprobe_events file in the tracefs mountput, usually /sys/kernel/debug/tracing/kprobe_events (or /sys/kernel/tracing/kprobe_events). Signed-off-by: Masami Hiramatsu <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/147214229717.23638.6440579792548044658.stgit@devbox [ Add definition for EM_AARCH64 to fix the build on at least centos 6, debian 7 & ubuntu 12.04.5 ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-09-01perf probe: Ignore vmlinux buildid if offline kernel is givenMasami Hiramatsu2-1/+5
Ignore the buildid of running kernel when both of --definition and --vmlinux is given because that kernel should be off-line. This also skips post-processing of kprobe event for relocating symbol and checking blacklist, because it can not be done on off-line kernel. E.g. without this fix perf shows an error as below ---- $ perf probe --vmlinux=./vmlinux-arm --definition do_sys_open ./vmlinux-arm with build id 7a1f76dd56e9c4da707cd3d6333f50748141434b not found, continuing without symbols Failed to find symbol do_sys_open in kernel Error: Failed to add events. ---- with this fix, we can get the definition ---- $ perf probe --vmlinux=./vmlinux-arm --definition do_sys_open p:probe/do_sys_open do_sys_open+0 ---- Signed-off-by: Masami Hiramatsu <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/147214228193.23638.12581984840822162131.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-09-01perf probe: Show trace event definitionMasami Hiramatsu2-0/+47
Add --definition/-D option for showing the trace-event definition in stdout. This can be useful in debugging or combined with a shell script. e.g. ---- # perf probe --definition 'do_sys_open $params' p:probe/do_sys_open _text+2261728 dfd=%di:s32 filename=%si:u64 flags=%dx:s32 mode=%cx:u16 ---- Suggested-and-Tested-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Masami Hiramatsu <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/147214226712.23638.2240534040014013658.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-09-01perf symbols: Demangle symbols for synthesized @plt entries.Milian Wolff1-29/+52
The symbols in the synthesized @plt entries where not demangled before, i.e. we could end up with entries such as: $ perf report Samples: 7K of event 'cycles:ppp', Event count (approx.): 6223833141 Children Self Command Shared Object Symbol - 93.63% 28.89% lab_mandelbrot lab_mandelbrot [.] main - 73.81% main - 33.57% hypot 27.76% __hypot_finite 15.97% __muldc3 2.90% __muldc3@plt 2.40% _ZNK6QImage6heightEv@plt + 2.14% QColor::rgb 1.94% _ZNK6QImage5widthEv@plt 1.92% cabs@plt This patch remedies this issue by also applying demangling to the synthesized symbols. The output for the above is now: $ perf report Samples: 7K of event 'cycles:ppp', Event count (approx.): 6223833141 Children Self Command Shared Object Symbol - 93.63% 28.89% lab_mandelbrot lab_mandelbrot [.] main - 73.81% main - 33.57% hypot 27.76% __hypot_finite 15.97% __muldc3 2.90% __muldc3@plt 2.40% QImage::height() const@plt + 2.14% QColor::rgb 1.94% QImage::width() const@plt 1.92% cabs@plt Signed-off-by: Milian Wolff <[email protected]> LPU-Reference: [email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-09-01perf probe: Do not use map_load filters for functionArnaldo Carvalho de Melo1-19/+10
It is simpler to just do the loop, no need for globals and the last user of such facility disappears. Testing: # perf probe -F [a-z]*recvmsg aead_recvmsg compat_SyS_recvmsg compat_sys_recvmsg hash_recvmsg inet_recvmsg kernel_recvmsg netlink_recvmsg packet_recvmsg ping_recvmsg raw_recvmsg rawv6_recvmsg rng_recvmsg security_socket_recvmsg selinux_socket_recvmsg skcipher_recvmsg sock_common_recvmsg sock_no_recvmsg sock_recvmsg sys_recvmsg tcp_recvmsg udp_recvmsg udpv6_recvmsg unix_dgram_recvmsg unix_seqpacket_recvmsg unix_stream_recvmsg # Without filters: # perf probe -F | tail -5 zswap_pool_create zswap_pool_current zswap_update_total_size zswap_writeback_entry zswap_zpool_param_set # # perf probe -F | wc -l 33311 # Acked-by: Masami Hiramatsu <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-30perf symbols: Rename ->ignore to ->idleArnaldo Carvalho de Melo2-3/+3
Since this is the only use thus far, and this mechanism is in place for a long time. To clarify why symbols should be skipped or treated differently, name it for the only use it has. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-30perf annotate: Initialize the priv are in symbol__new()Arnaldo Carvalho de Melo4-9/+27
We need to initializa some fields (right now just a mutex) when we allocate the per symbol annotation struct, so do it at the symbol constructor instead of (ab)using the filter mechanism for that. This way we remove one of the few cases we have for that symbol filter, which will eventually led to removing it. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-24perf tools: Fix error handling of lzma decompressionShawn Lin1-6/+9
lzma_decompress_to_file() never actually closes the file pointer, let's fix it. Signed-off-by: Shawn Lin <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Make err = -1, the common case, set it to 0 before the error label ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-24perf probe: Remove unused tracing_dir variableMasami Hiramatsu1-3/+2
Remove unused tracing_dir variable from open_probe_events(). Signed-off-by: Masami Hiramatsu <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/147201827792.5713.4165387506020511920.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-23perf bpf: Fix typo: "ehough" -> "enough"Colin Ian King1-1/+1
Trivial typo fix in pr_debug message Signed-off-by: Colin King <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: He Kuang <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-23perf probe: Use hexadecimal type by default if possibleMasami Hiramatsu1-1/+2
Use hexadecimal type by default if it is available on current running kernel. This keeps the default behavior of perf probe after changing the output format of 'u8/16/32/64' to unsigned decimal number. Signed-off-by: Masami Hiramatsu <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Hemant Kumar <[email protected]> Cc: Naohiro Aota <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/147151074685.12957.16415861010796255514.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-23perf probe: Support hexadecimal castingMasami Hiramatsu1-7/+10
Support hexadecimal unsigned integer casting by 'x'. This allows user to explicitly specify the output format of the probe arguments as hexadecimal. Signed-off-by: Masami Hiramatsu <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Hemant Kumar <[email protected]> Cc: Naohiro Aota <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/147151072679.12957.4458656416765710753.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-23perf probe: Add supported for type casting by the running kernelMasami Hiramatsu3-0/+68
Add a checking routine what types are supported by the running kernel by finding the pattern in <debugfs>/tracing/README. Signed-off-by: Masami Hiramatsu <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Hemant Kumar <[email protected]> Cc: Naohiro Aota <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/147151071172.12957.3340095690753291085.stgit@devbox [ 'enum probe_type' has no negative entries, so ends up as 'unsigned', remove '< 0' test to fix the build on at least centos:5, debian:7 & ubuntu:12.04.5 ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-23perf tools: Use __weak definition from linux/compiler.hRui Teng1-2/+1
Replace __attribute__((weak)) with __weak definition Signed-off-by: Rui Teng <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-23perf report: Allow configuring the default sort order in ~/.perfconfigArnaldo Carvalho de Melo2-2/+2
Allows changing the default sort order from "comm,dso,symbol" to some other default, for instance "sym,dso" may be more fitting for kernel developers. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-23perf disassemble: Extract logic to find file to pass to objdump to a ↵Arnaldo Carvalho de Melo1-22/+32
separate function Disentangling this a bit further, more to come. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-23perf disassemble: Simplify logic for picking the filename to disassembleArnaldo Carvalho de Melo1-25/+16
Lots of changes to support kcore, compressed modules, build-id files left us with some spaguetti code, simplify it a bit, more to come. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-23perf disassemble: Move check for kallsyms + !kcoreArnaldo Carvalho de Melo1-8/+8
We don't need to do all that filename logic to then just have to test something unrelated and bail out, move it to the start of the function. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-23perf hists: Add support for header spanJiri Olsa2-3/+5
Add span argument for header callback function. The handling of this argument is completely in the hands of the callback. The only thing the caller ensures is it's zeroed on the beginning. Omitting span skipping in hierarchy headers and gtk code. The c2c code use this to span header lines based on the entries span configuration. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-23perf hists: Add line argument into perf_hpp_fmt's header callbackJiri Olsa2-3/+4
Adding line argument into perf_hpp_fmt's header callback to be able to request specific header line. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-23perf hists: Introduce nr_header_lines into struct perf_hpp_listJiri Olsa1-0/+1
Currently we support just single line headers, this is first step to allow more. Store the number of header lines in perf_hpp_list, which encompasses all the display/sort entries and is thus suitable to hold this value. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-23perf timechart: Use NSEC_PER_U?SECArnaldo Carvalho de Melo1-5/+6
Following kernel practices, using linux/time64.h Cc: Adrian Hunter <[email protected]> Cc: Arjan van de Ven <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Stanislav Fomichev <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-23tools: Introduce tools/include/linux/time64.h for *SEC_PER_*SEC macrosArnaldo Carvalho de Melo5-14/+11
And remove it from tools/perf/{perf,util}.h, making code that needs these macros to include linux/time64.h instead, to match how this is used in the kernel sources. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-18perf evsel: Do not access outside hw cache name arraysArnaldo Carvalho de Melo1-3/+3
We have to check if the values are >= *_MAX, not just >, fix it. From the bugzilla report: ''In file /tools/perf/util/evsel.c function __perf_evsel__hw_cache_name it appears that there is a bug that reads beyond the end of the buffer. The statement "if (type > PERF_COUNT_HW_CACHE_MAX)" allows type to be equal to the maximum value. Later, when statement "if (!perf_evsel__is_cache_op_valid(type, op))" is executed, the function can access array perf_evsel__hw_cache_stat[type] beyond the end of the buffer. It appears to me that the statement "if (type > PERF_COUNT_HW_CACHE_MAX)" should be "if (type >= PERF_COUNT_HW_CACHE_MAX)" Bug found with Coverity and manual code review. No attempts were made to execute the code with a maximum type value.'' Committer note: Testing it: $ perf record -e $(echo $(perf list cache | cut -d \[ -f1) | sed 's/ /,/g') usleep 1 [ perf record: Woken up 16 times to write data ] [ perf record: Captured and wrote 0.023 MB perf.data (34 samples) ] $ perf evlist L1-dcache-load-misses L1-dcache-loads L1-dcache-stores L1-icache-load-misses LLC-load-misses LLC-loads LLC-store-misses LLC-stores branch-load-misses branch-loads dTLB-load-misses dTLB-loads dTLB-store-misses dTLB-stores iTLB-load-misses iTLB-loads node-load-misses node-loads node-store-misses node-stores $ perf list cache List of pre-defined events (to be used in -e): L1-dcache-load-misses [Hardware cache event] L1-dcache-loads [Hardware cache event] L1-dcache-stores [Hardware cache event] L1-icache-load-misses [Hardware cache event] LLC-load-misses [Hardware cache event] LLC-loads [Hardware cache event] LLC-store-misses [Hardware cache event] LLC-stores [Hardware cache event] branch-load-misses [Hardware cache event] branch-loads [Hardware cache event] dTLB-load-misses [Hardware cache event] dTLB-loads [Hardware cache event] dTLB-store-misses [Hardware cache event] dTLB-stores [Hardware cache event] iTLB-load-misses [Hardware cache event] iTLB-loads [Hardware cache event] node-load-misses [Hardware cache event] node-loads [Hardware cache event] node-store-misses [Hardware cache event] node-stores [Hardware cache event] $ Reported-by: Brian Sweeney <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=153351 Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-16perf unwind: Use addr_location::addr instead of ip for entriesMilian Wolff2-2/+2
This fixes the srcline translation for call chains of user space applications. Before we got: perf report --stdio --no-children -s sym,srcline -g address 8.92% [.] main mandelbrot.h:41 | |--3.70%--main +8390240 | __libc_start_main +139950056726769 | _start +8388650 | |--2.74%--main +8390189 | --2.08%--main +8390296 __libc_start_main +139950056726769 _start +8388650 7.59% [.] main complex:1326 | |--4.79%--main +8390203 | __libc_start_main +139950056726769 | _start +8388650 | --2.80%--main +8390219 7.12% [.] __muldc3 libgcc2.c:1945 | |--3.76%--__muldc3 +139950060519490 | main +8390224 | __libc_start_main +139950056726769 | _start +8388650 | --3.32%--__muldc3 +139950060519512 main +8390224 With this patch applied, we instead get: perf report --stdio --no-children -s sym,srcline -g address 8.92% [.] main mandelbrot.h:41 | |--3.70%--main mandelbrot.h:41 | __libc_start_main +241 | _start +4194346 | |--2.74%--main mandelbrot.h:41 | --2.08%--main mandelbrot.h:41 __libc_start_main +241 _start +4194346 7.59% [.] main complex:1326 | |--4.79%--main complex:1326 | __libc_start_main +241 | _start +4194346 | --2.80%--main complex:1326 7.12% [.] __muldc3 libgcc2.c:1945 | |--3.76%--__muldc3 libgcc2.c:1945 | main mandelbrot.h:39 | __libc_start_main +241 | _start +4194346 | --3.32%--__muldc3 libgcc2.c:1945 main mandelbrot.h:39 Suggested-and-Acked-by: Namhyung Kim <[email protected]> Signed-off-by: Milian Wolff <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> LPU-Reference: [email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-15perf probe: Release resources on error when handling exit pathsArnaldo Carvalho de Melo1-3/+9
Cc: Adrian Hunter <[email protected]> Cc: Colin King <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-15perf probe: Check for dup and fdopen failuresColin Ian King1-4/+20
dup and fdopen can potentially fail, so add some extra error handling checks rather than assuming they always work. Signed-off-by: Colin King <[email protected]> Acked-by: Masami Hiramatsu <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Free resources when those functions (now being verified) fail ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-15perf symbols: Fix annotation of objects with debuginfo filesAnton Blanchard1-1/+2
Commit 73cdf0c6ea9c ("perf symbols: Record text offset in dso to calculate objdump address") started storing the offset of the text section for all DSOs: if (elf_section_by_name(elf, &ehdr, &tshdr, ".text", NULL)) dso->text_offset = tshdr.sh_addr - tshdr.sh_offset; Unfortunately this breaks debuginfo files, because we need to calculate the offset of the text section in the associated executable file. As a result perf annotate returns junk for all debuginfo files. Fix this by using runtime_ss->elf which should point at the executable when parsing a debuginfo file. Signed-off-by: Anton Blanchard <[email protected]> Reviewed-by: Naveen N. Rao <[email protected]> Tested-by: Wang Nan <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: [email protected] # v4.6+ Fixes: 73cdf0c6ea9c ("perf symbols: Record text offset in dso to calculate objdump address") Link: http://lkml.kernel.org/r/20160813115533.6de17912@kryten Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-15perf jitdump: Add the right header to get the major()/minor() definitionsArnaldo Carvalho de Melo1-0/+1
Noticed on Fedora Rawhide: $ gcc --version gcc (GCC) 6.1.1 20160721 (Red Hat 6.1.1-4) $ rpm -q glibc glibc-2.24.90-1.fc26.x86_64 $ CC /tmp/build/perf/util/jitdump.o util/jitdump.c: In function 'jit_repipe_code_load': util/jitdump.c:428:2: error: '__major_from_sys_types' is deprecated: In the GNU C Library, `major' is defined by <sys/sysmacros.h>. For historical compatibility, it is currently defined by <sys/types.h> as well, but we plan to remove this soon. To use `major', include <sys/sysmacros.h> directly. If you did not intend to use a system-defined macro `major', you should #undef it after including <sys/types.h>. [-Werror=deprecated-declarations] event->mmap2.maj = major(st.st_dev); ^~~~~ In file included from /usr/include/features.h:397:0, from /usr/include/sys/types.h:25, from util/jitdump.c:1: /usr/include/sys/sysmacros.h:87:1: note: declared here __SYSMACROS_DEFINE_MAJOR (__SYSMACROS_FST_IMPL_TEMPL) Fix it following that recomendation. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-12perf intel-pt: Fix ip compressionAdrian Hunter2-28/+40
The June 2015 Intel SDM introduced IP Compression types 4 and 6. Refer to section 36.4.2.2 Target IP (TIP) Packet - IP Compression. Existing Intel PT packet decoder did not support type 4, and got type 6 wrong. Because type 3 and type 4 have the same number of bytes, the packet 'count' has been changed from being the number of ip bytes to being the type code. That allows the Intel PT decoder to correctly decide whether to sign-extend or use the last ip. However that also meant the code had to be adjusted in a number of places. Currently hardware is not using the new compression types, so this fix has no effect on existing hardware. Signed-off-by: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-09perf probe ppc64le: Fix probe location when using DWARFRavi Bangoria2-17/+26
Powerpc has Global Entry Point and Local Entry Point for functions. LEP catches call from both the GEP and the LEP. Symbol table of ELF contains GEP and Offset from which we can calculate LEP, but debuginfo does not have LEP info. Currently, perf prioritize symbol table over dwarf to probe on LEP for ppc64le. But when user tries to probe with function parameter, we fall back to using dwarf(i.e. GEP) and when function called via LEP, probe will never hit. For example: $ objdump -d vmlinux ... do_sys_open(): c0000000002eb4a0: e8 00 4c 3c addis r2,r12,232 c0000000002eb4a4: 60 00 42 38 addi r2,r2,96 c0000000002eb4a8: a6 02 08 7c mflr r0 c0000000002eb4ac: d0 ff 41 fb std r26,-48(r1) $ sudo ./perf probe do_sys_open $ sudo cat /sys/kernel/debug/tracing/kprobe_events p:probe/do_sys_open _text+3060904 $ sudo ./perf probe 'do_sys_open filename:string' $ sudo cat /sys/kernel/debug/tracing/kprobe_events p:probe/do_sys_open _text+3060896 filename_string=+0(%gpr4):string For second case, perf probed on GEP. So when function will be called via LEP, probe won't hit. $ sudo ./perf record -a -e probe:do_sys_open ls [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.195 MB perf.data ] To resolve this issue, let's not prioritize symbol table, let perf decide what it wants to use. Perf is already converting GEP to LEP when it uses symbol table. When perf uses debuginfo, let it find LEP offset form symbol table. This way we fall back to probe on LEP for all cases. After patch: $ sudo ./perf probe 'do_sys_open filename:string' $ sudo cat /sys/kernel/debug/tracing/kprobe_events p:probe/do_sys_open _text+3060904 filename_string=+0(%gpr4):string $ sudo ./perf record -a -e probe:do_sys_open ls [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.197 MB perf.data (11 samples) ] Signed-off-by: Ravi Bangoria <[email protected]> Acked-by: Masami Hiramatsu <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ananth N Mavinakayanahalli <[email protected]> Cc: Balbir Singh <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/1470723805-5081-2-git-send-email-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-09perf probe: Add function to post process kernel trace eventsRavi Bangoria1-11/+18
Instead of inline code, introduce function to post process kernel probe trace events. Signed-off-by: Ravi Bangoria <[email protected]> Acked-by: Masami Hiramatsu <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ananth N Mavinakayanahalli <[email protected]> Cc: Balbir Singh <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/1470723805-5081-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-09perf probe: Support signedness castingNaohiro Aota1-3/+12
The 'perf probe' tool detects a variable's type and use the detected type to add a new probe. Then, kprobes prints its variable in hexadecimal format if the variable is unsigned and prints in decimal if it is signed. We sometimes want to see unsigned variable in decimal format (i.e. sector_t or size_t). In that case, we need to investigate the variable's size manually to specify just signedness. This patch add signedness casting support. By specifying "s" or "u" as a type, perf-probe will investigate variable size as usual and use the specified signedness. E.g. without this: $ perf probe -a 'submit_bio bio->bi_iter.bi_sector' Added new event: probe:submit_bio (on submit_bio with bi_sector=bio->bi_iter.bi_sector) You can now use it in all perf tools, such as: perf record -e probe:submit_bio -aR sleep 1 $ cat trace_pipe|head dbench-9692 [003] d..1 971.096633: submit_bio: (submit_bio+0x0/0x140) bi_sector=0x3a3d00 dbench-9692 [003] d..1 971.096685: submit_bio: (submit_bio+0x0/0x140) bi_sector=0x1a3d80 dbench-9692 [003] d..1 971.096687: submit_bio: (submit_bio+0x0/0x140) bi_sector=0x3a3d80 ... // need to investigate the variable size $ perf probe -a 'submit_bio bio->bi_iter.bi_sector:s64' Added new event: probe:submit_bio (on submit_bio with bi_sector=bio->bi_iter.bi_sector:s64) You can now use it in all perf tools, such as: perf record -e probe:submit_bio -aR sleep 1 With this: // just use "s" to cast its signedness $ perf probe -v -a 'submit_bio bio->bi_iter.bi_sector:s' Added new event: probe:submit_bio (on submit_bio with bi_sector=bio->bi_iter.bi_sector:s) You can now use it in all perf tools, such as: perf record -e probe:submit_bio -aR sleep 1 $ cat trace_pipe|head dbench-9689 [001] d..1 1212.391237: submit_bio: (submit_bio+0x0/0x140) bi_sector=128 dbench-9689 [001] d..1 1212.391252: submit_bio: (submit_bio+0x0/0x140) bi_sector=131072 dbench-9697 [006] d..1 1212.398611: submit_bio: (submit_bio+0x0/0x140) bi_sector=30208 This commit also update perf-probe.txt to describe "types". Most parts are based on existing documentation: Documentation/trace/kprobetrace.txt Committer note: Testing using 'perf trace': # perf probe -a 'submit_bio bio->bi_iter.bi_sector' Added new event: probe:submit_bio (on submit_bio with bi_sector=bio->bi_iter.bi_sector) You can now use it in all perf tools, such as: perf record -e probe:submit_bio -aR sleep 1 # trace --no-syscalls --ev probe:submit_bio 0.000 probe:submit_bio:(ffffffffac3aee00) bi_sector=0xc133c0) 3181.861 probe:submit_bio:(ffffffffac3aee00) bi_sector=0x6cffb8) 3181.881 probe:submit_bio:(ffffffffac3aee00) bi_sector=0x6cffc0) 3184.488 probe:submit_bio:(ffffffffac3aee00) bi_sector=0x6cffc8) <SNIP> 4717.927 probe:submit_bio:(ffffffffac3aee00) bi_sector=0x4dc7a88) 4717.970 probe:submit_bio:(ffffffffac3aee00) bi_sector=0x4dc7880) ^C[root@jouet ~]# Now, using this new feature: [root@jouet ~]# perf probe -a 'submit_bio bio->bi_iter.bi_sector:s' Added new event: probe:submit_bio (on submit_bio with bi_sector=bio->bi_iter.bi_sector:s) You can now use it in all perf tools, such as: perf record -e probe:submit_bio -aR sleep 1 [root@jouet ~]# trace --no-syscalls --ev probe:submit_bio 0.000 probe:submit_bio:(ffffffffac3aee00) bi_sector=7145704) 0.017 probe:submit_bio:(ffffffffac3aee00) bi_sector=7145712) 0.019 probe:submit_bio:(ffffffffac3aee00) bi_sector=7145720) 2.567 probe:submit_bio:(ffffffffac3aee00) bi_sector=7145728) 5631.919 probe:submit_bio:(ffffffffac3aee00) bi_sector=0) 5631.941 probe:submit_bio:(ffffffffac3aee00) bi_sector=8) 5631.945 probe:submit_bio:(ffffffffac3aee00) bi_sector=16) 5631.948 probe:submit_bio:(ffffffffac3aee00) bi_sector=24) ^C# With callchains: # trace --no-syscalls --ev probe:submit_bio/max-stack=10/ 0.000 probe:submit_bio:(ffffffffac3aee00) bi_sector=50662544) submit_bio+0xa8200001 ([kernel.kallsyms]) submit_bh+0xa8200013 ([kernel.kallsyms]) jbd2_journal_commit_transaction+0xa8200691 ([kernel.kallsyms]) kjournald2+0xa82000ca ([kernel.kallsyms]) kthread+0xa82000d8 ([kernel.kallsyms]) ret_from_fork+0xa820001f ([kernel.kallsyms]) 0.023 probe:submit_bio:(ffffffffac3aee00) bi_sector=50662552) submit_bio+0xa8200001 ([kernel.kallsyms]) submit_bh+0xa8200013 ([kernel.kallsyms]) jbd2_journal_commit_transaction+0xa8200691 ([kernel.kallsyms]) kjournald2+0xa82000ca ([kernel.kallsyms]) kthread+0xa82000d8 ([kernel.kallsyms]) ret_from_fork+0xa820001f ([kernel.kallsyms]) 0.027 probe:submit_bio:(ffffffffac3aee00) bi_sector=50662560) submit_bio+0xa8200001 ([kernel.kallsyms]) submit_bh+0xa8200013 ([kernel.kallsyms]) jbd2_journal_commit_transaction+0xa8200691 ([kernel.kallsyms]) kjournald2+0xa82000ca ([kernel.kallsyms]) kthread+0xa82000d8 ([kernel.kallsyms]) ret_from_fork+0xa820001f ([kernel.kallsyms]) 2.593 probe:submit_bio:(ffffffffac3aee00) bi_sector=50662568) submit_bio+0xa8200001 ([kernel.kallsyms]) submit_bh+0xa8200013 ([kernel.kallsyms]) journal_submit_commit_record+0xa82001ac ([kernel.kallsyms]) jbd2_journal_commit_transaction+0xa82012e8 ([kernel.kallsyms]) kjournald2+0xa82000ca ([kernel.kallsyms]) kthread+0xa82000d8 ([kernel.kallsyms]) ret_from_fork+0xa820001f ([kernel.kallsyms]) ^C# Signed-off-by: Naohiro Aota <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Hemant Kumar <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-09perf probe: Fix module name matchingKonstantin Khlebnikov1-1/+3
If module is "module" then dso->short_name is "[module]". Substring comparing is't enough: "raid10" matches to "[raid1]". This patch also checks terminating zero in module name. Signed-off-by: Konstantin Khlebnikov <[email protected]> Acked-by: Masami Hiramatsu <[email protected]> Link: http://lkml.kernel.org/r/147039975648.715620.12985971832789032159.stgit@buzz Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-09perf probe: Adjust map->reloc offset when finding kernel symbol from mapMasami Hiramatsu1-1/+1
Adjust map->reloc offset for the unmapped address when finding alternative symbol address from map, because KASLR can relocate the kernel symbol address. The same adjustment has been done when finding appropriate kernel symbol address from map which was introduced by commit f90acac75713 ("perf probe: Find given address from offline dwarf") Reported-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Masami Hiramatsu <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-09perf hists: Trim libtraceevent trace_seq buffersArnaldo Carvalho de Melo1-1/+5
When we use libtraceevent to format trace event fields into printable strings to use in hist entries it is important to trim it from the default 4 KiB it starts with to what is really used, to reduce the memory footprint, so use realloc(seq.buffer, seq.len + 1) when returning the seq.buffer formatted with the fields contents. Reported-and-Tested-by: Wang Nan <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-04Merge tag 'perf-core-for-mingo-20160803' of ↵Ingo Molnar8-65/+152
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: New features: - Add --sample-cpu to 'perf record', to explicitely ask for sampling the CPU (Jiri Olsa) Fixes: - Fix processing of multi byte chunks in objdump output, fixing disassemble processing for annotation on at least ARM64 (Jan Stancek) - Use SyS_epoll_wait in a BPF 'perf test' entry instead of sys_epoll_wait, that is not present in the DWARF info in vmlinux files (Arnaldo Carvalho de Melo) - Add -wno-shadow when processing files using perl headers, fixing the build on Fedora Rawhide and Arch Linux (Namhyung Kim) Infrastructure changes: - Annotate prep work to better catch and report errors related to using objdump to disassemble DSOs (Arnaldo Carvalho de Melo) - Add 'alloc', 'scnprintf' and 'and' methods for bitmap processing (Jiri Olsa) - Add nested output resorting callback in hists processing (Jiri Olsa) Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2016-08-02perf record: Add --sample-cpu optionJiri Olsa1-1/+1
Adding --sample-cpu option to be able to explicitly enable CPU sample type. Currently it's only enable implicitly in case the target is cpu related. It will be useful for following c2c record tool. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-02perf hists: Introduce output_resort_cb methodJiri Olsa2-3/+16
When dealing with nested hist entries it's helpful to have a way to resort those nested objects. Adding optional callback call into output_resort function and following new interface function: typedef int (*hists__resort_cb_t)(struct hist_entry *he); void hists__output_resort_cb(struct hists *hists, struct ui_progress *prog, hists__resort_cb_t cb); Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-01perf annotate: Plug filename string leakArnaldo Carvalho de Melo1-3/+2
If dso__build_id_filename(..., NULL, ...) returns !NULL its because it allocated it, so, when reaching the 'if (dso__is_kcore()) test, we already checked that and were just "fallbacking" to using dso->long_name, but without freeing filename, thus leaking it. Fix it by adding the dso__is_kcore() test to the 'or' group just after it, the one containing the full fallback code, including freeing the filename. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Fixes: ee205503f233 ("perf tools: Fix annotation with kcore") Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-01perf annotate: Introduce strerror for handling symbol__disassemble() errorsArnaldo Carvalho de Melo2-26/+62
We were just using pr_error() which makes it difficult for non stdio UIs to provide errors using its widgets, as they need to somehow catch what was passed to pr_error(). Fix it by introducing a __strerror() interface like the ones used elsewhere, for instance target__strerror(). This is just the initial step, more work will be done, but first some error handling bugs noticed while working on this need to be dealt with. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-08-01perf annotate: Rename symbol__annotate() to symbol__disassemble()Arnaldo Carvalho de Melo2-3/+3
This function will not annotate anything, it will just disassembly the given map->dso and symbol. It currently does this by parsing the output of 'objdump --disassemble', but this could conceivably be done using a library or an offshot of the kernel's instruction decoder (arch/x86/lib/inat.c), etc. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-07-30Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds4-6/+9
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "This update contains: - a fix for the bpf tools to use the new EM_BPF code - a fix for the module parser of perf to retrieve the proper text start address - add str_error_c to libapi to avoid linking against tools/lib/str_error_r.o" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tools lib api: Add str_error_c to libapi perf s390: Fix 'start' address of module's map tools lib bpf: Use official ELF e_machine value
2016-07-29perf target: str_error_r() always returns the buffer it receivesArnaldo Carvalho de Melo1-5/+1
So no need for checking if it uses the strerror_r() GNU variant error reporting mechanism, i.e. if it returns a pointer to a immutable string internal to glibc. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Fixes: c8b5f2c96d1b ("tools: Introduce str_error_r()") Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-07-29perf annotate: Use pipe + fork instead of popenArnaldo Carvalho de Melo1-4/+35
We will need to redirect the stderr as well, so open code popen as a starting point. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-07-28Merge tag 'libnvdimm-for-4.8' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm updates from Dan Williams: - Replace pcommit with ADR / directed-flushing. The pcommit instruction, which has not shipped on any product, is deprecated. Instead, the requirement is that platforms implement either ADR, or provide one or more flush addresses per nvdimm. ADR (Asynchronous DRAM Refresh) flushes data in posted write buffers to the memory controller on a power-fail event. Flush addresses are defined in ACPI 6.x as an NVDIMM Firmware Interface Table (NFIT) sub-structure: "Flush Hint Address Structure". A flush hint is an mmio address that when written and fenced assures that all previous posted writes targeting a given dimm have been flushed to media. - On-demand ARS (address range scrub). Linux uses the results of the ACPI ARS commands to track bad blocks in pmem devices. When latent errors are detected we re-scrub the media to refresh the bad block list, userspace can also request a re-scrub at any time. - Support for the Microsoft DSM (device specific method) command format. - Support for EDK2/OVMF virtual disk device memory ranges. - Various fixes and cleanups across the subsystem. * tag 'libnvdimm-for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (41 commits) libnvdimm-btt: Delete an unnecessary check before the function call "__nd_device_register" nfit: do an ARS scrub on hitting a latent media error nfit: move to nfit/ sub-directory nfit, libnvdimm: allow an ARS scrub to be triggered on demand libnvdimm: register nvdimm_bus devices with an nd_bus driver pmem: clarify a debug print in pmem_clear_poison x86/insn: remove pcommit Revert "KVM: x86: add pcommit support" nfit, tools/testing/nvdimm/: unify shutdown paths libnvdimm: move ->module to struct nvdimm_bus_descriptor nfit: cleanup acpi_nfit_init calling convention nfit: fix _FIT evaluation memory leak + use after free tools/testing/nvdimm: add manufacturing_{date|location} dimm properties tools/testing/nvdimm: add virtual ramdisk range acpi, nfit: treat virtual ramdisk SPA as pmem region pmem: kill __pmem address space pmem: kill wmb_pmem() libnvdimm, pmem: use nvdimm_flush() for namespace I/O writes fs/dax: remove wmb_pmem() libnvdimm, pmem: flush posted-write queues on shutdown ...