aboutsummaryrefslogtreecommitdiff
path: root/tools/perf/util
AgeCommit message (Collapse)AuthorFilesLines
2016-02-05perf jit: add source line info supportStephane Eranian5-8/+634
This patch adds source line information support to perf for jitted code. The source line info must be emitted by the runtime, such as JVMTI. Perf injects extract the source line info from the jitdump file and adds the corresponding .debug_lines section in the ELF image generated for each jitted function. The source line enables matching any address in the profile with a source file and line number. The improvement is visible in perf annotate with the source code displayed alongside the assembly code. The dwarf code leverages the support from OProfile which is also released under GPLv2. Copyright 2007 OProfile authors. Signed-off-by: Stephane Eranian <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Carl Love <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John McCutchan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Pawel Moll <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sonny Rao <[email protected]> Cc: Sukadev Bhattiprolu <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-05perf inject: Add jitdump mmap injection supportStephane Eranian6-0/+1316
This patch adds a --jit/-j option to perf inject. This options injects MMAP records into the perf.data file to cover the jitted code mmaps. It also emits ELF images for each function in the jidump file. Those images are created where the jitdump file is. The MMAP records point to that location as well. Typical flow: $ perf record -k mono -- java -agentpath:libpjvmti.so java_class $ perf inject --jit -i perf.data -o perf.data.jitted $ perf report -i perf.data.jitted Note that jitdump.h support is not limited to Java, it works with any jitted environment modified to emit the jitdump file format, include those where code can be jitted multiple times and moved around. The jitdump.h format is adapted from the Oprofile project. The genelf.c (ELF binary generation) depends on MD5 hash encoding for the buildid. To enable this, libssl-dev must be installed. If not, then genelf.c defaults to using urandom to generate the buildid, which is not ideal. The Makefile auto-detects the presence on libssl-dev. This version mmaps the jitdump file to create a marker MMAP record in the perf.data file. The marker is used to detect jitdump and cause perf inject to inject the jitted mmaps and generate ELF images for jitted functions. In V8, the following fixes and changes were made among other things: - the jidump header format include a new flags field to be used to carry information about the configuration of the runtime agent. Contributed by: Adrian Hunter <[email protected]> - Fix mmap pgoff: MMAP event pgoff must be the offset within the ELF file at which the code resides. Contributed by: Adrian Hunter <[email protected]> - Fix ELF virtual addresses: perf tools expect the ELF virtual addresses of dynamic objects to match the file offset. Contributed by: Adrian Hunter <[email protected]> - JIT MMAP injection does not obey finished_round semantics. JIT MMAP injection injects all MMAP events in one go, so it does not obey finished_round semantics, so drop the finished_round events from the output perf.data file. Contributed by: Adrian Hunter <[email protected]> Signed-off-by: Stephane Eranian <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Carl Love <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John McCutchan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Pawel Moll <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sonny Rao <[email protected]> Cc: Sukadev Bhattiprolu <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Moved inject.build_ids ordering bits to a separate patch, fixed the NO_LIBELF=1 build ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-05perf symbols: add Java demangling supportStephane Eranian4-0/+213
Add Java function descriptor demangling support. Something bfd cannot do. Use the JAVA_DEMANGLE_NORET flag to avoid decoding the return type of functions. Signed-off-by: Stephane Eranian <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Carl Love <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John McCutchan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Pawel Moll <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sonny Rao <[email protected]> Cc: Sukadev Bhattiprolu <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-05perf tools: handle spaces in file names obtained from /proc/pid/mapsMarcin Ślusarz1-1/+1
Steam frequently puts game binaries in folders with spaces. Note: "(deleted)" markers are now treated as part of the file name. Signed-off-by: Marcin Ślusarz <[email protected]> Acked-by: Namhyung Kim <[email protected]> Fixes: 6064803313ba ("perf tools: Use sscanf for parsing /proc/pid/maps") Link: http://lkml.kernel.org/r/20160119190303.GA17579@marcin-Inspiron-7720 Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-04Merge tag 'perf-core-for-mingo' of ↵Ingo Molnar3-98/+219
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: New features: - Add 'L' hotkey to dynamicly set the percent threshold for histogram entries and callchains, i.e. dynamicly do what the --percent-limit command line option to 'top' and 'report' does. (Namhyung Kim) Infrastructure changes: - Per hists field and sort lists, that will be used, for instance, in the c2c tool (Jiri Olsa) Documentation changes: - Update documentation of --sort and --perf-limit options for 'perf report' (Namhyung Kim) Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2016-02-04Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar5-26/+63
Signed-off-by: Ingo Molnar <[email protected]>
2016-02-04Merge tag 'perf-urgent-for-mingo-2' of ↵Ingo Molnar1-0/+10
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fix from Arnaldo Carvalho de Melo: - Fix 'perf stat' interval output values (Jiri Olsa) Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2016-02-04Merge tag 'perf-urgent-for-mingo' of ↵Ingo Molnar4-26/+53
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/urgent fixes from Arnaldo Carvalho de Melo: - tracepoint_error() can receive e=NULL, robustify it, fixes a problem noticed with a very specific combination: Machine with Intel PT (e.g. Broadwell), kernel with no perf_event_attr.context_switch feature (e.g. 4.2) and unreadable tracefs (for instance !root users), making the fallback from perf_event_attr.context_switch to the sched:sched_switch tracepoint to fail reading its info from tracefs, fix it. (Adrian Hunter) - Fix segfault in intel PT, by making it follow the 'struct thread' lifetime cycle checking expectations, noticed for instance, when processing perf.data files with Intel PT data using 'perf script' and when exiting 'perf report' (Adrian Hunter) - Fix CFI usage from .eh_frame and .debug_frame, which sometimes requires that we fallback from .eh_frame to .debug_frame in architectures such as PowerPC (Hemant Kumar) Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2016-02-03perf stat: Fix interval output valuesJiri Olsa1-0/+10
We broke interval data displays with commit: 3f416f22d1e2 ("perf stat: Do not clean event's private stats") This commit removed stats cleaning, which is important for '-r' option to carry counters data over the whole run. But it's necessary to clean it for interval mode, otherwise the displayed value is avg of all previous values. Before: $ perf stat -e cycles -a -I 1000 record # time counts unit events 1.000240796 75,216,287 cycles 2.000512791 107,823,524 cycles $ perf stat report # time counts unit events 1.000240796 75,216,287 cycles 2.000512791 91,519,906 cycles Now: $ perf stat report # time counts unit events 1.000240796 75,216,287 cycles 2.000512791 107,823,524 cycles Notice the second value being bigger (91,.. < 107,..). This could be easily verified by using perf script which displays raw stat data: $ perf script CPU THREAD VAL ENA RUN TIME EVENT 0 -1 23855779 1000209530 1000209530 1000240796 cycles 1 -1 33340397 1000224964 1000224964 1000240796 cycles 2 -1 15835415 1000226695 1000226695 1000240796 cycles 3 -1 2184696 1000228245 1000228245 1000240796 cycles 0 -1 97014312 2000514533 2000514533 2000512791 cycles 1 -1 46121497 2000543795 2000543795 2000512791 cycles 2 -1 32269530 2000543566 2000543566 2000512791 cycles 3 -1 7634472 2000544108 2000544108 2000512791 cycles The sum of the first 4 values is the first interval aggregated value: 23855779 + 33340397 + 15835415 + 2184696 = 75,216,287 The sum of the second 4 values minus first value is the second interval aggregated value: 97014312 + 46121497 + 32269530 + 7634472 - 75216287 = 107,823,524 Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03perf hists: Introduce hists__for_each_sort_list macroJiri Olsa2-3/+9
With the hist object having the perf_hpp_list we can now iterate sort format entries based in the hists object. Adding hists__for_each_sort_list macro to do that. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03perf hists: Introduce hists__for_each_format macroJiri Olsa1-0/+3
With the hist object having the perf_hpp_list we can now iterate output format entries based in the hists object. Adding hists__for_each_format macro to do that. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03perf tools: Add hpp_list into struct hists objectJiri Olsa2-3/+5
Adding hpp_list into struct hists object. Initializing struct hists_evsel hists object to carry global perf_hpp_list list. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03perf hists: Add struct perf_hpp_list argument to helper functionsJiri Olsa2-6/+7
Adding struct perf_hpp_list argument to following helper functions: void perf_hpp__setup_output_field(struct perf_hpp_list *list); void perf_hpp__reset_output_field(struct perf_hpp_list *list); void perf_hpp__append_sort_keys(struct perf_hpp_list *list); so they could be used on hists's hpp_list. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03perf hists: Introduce perf_hpp_list__for_each_sort_list_safe macroJiri Olsa1-2/+2
Introducing perf_hpp_list__for_each_sort_list_safe macro to iterate perf_hpp_list object's sort entries safely. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03perf hists: Introduce perf_hpp_list__for_each_sort_list macroJiri Olsa2-5/+5
Introducing perf_hpp_list__for_each_sort_list macro to iterate perf_hpp_list object's sort entries. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03perf hists: Introduce perf_hpp_list__for_each_format_safe macroJiri Olsa1-2/+2
Introducing perf_hpp_list__for_each_format_safe macro to iterate perf_hpp_list object's output entries safely. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03perf hists: Introduce perf_hpp_list__for_each_format macroJiri Olsa2-6/+6
Introducing perf_hpp_list__for_each_format macro to iterate perf_hpp_list object's output entries. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03perf hists: Pass perf_hpp_list all the way through setup_output_listJiri Olsa1-15/+18
Passing perf_hpp_list all the way through setup_output_list so the output entry could be added on the arbitrary list. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03perf hists: Add perf_hpp_list register helpersJiri Olsa1-3/+15
Adding 2 perf_hpp_list register helpers: perf_hpp_list__column_register() perf_hpp_list__register_sort_field() to be called within existing helpers: perf_hpp__column_register() perf_hpp__register_sort_field() to register format entries within global perf_hpp_list object. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03perf hists: Introduce perf_hpp_list__init functionJiri Olsa2-0/+8
Introducing perf_hpp_list__init function to have an easy way to initialize perf_hpp_list struct. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03perf hists: Introduce struct perf_hpp_listJiri Olsa1-6/+10
Gather output and sort lists under struct perf_hpp_list, so we could have multiple instancies of sort/output format entries. Replacing current perf_hpp__list and perf_hpp__sort_list lists with single perf_hpp_list instance. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Renamed fields to .{fields,sorts} as suggested by Namhyung and acked by Jiri ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03perf hists: Separate output fields parsing into setup_output_list functionJiri Olsa1-12/+22
Separating output fields parsing into setup_output_list function, so it's separated from field_order string setup and could be reused later in following patches. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03perf hists: Separate sort fields parsing into setup_sort_list functionJiri Olsa1-12/+22
Separating sort fields parsing into setup_sort_list function, so it's separated from sort_order string setup and could be reused later in following patches. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03perf hists: Properly release format fieldsJiri Olsa2-0/+25
With multiple list holding format entries, we need the support properly releasing format output/sort fields. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03perf hists: Remove perf_hpp__column_(disable|enable)Jiri Olsa1-2/+0
Those functions are no longer needed. They operate over perf_hpp__format array which is now used only as template for dynamic entries. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03perf hists: Allocate output sort fieldJiri Olsa1-8/+33
Currently we use static output fields, because we have single global list of all sort/output fields. We will add hists specific sort and output lists in following patches, so we need all format entries to be dynamically allocated. Adding support to allocate output sort field. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03perf hists: Add 'equal' method to perf_hpp_fmt structJiri Olsa2-20/+21
To easily compare format entries and make it available for all kinds of format entries. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03perf hists: Add _idx fields into struct perf_hpp_fmtJiri Olsa1-0/+1
Currently there's no way of comparing hpp format entries, which is needed in following patches. Adding _idx fields into struct perf_hpp_fmt to recognize and be able to compare hpp format entries. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03perf hists: Introduce perf_evsel__output_resort functionJiri Olsa2-3/+8
Adding evsel specific function to sort hists_evsel based hists. The hists__output_resort can be now used to sort common hists object. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03perf hists: Factor output_resort from hists__output_resortJiri Olsa1-8/+15
Currently hists__output_resort() depends on hists based on hists_evsel struct, but we need to be able to sort common hists as well. Cutting out the sorting base sorting code into output_resort function, so it can be reused in following patch. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03Merge tag 'perf-core-for-mingo-3' of ↵Ingo Molnar1-4/+20
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core callchain fixes and improvements from Arnaldo Carvalho de Melo <[email protected]: User visible changes: - Make --percent-limit apply to callchains also and fix some bugs related to --percent-limit (Namhyung Kim) Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2016-02-03Merge tag 'perf-core-for-mingo-2' of ↵Ingo Molnar6-7/+38
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf tooling changes from Arnaldo Carvalho de Melo: New features: - Port 'perf kvm stat' to PowerPC (Hemant Kumar) Infrastructure changes: - Use the 'feature-dump' target to do the feature checks just once and then add code to reuse that in the tests/make makefile, speeding up the 'make -C tools/perf build-test' target (Wang Nan) - Reduce the number of tests the 'build-test' target do to those that don't pollute the source tree (Arnaldo Carvalho de Melo) - Improve the output of the build tests a bit by aligning the name of the tests, more can be done to filter out uninteresting info in the output (Arnaldo Carvalho de Melo) - Add perf_evlist pointer to *info_priv_size(), more prep work for supporting the coresight architecture (Mathieu Poirier) - Improve the 'perf test bp_signal' test (Wang Nan) - Check environment before starting the BPF 'perf test', so that we can just 'Skip' older kernels instead of 'FAIL'ing them (Wang Nan) - Fix cpumode of synthesized buildid event (Wang Nan) Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2016-02-03Merge tag 'perf-core-for-mingo' of ↵Ingo Molnar14-117/+115
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: User visible changes: - Rename the "colors.code" ~/.perfconfig variable to "colors.jump_arrows", as it controls just the that UI element in the annotate browser (Taeung Song) - Avoid trying to read ELF symtabs from device files, noticed while doing memory profiling work (Jiri Olsa) - Improve context detection when offering options in the hists browser, i.e. some options don't make sense when the browser is not working with a perf.data file ('perf top' mode), only in 'perf report' mode, like scripting (Namhyung Kim) Infrastructure changes: - Elliminate duplication in the hists browser filter functions, getting the common part into a function that receives callbacks for filtering by DSO, thread, etc. (Namhyung Kim) - Fix misleadingly indented assignment, found using gcc6 -Wmisleading-indentation (Markus Trippelsdorf) - Handle LLVM relocation oddities in libbpf, introducing a 'perf test' that detects such problems and then fixing the problem, so that the test now passes (Wang Nan) - More improvements to the build infrastructure to allow reusing the feature detection facilities (Wang Nan) - Auto initialize the globals needed by cpu__max_{cpu,node}() routines (Arnaldo Carvalho de Melo) Documentation changes: - Document the perf sysctls in Documentation/sysctl/kernel.txt (Ben Hutchings) - Document a bunch more ~/.perfconfig knobs (Taeung Song) Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2016-02-02perf probe: Search both .eh_frame and .debug_frame sections for probe locationHemant Kumar2-26/+41
'perf probe' through debuginfo__find_probes() in util/probe-finder.c checks for the functions' frame descriptions in either .eh_frame section of an ELF or the .debug_frame. The check is based on whether either one of these sections is present. Depending on distro, toolchain defaults, architetcutre, build flags, etc., CFI might be found in either .eh_frame and/or .debug_frame. Sometimes, it may happen that, .eh_frame, even if present, may not be complete and may miss some descriptions. Therefore, to be sure, to find the CFI covering an address we will always have to investigate both if available. For e.g., in powerpc, this may happen: $ gcc -g bin.c -o bin $ objdump --dwarf ./bin <1><145>: Abbrev Number: 7 (DW_TAG_subprogram) <146> DW_AT_external : 1 <146> DW_AT_name : (indirect string, offset: 0x9e): main <14a> DW_AT_decl_file : 1 <14b> DW_AT_decl_line : 39 <14c> DW_AT_prototyped : 1 <14c> DW_AT_type : <0x57> <150> DW_AT_low_pc : 0x100007b8 If the .eh_frame and .debug_frame are checked for the same binary, we will find that, .eh_frame (although present) doesn't contain a description for "main" function. But, .debug_frame has a description: 000000d8 00000024 00000000 FDE cie=00000000 pc=100007b8..10000838 DW_CFA_advance_loc: 16 to 100007c8 DW_CFA_def_cfa_offset: 144 DW_CFA_offset_extended_sf: r65 at cfa+16 ... Due to this (since, perf checks whether .eh_frame is present and goes on searching for that address inside that frame), perf is unable to process the probes: # perf probe -x ./bin main Failed to get call frame on 0x100007b8 Error: Failed to add events. To avoid this issue, we need to check both the sections (.eh_frame and .debug_frame), which is done in this patch. Note that, we can always force everything into both .eh_frame and .debug_frame by: $ gcc bin.c -fasynchronous-unwind-tables -fno-dwarf2-cfi-asm -g -o bin Signed-off-by: Hemant Kumar <[email protected]> Acked-by: Masami Hiramatsu <[email protected]> Cc: [email protected] Cc: Mark Wielaard <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Srikar Dronamraju <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-02perf tools: Fix thread lifetime related segfaut in intel_ptAdrian Hunter1-0/+9
intel_pt_process_auxtrace_info() creates a pt->unknown_thread thread that eventually needs to be freed by the last thread__put() on it, when its refcount hits zero, which may happen in intel_pt_process_auxtrace_info() error handling path and triggers the following segfault, which would happen as well at intel_pt_free, when tools using this intel_pt codebase frees up resources: # perf record -I -e intel_pt/tsc=1,noretcomp=1/u /bin/ls 0 a anaconda-ks.cfg bin perf.data perf.data.old perf-f23-bringup.todo [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.217 MB perf.data ] # # perf script -F event,comm,pid,tid,time,addr,ip,sym,dso,iregs Samples for 'instructions:u' event do not have IREGS attribute set. Cannot print 'iregs' field. intel_pt_synth_events: failed to synthesize 'instructions' event type Segmentation fault (core dumped) # The problem is: there's a union in 'struct thread' combines a list_head and a rb_node. The standard life cycle of a thread is: init rb_node in the constructor, insert it into machine->threads rbtree using rb_node, move it to machine->dead_threads using list_head, clean in the last thread__put: list_del_init(&thread->node). In the above command, it clean a thread before adding it into list, causes the above segfault. Since pt->unknown_thread will never live in an rbtree, initialize its list node so that when list_del_init() is done on it we don't segfault. After this patch: # perf script -F event,comm,pid,tid,time,addr,ip,sym,dso,iregs Samples for 'instructions:u' event do not have IREGS attribute set. Cannot print 'iregs' field. intel_pt_synth_events: failed to synthesize 'instructions' event type 0x248 [0x88]: failed to process type: 70 # Reported-by: Tong Zhang <[email protected]> Reported-by: Wang Nan <[email protected]> Signed-off-by: Adrian Hunter <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Josh Poimboeuf <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-01perf hists: Update hists' total period when adding entriesNamhyung Kim1-2/+9
Currently the hist entry addition path doesn't update total_period of hists and it's calculated during 'resort' path. But the resort path needs to know the total period before doing its job because it's used for calculating percent limit of callchains in hist entries. So this patch update the total period during the addition path. It makes the percent limit of callchains working (again). Signed-off-by: Namhyung Kim <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-01perf hists: Fix min callchain hits calculationNamhyung Kim1-2/+11
The total period should be get using hists__total_period() since it takes filtered entries into account. In addition, if callchain mode is 'fractal', the total period should be the entry's period. Signed-off-by: Namhyung Kim <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-01perf tools: tracepoint_error() can receive e=NULL, robustify itAdrian Hunter1-0/+3
Fixes segmentation fault using, for instance: (gdb) run record -I -e intel_pt/tsc=1,noretcomp=1/u /bin/ls Starting program: /home/acme/bin/perf record -I -e intel_pt/tsc=1,noretcomp=1/u /bin/ls Missing separate debuginfos, use: dnf debuginfo-install glibc-2.22-7.fc23.x86_64 [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Program received signal SIGSEGV, Segmentation fault. 0 x00000000004b9ea5 in tracepoint_error (e=0x0, err=13, sys=0x19b1370 "sched", name=0x19a5d00 "sched_switch") at util/parse-events.c:410 (gdb) bt #0 0x00000000004b9ea5 in tracepoint_error (e=0x0, err=13, sys=0x19b1370 "sched", name=0x19a5d00 "sched_switch") at util/parse-events.c:410 #1 0x00000000004b9fc5 in add_tracepoint (list=0x19a5d20, idx=0x7fffffffb8c0, sys_name=0x19b1370 "sched", evt_name=0x19a5d00 "sched_switch", err=0x0, head_config=0x0) at util/parse-events.c:433 #2 0x00000000004ba334 in add_tracepoint_event (list=0x19a5d20, idx=0x7fffffffb8c0, sys_name=0x19b1370 "sched", evt_name=0x19a5d00 "sched_switch", err=0x0, head_config=0x0) at util/parse-events.c:498 #3 0x00000000004bb699 in parse_events_add_tracepoint (list=0x19a5d20, idx=0x7fffffffb8c0, sys=0x19b1370 "sched", event=0x19a5d00 "sched_switch", err=0x0, head_config=0x0) at util/parse-events.c:936 #4 0x00000000004f6eda in parse_events_parse (_data=0x7fffffffb8b0, scanner=0x19a49d0) at util/parse-events.y:391 #5 0x00000000004bc8e5 in parse_events__scanner (str=0x663ff2 "sched:sched_switch", data=0x7fffffffb8b0, start_token=258) at util/parse-events.c:1361 #6 0x00000000004bca57 in parse_events (evlist=0x19a5220, str=0x663ff2 "sched:sched_switch", err=0x0) at util/parse-events.c:1401 #7 0x0000000000518d5f in perf_evlist__can_select_event (evlist=0x19a3b90, str=0x663ff2 "sched:sched_switch") at util/record.c:253 #8 0x0000000000553c42 in intel_pt_track_switches (evlist=0x19a3b90) at arch/x86/util/intel-pt.c:364 #9 0x00000000005549d1 in intel_pt_recording_options (itr=0x19a2c40, evlist=0x19a3b90, opts=0x8edf68 <record+232>) at arch/x86/util/intel-pt.c:664 #10 0x000000000051e076 in auxtrace_record__options (itr=0x19a2c40, evlist=0x19a3b90, opts=0x8edf68 <record+232>) at util/auxtrace.c:539 #11 0x0000000000433368 in cmd_record (argc=1, argv=0x7fffffffde60, prefix=0x0) at builtin-record.c:1264 #12 0x000000000049bec2 in run_builtin (p=0x8fa2a8 <commands+168>, argc=5, argv=0x7fffffffde60) at perf.c:390 #13 0x000000000049c12a in handle_internal_command (argc=5, argv=0x7fffffffde60) at perf.c:451 #14 0x000000000049c278 in run_argv (argcp=0x7fffffffdcbc, argv=0x7fffffffdcb0) at perf.c:495 #15 0x000000000049c60a in main (argc=5, argv=0x7fffffffde60) at perf.c:618 (gdb) Intel PT attempts to find the sched:sched_switch tracepoint but that seg faults if tracefs is not readable, because the error reporting structure is null, as errors are not reported when automatically adding tracepoints. Fix by checking before using. Committer note: This doesn't take place in a kernel that supports perf_event_attr.context_switch, that is the default way that will be used for tracking context switches, only in older kernels, like 4.2, in a machine with Intel PT (e.g. Broadwell) for non-priviledged users. Further info from a similar patch by Wang: The error is in tracepoint_error: it assumes the 'e' parameter is valid. However, there are many situation a parse_event() can be called without parse_events_error. See result of $ grep 'parse_events(.*NULL)' ./tools/perf/ -r' Signed-off-by: Adrian Hunter <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Tong Zhang <[email protected]> Cc: Wang Nan <[email protected]> Cc: [email protected] # v4.4+ Fixes: 196581717d85 ("perf tools: Enhance parsing events tracepoint error output") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-31Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds4-3/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "This is much bigger than typical fixes, but Peter found a category of races that spurred more fixes and more debugging enhancements. Work started before the merge window, but got finished only now. Aside of that this contains the usual small fixes to perf and tools. Nothing particular exciting" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (43 commits) perf: Remove/simplify lockdep annotation perf: Synchronously clean up child events perf: Untangle 'owner' confusion perf: Add flags argument to perf_remove_from_context() perf: Clean up sync_child_event() perf: Robustify event->owner usage and SMP ordering perf: Fix STATE_EXIT usage perf: Update locking order perf: Remove __free_event() perf/bpf: Convert perf_event_array to use struct file perf: Fix NULL deref perf/x86: De-obfuscate code perf/x86: Fix uninitialized value usage perf: Fix race in perf_event_exit_task_context() perf: Fix orphan hole perf stat: Do not clean event's private stats perf hists: Fix HISTC_MEM_DCACHELINE width setting perf annotate browser: Fix behaviour of Shift-Tab with nothing focussed perf tests: Remove wrong semicolon in while loop in CQM test perf: Synchronously free aux pages in case of allocation failure ...
2016-01-29perf kvm/powerpc: Port perf kvm stat to powerpcHemant Kumar1-0/+1
perf kvm can be used to analyze guest exit reasons. This support already exists in x86. Hence, porting it to powerpc. - To trace KVM events : perf kvm stat record If many guests are running, we can track for a specific guest by using --pid as in : perf kvm stat record --pid <pid> - To see the results : perf kvm stat report The result shows the number of exits (from the guest context to host/hypervisor context) grouped by their respective exit reasons with their frequency. Since, different powerpc machines have different KVM tracepoints, this patch discovers the available tracepoints dynamically and accordingly looks for them. If any single tracepoint is not present, this support won't be enabled for reporting. To record, this will fail if any of the events we are looking to record isn't available. Right now, its only supported on PowerPC Book3S_HV architectures. To analyze the different exits, group them and present them (in a slight descriptive way) to the user, we need a mapping between the "exit code" (dumped in the kvm_guest_exit tracepoint data) and to its related Interrupt vector description (exit reason). This patch adds this mapping in book3s_hv_exits.h. It records on two available KVM tracepoints for book3s_hv: "kvm_hv:kvm_guest_exit" and "kvm_hv:kvm_guest_enter". Here is a sample o/p: # pgrep qemu 19378 60515 2 Guests are running on the host. # perf kvm stat record -a ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 4.153 MB perf.data.guest (39624 samples) ] # perf kvm stat report -p 60515 Analyze events for pid(s) 60515, all VCPUs: VM-EXIT Samples Samples% Time% MinTime MaxTime Avg time SYSCALL 9141 63.67% 7.49% 1.26us 5782.39us 9.87us (+- 6.46%) H_DATA_STORAGE 4114 28.66% 5.07% 1.72us 4597.68us 14.84us (+-20.06%) HV_DECREMENTER 418 2.91% 4.26% 0.70us 30002.22us 122.58us (+-70.29%) EXTERNAL 392 2.73% 0.06% 0.64us 104.10us 1.94us (+-18.83%) RETURN_TO_HOST 287 2.00% 83.11% 1.53us 124240.15us 3486.52us (+-16.81%) H_INST_STORAGE 5 0.03% 0.00% 1.88us 3.73us 2.39us (+-14.20%) Total Samples:14357, Total events handled time:1203918.42us. Signed-off-by: Hemant Kumar <[email protected]> Cc: Alexander Yarygin <[email protected]> Cc: David Ahern <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Scott Wood <[email protected]> Cc: Srikar Dronamraju <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Srikar Dronamraju <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-29perf kvm/{x86,s390}: Remove const from kvm_events_tpHemant Kumar1-1/+1
This patch removes the "const" qualifier from kvm_events_tp declaration to account for the fact that some architectures may need to update this variable dynamically. For instance, powerpc will need to update this variable dynamically depending on the machine type. Signed-off-by: Hemant Kumar <[email protected]> Acked-by: David Ahern <[email protected]> Cc: Alexander Yarygin <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Scott Wood <[email protected]> Cc: Srikar Dronamraju <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-29perf kvm/{x86,s390}: Remove dependency on uapi/kvm_perf.hHemant Kumar1-0/+5
Its better to remove the dependency on uapi/kvm_perf.h to allow dynamic discovery of kvm events (if its needed). To do this, some extern variables have been introduced with which we can keep the generic functions generic. Signed-off-by: Hemant Kumar <[email protected]> Acked-by: Alexander Yarygin <[email protected]> Acked-by: David Ahern <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Scott Wood <[email protected]> Cc: Srikar Dronamraju <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-29perf tools: Move timestamp creation to utilWang Nan2-0/+18
Timestamp generation becomes a public available helper. Which will be used by 'perf record', help it output to split output file based on time. For example: perf.data.2015122620363710 perf.data.2015122620364092 perf.data.2015122620365423 ... Signed-off-by: Wang Nan <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: He Kuang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Li Zefan <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Cc: Zefan Li <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: He Kuang <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-29perf buildid: Fix cpumode of buildid eventWang Nan1-1/+5
There is a nasty confusion that, for kernel module, dso->kernel is not necessary to be DSO_TYPE_KERNEL or DSO_TYPE_GUEST_KERNEL. These two enums are for vmlinux. See thread [1]. We tried to fix this part but it is costy. Code machine__write_buildid_table() is another unfortunate function fall into this trap that, when issuing buildid event for a kernel module, cpumode it gives to the event is PERF_RECORD_MISC_USER, not PERF_RECORD_MISC_KERNEL. However, even with this bug, most of the time it doesn't causes real problem. I find this issue when trying to use a perf before commit 3d39ac538629 ("perf machine: No need to have two DSOs lists") to parse a perf.data generated by newest perf. [1] https://lkml.org/lkml/2015/9/21/908 Signed-off-by: Wang Nan <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Li Zefan <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-29perf auxtrace: Add perf_evlist pointer to *info_priv_size()Mathieu Poirier2-5/+8
On some architecture the size of the private header may be dependent on the number of tracers used in the session. As such adding a "struct perf_evlist *" parameter, which should contain all the required information. Also adjusting the existing client of the interface to take the new parameter into account. Signed-off-by: Mathieu Poirier <[email protected]> Acked-by: Adrian Hunter <[email protected]> Cc: Al Grant <[email protected]> Cc: Chunyan Zhang <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Mike Leach <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rabin Vincent <[email protected]> Cc: Tor Jeremiassen <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-26perf cpumap: Auto initialize cpu__max_{node,cpu}Arnaldo Carvalho de Melo2-29/+33
Since it was always checking if the initialization was done, use that branch to do the initialization if not done already. With this we reduce the number of exported globals from these files. Cc: Adrian Hunter <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-26perf machine: Introduce machine__find_kernel_symbol_by_name()Arnaldo Carvalho de Melo1-0/+10
To be used in the 'vmlinux matches kallsyms' 'perf test' entry. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-26perf hists browser: Offer 'Zoom into DSO'/'Map details' only when sort order ↵Namhyung Kim1-0/+1
has 'dso' We can't offer a zoom into DSO when a bucket (struct hist_entry) may have samples for more than one DSO, i.e. when 'dso' is not part of the sort order, ditto for 'Map details', fix it. Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]>, Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Carved out from a larger patch, moved check to add_{dso,map}_opt() ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-26perf sort: Provide a way to find out if per-thread bucketing is in placeNamhyung Kim2-0/+4
Now the UI browsers will be able to offer thread related operations only if the thread is part of the sort order in use, i.e. if hist_entry stats are all for a single thread. Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]>, Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Carved out from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-26perf tools: Document the perf sysctlsBen Hutchings1-6/+9
perf_event_paranoid was only documented in source code and a perf error message. Copy the documentation from the error message to Documentation/sysctl/kernel.txt. perf_cpu_time_max_percent was already documented but missing from the list at the top, so add it there. Signed-off-by: Ben Hutchings <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] [ Remove reference to external Documentation file, provide info inline, as before ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>