aboutsummaryrefslogtreecommitdiff
path: root/tools/perf
AgeCommit message (Collapse)AuthorFilesLines
2016-02-03perf hists: Remove perf_hpp__column_(disable|enable)Jiri Olsa2-14/+0
Those functions are no longer needed. They operate over perf_hpp__format array which is now used only as template for dynamic entries. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03perf hists: Allocate output sort fieldJiri Olsa2-10/+47
Currently we use static output fields, because we have single global list of all sort/output fields. We will add hists specific sort and output lists in following patches, so we need all format entries to be dynamically allocated. Adding support to allocate output sort field. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03perf top: Move UI initialization ahead of sort setupArnaldo Carvalho de Melo1-7/+7
The ui initialization changes hpp format callbacks, based on the used browser. Thus we need this init being processed before setup_sorting. Replica of a patch by Jiri for 'perf report'. Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03perf report: Move UI initialization ahead of sort setupJiri Olsa1-9/+9
The ui initialization changes hpp format callbacks, based on the used browser. Thus we need this init being processed before setup_sorting. Signed-off-by: Jiri Olsa <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03perf hists: Make hpp setup function genericJiri Olsa1-28/+8
Now that we have the 'equal' method implemented for hpp format entries we can ease up the logic in the following functions and make them generic wrt comparing format entries: perf_hpp__setup_output_field perf_hpp__append_sort_keys Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03perf hists: Add 'hpp__equal' callback functionJiri Olsa1-0/+16
Adding 'hpp__equal' callback function to compare hpp output format entries. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03perf hists: Add 'equal' method to perf_hpp_fmt structJiri Olsa3-22/+28
To easily compare format entries and make it available for all kinds of format entries. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03perf hists: Use struct perf_hpp_fmt::idx in perf_hpp__reset_widthJiri Olsa1-10/+2
We are going to add dynamic hpp format fields, so we need to make the 'len' change for the format itself, not in the perf_hpp__format template. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03perf hists: Add _idx fields into struct perf_hpp_fmtJiri Olsa2-11/+15
Currently there's no way of comparing hpp format entries, which is needed in following patches. Adding _idx fields into struct perf_hpp_fmt to recognize and be able to compare hpp format entries. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03perf hists: Introduce perf_evsel__output_resort functionJiri Olsa8-16/+23
Adding evsel specific function to sort hists_evsel based hists. The hists__output_resort can be now used to sort common hists object. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03perf hists: Factor output_resort from hists__output_resortJiri Olsa1-8/+15
Currently hists__output_resort() depends on hists based on hists_evsel struct, but we need to be able to sort common hists as well. Cutting out the sorting base sorting code into output_resort function, so it can be reused in following patch. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-03Merge tag 'perf-core-for-mingo-3' of ↵Ingo Molnar4-87/+136
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core callchain fixes and improvements from Arnaldo Carvalho de Melo <[email protected]: User visible changes: - Make --percent-limit apply to callchains also and fix some bugs related to --percent-limit (Namhyung Kim) Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2016-02-03Merge tag 'perf-core-for-mingo-2' of ↵Ingo Molnar23-122/+678
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf tooling changes from Arnaldo Carvalho de Melo: New features: - Port 'perf kvm stat' to PowerPC (Hemant Kumar) Infrastructure changes: - Use the 'feature-dump' target to do the feature checks just once and then add code to reuse that in the tests/make makefile, speeding up the 'make -C tools/perf build-test' target (Wang Nan) - Reduce the number of tests the 'build-test' target do to those that don't pollute the source tree (Arnaldo Carvalho de Melo) - Improve the output of the build tests a bit by aligning the name of the tests, more can be done to filter out uninteresting info in the output (Arnaldo Carvalho de Melo) - Add perf_evlist pointer to *info_priv_size(), more prep work for supporting the coresight architecture (Mathieu Poirier) - Improve the 'perf test bp_signal' test (Wang Nan) - Check environment before starting the BPF 'perf test', so that we can just 'Skip' older kernels instead of 'FAIL'ing them (Wang Nan) - Fix cpumode of synthesized buildid event (Wang Nan) Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2016-02-03Merge tag 'perf-core-for-mingo' of ↵Ingo Molnar28-171/+452
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: User visible changes: - Rename the "colors.code" ~/.perfconfig variable to "colors.jump_arrows", as it controls just the that UI element in the annotate browser (Taeung Song) - Avoid trying to read ELF symtabs from device files, noticed while doing memory profiling work (Jiri Olsa) - Improve context detection when offering options in the hists browser, i.e. some options don't make sense when the browser is not working with a perf.data file ('perf top' mode), only in 'perf report' mode, like scripting (Namhyung Kim) Infrastructure changes: - Elliminate duplication in the hists browser filter functions, getting the common part into a function that receives callbacks for filtering by DSO, thread, etc. (Namhyung Kim) - Fix misleadingly indented assignment, found using gcc6 -Wmisleading-indentation (Markus Trippelsdorf) - Handle LLVM relocation oddities in libbpf, introducing a 'perf test' that detects such problems and then fixing the problem, so that the test now passes (Wang Nan) - More improvements to the build infrastructure to allow reusing the feature detection facilities (Wang Nan) - Auto initialize the globals needed by cpu__max_{cpu,node}() routines (Arnaldo Carvalho de Melo) Documentation changes: - Document the perf sysctls in Documentation/sysctl/kernel.txt (Ben Hutchings) - Document a bunch more ~/.perfconfig knobs (Taeung Song) Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2016-02-02perf probe: Search both .eh_frame and .debug_frame sections for probe locationHemant Kumar2-26/+41
'perf probe' through debuginfo__find_probes() in util/probe-finder.c checks for the functions' frame descriptions in either .eh_frame section of an ELF or the .debug_frame. The check is based on whether either one of these sections is present. Depending on distro, toolchain defaults, architetcutre, build flags, etc., CFI might be found in either .eh_frame and/or .debug_frame. Sometimes, it may happen that, .eh_frame, even if present, may not be complete and may miss some descriptions. Therefore, to be sure, to find the CFI covering an address we will always have to investigate both if available. For e.g., in powerpc, this may happen: $ gcc -g bin.c -o bin $ objdump --dwarf ./bin <1><145>: Abbrev Number: 7 (DW_TAG_subprogram) <146> DW_AT_external : 1 <146> DW_AT_name : (indirect string, offset: 0x9e): main <14a> DW_AT_decl_file : 1 <14b> DW_AT_decl_line : 39 <14c> DW_AT_prototyped : 1 <14c> DW_AT_type : <0x57> <150> DW_AT_low_pc : 0x100007b8 If the .eh_frame and .debug_frame are checked for the same binary, we will find that, .eh_frame (although present) doesn't contain a description for "main" function. But, .debug_frame has a description: 000000d8 00000024 00000000 FDE cie=00000000 pc=100007b8..10000838 DW_CFA_advance_loc: 16 to 100007c8 DW_CFA_def_cfa_offset: 144 DW_CFA_offset_extended_sf: r65 at cfa+16 ... Due to this (since, perf checks whether .eh_frame is present and goes on searching for that address inside that frame), perf is unable to process the probes: # perf probe -x ./bin main Failed to get call frame on 0x100007b8 Error: Failed to add events. To avoid this issue, we need to check both the sections (.eh_frame and .debug_frame), which is done in this patch. Note that, we can always force everything into both .eh_frame and .debug_frame by: $ gcc bin.c -fasynchronous-unwind-tables -fno-dwarf2-cfi-asm -g -o bin Signed-off-by: Hemant Kumar <[email protected]> Acked-by: Masami Hiramatsu <[email protected]> Cc: [email protected] Cc: Mark Wielaard <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Srikar Dronamraju <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-02perf tools: Fix thread lifetime related segfaut in intel_ptAdrian Hunter1-0/+9
intel_pt_process_auxtrace_info() creates a pt->unknown_thread thread that eventually needs to be freed by the last thread__put() on it, when its refcount hits zero, which may happen in intel_pt_process_auxtrace_info() error handling path and triggers the following segfault, which would happen as well at intel_pt_free, when tools using this intel_pt codebase frees up resources: # perf record -I -e intel_pt/tsc=1,noretcomp=1/u /bin/ls 0 a anaconda-ks.cfg bin perf.data perf.data.old perf-f23-bringup.todo [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.217 MB perf.data ] # # perf script -F event,comm,pid,tid,time,addr,ip,sym,dso,iregs Samples for 'instructions:u' event do not have IREGS attribute set. Cannot print 'iregs' field. intel_pt_synth_events: failed to synthesize 'instructions' event type Segmentation fault (core dumped) # The problem is: there's a union in 'struct thread' combines a list_head and a rb_node. The standard life cycle of a thread is: init rb_node in the constructor, insert it into machine->threads rbtree using rb_node, move it to machine->dead_threads using list_head, clean in the last thread__put: list_del_init(&thread->node). In the above command, it clean a thread before adding it into list, causes the above segfault. Since pt->unknown_thread will never live in an rbtree, initialize its list node so that when list_del_init() is done on it we don't segfault. After this patch: # perf script -F event,comm,pid,tid,time,addr,ip,sym,dso,iregs Samples for 'instructions:u' event do not have IREGS attribute set. Cannot print 'iregs' field. intel_pt_synth_events: failed to synthesize 'instructions' event type 0x248 [0x88]: failed to process type: 70 # Reported-by: Tong Zhang <[email protected]> Reported-by: Wang Nan <[email protected]> Signed-off-by: Adrian Hunter <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Josh Poimboeuf <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-01perf report: Don't show blank lines if entry has no callchainNamhyung Kim1-1/+4
When all callchains of a hist entry is percent-limited, do not add a blank line at the end. It makes the entry look like it doesn't have callchains. Reported-and-Tested-by: Jiri Olsa <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-01perf hists browser: Fix percent display in callchainsNamhyung Kim1-5/+19
When there's only a single callchain, perf doesn't print its percentage in front of the symbols. This is because it assumes that the percentage is same as parents. But if a percent limit is applied, it's possible that there are actually a couple of child nodes but only one of them is shown. In this case it should display the percent to prevent misunderstanding of its percentage is same as the parent's. For example, let's see the following callchain. $ perf report --no-children --percent-limit 0.01 --tui ... - 0.06% sleep [kernel.vmlinux] [k] kmem_cache_alloc_trace kmem_cache_alloc_trace - perf_event_mmap - 0.04% mmap_region do_mmap_pgoff - vm_mmap_pgoff + 0.02% sys_mmap_pgoff + 0.02% vm_mmap + 0.02% mprotect_fixup Current code omits the percent if 'mmap_region' becomes the only node when percent limit is set to 0.03%, its percent is not 0.06% but users will assume it incorrectly. Before: $ perf report --no-children --percent-limit 0.03 --tui ... 0.06% sleep [kernel.vmlinux] [k] kmem_cache_alloc_trace kmem_cache_alloc_trace - perf_event_mmap - mmap_region do_mmap_pgoff vm_mmap_pgoff After: $ perf report --no-children --percent-limit 0.03 --tui ... 0.06% sleep [kernel.vmlinux] [k] kmem_cache_alloc_trace kmem_cache_alloc_trace - perf_event_mmap - 0.04% mmap_region do_mmap_pgoff vm_mmap_pgoff Signed-off-by: Namhyung Kim <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-01perf hists browser: Pass parent_total to callchain print functionsNamhyung Kim1-20/+24
Pass parent node's total period to callchain print functions. This info is needed by later patch to determine whether it can omit percent or not correctly. No functional change intended. Signed-off-by: Namhyung Kim <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-01perf hists browser: Fix dump to show correct callchain styleNamhyung Kim1-32/+41
The commit 8c430a348699 ("perf hists browser: Support folded callchains") missed to update hist_browser__dump() so it always shows graph-style callchains regardless of current setting. To fix that, factor out callchain printing code and rename the existing function which prints graph-style callchain. Signed-off-by: Namhyung Kim <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Fixes: 8c430a348699 ("perf hists browser: Support folded callchains") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-01perf report: Fix percent display in callchains on --stdioNamhyung Kim1-6/+20
When there's only a single callchain, perf doesn't print its percentage in front of the symbols. This is because it assumes that the percentage is same as parents. But if a percent limit is applied, it's possible that there are actually a couple of child nodes but only one of them is shown. In this case it should display the percent to prevent misunderstanding of its percentage is same as the parent's. For example, let's see the following callchain. $ perf report -s comm --percent-limit 0.01 --stdio ... 9.95% swapper | |--7.57%--intel_idle | cpuidle_enter_state | cpuidle_enter | call_cpuidle | cpu_startup_entry | | | |--4.89%--start_secondary | | | --2.68%--rest_init | start_kernel | x86_64_start_reservations | x86_64_start_kernel | |--0.15%--__schedule | | | |--0.13%--schedule | | schedule_preempt_disable | | cpu_startup_entry | | | | | |--0.09%--start_secondary | | | | | --0.04%--rest_init | | start_kernel | | x86_64_start_reservations | | x86_64_start_kernel | | | --0.01%--schedule_preempt_disabled | cpu_startup_entry ... Current code omits the percent if 'intel_idle' becomes the only node when percent limit is set to 0.5%, its percent is not 9.95% but users will assume it incorrectly. Before: $ perf report --percent-limit 0.5 --stdio ... 9.95% swapper | ---intel_idle cpuidle_enter_state cpuidle_enter call_cpuidle cpu_startup_entry | |--4.89%--start_secondary | --2.68%--rest_init start_kernel x86_64_start_reservations x86_64_start_kernel After: $ perf report --percent-limit 0.5 --stdio ... 9.95% swapper | --7.57%--intel_idle cpuidle_enter_state cpuidle_enter call_cpuidle cpu_startup_entry | |--4.89%--start_secondary | --2.68%--rest_init start_kernel x86_64_start_reservations x86_64_start_kernel Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-01perf callchain: Pass parent_samples to __callchain__fprintf_graph()Namhyung Kim1-6/+13
Pass hist entry's period to graph callchain print function. This info is needed by later patch to determine whether it can omit percentage of top-level node or not. No functional change intended. Signed-off-by: Namhyung Kim <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-01perf report: Get rid of hist_entry__callchain_fprintf()Namhyung Kim1-25/+2
It's just a wrapper function to align the start position ofcallchains to 'comm' of each thread if it's a first sort key. But it doesn't not work with tracepoint events and also with upcoming hierarchy view. Signed-off-by: Namhyung Kim <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-01perf report: Apply --percent-limit to callchains alsoNamhyung Kim1-2/+7
Currently --percent-limit option only works for hist entries. However it'd be better to have same effect to callchains as well Requested-by: Andi Kleen <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-01perf hists: Update hists' total period when adding entriesNamhyung Kim1-2/+9
Currently the hist entry addition path doesn't update total_period of hists and it's calculated during 'resort' path. But the resort path needs to know the total period before doing its job because it's used for calculating percent limit of callchains in hist entries. So this patch update the total period during the addition path. It makes the percent limit of callchains working (again). Signed-off-by: Namhyung Kim <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-01perf hists: Fix min callchain hits calculationNamhyung Kim1-2/+11
The total period should be get using hists__total_period() since it takes filtered entries into account. In addition, if callchain mode is 'fractal', the total period should be the entry's period. Signed-off-by: Namhyung Kim <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-02-01perf tools: tracepoint_error() can receive e=NULL, robustify itAdrian Hunter1-0/+3
Fixes segmentation fault using, for instance: (gdb) run record -I -e intel_pt/tsc=1,noretcomp=1/u /bin/ls Starting program: /home/acme/bin/perf record -I -e intel_pt/tsc=1,noretcomp=1/u /bin/ls Missing separate debuginfos, use: dnf debuginfo-install glibc-2.22-7.fc23.x86_64 [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Program received signal SIGSEGV, Segmentation fault. 0 x00000000004b9ea5 in tracepoint_error (e=0x0, err=13, sys=0x19b1370 "sched", name=0x19a5d00 "sched_switch") at util/parse-events.c:410 (gdb) bt #0 0x00000000004b9ea5 in tracepoint_error (e=0x0, err=13, sys=0x19b1370 "sched", name=0x19a5d00 "sched_switch") at util/parse-events.c:410 #1 0x00000000004b9fc5 in add_tracepoint (list=0x19a5d20, idx=0x7fffffffb8c0, sys_name=0x19b1370 "sched", evt_name=0x19a5d00 "sched_switch", err=0x0, head_config=0x0) at util/parse-events.c:433 #2 0x00000000004ba334 in add_tracepoint_event (list=0x19a5d20, idx=0x7fffffffb8c0, sys_name=0x19b1370 "sched", evt_name=0x19a5d00 "sched_switch", err=0x0, head_config=0x0) at util/parse-events.c:498 #3 0x00000000004bb699 in parse_events_add_tracepoint (list=0x19a5d20, idx=0x7fffffffb8c0, sys=0x19b1370 "sched", event=0x19a5d00 "sched_switch", err=0x0, head_config=0x0) at util/parse-events.c:936 #4 0x00000000004f6eda in parse_events_parse (_data=0x7fffffffb8b0, scanner=0x19a49d0) at util/parse-events.y:391 #5 0x00000000004bc8e5 in parse_events__scanner (str=0x663ff2 "sched:sched_switch", data=0x7fffffffb8b0, start_token=258) at util/parse-events.c:1361 #6 0x00000000004bca57 in parse_events (evlist=0x19a5220, str=0x663ff2 "sched:sched_switch", err=0x0) at util/parse-events.c:1401 #7 0x0000000000518d5f in perf_evlist__can_select_event (evlist=0x19a3b90, str=0x663ff2 "sched:sched_switch") at util/record.c:253 #8 0x0000000000553c42 in intel_pt_track_switches (evlist=0x19a3b90) at arch/x86/util/intel-pt.c:364 #9 0x00000000005549d1 in intel_pt_recording_options (itr=0x19a2c40, evlist=0x19a3b90, opts=0x8edf68 <record+232>) at arch/x86/util/intel-pt.c:664 #10 0x000000000051e076 in auxtrace_record__options (itr=0x19a2c40, evlist=0x19a3b90, opts=0x8edf68 <record+232>) at util/auxtrace.c:539 #11 0x0000000000433368 in cmd_record (argc=1, argv=0x7fffffffde60, prefix=0x0) at builtin-record.c:1264 #12 0x000000000049bec2 in run_builtin (p=0x8fa2a8 <commands+168>, argc=5, argv=0x7fffffffde60) at perf.c:390 #13 0x000000000049c12a in handle_internal_command (argc=5, argv=0x7fffffffde60) at perf.c:451 #14 0x000000000049c278 in run_argv (argcp=0x7fffffffdcbc, argv=0x7fffffffdcb0) at perf.c:495 #15 0x000000000049c60a in main (argc=5, argv=0x7fffffffde60) at perf.c:618 (gdb) Intel PT attempts to find the sched:sched_switch tracepoint but that seg faults if tracefs is not readable, because the error reporting structure is null, as errors are not reported when automatically adding tracepoints. Fix by checking before using. Committer note: This doesn't take place in a kernel that supports perf_event_attr.context_switch, that is the default way that will be used for tracking context switches, only in older kernels, like 4.2, in a machine with Intel PT (e.g. Broadwell) for non-priviledged users. Further info from a similar patch by Wang: The error is in tracepoint_error: it assumes the 'e' parameter is valid. However, there are many situation a parse_event() can be called without parse_events_error. See result of $ grep 'parse_events(.*NULL)' ./tools/perf/ -r' Signed-off-by: Adrian Hunter <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Tong Zhang <[email protected]> Cc: Wang Nan <[email protected]> Cc: [email protected] # v4.4+ Fixes: 196581717d85 ("perf tools: Enhance parsing events tracepoint error output") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-31Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds9-22/+75
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "This is much bigger than typical fixes, but Peter found a category of races that spurred more fixes and more debugging enhancements. Work started before the merge window, but got finished only now. Aside of that this contains the usual small fixes to perf and tools. Nothing particular exciting" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (43 commits) perf: Remove/simplify lockdep annotation perf: Synchronously clean up child events perf: Untangle 'owner' confusion perf: Add flags argument to perf_remove_from_context() perf: Clean up sync_child_event() perf: Robustify event->owner usage and SMP ordering perf: Fix STATE_EXIT usage perf: Update locking order perf: Remove __free_event() perf/bpf: Convert perf_event_array to use struct file perf: Fix NULL deref perf/x86: De-obfuscate code perf/x86: Fix uninitialized value usage perf: Fix race in perf_event_exit_task_context() perf: Fix orphan hole perf stat: Do not clean event's private stats perf hists: Fix HISTC_MEM_DCACHELINE width setting perf annotate browser: Fix behaviour of Shift-Tab with nothing focussed perf tests: Remove wrong semicolon in while loop in CQM test perf: Synchronously free aux pages in case of allocation failure ...
2016-01-29perf build: Align the names of the build tests:Arnaldo Carvalho de Melo1-2/+4
$ make -C tools/perf build-test make[1]: Entering directory `/home/acme/git/linux/tools/perf' make_pure_O: cd . && make -f Makefile O=/tmp/tmp.mPx0Cmik3f DESTDIR=/tmp/tmp.U0SUmVbtJm make_clean_all_O: cd . && make -f Makefile O=/tmp/tmp.Yl5UzhTU7T DESTDIR=/tmp/tmp.fop1E4jdER clean all make_debug_O: cd . && make -f Makefile O=/tmp/tmp.pMn2ozBoXC DESTDIR=/tmp/tmp.azxhDp5sEp DEBUG=1 make_no_libperl_O: cd . && make -f Makefile O=/tmp/tmp.qJPiINMtA7 DESTDIR=/tmp/tmp.KNMrLeGDxZ NO_LIBPERL=1 <SNIP> More needs to be done to make it more compact, i.e. elide the '-f Makefile', remove that 'cd . &&', move the DESTDIR= and O= to the end, as they don't convey that much information besides the fact that they are being set to some random directory just for this build, move the meat, i.e. the meaningful feature disabling bits to the start, etc. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-29perf kvm/powerpc: Add support for HCALL reasonsHemant Kumar2-1/+187
Powerpc provides hcall events that also provides insights into guest behaviour. Enhance perf kvm stat to record and analyze hcall events. - To trace hcall events : perf kvm stat record - To show the results : perf kvm stat report --event=hcall The result shows the number of hypervisor calls from the guest grouped by their respective reasons displayed with the frequency. This patch makes use of two additional tracepoints "kvm_hv:kvm_hcall_enter" and "kvm_hv:kvm_hcall_exit". To map the hcall codes to their respective names, it needs a mapping. Such mapping is added in this patch in book3s_hcalls.h. # pgrep qemu A sample output : 19378 60515 2 VMs running. # perf kvm stat record -a ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 4.153 MB perf.data.guest (39624 samples) ] # perf kvm stat report -p 60515 --event=hcall Analyze events for all VMs, all VCPUs: HCALL-EVENT Samples Samples% Time% MinTime MaxTime AvgTime H_IPI 822 66.08% 88.10% 0.63us 11.38us 2.05us (+- 1.42%) H_SEND_CRQ 144 11.58% 3.77% 0.41us 0.88us 0.50us (+- 1.47%) H_VIO_SIGNAL 118 9.49% 2.86% 0.37us 0.83us 0.47us (+- 1.43%) H_PUT_TERM_CHAR 76 6.11% 2.07% 0.37us 0.90us 0.52us (+- 2.43%) H_GET_TERM_CHAR 74 5.95% 2.23% 0.37us 1.70us 0.58us (+- 4.77%) H_RTAS 6 0.48% 0.85% 1.10us 9.25us 2.70us (+-48.57%) H_PERFMON 4 0.32% 0.12% 0.41us 0.96us 0.59us (+-20.92%) Total Samples:1244, Total events handled time:1916.69us. Signed-off-by: Hemant Kumar <[email protected]> Cc: Alexander Yarygin <[email protected]> Cc: David Ahern <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Scott Wood <[email protected]> Cc: Srikar Dronamraju <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-29perf kvm/powerpc: Port perf kvm stat to powerpcHemant Kumar6-0/+162
perf kvm can be used to analyze guest exit reasons. This support already exists in x86. Hence, porting it to powerpc. - To trace KVM events : perf kvm stat record If many guests are running, we can track for a specific guest by using --pid as in : perf kvm stat record --pid <pid> - To see the results : perf kvm stat report The result shows the number of exits (from the guest context to host/hypervisor context) grouped by their respective exit reasons with their frequency. Since, different powerpc machines have different KVM tracepoints, this patch discovers the available tracepoints dynamically and accordingly looks for them. If any single tracepoint is not present, this support won't be enabled for reporting. To record, this will fail if any of the events we are looking to record isn't available. Right now, its only supported on PowerPC Book3S_HV architectures. To analyze the different exits, group them and present them (in a slight descriptive way) to the user, we need a mapping between the "exit code" (dumped in the kvm_guest_exit tracepoint data) and to its related Interrupt vector description (exit reason). This patch adds this mapping in book3s_hv_exits.h. It records on two available KVM tracepoints for book3s_hv: "kvm_hv:kvm_guest_exit" and "kvm_hv:kvm_guest_enter". Here is a sample o/p: # pgrep qemu 19378 60515 2 Guests are running on the host. # perf kvm stat record -a ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 4.153 MB perf.data.guest (39624 samples) ] # perf kvm stat report -p 60515 Analyze events for pid(s) 60515, all VCPUs: VM-EXIT Samples Samples% Time% MinTime MaxTime Avg time SYSCALL 9141 63.67% 7.49% 1.26us 5782.39us 9.87us (+- 6.46%) H_DATA_STORAGE 4114 28.66% 5.07% 1.72us 4597.68us 14.84us (+-20.06%) HV_DECREMENTER 418 2.91% 4.26% 0.70us 30002.22us 122.58us (+-70.29%) EXTERNAL 392 2.73% 0.06% 0.64us 104.10us 1.94us (+-18.83%) RETURN_TO_HOST 287 2.00% 83.11% 1.53us 124240.15us 3486.52us (+-16.81%) H_INST_STORAGE 5 0.03% 0.00% 1.88us 3.73us 2.39us (+-14.20%) Total Samples:14357, Total events handled time:1203918.42us. Signed-off-by: Hemant Kumar <[email protected]> Cc: Alexander Yarygin <[email protected]> Cc: David Ahern <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Scott Wood <[email protected]> Cc: Srikar Dronamraju <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Srikar Dronamraju <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-29perf kvm/{x86,s390}: Remove const from kvm_events_tpHemant Kumar3-3/+3
This patch removes the "const" qualifier from kvm_events_tp declaration to account for the fact that some architectures may need to update this variable dynamically. For instance, powerpc will need to update this variable dynamically depending on the machine type. Signed-off-by: Hemant Kumar <[email protected]> Acked-by: David Ahern <[email protected]> Cc: Alexander Yarygin <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Scott Wood <[email protected]> Cc: Srikar Dronamraju <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-29perf kvm/{x86,s390}: Remove dependency on uapi/kvm_perf.hHemant Kumar4-14/+33
Its better to remove the dependency on uapi/kvm_perf.h to allow dynamic discovery of kvm events (if its needed). To do this, some extern variables have been introduced with which we can keep the generic functions generic. Signed-off-by: Hemant Kumar <[email protected]> Acked-by: Alexander Yarygin <[email protected]> Acked-by: David Ahern <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Scott Wood <[email protected]> Cc: Srikar Dronamraju <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-29perf record: Use OPT_BOOLEAN_SET for buildid cache related optionsWang Nan1-4/+8
'perf record' knows whether buildid cache is enabled (via --no-no-buildid-cache) deliberately. Buildid cache can be turned off in some situations. Output switching support needs this feature to turn off buildid cache by default. Signed-off-by: Wang Nan <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: He Kuang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Li Zefan <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Cc: Zefan Li <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: He Kuang <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-29perf tools: Move timestamp creation to utilWang Nan3-13/+19
Timestamp generation becomes a public available helper. Which will be used by 'perf record', help it output to split output file based on time. For example: perf.data.2015122620363710 perf.data.2015122620364092 perf.data.2015122620365423 ... Signed-off-by: Wang Nan <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: He Kuang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Li Zefan <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Cc: Zefan Li <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: He Kuang <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-29perf test: Improve bp_signalWang Nan1-22/+118
Will Deacon [1] has some question on patch [2]. This patch improves test__bp_signal so we can test: 1. A watchpoint and a breakpoint that fire on the same instruction 2. Nested signals Test result: On x86_64 and ARM64 (result are similar with patch [2] on ARM64): # ./perf test -v signal 17: Test breakpoint overflow signal handler : --- start --- test child forked, pid 10213 count1 1, count2 3, count3 2, overflow 3, overflows_2 3 test child finished with 0 ---- end ---- Test breakpoint overflow signal handler: Ok So at least 2 cases Will doubted are handled correctly. [1] http://lkml.kernel.org/g/[email protected] [2] http://lkml.kernel.org/g/[email protected] Signed-off-by: Wang Nan <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: He Kuang <[email protected]> Cc: Li Zefan <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-29perf test: Check environment before start real BPF testWang Nan1-0/+37
Copying perf to old kernel system results: # perf test bpf 37: Test BPF filter : 37.1: Test basic BPF filtering : FAILED! 37.2: Test BPF prologue generation : Skip However, in case when kernel doesn't support a test case it should return 'Skip', 'FAILED!' should be reserved for kernel tests for when the kernel supports a feature that then fails to work as advertised. This patch checks environment before real testcase. Signed-off-by: Wang Nan <[email protected]> Suggested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: He Kuang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Li Zefan <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-29perf buildid: Fix cpumode of buildid eventWang Nan1-1/+5
There is a nasty confusion that, for kernel module, dso->kernel is not necessary to be DSO_TYPE_KERNEL or DSO_TYPE_GUEST_KERNEL. These two enums are for vmlinux. See thread [1]. We tried to fix this part but it is costy. Code machine__write_buildid_table() is another unfortunate function fall into this trap that, when issuing buildid event for a kernel module, cpumode it gives to the event is PERF_RECORD_MISC_USER, not PERF_RECORD_MISC_KERNEL. However, even with this bug, most of the time it doesn't causes real problem. I find this issue when trying to use a perf before commit 3d39ac538629 ("perf machine: No need to have two DSOs lists") to parse a perf.data generated by newest perf. [1] https://lkml.org/lkml/2015/9/21/908 Signed-off-by: Wang Nan <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Li Zefan <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-29perf auxtrace: Add perf_evlist pointer to *info_priv_size()Mathieu Poirier4-7/+14
On some architecture the size of the private header may be dependent on the number of tracers used in the session. As such adding a "struct perf_evlist *" parameter, which should contain all the required information. Also adjusting the existing client of the interface to take the new parameter into account. Signed-off-by: Mathieu Poirier <[email protected]> Acked-by: Adrian Hunter <[email protected]> Cc: Al Grant <[email protected]> Cc: Chunyan Zhang <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Mike Leach <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rabin Vincent <[email protected]> Cc: Tor Jeremiassen <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-29perf tools: Speed up build-tests by reducing the number of builds testedArnaldo Carvalho de Melo1-2/+9
The 'tools/perf/test/make' makefile has in its default, 'all' target builds that will pollute the source code directory, i.e. that will not use O= variable. The 'build-test' should be run as often as possible, preferrably after each non strictly non-code commit, so speed it up by selecting just the O= targets. Furthermore it tests both the Makefile.perf file, that is normally driven by the main Makefile, and the Makefile, reduce the time in half by having just MK=Makefile, the most usual, tested by 'build-test'. Please run: make -C tools/perf -f tests/make from time to time for testing also the in-place build tests. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-29perf build: Use feature dump file for build-testWang Nan2-1/+34
To prevent the feature check tests to run repeately, one time per 'tests/make' target/test, this patch utilizes the previously introduced 'feature-dump' make target and FEATURES_DUMP variable, making sure that the feature checkers run only once when doing build-test for normal test cases. However, since standard users doesn't reuse features dump result, we'd better give an option to check their behaviors. The above feature should be used to make build-test faster only. Only utilize it for build-test. Signed-off-by: Wang Nan <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-29perf build: Remove all condition feature check {C,LD}FLAGSWang Nan1-54/+47
'make feature-dump' should give a stable result, so even 'NO_SOMETHING=1' is given (for babeltrace, if LIBBABELTRACE=1 is not given), we should try to detect those feature and {C,LD}FLAGS. Build or not should be controled independent. Signed-off-by: Wang Nan <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Li Zefan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-26perf cpumap: Auto initialize cpu__max_{node,cpu}Arnaldo Carvalho de Melo2-29/+33
Since it was always checking if the initialization was done, use that branch to do the initialization if not done already. With this we reduce the number of exported globals from these files. Cc: Adrian Hunter <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-26perf hists browser: Skip scripting when perf.data file not availableNamhyung Kim1-0/+4
The script and data-switch context menu are only meaningful when it deals with a data file. So add a check so that it cannot be shown when perf-top is run. Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]>, Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Use goto skip_scripting instead of two is_report_browser() tests ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-26perf build: Select all feature checkers for feature-dumpWang Nan1-0/+9
Set FEATURE_TESTS to 'all' so all possible feature checkers are executed. Without this setting the output feature dump file miss some feature, for example, liberity. Select all checker so we won't get an incomplete feature dump file. Signed-off-by: Wang Nan <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: He Kuang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Li Zefan <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-26perf test: Add libbpf relocation checkerWang Nan7-12/+98
There's a bug in LLVM that it can generate unneeded relocation information. See [1] and [2]. Libbpf should check the target section of a relocation symbol. This patch adds a testcase which references a global variable (BPF doesn't support global variables). Before fixing libbpf, the new test case can be loaded into kernel, the global variable acts like the first map. It is incorrect. Result: # ~/perf test BPF 37: Test BPF filter : 37.1: Test basic BPF filtering : Ok 37.2: Test BPF prologue generation : Ok 37.3: Test BPF relocation checker : FAILED! # ~/perf test -v BPF ... libbpf: loading object '[bpf_relocation_test]' from buffer libbpf: section .strtab, size 126, link 0, flags 0, type=3 libbpf: section .text, size 0, link 0, flags 6, type=1 libbpf: section .data, size 0, link 0, flags 3, type=1 libbpf: section .bss, size 0, link 0, flags 3, type=8 libbpf: section func=sys_write, size 104, link 0, flags 6, type=1 libbpf: found program func=sys_write libbpf: section .relfunc=sys_write, size 16, link 10, flags 0, type=9 libbpf: section maps, size 16, link 0, flags 3, type=1 libbpf: maps in [bpf_relocation_test]: 16 bytes libbpf: section license, size 4, link 0, flags 3, type=1 libbpf: license of [bpf_relocation_test] is GPL libbpf: section version, size 4, link 0, flags 3, type=1 libbpf: kernel version of [bpf_relocation_test] is 40400 libbpf: section .symtab, size 144, link 1, flags 0, type=2 libbpf: map 0 is "my_table" libbpf: collecting relocating info for: 'func=sys_write' libbpf: relocation: insn_idx=7 Success unexpectedly: libbpf error when dealing with relocation test child finished with -1 ---- end ---- Test BPF filter subtest 2: FAILED! [1] https://llvm.org/bugs/show_bug.cgi?id=26243 [2] https://patchwork.ozlabs.org/patch/571385/ Signed-off-by: Wang Nan <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: He Kuang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Li Zefan <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-26perf test: Fixup aliases checking in the 'vmlinux matches kallsyms' testArnaldo Carvalho de Melo1-19/+5
There are cases where looking at just the next and prev entries is not enough, like with: $ readelf -sW /usr/lib/debug/lib/modules/4.3.3-301.fc23.x86_64/vmlinux | grep ffffffff81065ec0 4979: ffffffff81065ec0 53 FUNC LOCAL DEFAULT 1 try_to_free_pud_page 4980: ffffffff81065ec0 53 FUNC LOCAL DEFAULT 1 try_to_free_pte_page 4981: ffffffff81065ec0 53 FUNC LOCAL DEFAULT 1 try_to_free_pmd_page So just search by name to see if the symbol is in kallsyms. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-26perf machine: Introduce machine__find_kernel_symbol_by_name()Arnaldo Carvalho de Melo1-0/+10
To be used in the 'vmlinux matches kallsyms' 'perf test' entry. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-26perf hists browser: Offer non-symbol specific menu options for --sort ↵Namhyung Kim1-5/+1
without 'sym' Now that we check more strictly what each of the menu entries need, we can stop bailing out when 'sym' is not in the --sort order, instead we let each option be added if what it needs is present. This way, for instance, we can run scripts on all samples, see DSO map details when 'dso' is in the --sort provided, etc. Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]>, Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Carved out from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-01-26perf hists browser: Be a bit more strict about presenting CPU socket zoomNamhyung Kim1-1/+1
For consistency with the other sort order checks. Signed-off-by: Namhyung Kim <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]>, Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Carved out from a larger patch, moved check to add_socket_opt() ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>