aboutsummaryrefslogtreecommitdiff
path: root/tools/perf/util/machine.c
AgeCommit message (Collapse)AuthorFilesLines
2016-11-14perf report: Add branch flag to callchain cursor nodeJin Yao1-15/+67
Since the branch ip has been added to call stack for easier browsing, this patch adds more branch information. For example, add a flag to indicate if this ip is a branch, and also add with the branch flag. Then we can know if the cursor node represents a branch and know what the branch flag it has. The branch history code has a loop detection pass that removes loops. It would be nice for knowing how many loops were removed then in next steps, we can compute out the average number of iterations. For example: Before remove_loops(), entry0: from = 0x100, to = 0x200 entry1: from = 0x300, to = 0x250 entry2: from = 0x300, to = 0x250 entry3: from = 0x300, to = 0x250 entry4: from = 0x700, to = 0x800 After remove_loops() entry0: from = 0x100, to = 0x200 entry1: from = 0x300, to = 0x250 entry2: from = 0x700, to = 0x800 The original entry2 and entry3 are removed. So the number of iterations (from = 0x300, to = 0x250) is equal to removed number + 1 (2 + 1). iterations = removed number + 1; average iteractions = Sum(iteractions) / number of samples This formula ignores other cases, for example, iterations cross multiple buffers and one buffer contains 2+ loops. Because in practice, it's good enough. Signed-off-by: Yao Jin <[email protected]> Acked-by: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: [email protected] Cc: Yao Jin <[email protected]> Link: http://lkml.kernel.org/n/[email protected] [ Renamed 'iter' to 'nr_loop_iter' for clarity ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-10-03perf tools: Experiment with cppcheckArnaldo Carvalho de Melo1-3/+3
Experimenting a bit using cppcheck[1], a static checker brought to my attention by Colin, reducing the scope of some variables, reducing the line of source code lines in the process: $ cppcheck --enable=style tools/perf/util/thread.c Checking tools/perf/util/thread.c... [tools/perf/util/thread.c:17]: (style) The scope of the variable 'leader' can be reduced. [tools/perf/util/thread.c:133]: (style) The scope of the variable 'err' can be reduced. [tools/perf/util/thread.c:273]: (style) The scope of the variable 'err' can be reduced. Will continue later, but these are already useful, keep them. 1: https://sourceforge.net/p/cppcheck/wiki/Home/ Cc: Adrian Hunter <[email protected]> Cc: Colin Ian King <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-09-05perf symbols: Remove symbol_filter_t machineryArnaldo Carvalho de Melo1-10/+9
We're not using it anymore, few users were, but we really could do without it, simplify lots of functions by removing it. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-09-05perf machine: Remove machine->symbol_filter and friendsArnaldo Carvalho de Melo1-20/+1
Including machines__set_symbol_filter(), not used anymore. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-07-26perf s390: Fix 'start' address of module's mapSong Shan Gong1-0/+8
At present, when creating module's map, perf gets 'start' address by parsing '/proc/modules', but it's the module base address, it isn't the start address of the '.text' section. In most arches, it's OK. But for s390, it places 'GOT' and 'PLT' relocations before '.text' section. So there exists an offset between module base address and '.text' section, which will incur wrong symbol resolution for modules. Fix this bug by getting 'start' address of module's map from parsing '/sys/module/[module name]/sections/.text', not from '/proc/modules'. Signed-off-by: Song Shan Gong <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: David Ahern <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-06-22perf machine: Destructors should accept NULLArnaldo Carvalho de Melo1-2/+4
And do nothing, just like free(), to avoid having to test it in callers, usually in error paths. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-06-08Merge tag 'perf-core-for-mingo-20160607' of ↵Ingo Molnar1-2/+12
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: User visible changes: - Support cross unwinding, i.e. collecting '--call-graph dwarf' perf.data files in one machine and then doing analysis in another machine of a different hardware architecture. This enables, for instance, to do: perf record -a --call-graph dwarf on a x86-32 or aarch64 system and then do 'perf report' on it on a x86_64 workstation. (He Kuang) - Fix crash in build_id_cache__kallsyms_path(), recent regression (Wang Nan) Infrastructure changes: - Make tools/lib/bpf use the IS_ERR return facility consistently and also stop using the _get_ term for non-reference count methods (Arnaldo Carvalho de Melo) - 'perf config' refactorings (Taeung Song) Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2016-06-07perf unwind: Move unwind__prepare_access from thread_new into thread__insert_mapHe Kuang1-2/+12
To determine the libunwind methods to use, we should get the 32bit/64bit information from maps of a thread. When a thread is newly created, the information is not prepared. This patch moves unwind__prepare_access() into thread__insert_map() so we can get the information we need from maps. Meanwhile, let thread__insert_map() return value and show messages on error. Signed-off-by: He Kuang <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Ekaterina Tumanova <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Kan Liang <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Sukadev Bhattiprolu <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-05-25Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds1-20/+15
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "Mostly tooling and PMU driver fixes, but also a number of late updates such as the reworking of the call-chain size limiting logic to make call-graph recording more robust, plus tooling side changes for the new 'backwards ring-buffer' extension to the perf ring-buffer" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits) perf record: Read from backward ring buffer perf record: Rename variable to make code clear perf record: Prevent reading invalid data in record__mmap_read perf evlist: Add API to pause/resume perf trace: Use the ptr->name beautifier as default for "filename" args perf trace: Use the fd->name beautifier as default for "fd" args perf report: Add srcline_from/to branch sort keys perf evsel: Record fd into perf_mmap perf evsel: Add overwrite attribute and check write_backward perf tools: Set buildid dir under symfs when --symfs is provided perf trace: Only auto set call-graph to "dwarf" when syscalls are being traced perf annotate: Sort list of recognised instructions perf annotate: Fix identification of ARM blt and bls instructions perf tools: Fix usage of max_stack sysctl perf callchain: Stop validating callchains by the max_stack sysctl perf trace: Fix exit_group() formatting perf top: Use machine->kptr_restrict_warned perf trace: Warn when trying to resolve kernel addresses with kptr_restrict=1 perf machine: Do not bail out if not managing to read ref reloc symbol perf/x86/intel/p4: Trival indentation fix, remove space ...
2016-05-20perf callchain: Stop validating callchains by the max_stack sysctlArnaldo Carvalho de Melo1-20/+5
As thread__resolve_callchain_sample can be used for handling perf.data files, that could've been recorded with a large max_stack sysctl setting than what the system used for analysis has set. Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: He Kuang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Milian Wolff <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Cc: Wang Nan <[email protected]> Cc: Zefan Li <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-05-20perf trace: Warn when trying to resolve kernel addresses with kptr_restrict=1Arnaldo Carvalho de Melo1-0/+1
Hook into the libtraceevent plugin kernel symbol resolver to warn the user that that can't happen with kptr_restrict=1. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Milian Wolff <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-05-20perf machine: Do not bail out if not managing to read ref reloc symbolArnaldo Carvalho de Melo1-4/+5
This means the user can't access /proc/kallsyms, for instance, because /proc/sys/kernel/kptr_restrict is set to 1. Instead leave the ref_reloc_sym as NULL and code using it will cope. This allows 'perf trace' to work on such systems for !root, the only issue would be when trying to resolve kernel symbols, which happens, for instance, in some libtracevent plugins. A warning for that case will be provided in the next patch in this series. Noticed in Ubuntu 16.04, that comes with kptr_restrict=1. Reported-by: Milian Wolff <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-05-17Merge branch 'for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree updates from Jiri Kosina. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (21 commits) gitignore: fix wording mfd: ab8500-debugfs: fix "between" in printk memstick: trivial fix of spelling mistake on management cpupowerutils: bench: fix "average" treewide: Fix typos in printk IB/mlx4: printk fix pinctrl: sirf/atlas7: fix printk spelling serial: mctrl_gpio: Grammar s/lines GPIOs/line GPIOs/, /sets/set/ w1: comment spelling s/minmum/minimum/ Blackfin: comment spelling s/divsor/divisor/ metag: Fix misspellings in comments. ia64: Fix misspellings in comments. hexagon: Fix misspellings in comments. tools/perf: Fix misspellings in comments. cris: Fix misspellings in comments. c6x: Fix misspellings in comments. blackfin: Fix misspelling of 'register' in comment. avr32: Fix misspelling of 'definitions' in comment. treewide: Fix typos in printk Doc: treewide : Fix typos in DocBook/filesystem.xml ...
2016-05-16perf tools: Separate accounting of contexts and real addresses in a stack traceArnaldo Carvalho de Melo1-9/+17
The perf_sample->ip_callchain->nr value includes all the entries in the ip_callchain->ip[] array, real addresses and PERF_CONTEXT_{KERNEL,USER,etc}, while what the user expects is that what is in the kernel.perf_event_max_stack sysctl or in the upcoming per event perf_event_attr.sample_max_stack knob be honoured in terms of IP addresses in the stack trace. So match the kernel support and validate chain->nr taking into account both kernel.perf_event_max_stack and kernel.perf_event_max_contexts_per_stack. Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: He Kuang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Milian Wolff <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Cc: Wang Nan <[email protected]> Cc: Zefan Li <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-05-16perf symbols: Introduce DSO__NAME_KALLSYMS and DSO__NAME_KCOREMasami Hiramatsu1-1/+1
Instead of using a raw string, use DSO__NAME_KALLSYMS and DSO__NAME_KCORE macros for kallsyms and kcore. Signed-off-by: Masami Hiramatsu <[email protected]> Cc: Ananth N Mavinakayanahalli <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: Hemant Kumar <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/20160515031935.4017.50971.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-05-06perf callchain: Fix incorrect ordering of entriesChris Phlipot1-15/+41
The existing implementation of thread__resolve_callchain, under certain circumstances, can assemble callchain entries in the incorrect order. The callchain entries are resolved incorrectly for a sample when all of the following conditions are met: 1. callchain_param.order is set to ORDER_CALLER 2. thread__resolve_callchain_sample is able to resolve callchain entries for the sample. 3. unwind__get_entries is also able to resolve callchain entries for the sample. The fix is accomplished by reversing the order in which thread__resolve_callchain_sample and unwind__get_entries are called when callchain_param.order is set to ORDER_CALLER. Unwind specific code from thread__resolve_callchain is also moved into a new static function to improve readability of the fix. How to Reproduce the Existing Bug: Modifying perf script to print call trees in the opposite order or applying the remaining patches from this series and comparing the results output from export-to-postgtresql.py are the easiest ways to see the bug, however it can still be seen in current builds using perf report. Here is how i can reproduce the bug using perf report: # perf record --call-graph=dwarf stress -c 1 -t 5 when i run this command: # perf report --call-graph=flat,0,0,callee This callchain, containing kernel (handle_irq_event, etc) and userspace samples (__libc_start_main, etc) is contained in the output, which looks correct (callee order): gen8_irq_handler handle_irq_event_percpu handle_irq_event handle_edge_irq handle_irq do_IRQ ret_from_intr __random rand 0x558f2a04dded 0x558f2a04c774 __libc_start_main 0x558f2a04dcd9 Now run this command using caller order: # perf report --call-graph=flat,0,0,caller It is expected to see the exact reverse of the above when using caller order (with "0x558f2a04dcd9" at the top and "gen8_irq_handler" at the bottom) in the output, but it is nowhere to be found. instead you see this: ret_from_intr do_IRQ handle_irq handle_edge_irq handle_irq_event handle_irq_event_percpu gen8_irq_handler 0x558f2a04dcd9 __libc_start_main 0x558f2a04c774 0x558f2a04dded rand __random Notice how internally the kernel symbols are reversed and the user space symbols are reversed, but the kernel symbols still appear above the user space symbols. if this patch is applied and perf script is re-run, you will see the expected output (with "0x558f2a04dcd9" at the top and "gen8_irq_handler" at the bottom): 0x558f2a04dcd9 __libc_start_main 0x558f2a04c774 0x558f2a04dded rand __random ret_from_intr do_IRQ handle_irq handle_edge_irq handle_irq_event handle_irq_event_percpu gen8_irq_handler Signed-off-by: Chris Phlipot <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-05-05perf hists: Move sort__has_parent into struct perf_hpp_listJiri Olsa1-1/+1
Now we have sort dimensions private for struct hists, we need to make dimension booleans hists specific as well. Moving sort__has_parent into struct perf_hpp_list. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-05-05perf machine: Introduce number of threads memberArnaldo Carvalho de Melo1-1/+6
To be used, for instance, for pre-allocating an rb_tree array for sorting by other keys besides the current pid one. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Milian Wolff <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-04-27perf tools: Set the maximum allowed stack from ↵Arnaldo Carvalho de Melo1-3/+3
/proc/sys/kernel/perf_event_max_stack There is an upper limit to what tooling considers a valid callchain, and it was tied to the hardcoded value in the kernel, PERF_MAX_STACK_DEPTH (127), now that this can be tuned via a sysctl, make it read it and use that as the upper limit, falling back to PERF_MAX_STACK_DEPTH for kernels where this sysctl isn't present. Cc: Adrian Hunter <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Milian Wolff <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-04-19perf symbols: Allow loading kallsyms without considering kcore filesArnaldo Carvalho de Melo1-3/+9
Before the support for using /proc/kcore was introduced, the kallsyms routines used /proc/modules and the first 'perf test' entry expected finding maps for each module in the system, which is not the case with the kcore code. Provide a way to ignore kcore files so that the test can have its expectations met. Improving the test to cover kcore files as well needs to be done. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-04-18perf evsel: Add missign class prefix to has_branch_stack methodArnaldo Carvalho de Melo1-1/+1
Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Milian Wolff <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-04-18tools/perf: Fix misspellings in comments.Adam Buchbinder1-1/+1
Signed-off-by: Adam Buchbinder <[email protected]> Signed-off-by: Jiri Kosina <[email protected]>
2016-04-14perf callchain: Start moving away from global per thread cursorsArnaldo Carvalho de Melo1-11/+15
The recent perf_evsel__fprintf_callchain() move to evsel.c added several new symbol requirements to the python binding, for instance: # perf test -v python 16: Try 'import perf' in python, checking link problems : --- start --- test child forked, pid 18030 Traceback (most recent call last): File "<stdin>", line 1, in <module> ImportError: /tmp/build/perf/python/perf.so: undefined symbol: callchain_cursor test child finished with -1 ---- end ---- Try 'import perf' in python, checking link problems: FAILED! # This would require linking against callchain.c to access to the global callchain_cursor variables. Since lots of functions already receive as a parameter a callchain_cursor struct pointer, make that be the case for some more function so that we can start phasing out usage of yet another global variable. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-03-23perf tools: Add cpumode to struct perf_sampleArnaldo Carvalho de Melo1-8/+6
To avoid parsing event->header.misc in many locations. This will also allow setting perf.sample.{ip,cpumode} in a single place, from tracepoint fields, as needed by 'perf kvm' with PPC guests, where the guest hardware counters is not available at the host. Cc: Adrian Hunter <[email protected]> Cc: Hemant Kumar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-12-14perf thread: Fix reference count initial stateArnaldo Carvalho de Melo1-7/+12
We should always return from thread__new(), the constructor, with the object with a reference count of one, so that: struct thread *thread = thread__new(); thread__put(thread); Will call thread__delete(). If any reference is made to that 'thread' variable, it better use thread__get(thread) to hold a reference. We were returning with thread->refcnt set to zero, fix it and some cases where thread__delete() was being called, which were not a problem because just one reference was being used, now that we set it to 1, use thread__put() instead. Reported-by: Masami Hiramatsu <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-12-11perf tools: Clear struct machine during machine__init()Wang Nan1-0/+1
There are so many test cases use stack allocated 'struct machine'. Including: test__hists_link test__hists_filter test__mmap_thread_lookup test__thread_mg_share test__hists_output test__hists_cumulate Also, in non-test code (for example, machine__new_host()) there are code use 'malloc()' to alloc struct machine. These are dangerous operations, cause some tests fail or hung in machines__exit(). For example, in machines__exit -> machine__destroy_kernel_maps -> map_groups__remove -> maps__remove -> pthread_rwlock_wrlock a incorrectly initialized lock causes unintended behavior. This patch memset(0) that structure in machine__init() to ensure all fields in 'struct machine' are initialized to zero. Signed-off-by: Wang Nan <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Zefan Li <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] [ Use memset, see 'man bzero' ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-12-09perf machine: Fix machine.vmlinux_maps to make sure to clear the old oneMasami Hiramatsu1-0/+5
Fix machine.vmlinux_maps to make sure to clear the old one if it is renewal. This can leak the previous maps on the vmlinux_maps because those are just overwritten. Signed-off-by: Masami Hiramatsu <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Simplified the memset, same end result ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-12-07perf machine: Pass correct string to dso__adjust_kmod_long_nameWang Nan1-1/+1
There's a mistake in dso__adjust_kmod_long_name() that it use strdup() to dup the new long_name of a dso, but passes the original string to dso__set_long_name(). Which causes random crash during cleanup. Signed-off-by: Wang Nan <[email protected]> Reviewed-by: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Zefan Li <[email protected]> Cc: [email protected] Fixes: c03d5184f0e9 ("perf machine: Adjust dso->long_name for offline module") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-11-26perf machine: Adjust dso->long_name for offline moduleWang Nan1-1/+26
Something unexpected may happen if copy statically linked perf to a production environment: # ./perf probe -m ./mymodule.ko my_func [mymodule] with build id 326ab42550ef3d24944f53c817533728367effeb not found, continuing without symbols Failed to find symbol my_func in /home/wangnan/kmodule/mymodule.ko Error: Failed to add events. # ./perf buildid-cache -a ./mymodule.ko # ./perf probe -m ./mymodule.ko my_func Added new event: probe:my_func (on my_func in /home/wangnan/kmodule/mymodule.ko) You can now use it in all perf tools, such as: perf record -e probe:my_func -aR sleep 1 Where: # ldd ./perf not a dynamic executable # strace -e open ./perf probe -m ./mymodule.ko my_func ... open("/home/wangnan/kmodule/mymodule.ko", O_RDONLY) = 3 open("/home/wangnan/kmodule/../lib64/elfutils/libebl_x86_64.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) ... open("/lib64/tls/libebl_x86_64.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) open("/lib64/libebl_x86_64.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) open("/usr/lib64/tls/libebl_x86_64.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) open("/usr/lib64/libebl_x86_64.so", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory) open("[mymodule]", O_RDONLY) = -1 ENOENT (No such file or directory) open("/home/wangnan/.debug/.build-id/32/6ab42550ef3d24944f53c817533728367effeb", O_RDONLY) = -1 ENOENT (No such file or directory) open("[mymodule]", O_RDONLY) = -1 ENOENT (No such file or directory) In the above example, probe fails before we put the module into buildid-cache. However, user would expect it success in both case because perf is able to find probe points actually. The reason is because perf won't utilize module's full path if it failed to open debuginfo. In: convert_to_probe_trace_events -> find_probe_trace_events_from_map -> get_target_map -> kernel_get_module_map -> machine__findnew_module_map -> map_groups__find_by_name map_groups__find_by_name() is able to find the map of that module, but this information is found from /proc/module before it knows the real path of the offline module. Therefore, the map->dso->long_name is set to something like '[mymodule]', which prevent dso__load() find the real path of the module file. In another aspect, if dso__load() can get the offline module through buildid cache, it can read symble table from that ko. Even if debuginfo is not available, 'perf probe' can success if the '.symtab' can be found. This patch improves machine__findnew_module_map(): when dso->long_name is leading with '[' (doesn't find path of module when parsing /proc/modules), fixes it by dso__set_long_name(), so following dso__load() is possible to find the symbol table. This patch won't interfere with buildid matching. Here is the test result: # ./perf probe -m ./mymodule.ko my_func Added new event: probe:my_func (on my_func in /home/wangnan/kmodule/mymodule.ko) You can now use it in all perf tools, such as: perf record -e probe:my_func -aR sleep 1 # ./perf probe -d '*' Removed event: probe:my_func # mv ./mymodule.{ko,.bak} # mv ./moduleb.ko mymodule.ko # ./perf probe -m ./mymodule.ko my_func /home/wangnan/kmodule/mymodule.ko with build id 326ab42550ef3d24944f53c817533728367effeb not found, continuing without symbols Failed to find symbol my_func in /home/wangnan/kmodule/mymodule.ko Error: Failed to add events. # ./perf probe -v -m ./mymodule.ko my_func probe-definition(0): my_func symbol:my_func file:(null) line:0 offset:0 return:0 lazy:(null) 0 arguments Could not open debuginfo. Try to use symbols. symsrc__init: build id mismatch for /home/wangnan/kmodule/mymodule.ko. /home/wangnan/kmodule/mymodule.ko with build id 326ab42550ef3d24944f53c817533728367effeb not found, continuing without symbols Failed to find symbol my_func in /home/wangnan/kmodule/mymodule.ko Error: Failed to add events. Reason: No such file or directory (Code: -2) Signed-off-by: Wang Nan <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Zefan Li <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] [ Renamed adjust_dso_long_name() do dso__adjust_kmod_long_name() ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-11-26perf callchain: Honor hide_unresolvedNamhyung Kim1-0/+5
If user requested to hide unresolved entries, skip unresolved callchains as well as hist entries. Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-11-19perf machine: Fix machine__findnew_module_map to put dsoMasami Hiramatsu1-1/+3
Fix machine__findnew_module_map to drop the reference to the dso because it is already referenced by both machine__findnew_module_dso() and map__new2(). Refcnt debugger shows: ==== [1] ==== Unreclaimed dso: 0x1ffd980 Refcount +1 => 1 at ./perf(dso__new+0x1ff) [0x4a62df] ./perf(__dsos__addnew+0x29) [0x4a6e19] ./perf() [0x4b8b91] ./perf(modules__parse+0xfc) [0x4a9d5c] ./perf() [0x4b8460] ./perf(machine__create_kernel_maps+0x150) [0x4bb550] ./perf(machine__new_host+0xfa) [0x4bb75a] ./perf(init_probe_symbol_maps+0x93) [0x506623] ./perf() [0x455ffa] ./perf(cmd_probe+0x6c) [0x4566bc] ./perf() [0x47abc5] ./perf(main+0x610) [0x421f90] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1345a8eaf5] ./perf() [0x4220a9] This map_groups__insert(0x4b8b91) already gets a reference to the new dso: ---- eu-addr2line -e ./perf -f 0x4b8b91 map_groups__insert inlined at util/machine.c:586 in machine__create_module util/map.h:207 ---- So this dso refcnt will be released when map_groups gets released. [snip] Refcount +1 => 2 at ./perf(dso__get+0x34) [0x4a65f4] ./perf() [0x4b8b35] ./perf(modules__parse+0xfc) [0x4a9d5c] ./perf() [0x4b8460] ./perf(machine__create_kernel_maps+0x150) [0x4bb550] ./perf(machine__new_host+0xfa) [0x4bb75a] ./perf(init_probe_symbol_maps+0x93) [0x506623] ./perf() [0x455ffa] ./perf(cmd_probe+0x6c) [0x4566bc] ./perf() [0x47abc5] ./perf(main+0x610) [0x421f90] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1345a8eaf5] ./perf() [0x4220a9] Here, machine__findnew_module_dso(0x4b8b35) gets the dso (and stores it in a local variable): ---- # eu-addr2line -e ./perf -f 0x4b8b35 machine__findnew_module_dso inlined at util/machine.c:578 in machine__create_module util/machine.c:514 ---- Refcount +1 => 3 at ./perf(dso__get+0x34) [0x4a65f4] ./perf(map__new2+0x76) [0x4be1c6] ./perf() [0x4b8b4f] ./perf(modules__parse+0xfc) [0x4a9d5c] ./perf() [0x4b8460] ./perf(machine__create_kernel_maps+0x150) [0x4bb550] ./perf(machine__new_host+0xfa) [0x4bb75a] ./perf(init_probe_symbol_maps+0x93) [0x506623] ./perf() [0x455ffa] ./perf(cmd_probe+0x6c) [0x4566bc] ./perf() [0x47abc5] ./perf(main+0x610) [0x421f90] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1345a8eaf5] ./perf() [0x4220a9] But also map__new2() gets the dso which will be put when the map is released. So, we have to drop the constructor reference obtained in machine__findnew_module_dso(). Signed-off-by: Masami Hiramatsu <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-11-19perf tools: Fix machine__create_kernel_maps to put kernel dso refcountMasami Hiramatsu1-3/+6
Fix machine__create_kernel_maps() to put kernel dso because the dso has been gotten via __machine__create_kernel_maps(). Refcnt debugger shows: ==== [0] ==== Unreclaimed dso: 0x3036ab0 Refcount +1 => 1 at ./perf(dso__new+0x1ff) [0x4a62df] ./perf(__dsos__addnew+0x29) [0x4a6e19] ./perf(dsos__findnew+0xd1) [0x4a7181] ./perf(machine__findnew_kernel+0x27) [0x4a5e17] ./perf() [0x4b8cf2] ./perf(machine__create_kernel_maps+0x28) [0x4bb428] ./perf(machine__new_host+0xfa) [0x4bb74a] ./perf(init_probe_symbol_maps+0x93) [0x506613] ./perf() [0x455ffa] ./perf(cmd_probe+0x6c) [0x4566bc] ./perf() [0x47abc5] ./perf(main+0x610) [0x421f90] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7ffa6809eaf5] ./perf() [0x4220a9] [snip] Refcount +1 => 2 at ./perf(dsos__findnew+0x7e) [0x4a712e] ./perf(machine__findnew_kernel+0x27) [0x4a5e17] ./perf() [0x4b8cf2] ./perf(machine__create_kernel_maps+0x28) [0x4bb428] ./perf(machine__new_host+0xfa) [0x4bb74a] ./perf(init_probe_symbol_maps+0x93) [0x506613] ./perf() [0x455ffa] ./perf(cmd_probe+0x6c) [0x4566bc] ./perf() [0x47abc5] ./perf(main+0x610) [0x421f90] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7ffa6809eaf5] ./perf() [0x4220a9] [snip] Refcount -1 => 1 at ./perf(dso__put+0x2f) [0x4a664f] ./perf(machine__delete+0xfe) [0x4b93ee] ./perf(exit_probe_symbol_maps+0x28) [0x5066b8] ./perf() [0x45628a] ./perf(cmd_probe+0x6c) [0x4566bc] ./perf() [0x47abc5] ./perf(main+0x610) [0x421f90] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7ffa6809eaf5] ./perf() [0x4220a9] Actually, dsos__findnew gets the dso before returning it, so the dso user (in this case machine__create_kernel_maps) has to put the dso after used. Signed-off-by: Masami Hiramatsu <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-11-19perf machine: Fix to destroy kernel maps when machine exitsMasami Hiramatsu1-0/+1
Actually machine__exit forgot to call machine__destroy_kernel_maps. This fixes some memory leaks on map as below. Without this fix. ---- ./perf probe vfs_read Added new event: probe:vfs_read (on vfs_read) You can now use it in all perf tools, such as: perf record -e probe:vfs_read -aR sleep 1 REFCNT: BUG: Unreclaimed objects found. REFCNT: Total 4 objects are not reclaimed. To see all backtraces, rerun with -v option ---- With this fix. ---- ./perf probe vfs_read Added new event: probe:vfs_read (on vfs_read) You can now use it in all perf tools, such as: perf record -e probe:vfs_read -aR sleep 1 REFCNT: BUG: Unreclaimed objects found. REFCNT: Total 2 objects are not reclaimed. To see all backtraces, rerun with -v option ---- Signed-off-by: Masami Hiramatsu <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-11-19perf machine: Fix machine__destroy_kernel_maps to drop vmlinux_maps referencesMasami Hiramatsu1-0/+1
Fix machine__destroy_kernel_maps() to drop vmlinux_maps references before filling it with NULL. Refcnt debugger shows ==== [1] ==== Unreclaimed map: 0x36b1070 Refcount +1 => 1 at ./perf(map__new2+0xb5) [0x4bdec5] ./perf(machine__create_kernel_maps+0x72) [0x4bb152] ./perf(machine__new_host+0xfa) [0x4bb41a] ./perf(init_probe_symbol_maps+0x93) [0x5062d3] ./perf() [0x455ffa] ./perf(cmd_probe+0x6c) [0x4566bc] ./perf() [0x47abc5] ./perf(main+0x610) [0x421f90] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1fc9fc4af5] ./perf() [0x4220a9] Refcount +1 => 2 at ./perf(maps__insert+0x9a) [0x4bfd6a] ./perf(machine__create_kernel_maps+0xc3) [0x4bb1a3] ./perf(machine__new_host+0xfa) [0x4bb41a] ./perf(init_probe_symbol_maps+0x93) [0x5062d3] ./perf() [0x455ffa] ./perf(cmd_probe+0x6c) [0x4566bc] ./perf() [0x47abc5] ./perf(main+0x610) [0x421f90] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1fc9fc4af5] ./perf() [0x4220a9] Refcount -1 => 1 at ./perf(map_groups__exit+0x94) [0x4bea74] ./perf(machine__delete+0x3d) [0x4b91fd] ./perf(exit_probe_symbol_maps+0x28) [0x506378] ./perf() [0x45628a] ./perf(cmd_probe+0x6c) [0x4566bc] ./perf() [0x47abc5] ./perf(main+0x610) [0x421f90] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f1fc9fc4af5] ./perf() [0x4220a9] map__new2() returns map with refcnt = 1, and also map_groups__insert gets it again in__machine__create_kernel_maps(). machine__destroy_kernel_maps() calls map_groups__remove() to decrement the refcnt, but before decrement it again (corresponding to map__new2), it makes vmlinux_maps[type] = NULL. And this may cause a refcnt leak. Signed-off-by: Masami Hiramatsu <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-11-19perf machine: Fix machine__findnew_module_map to put registered mapMasami Hiramatsu1-0/+2
Fix machine object to drop the reference to the map object after it inserted it into machine->kmaps. refcnt debugger shows what happened: ---- ==== [2] ==== Unreclaimed map: 0x346f750 Refcount +1 => 1 at ./perf(map__new2+0xb5) [0x4bdea5] ./perf() [0x4b8aaf] ./perf(modules__parse+0xfc) [0x4a9cbc] ./perf() [0x4b83c0] ./perf(machine__create_kernel_maps+0x148) [0x4bb208] ./perf(machine__new_host+0xfa) [0x4bb3fa] ./perf(init_probe_symbol_maps+0x93) [0x5062b3] ./perf() [0x455ffa] ./perf(cmd_probe+0x6c) [0x4566bc] ./perf() [0x47abc5] ./perf(main+0x610) [0x421f90] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f5373899af5] ./perf() [0x4220a9] Refcount +1 => 2 at ./perf(maps__insert+0x9a) [0x4bfd4a] ./perf() [0x4b8acb] ./perf(modules__parse+0xfc) [0x4a9cbc] ./perf() [0x4b83c0] ./perf(machine__create_kernel_maps+0x148) [0x4bb208] ./perf(machine__new_host+0xfa) [0x4bb3fa] ./perf(init_probe_symbol_maps+0x93) [0x5062b3] ./perf() [0x455ffa] ./perf(cmd_probe+0x6c) [0x4566bc] ./perf() [0x47abc5] ./perf(main+0x610) [0x421f90] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f5373899af5] ./perf() [0x4220a9] Refcount -1 => 1 at ./perf(map_groups__exit+0x94) [0x4bea54] ./perf(machine__delete+0x3d) [0x4b91ed] ./perf(exit_probe_symbol_maps+0x28) [0x506358] ./perf() [0x45628a] ./perf(cmd_probe+0x6c) [0x4566bc] ./perf() [0x47abc5] ./perf(main+0x610) [0x421f90] /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f5373899af5] ./perf() [0x4220a9] ---- This pattern clearly shows that the refcnt of the map is acquired twice by map__new2 and maps__insert but released onlu once at map_groups__exit, when we purge its maps rbtree. Since maps__insert already reference counted the map, we have to drop the constructor (map__new2) reference count right after inserting it. These happened in machine__findnew_module_map, as below. ---- # eu-addr2line -e ./perf -f 0x4b8aaf machine__findnew_module_map inlined at util/machine.c:1046 in machine__create_module util/machine.c:582 # eu-addr2line -e ./perf -f 0x4b8acb map_groups__insert inlined at util/machine.c:585 in machine__create_module util/map.h:208 ---- (note that both are at util/machine.c:58X which is machine__findnew_module_map) Signed-off-by: Masami Hiramatsu <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-11-13perf symbols: Fix dso lookup by long name and missing buildidsAdrian Hunter1-0/+1
Commit 4598a0a6d22f ("perf symbols: Improve DSO long names lookup speed with rbtree") Added a tree to lookup dsos by long name. That tree gets corrupted whenever a dso long name is changed because the tree is not updated. One effect of that is buildid-list does not work with the 'with-hits' option because dso lookup fails and results in two structs for the same dso. The first has the buildid but no hits, the second has hits but no buildid. e.g. Before: $ tools/perf/perf record ls arch certs CREDITS Documentation firmware include ipc Kconfig lib Makefile net REPORTING-BUGS scripts sound usr block COPYING crypto drivers fs init Kbuild kernel MAINTAINERS mm README samples security tools virt [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.012 MB perf.data (11 samples) ] $ tools/perf/perf buildid-list 574da826c66538a8d9060d393a8866289bd06005 [kernel.kallsyms] 30c94dc66a1fe95180c3d68d2b89e576d5ae213c /lib/x86_64-linux-gnu/libc-2.19.so $ tools/perf/perf buildid-list -H 574da826c66538a8d9060d393a8866289bd06005 [kernel.kallsyms] 0000000000000000000000000000000000000000 /lib/x86_64-linux-gnu/libc-2.19.so After: $ tools/perf/perf buildid-list -H 574da826c66538a8d9060d393a8866289bd06005 [kernel.kallsyms] 30c94dc66a1fe95180c3d68d2b89e576d5ae213c /lib/x86_64-linux-gnu/libc-2.19.so The fix is to record the root of the tree on the dso so that dso__set_long_name() can update the tree when the long name changes. Signed-off-by: Adrian Hunter <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Don Zickus <[email protected]> Cc: Douglas Hatch <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Scott J Norton <[email protected]> Cc: Waiman Long <[email protected]> Fixes: 4598a0a6d22f ("perf symbols: Improve DSO long names lookup speed with rbtree") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-10-01perf callchain: Allow for max_stack greater than PERF_MAX_STACK_DEPTHAdrian Hunter1-1/+1
Adjust the validation to allow for max_stack greater than PERF_MAX_STACK_DEPTH. Signed-off-by: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-09-30perf machine: Add method for common kernel_map(FUNCTION) operationArnaldo Carvalho de Melo1-8/+7
And it is also a step in the direction of killing the separation of data and text maps in map_groups. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-09-30perf machine: Use machine__kernel_map() thoroughlyArnaldo Carvalho de Melo1-11/+12
In places where we were using its open coded equivalent. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-09-14perf machine: Add pointer to sample's environmentArnaldo Carvalho de Melo1-0/+1
The 'struct machine' represents the machine where the samples were/are being collected, and we also have a 'struct perf_env' with extra details about such machine, that we were collecting at 'perf.data' creation time but we also needed when no perf.data file is being used, such as in 'perf top'. So, get those structs closer together, as they provide a bigger picture of the sample's environment. In 'perf session', when the file argument is NULL, we can assume that the tool is sampling the running machine, so point machine->env to the global put in place in previous patches, while set it to the perf_header.env one when reading from a file. This paves the way for machine->env to be used in perf_event__preprocess_sample to populate addr_location.socket. Tested-by: Wang Nan <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-08-20Merge branch 'perf/urgent' into perf/core, to pick up fixes before adding ↵Ingo Molnar1-2/+18
more changes Signed-off-by: Ingo Molnar <[email protected]>
2015-08-19perf tools: Make fork event processing more resilientAdrian Hunter1-2/+18
When processing a fork event, the tools lookup the parent thread by its tid. In a couple of cases, it is possible for that thread to have the wrong pid. That can happen if the data is being processed out of order, or if the (fork) event that would have removed the erroneous thread was lost. Assume the latter case, print a dump message, remove the erroneous thread, create a new one with the correct pid, and keep going. Reported-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Adrian Hunter <[email protected]> Tested-by: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-07-23perf tools: Add new PERF_RECORD_SWITCH eventAdrian Hunter1-0/+11
Support processing of PERF_RECORD_SWITCH events and PERF_RECORD_SWITCH_CPU_WIDE events. There is a single tools callback for them both so that the tool must check the event type before using the extra members in PERF_RECORD_SWITCH_CPU_WIDE. There is still no way to select the events, though. That is added in a subsequest patch. Signed-off-by: Adrian Hunter <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Tested-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Pawel Moll <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-07-23perf symbols: Provide libtraceevent callback to resolve kernel symbolsArnaldo Carvalho de Melo1-0/+14
That provides the function signature expected by libtraceevent's pevent_set_function_resolver(). Acked-by: David Ahern <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Steven Rostedt <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-07-20perf strlist: Make dupstr be the default and part of an extensible config parmArnaldo Carvalho de Melo1-1/+1
So that we can pass more info to strlist__new() without having to change its function signature, just adding entries to the strlist_config struct with sensible defaults for when those fields are not specified. Cc: Adrian Hunter <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-07-01perf tools: Add missing break for PERF_RECORD_ITRACE_STARTJiri Olsa1-2/+1
Missing switch break since introduction of new event: c4937a91ea56 perf tools: handle PERF_RECORD_LOST_SAMPLES Also removing unneeded break for PERF_RECORD_LOST_SAMPLES. Signed-off-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Kan Liang <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-06-19perf tools: Configurable per thread proc map processing time outKan Liang1-3/+4
The time out to limit the individual proc map processing was hard code to 500ms. This patch introduce a new option --proc-map-timeout to make the time limit configurable. Signed-off-by: Kan Liang <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Ying Huang <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-06-19perf tools: Ensure thread-stack is flushedAdrian Hunter1-0/+21
The thread-stack represents a thread's current stack. When a thread exits there can still be many functions on the stack e.g. exit() can be called many levels deep, so all the callers will never return. To get that information output, the thread-stack must be flushed. Previously it was assumed the thread-stack would be flushed when the struct thread was deleted. With thread ref-counting it is no longer clear when that will be, if ever. So instead explicitly flush all the thread-stacks at the end of a session. Signed-off-by: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-06-08perf tools: Reference count struct dsoArnaldo Carvalho de Melo1-4/+11
This has a different model than the 'thread' and 'map' struct lifetimes: there is not a definitive "don't use this DSO anymore" event, i.e. we may get many 'struct map' holding references to the '/usr/lib64/libc-2.20.so' DSO but then at some point some DSO may have no references but we still don't want to straight away release its resources, because "soon" we may get a new 'struct map' that needs it and we want to reuse its symtab or other resources. So we need some way to garbage collect it when crossing some memory usage threshold, which is left for anoter patch, for now it is sufficient to release it when calling dsos__exit(), i.e. when deleting the whole list as part of deleting the 'struct machine' containing it, which will leave only referenced objects being used. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2015-06-08perf tools: Protect accesses the dso rbtrees/lists with a rw lockArnaldo Carvalho de Melo1-6/+21
To allow concurrent access, next step: refcount struct dso instances, so that we can ditch unused them when the last map pointing to it goes away. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>