aboutsummaryrefslogtreecommitdiff
path: root/tools/perf/util/machine.h
AgeCommit message (Collapse)AuthorFilesLines
2024-09-10perf callchain: Allow symbols to be optional when resolving a callchainIan Rogers1-7/+26
In uses like 'perf inject' it is not necessary to gather the symbol for each call chain location, the map for the sample IP is wanted so that build IDs and the like can be injected. Make gathering the symbol in the callchain_cursor optional. For a 'perf inject -B' command this lowers the peak RSS from 54.1MB to 29.6MB by avoiding loading symbols. Signed-off-by: Ian Rogers <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Anne Macedo <[email protected]> Cc: Casey Chen <[email protected]> Cc: Colin Ian King <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sun Haiyong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2024-08-19perf dso: Constify dso_idIan Rogers1-1/+2
The passed dso_id is copied and so is never an out argument. Remove its mutability. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Anne Macedo <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Casey Chen <[email protected]> Cc: Chaitanya S Prakash <[email protected]> Cc: Colin Ian King <[email protected]> Cc: Dominique Martinet <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jann Horn <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Masahiro Yamada <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sun Haiyong <[email protected]> Cc: Weilin Wang <[email protected]> Cc: Yang Jihong <[email protected]> Cc: Yunseong Kim <[email protected]> Cc: Ze Gao <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2024-04-12perf dsos: Attempt to better abstract DSOs internalsIan Rogers1-0/+2
Move functions from machine and build-id to dsos. Pass 'struct dsos' rather than internal state. Rename some functions to better represent which data structure they operate on. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Anne Macedo <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Changbin Du <[email protected]> Cc: Chengen Du <[email protected]> Cc: Colin Ian King <[email protected]> Cc: Ilkka Koskinen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Li Dong <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Markus Elfring <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paran Lee <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Song Liu <[email protected]> Cc: Sun Haiyong <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Yang Jihong <[email protected]> Cc: Yanteng Si <[email protected]> Cc: Yicong Yang <[email protected]> Cc: zhaimingbing <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2024-03-21perf lock contention: Trim backtrace by skipping traceiter functionsAnne Macedo1-1/+1
The 'perf lock contention' program currently shows the caller of the locks as __traceiter_contention_begin+0x??. This caller can be ignored, as it is from the traceiter itself. Instead, it should show the real callers for the locks. When fiddling with the --stack-skip parameter, the actual callers for the locks start to show up. However, just ignore the __traceiter_contention_begin and the __traceiter_contention_end symbols so the actual callers will show up. Before this patch is applied: sudo perf lock con -a -b -- sleep 3 contended total wait max wait avg wait type caller 8 2.33 s 2.28 s 291.18 ms rwlock:W __traceiter_contention_begin+0x44 4 2.33 s 2.28 s 582.35 ms rwlock:W __traceiter_contention_begin+0x44 7 140.30 ms 46.77 ms 20.04 ms rwlock:W __traceiter_contention_begin+0x44 2 63.35 ms 33.76 ms 31.68 ms mutex trace_contention_begin+0x84 2 46.74 ms 46.73 ms 23.37 ms rwlock:W __traceiter_contention_begin+0x44 1 13.54 us 13.54 us 13.54 us mutex trace_contention_begin+0x84 1 3.67 us 3.67 us 3.67 us rwsem:R __traceiter_contention_begin+0x44 Before this patch is applied - using --stack-skip 5 sudo perf lock con --stack-skip 5 -a -b -- sleep 3 contended total wait max wait avg wait type caller 2 2.24 s 2.24 s 1.12 s rwlock:W do_epoll_wait+0x5a0 4 1.65 s 824.21 ms 412.08 ms rwlock:W do_exit+0x338 2 824.35 ms 824.29 ms 412.17 ms spinlock get_signal+0x108 2 824.14 ms 824.14 ms 412.07 ms rwlock:W release_task+0x68 1 25.22 ms 25.22 ms 25.22 ms mutex cgroup_kn_lock_live+0x58 1 24.71 us 24.71 us 24.71 us spinlock do_exit+0x44 1 22.04 us 22.04 us 22.04 us rwsem:R lock_mm_and_find_vma+0xb0 After this patch is applied: sudo ./perf lock con -a -b -- sleep 3 contended total wait max wait avg wait type caller 4 4.13 s 2.07 s 1.03 s rwlock:W release_task+0x68 2 2.07 s 2.07 s 1.03 s rwlock:R mm_update_next_owner+0x50 2 2.07 s 2.07 s 1.03 s rwlock:W do_exit+0x338 1 41.56 ms 41.56 ms 41.56 ms mutex cgroup_kn_lock_live+0x58 2 36.12 us 18.83 us 18.06 us rwlock:W do_exit+0x338 Signed-off-by: Anne Macedo <[email protected]> Acked-by: Namhyung Kim <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2024-03-03perf threads: Move threads to its own filesIan Rogers1-25/+1
Move threads out of machine and into its own file. Signed-off-by: Ian Rogers <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Yang Jihong <[email protected]> Cc: Oliver Upton <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-03-03perf machine: Move machine's threads into its own abstractionIan Rogers1-9/+17
Move thread_rb_node into the machine.c file. This hides the implementation of threads from the rest of the code allowing for it to be refactored. Locking discipline is tightened up in this change. As the lock is now encapsulated in threads, the findnew function requires holding it (as it already did in machine). Rather than do conditionals with locks based on whether the thread should be created (which could potentially be error prone with a read lock match with a write unlock), have a separate threads__find that won't create the thread and only holds the read lock. This effectively duplicates the findnew logic, with the existing findnew logic only operating under a write lock assuming creation is necessary as a previous find failed. The creation may still fail with the write lock due to another thread. The duplication is removed in a later next patch that delegates the implementation to hashtable. Signed-off-by: Ian Rogers <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Yang Jihong <[email protected]> Cc: Oliver Upton <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-03-03perf report: Sort child tasks by tidIan Rogers1-0/+10
Commit 91e467bc568f ("perf machine: Use hashtable for machine threads") made the iteration of thread tids unordered. The perf report --tasks output now shows child threads in an order determined by the hashing. For example, in this snippet tid 3 appears after tid 256 even though they have the same ppid 2: ``` $ perf report --tasks % pid tid ppid comm 0 0 -1 |swapper 2 2 0 | kthreadd 256 256 2 | kworker/12:1H-k 693761 693761 2 | kworker/10:1-mm 1301762 1301762 2 | kworker/1:1-mm_ 1302530 1302530 2 | kworker/u32:0-k 3 3 2 | rcu_gp ... ``` The output is easier to read if threads appear numerically increasing. To allow for this, read all threads into a list then sort with a comparator that orders by the child task's of the first common parent. The list creation and deletion are created as utilities on machine. The indentation is possible by counting the number of parents a child has. With this change the output for the same data file is now like: ``` $ perf report --tasks % pid tid ppid comm 0 0 -1 |swapper 1 1 0 | systemd 823 823 1 | systemd-journal 853 853 1 | systemd-udevd 3230 3230 1 | systemd-timesyn 3236 3236 1 | auditd 3239 3239 3236 | audisp-syslog 3321 3321 1 | accounts-daemon ... ``` Signed-off-by: Ian Rogers <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Yang Jihong <[email protected]> Cc: Oliver Upton <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-10-25perf threads: Remove unused dead thread listIan Rogers1-1/+0
Commit 40826c45eb0b ("perf thread: Remove notion of dead threads") removed dead threads but the list head wasn't removed. Remove it here. Signed-off-by: Ian Rogers <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: German Gomez <[email protected]> Cc: James Clark <[email protected]> Cc: Nick Terrell <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Changbin Du <[email protected]> Cc: liuwenyu <[email protected]> Cc: Yang Jihong <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Miguel Ojeda <[email protected]> Cc: Song Liu <[email protected]> Cc: Leo Yan <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Kan Liang <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Yanteng Si <[email protected]> Cc: Liam Howlett <[email protected]> Cc: Paolo Bonzini <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2022-12-14machine: Adopt is_lock_function() from builtin-lock.cArnaldo Carvalho de Melo1-0/+5
It is used in bpf_lock_contention.c and builtin-lock.c will be made CONFIG_LIBTRACEEVENT=y conditional, so move it to machine.c, that is always available. This makes those 4 global variables for sched and lock text start and end to move to 'struct machine' too, as conceivably we can have that info for several machine instances, say some 'perf diff' like tool. Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/ Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-31perf machine: Move machine__resolve() from event.hArnaldo Carvalho de Melo1-0/+3
Its a machine method, so move it to machine.h, this way some places that were using event.h just to get this prototype may stop doing so and speed up building and disentanble the header dependency graph. Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-07-20perf machine: Use realloc_array_as_needed() in machine__set_current_tid()Adrian Hunter1-0/+1
Prepare machine__set_current_tid() for use with guest machines that do not currently have a machine->env->nr_cpus_avail value by making use of realloc_array_as_needed(). Signed-off-by: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-07-20perf tools: Automatically use guest kcore_dir if presentAdrian Hunter1-0/+1
When registering a guest machine using machine_pid from the id index, check perf.data for a matching kcore_dir subdirectory and set the kallsyms file name accordingly. If set, use it to find the machine's kernel symbols and object code (from kcore). Signed-off-by: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-07-18perf buildid-list: Add a "-m" option to show kernel and modules build-idsBlake Jones1-0/+5
This new option displays all of the information needed to do external BuildID-based symbolization of kernel stack traces, such as those collected by bpf_get_stackid(). For each kernel module plus the main kernel, it displays the BuildID, the start and end virtual addresses of that module's text range (rounded out to page boundaries), and the pathname of the module. When run as a non-privileged user, the actual addresses of the modules' text ranges are not available, so the tools displays "0, <text length>" for kernel modules and "0, 0xffffffffffffffff" for the kernel itself. Sample output: root# perf buildid-list -m cf6df852fd4da122d616153353cc8f560fd12fe0 ffffffffa5400000 ffffffffa6001e27 [kernel.kallsyms] 1aa7209aa2acb067d66ed6cf7676d65066384d61 ffffffffc0087000 ffffffffc008b000 /lib/modules/5.15.15-1rodete2-amd64/kernel/crypto/sha512_generic.ko 3857815b5bf0183697b68f8fe0ea06121644041e ffffffffc008c000 ffffffffc0098000 /lib/modules/5.15.15-1rodete2-amd64/kernel/arch/x86/crypto/sha512-ssse3.ko 4081fde0bca2bc097cb3e9d1efcb836047d485f1 ffffffffc0099000 ffffffffc009f000 /lib/modules/5.15.15-1rodete2-amd64/kernel/drivers/acpi/button.ko 1ef81ba4890552ea6b0314f9635fc43fc8cef568 ffffffffc00a4000 ffffffffc00aa000 /lib/modules/5.15.15-1rodete2-amd64/kernel/crypto/cryptd.ko cc5c985506cb240d7d082b55ed260cbb851f983e ffffffffc00af000 ffffffffc00b6000 /lib/modules/5.15.15-1rodete2-amd64/kernel/drivers/i2c/busses/i2c-piix4.ko [...] Committer notes: u64 formatter should be PRIx64 for printing as hex numbers, fix this: 28 5.28 debian:experimental-x-mips : FAIL gcc version 11.2.0 (Debian 11.2.0-18) builtin-buildid-list.c: In function 'buildid__map_cb': builtin-buildid-list.c:32:24: error: format '%lx' expects argument of type 'long unsigned int', but argument 3 has type 'u64' {aka 'long long unsigned int'} [-Werror=format=] 32 | printf("%s %16lx %16lx", bid_buf, map->start, map->end); | ~~~~^ ~~~~~~~~~~ | | | | long unsigned int u64 {aka long long unsigned int} | %16llx builtin-buildid-list.c:32:30: error: format '%lx' expects argument of type 'long unsigned int', but argument 4 has type 'u64' {aka 'long long unsigned int'} [-Werror=format=] 32 | printf("%s %16lx %16lx", bid_buf, map->start, map->end); | ~~~~^ ~~~~~~~~ | | | | long unsigned int u64 {aka long long unsigned int} | %16llx cc1: all warnings being treated as errors Signed-off-by: Blake Jones <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-05-23perf tools: Add guest_code supportAdrian Hunter1-0/+2
A common case for KVM test programs is that the test program acts as the hypervisor, creating, running and destroying the virtual machine, and providing the guest object code from its own object code. In this case, the VM is not running an OS, but only the functions loaded into it by the hypervisor test program, and conveniently, loaded at the same virtual addresses. Normally to resolve addresses, MMAP events are needed to map addresses back to the object code and debug symbols for that object code. Currently, there is no way to get such mapping information from guests but, in the scenario described above, the guest has the same mappings as the hypervisor, so support for that scenario can be achieved. To support that, copy the host thread's maps to the guest thread's maps. Note, we do not discover the guest until we encounter a guest event, which works well because it is not until then that we know that the host thread's maps have been set up. Typically the main function for the guest object code is called "guest_code", hence the name chosen for this feature. Note, that is just a convention, the function could be named anything, and the tools do not care. This is primarily aimed at supporting Intel PT, or similar, where trace data can be recorded for a guest. Refer to the final patch in this series "perf intel-pt: Add guest_code support" for an example. Signed-off-by: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Leo Yan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-05-23perf tools: Add machine to machines back pointerAdrian Hunter1-0/+2
When dealing with guest machines, it can be necessary to get a reference to the host machine. Add a machines pointer to struct machine to make that possible. Signed-off-by: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Leo Yan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-05-13perf tools: Remove unused machines__find_host()Adrian Hunter1-1/+0
machines__find_host() does not exist. Remove declaration. Signed-off-by: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-02-14perf maps: Use a pointer for kmapsIan Rogers1-4/+4
struct maps is reference counted, using a pointer is more idiomatic. Committer notes: Delay: maps = machine__kernel_maps(&vmlinux); To after: machine__init(&vmlinux, "", HOST_KERNEL_ID); To avoid this on f34: In file included from /var/home/acme/git/perf/tools/perf/util/build-id.h:10, from /var/home/acme/git/perf/tools/perf/util/dso.h:13, from tests/vmlinux-kallsyms.c:8: In function ‘machine__kernel_maps’, inlined from ‘test__vmlinux_matches_kallsyms’ at tests/vmlinux-kallsyms.c:122:22: /var/home/acme/git/perf/tools/perf/util/machine.h:86:23: error: ‘vmlinux.kmaps’ is used uninitialized [-Werror=uninitialized] 86 | return machine->kmaps; | ~~~~~~~^~~~~~~ tests/vmlinux-kallsyms.c: In function ‘test__vmlinux_matches_kallsyms’: tests/vmlinux-kallsyms.c:121:34: note: ‘vmlinux’ declared here 121 | struct machine kallsyms, vmlinux; | ^~~~~~~ cc1: all warnings being treated as errors Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexey Bayduraev <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andrew Morton <[email protected]> Cc: André Almeida <[email protected]> Cc: Andy Shevchenko <[email protected]> Cc: Darren Hart <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Dmitriy Vyukov <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: German Gomez <[email protected]> Cc: Hao Luo <[email protected]> Cc: James Clark <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Miaoqian Lin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: Shunsuke Nakamura <[email protected]> Cc: Song Liu <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Stephen Brennan <[email protected]> Cc: Steven Rostedt (VMware) <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Yury Norov <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-21perf arm64: Inject missing frames when using 'perf record --call-graph=fp'Alexandre Truong1-0/+1
When unwinding using frame pointers on ARM64, the return address of the current function may not have been pushed into the stack when a function was interrupted, which makes perf show an incorrect call graph to the user. Consider the following example program: void leaf() { /* long computation */ } void parent() { // (1) leaf(); // (2) } ... could be compiled into (using gcc -fno-inline -fno-omit-frame-pointer): leaf: /* long computation */ nop ret parent: // (1) stp x29, x30, [sp, -16]! mov x29, sp bl parent nop ldp x29, x30, [sp], 16 // (2) ret If the program is interrupted at (1), (2), or any point in "leaf:", the call graph will skip the callers of the current function. We can unwind using the dwarf info and check if the return addr is the same as the LR register, and inject the missing frame into the call graph. Before this patch, the above example shows the following call-graph when recording using "--call-graph fp" mode in ARM64: # Children Self Command Shared Object Symbol # ........ ........ ........ ................ ...................... # 99.86% 99.86% program3 program3 [.] leaf | ---_start __libc_start_main main leaf As can be seen, the "parent" function is missing. This is specially problematic in "leaf" because for leaf functions the compiler may always omit pushing the return addr into the stack. After this patch, it shows the correct graph: # Children Self Command Shared Object Symbol # ........ ........ ........ ................ ...................... # 99.86% 99.86% program3 program3 [.] leaf | ---_start __libc_start_main main parent leaf Reviewed-by: James Clark <[email protected]> Signed-off-by: Alexandre Truong <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: John Garry <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: German Gomez <[email protected]> [ Rename machine__normalize_is() to machine__normalized_is(), as suggested by James Clark ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-10-20perf tools: Add support for PERF_RECORD_AUX_OUTPUT_HW_IDAdrian Hunter1-0/+2
The PERF_RECORD_AUX_OUTPUT_HW_ID event provides a way to match AUX output data like Intel PT PEBS-via-PT back to the event that it came from, by providing a hardware ID that is present in the AUX output. Reviewed-by: Alexander Shishkin <[email protected]> Reviewed-by: Andi Kleen <[email protected]> Signed-off-by: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-02-18perf machine: Factor out machine__idle_thread()Adrian Hunter1-0/+1
Factor out machine__idle_thread() so it can be re-used for guest machines. A thread is needed to find executable code, even for the guest kernel. To avoid possible future pid number conflicts, the idle thread can be used. Signed-off-by: Adrian Hunter <[email protected]> Acked-by: Andi Kleen <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-02-18perf machine: Factor out machines__find_guest()Adrian Hunter1-0/+1
Factor out machines__find_guest() so it can be re-used. Signed-off-by: Adrian Hunter <[email protected]> Acked-by: Andi Kleen <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-09-17perf machine: Add machine__for_each_dso() functionJiri Olsa1-0/+4
Add the machine__for_each_dso() to iterate over all dso objects defined for the within a machine object. It will be used in the MMAP3 patch series. Signed-off-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexey Budankov <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Frank Ch. Eigler <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-07-10perf tools: Add support for PERF_RECORD_TEXT_POKEAdrian Hunter1-0/+3
Add processing for PERF_RECORD_TEXT_POKE events. When a text poke event is processed, then the kernel dso data cache is updated with the poked bytes. Signed-off-by: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt (VMware) <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-04-03perf tools: Basic support for CGROUP eventNamhyung Kim1-0/+3
Implement basic functionality to support cgroup tracking. Each cgroup can be identified by inode number which can be read from userspace too. The actual cgroup processing will come in the later patch. Reported-by: kernel test robot <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> [ fix perf test failure on sampling parsing ] Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-11-26perf maps: Rename map_groups.h to maps.hArnaldo Carvalho de Melo1-1/+1
One more step in the merge of 'struct maps' with 'struct map_groups'. Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-11-26perf maps: Merge 'struct maps' with 'struct map_groups'Arnaldo Carvalho de Melo1-4/+4
And pick the shortest name: 'struct maps'. The split existed because we used to have two groups of maps, one for functions and one for variables, but that only complicated things, sometimes we needed to figure out what was at some address and then had to first try it on the functions group and if that failed, fall back to the variables one. That split is long gone, so for quite a while we had only one struct maps per struct map_groups, simplify things by combining those structs. First patch is the minimum needed to merge both, follow up patches will rename 'thread->mg' to 'thread->maps', etc. Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-11-19perf dso: Move dso_id from 'struct map' to 'struct dso'Arnaldo Carvalho de Melo1-0/+2
And take it into account when looking up DSOs when we have the dso_id fields obtained from somewhere, like from PERF_RECORD_MMAP2 records. Instances of struct map pointing to the same DSO pathname but with anything in dso_id different are in fact different DSOs, so better have different 'struct dso' instances to reflect that. At some point we may want to get copies of the contents of the different objects if we want to do correct annotation or other analysis. With this we get 'struct map' 24 bytes leaner: $ pahole -C map ~/bin/perf struct map { union { struct rb_node rb_node __attribute__((__aligned__(8))); /* 0 24 */ struct list_head node; /* 0 16 */ } __attribute__((__aligned__(8))); /* 0 24 */ u64 start; /* 24 8 */ u64 end; /* 32 8 */ _Bool erange_warned:1; /* 40: 0 1 */ _Bool priv:1; /* 40: 1 1 */ /* XXX 6 bits hole, try to pack */ /* XXX 3 bytes hole, try to pack */ u32 prot; /* 44 4 */ u64 pgoff; /* 48 8 */ u64 reloc; /* 56 8 */ /* --- cacheline 1 boundary (64 bytes) --- */ u64 (*map_ip)(struct map *, u64); /* 64 8 */ u64 (*unmap_ip)(struct map *, u64); /* 72 8 */ struct dso * dso; /* 80 8 */ refcount_t refcnt; /* 88 4 */ u32 flags; /* 92 4 */ /* size: 96, cachelines: 2, members: 13 */ /* sum members: 92, holes: 1, sum holes: 3 */ /* sum bitfield members: 2 bits, bit holes: 1, sum bit holes: 6 bits */ /* forced alignments: 1 */ /* last cacheline: 32 bytes */ } __attribute__((__aligned__(8))); $ Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-11-18perf machine: No need to check if kernel module maps pre-existArnaldo Carvalho de Melo1-2/+0
We'only populating maps for kernel modules either from perf.data file PERF_RECORD_MMAP records or when parsing /proc/modules, so there is no need to first look if we already have those module maps in the list, that would mean the kernel has duplicate entries. So ditch one use of looking up maps by name. Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-09-20perf tools: Move event synthesizing routines to separate headerArnaldo Carvalho de Melo1-15/+0
Those are the only routines using the perf_event__handler_t typedef and are all related, so move to a separate header to reduce the header dependency tree, lots of places were getting event.h and even stdio.h, limits.h indirectly, so fix those as well. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31perf dsos: Move the dsos struct and its methods to separate source filesArnaldo Carvalho de Melo1-1/+2
So that we can reduce the header dependency tree further, in the process noticed that lots of places were getting even things like build-id routines and 'struct perf_tool' definition indirectly, so fix all those too. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-26perf record: Move record_opts and other record decls out of perf.hArnaldo Carvalho de Melo1-0/+1
And into a separate util/record.h, to better isolate things and make sure that those who use record_opts and the other moved declarations are explicitly including the necessary header. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-12Merge remote-tracking branch 'torvalds/master' into perf/coreArnaldo Carvalho de Melo1-1/+1
To get closer to upstream and check if we need to sync more UAPI headers, pick up fixes for libbpf that prevent perf's container tests from completing successfuly, etc. Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-08perf record: Fix module size on s390Thomas Richter1-1/+1
On s390 the modules loaded in memory have the text segment located after the GOT and Relocation table. This can be seen with this output: [root@m35lp76 perf]# fgrep qeth /proc/modules qeth 151552 1 qeth_l2, Live 0x000003ff800b2000 ... [root@m35lp76 perf]# cat /sys/module/qeth/sections/.text 0x000003ff800b3990 [root@m35lp76 perf]# There is an offset of 0x1990 bytes. The size of the qeth module is 151552 bytes (0x25000 in hex). The location of the GOT/relocation table at the beginning of a module is unique to s390. commit 203d8a4aa6ed ("perf s390: Fix 'start' address of module's map") adjusts the start address of a module in the map structures, but does not adjust the size of the modules. This leads to overlapping of module maps as this example shows: [root@m35lp76 perf] # ./perf report -D 0 0 0xfb0 [0xa0]: PERF_RECORD_MMAP -1/0: [0x3ff800b3990(0x25000) @ 0]: x /lib/modules/.../qeth.ko.xz 0 0 0x1050 [0xb0]: PERF_RECORD_MMAP -1/0: [0x3ff800d85a0(0x8000) @ 0]: x /lib/modules/.../ip6_tables.ko.xz The module qeth.ko has an adjusted start address modified to b3990, but its size is unchanged and the module ends at 0x3ff800d8990. This end address overlaps with the next modules start address of 0x3ff800d85a0. When the size of the leading GOT/Relocation table stored in the beginning of the text segment (0x1990 bytes) is subtracted from module qeth end address, there are no overlaps anymore: 0x3ff800d8990 - 0x1990 = 0x0x3ff800d7000 which is the same as 0x3ff800b2000 + 0x25000 = 0x0x3ff800d7000. To fix this issue, also adjust the modules size in function arch__fix_module_text_start(). Add another function parameter named size and reduce the size of the module when the text segment start address is changed. Output after: 0 0 0xfb0 [0xa0]: PERF_RECORD_MMAP -1/0: [0x3ff800b3990(0x23670) @ 0]: x /lib/modules/.../qeth.ko.xz 0 0 0x1050 [0xb0]: PERF_RECORD_MMAP -1/0: [0x3ff800d85a0(0x7a60) @ 0]: x /lib/modules/.../ip6_tables.ko.xz Reported-by: Stefan Liebler <[email protected]> Signed-off-by: Thomas Richter <[email protected]> Acked-by: Heiko Carstens <[email protected]> Cc: Hendrik Brueckner <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: [email protected] Fixes: 203d8a4aa6ed ("perf s390: Fix 'start' address of module's map") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-07-29perf evsel: Rename struct perf_evsel to struct evselJiri Olsa1-2/+2
Rename struct perf_evsel to struct evsel, so we don't have a name clash when we add struct perf_evsel in libperf. Committer notes: Added fixes for arm64, provided by Jiri. Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexey Budankov <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-07-29perf tools: Rename struct thread_map to struct perf_thread_mapJiri Olsa1-2/+2
Rename struct thread_map to struct perf_thread_map, so it could be part of libperf. Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexey Budankov <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-02-06perf map: Move structs and prototypes for map groups to a separate headerArnaldo Carvalho de Melo1-1/+1
And since machine.h only needs what is in there, make it stop including map.h and instead include this newly introduced map_groups.h instead. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-25perf machine: Use cached rbtreesDavidlohr Bueso1-6/+6
At the cost of an extra pointer, we can avoid the O(logN) cost of finding the first element in the tree (smallest node), which is something required for nearly every operation dealing with machine->guests and threads->entries. The conversion is straightforward, however, it's worth noticing that the rb_erase_init() calls have been replaced by rb_erase_cached() which has no _init() flavor, however, the node is explicitly cleared next anyway, which was redundant until now. Signed-off-by: Davidlohr Bueso <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-01-21perf tools: Handle PERF_RECORD_KSYMBOLSong Liu1-0/+3
This patch handles PERF_RECORD_KSYMBOL in perf record/report. Specifically, map and symbol are created for ksymbol register, and removed for ksymbol unregister. This patch also sets perf_event_attr.ksymbol properly. The flag is ON by default. Committer notes: Use proper inttypes.h for u64, fixing the build in some environments like in the android NDK r15c targetting ARM 32-bit. I.e. fixing this build error: util/event.c: In function 'perf_event__fprintf_ksymbol': util/event.c:1489:10: error: format '%lx' expects argument of type 'long unsigned int', but argument 3 has type 'u64' [-Werror=format=] event->ksymbol_event.flags, event->ksymbol_event.name); ^ cc1: all warnings being treated as errors Signed-off-by: Song Liu <[email protected]> Reviewed-by: Arnaldo Carvalho de Melo <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf tools: Allow specifying proc-map-timeout in config fileMark Drayton1-3/+0
The default timeout of 500ms for parsing /proc/<pid>/maps files is too short for profiling many of our services. This can be overridden by passing --proc-map-timeout to the relevant command but it'd be nice to globally increase our default value. This patch permits setting a different default with the core.proc-map-timeout config file parameter. Signed-off-by: Mark Drayton <[email protected]> Acked-by: Song Liu <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf thread: Add fallback functions for cases where cpumode is insufficientAdrian Hunter1-0/+2
For branch stacks or branch samples, the sample cpumode might not be correct because it applies only to the sample 'ip' and not necessary to 'addr' or branch stack addresses. Add fallback functions that can be used to deal with those cases Signed-off-by: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David S. Miller <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: [email protected] # 4.19 Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf machine: Record if a arch has a single user/kernel address spaceAdrian Hunter1-0/+1
Some architectures have a single address space for kernel and user addresses, which makes it possible to determine if an address is in kernel space or user space. Some don't, e.g.: sparc. Cache that info in perf_env so that, for instance, code needing to fallback failed symbol lookups at the kernel space in single address space arches can lookup at userspace. Signed-off-by: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David S. Miller <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: [email protected] # 4.19 Link: http://lkml.kernel.org/r/[email protected] [ split from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-08-13tools lib traceevent, perf tools: Rename pevent_set_* APIsTzvetomir Stoyanov (VMware)1-1/+1
In order to make libtraceevent into a proper library, variables, data structures and functions require a unique prefix to prevent name space conflicts. That prefix will be "tep_" and not "pevent_". This changes APIs: pevent_set_file_bigendian, pevent_set_flag, pevent_set_function_resolver, pevent_set_host_bigendian, pevent_set_long_size, pevent_set_page_size and pevent_get_long_size Signed-off-by: Tzvetomir Stoyanov (VMware) <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Yordan Karadzhov (VMware) <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Steven Rostedt <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-05-23perf machine: Create maps for x86 PTI entry trampolinesAdrian Hunter1-0/+19
Create maps for x86 PTI entry trampolines, based on symbols found in kallsyms. It is also necessary to keep track of whether the trampolines have been mapped particularly when the kernel dso is kcore. Signed-off-by: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] [ Fix extra_kernel_map_info.cnt designed struct initializer on gcc 4.4.7 (centos:6, etc) ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-05-22perf machine: Workaround missing maps for x86 PTI entry trampolinesAdrian Hunter1-0/+3
On x86_64 the PTI entry trampolines are not in the kernel map created by perf tools. That results in the addresses having no symbols and prevents annotation. It also causes Intel PT to have decoding errors at the trampoline addresses. Workaround that by creating maps for the trampolines. At present the kernel does not export information revealing where the trampolines are. Until that happens, the addresses are hardcoded. Signed-off-by: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-05-22perf machine: Add nr_cpus_avail()Adrian Hunter1-0/+1
Add a function to return the number of the machine's available CPUs. Signed-off-by: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-05-19perf machine: Add machine__is() to identify machine archAdrian Hunter1-0/+2
Add a function to identify the machine architecture. Signed-off-by: Adrian Hunter <[email protected]> Tested-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-04-30perf machine: Ditch find_kernel_function variantsArnaldo Carvalho de Melo1-15/+0
Since we do not have split symtabs anymore, no need to have explicit find_kernel_function variants, use the find_kernel_symbol ones. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-04-27perf symbols: Unify symbol mapsArnaldo Carvalho de Melo1-23/+9
Remove the split of symbol tables for data (MAP__VARIABLE) and for functions (MAP__FUNCTION), its unneeded and there were various places doing two lookups to find a symbol, so simplify this. We still will consider only the symbols that matched the filters in place, i.e. see the (elf_(sec,sym)|symbol_type)__filter() routines in the patch, just so that we consider only the same symbols as before, to reduce the possibility of regressions. All the tests on 50-something build environments, in varios versions of lots of distros and cross build environments were performed without build regressions, as usual with all pull requests the other tests were also performed: 'perf test' and 'make -C tools/perf build-test'. Also this was done at a great granularity so that regressions can be bisected more easily. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-04-26perf machine: Remove needless map_type from machine__load_vmlinux_path()Arnaldo Carvalho de Melo1-1/+1
Since it uses machine__kernel_map() and this function always returns the MAP__FUNCTION map, it doesn't make sense to call it with MAP__VARIABLE. And also this is a step in the direction of nuking the MAP__{FUNCTION,VARIABLE} split. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-04-26perf machine: Shorten machine__load_kallsyms() signatureArnaldo Carvalho de Melo1-2/+8
So far the only use is for MAP__FUNCTION, and since we're going to remove that split, remove the map_type argument in machine__load_kallsyms(). Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>