Age | Commit message (Collapse) | Author | Files | Lines |
|
The PMU capabilities information, which is located at
/sys/bus/event_source/devices/<dev>/caps, is required by perf tool. For
example, the max LBR information is required to stitch LBR call stack.
Add perf_pmu__caps_parse() to parse the PMU capabilities information.
The information is stored in a list.
The following patch will store the capabilities information in perf
header.
Committer notes:
Here's an example of such directories and its files in an i5 7th gen
machine:
[root@seventh ~]# ls -lad /sys/bus/event_source/devices/*/caps
drwxr-xr-x. 2 root root 0 Apr 14 13:33 /sys/bus/event_source/devices/cpu/caps
drwxr-xr-x. 2 root root 0 Apr 14 13:33 /sys/bus/event_source/devices/intel_pt/caps
[root@seventh ~]# ls -la /sys/bus/event_source/devices/intel_pt/caps
total 0
drwxr-xr-x. 2 root root 0 Apr 14 13:33 .
drwxr-xr-x. 5 root root 0 Apr 14 13:12 ..
-r--r--r--. 1 root root 4096 Apr 16 13:10 cr3_filtering
-r--r--r--. 1 root root 4096 Apr 16 11:42 cycle_thresholds
-r--r--r--. 1 root root 4096 Apr 16 13:10 ip_filtering
-r--r--r--. 1 root root 4096 Apr 16 13:10 max_subleaf
-r--r--r--. 1 root root 4096 Apr 14 13:33 mtc
-r--r--r--. 1 root root 4096 Apr 14 13:33 mtc_periods
-r--r--r--. 1 root root 4096 Apr 16 13:10 num_address_ranges
-r--r--r--. 1 root root 4096 Apr 16 13:10 output_subsys
-r--r--r--. 1 root root 4096 Apr 16 13:10 payloads_lip
-r--r--r--. 1 root root 4096 Apr 16 13:10 power_event_trace
-r--r--r--. 1 root root 4096 Apr 14 13:33 psb_cyc
-r--r--r--. 1 root root 4096 Apr 14 13:33 psb_periods
-r--r--r--. 1 root root 4096 Apr 16 13:10 ptwrite
-r--r--r--. 1 root root 4096 Apr 16 13:10 single_range_output
-r--r--r--. 1 root root 4096 Apr 16 12:03 topa_multiple_entries
-r--r--r--. 1 root root 4096 Apr 16 13:10 topa_output
[root@seventh ~]# cat /sys/bus/event_source/devices/intel_pt/caps/topa_output
1
[root@seventh ~]# cat /sys/bus/event_source/devices/intel_pt/caps/topa_multiple_entries
1
[root@seventh ~]# cat /sys/bus/event_source/devices/intel_pt/caps/mtc
1
[root@seventh ~]# cat /sys/bus/event_source/devices/intel_pt/caps/power_event_trace
0
[root@seventh ~]#
[root@seventh ~]# ls -la /sys/bus/event_source/devices/cpu/caps/
total 0
drwxr-xr-x. 2 root root 0 Apr 14 13:33 .
drwxr-xr-x. 6 root root 0 Apr 14 13:12 ..
-r--r--r--. 1 root root 4096 Apr 16 13:10 branches
-r--r--r--. 1 root root 4096 Apr 14 13:33 max_precise
-r--r--r--. 1 root root 4096 Apr 16 13:10 pmu_name
[root@seventh ~]# cat /sys/bus/event_source/devices/cpu/caps/max_precise
3
[root@seventh ~]# cat /sys/bus/event_source/devices/cpu/caps/branches
32
[root@seventh ~]# cat /sys/bus/event_source/devices/cpu/caps/pmu_name
skylake
[root@seventh ~]#
Wow, first time I've heard about
/sys/bus/event_source/devices/cpu/caps/max_precise, I think I'll use it!
:-)
Signed-off-by: Kan Liang <[email protected]>
Reviewed-by: Andi Kleen <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexey Budankov <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Pavel Gerasimov <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Vitaly Slobodskoy <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When it is not possible for a non-privilege perf command to monitor at
the kernel level (:k), the fallback code forces a :u. That works if the
event was previously monitoring both levels. But if the event was
already constrained to kernel only, then it does not make sense to
restrict it to user only.
Given the code works by exclusion, a kernel only event would have:
attr->exclude_user = 1
The fallback code would add:
attr->exclude_kernel = 1
In the end the end would not monitor in either the user level or kernel
level. In other words, it would count nothing.
An event programmed to monitor kernel only cannot be switched to user
only without seriously warning the user.
This patch forces an error in this case to make it clear the request
cannot really be satisfied.
Behavior with paranoid 1:
$ sudo bash -c "echo 1 > /proc/sys/kernel/perf_event_paranoid"
$ perf stat -e cycles:k sleep 1
Performance counter stats for 'sleep 1':
1,520,413 cycles:k
1.002361664 seconds time elapsed
0.002480000 seconds user
0.000000000 seconds sys
Old behavior with paranoid 2:
$ sudo bash -c "echo 2 > /proc/sys/kernel/perf_event_paranoid"
$ perf stat -e cycles:k sleep 1
Performance counter stats for 'sleep 1':
0 cycles:ku
1.002358127 seconds time elapsed
0.002384000 seconds user
0.000000000 seconds sys
New behavior with paranoid 2:
$ sudo bash -c "echo 2 > /proc/sys/kernel/perf_event_paranoid"
$ perf stat -e cycles:k sleep 1
Error:
You may not have permission to collect stats.
Consider tweaking /proc/sys/kernel/perf_event_paranoid,
which controls use of the performance events system by
unprivileged users (without CAP_PERFMON or CAP_SYS_ADMIN).
The current value is 2:
-1: Allow use of (almost) all events by all users
Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK
>= 0: Disallow ftrace function tracepoint by users without CAP_PERFMON or CAP_SYS_ADMIN
Disallow raw tracepoint access by users without CAP_SYS_PERFMON or CAP_SYS_ADMIN
>= 1: Disallow CPU event access by users without CAP_PERFMON or CAP_SYS_ADMIN
>= 2: Disallow kernel profiling by users without CAP_PERFMON or CAP_SYS_ADMIN
To make this setting permanent, edit /etc/sysctl.conf too, e.g.:
kernel.perf_event_paranoid = -1
v2 of this patch addresses the review feedback from [email protected].
Signed-off-by: Stephane Eranian <[email protected]>
Reviewed-by: Ian Rogers <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When AUX area events are used in sampling mode, they must be the group
leader, but the group leader is also used for leader-sampling. However,
it is not desirable to use an AUX area event as the leader for
leader-sampling, because it doesn't have any samples of its own. To support
leader-sampling with AUX area events, use the 2nd event of the group as the
"leader" for the purposes of leader-sampling.
Example:
# perf record --kcore --aux-sample -e '{intel_pt//,cycles,instructions}:S' -c 10000 uname
[ perf record: Woken up 3 times to write data ]
[ perf record: Captured and wrote 0.786 MB perf.data ]
# perf report
Samples: 380 of events 'anon group { cycles, instructions }', Event count (approx.): 3026164
Children Self Command Shared Object Symbol
+ 38.76% 42.65% 0.00% 0.00% uname [kernel.kallsyms] [k] __x86_indirect_thunk_rax
+ 35.82% 31.33% 0.00% 0.00% uname ld-2.28.so [.] _dl_start_user
+ 34.29% 29.74% 0.55% 0.47% uname ld-2.28.so [.] _dl_start
+ 33.73% 28.62% 1.60% 0.97% uname ld-2.28.so [.] dl_main
+ 33.19% 29.04% 0.52% 0.32% uname ld-2.28.so [.] _dl_sysdep_start
+ 27.83% 33.74% 0.00% 0.00% uname [kernel.kallsyms] [k] do_syscall_64
+ 26.76% 33.29% 0.00% 0.00% uname [kernel.kallsyms] [k] entry_SYSCALL_64_after_hwframe
+ 23.78% 20.33% 5.97% 5.25% uname [kernel.kallsyms] [k] page_fault
+ 23.18% 24.60% 0.00% 0.00% uname libc-2.28.so [.] __libc_start_main
+ 22.64% 24.37% 0.00% 0.00% uname uname [.] _start
+ 21.04% 23.27% 0.00% 0.00% uname uname [.] main
+ 19.48% 18.08% 3.72% 3.64% uname ld-2.28.so [.] _dl_relocate_object
+ 19.47% 21.81% 0.00% 0.00% uname libc-2.28.so [.] setlocale
+ 19.44% 21.56% 0.52% 0.61% uname libc-2.28.so [.] _nl_find_locale
+ 17.87% 19.66% 0.00% 0.00% uname libc-2.28.so [.] _nl_load_locale_from_archive
+ 15.71% 13.73% 0.53% 0.52% uname [kernel.kallsyms] [k] do_page_fault
+ 15.18% 13.21% 1.03% 0.68% uname [kernel.kallsyms] [k] handle_mm_fault
+ 14.15% 12.53% 1.01% 1.12% uname [kernel.kallsyms] [k] __handle_mm_fault
+ 12.03% 9.67% 0.54% 0.32% uname ld-2.28.so [.] _dl_map_object
+ 10.55% 8.48% 0.00% 0.00% uname ld-2.28.so [.] openaux
+ 10.55% 20.20% 0.52% 0.61% uname libc-2.28.so [.] __run_exit_handlers
Comnmitter notes:
Fixed up this problem:
util/record.c: In function ‘perf_evlist__config’:
util/record.c:256:3: error: too few arguments to function ‘perf_evsel__config_leader_sampling’
256 | perf_evsel__config_leader_sampling(evsel);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
util/record.c:190:13: note: declared here
190 | static void perf_evsel__config_leader_sampling(struct evsel *evsel,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Tools find the correct evsel, and therefore read format, using the event
ID, so it isn't necessary for all read formats to be the same. In the
case of leader-sampling of AUX area events, dummy tracking events will
have a different read format, so relax the validation to become a debug
message only.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
In preparation for adding support for leader sampling with AUX area events.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Move leader-sampling configuration in preparation for adding support for
leader sampling with AUX area events.
Committer notes:
It only makes sense when configuring an evsel that is part of an evlist,
so the only case where it is called outside perf_evlist__config(), in
some 'perf test' entry, is safe, and even there we should just use
perf_evlist__config(), but since in that case we have just one evsel in
the evlist, it is equivalent.
Also fixed up this problem:
util/record.c: In function ‘perf_evlist__config’:
util/record.c:223:3: error: too many arguments to function ‘perf_evsel__config_leader_sampling’
223 | perf_evsel__config_leader_sampling(evsel, evlist);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
util/record.c:170:13: note: declared here
170 | static void perf_evsel__config_leader_sampling(struct evsel *evsel)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
perf_evsel__is_aux_event()
Move and globalize 2 functions from the auxtrace specific sources so
that they can be reused.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
[ Move to pmu.c, as moving to evsel.h breaks the python binding ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Currently, callchains can be synthesized only for synthesized events.
Support also synthesizing callchains for regular events.
Example:
# perf record --kcore --aux-sample -e '{intel_pt//,cycles}' -c 10000 uname
Linux
[ perf record: Woken up 3 times to write data ]
[ perf record: Captured and wrote 0.532 MB perf.data ]
# perf script --itrace=Ge | head -20
uname 4864 2419025.358181: 10000 cycles:
ffffffffbba56965 apparmor_bprm_committing_creds+0x35 ([kernel.kallsyms])
ffffffffbc400cd5 __indirect_thunk_start+0x5 ([kernel.kallsyms])
ffffffffbba07422 security_bprm_committing_creds+0x22 ([kernel.kallsyms])
ffffffffbb89805d install_exec_creds+0xd ([kernel.kallsyms])
ffffffffbb90d9ac load_elf_binary+0x3ac ([kernel.kallsyms])
uname 4864 2419025.358185: 10000 cycles:
ffffffffbba56db0 apparmor_bprm_committed_creds+0x20 ([kernel.kallsyms])
ffffffffbc400cd5 __indirect_thunk_start+0x5 ([kernel.kallsyms])
ffffffffbba07452 security_bprm_committed_creds+0x22 ([kernel.kallsyms])
ffffffffbb89809a install_exec_creds+0x4a ([kernel.kallsyms])
ffffffffbb90d9ac load_elf_binary+0x3ac ([kernel.kallsyms])
uname 4864 2419025.358189: 10000 cycles:
ffffffffbb86fdf6 vma_adjust_trans_huge+0x6 ([kernel.kallsyms])
ffffffffbb821660 __vma_adjust+0x160 ([kernel.kallsyms])
ffffffffbb897be7 shift_arg_pages+0x97 ([kernel.kallsyms])
ffffffffbb897ed9 setup_arg_pages+0x1e9 ([kernel.kallsyms])
ffffffffbb90d9f2 load_elf_binary+0x3f2 ([kernel.kallsyms])
Committer testing:
# perf record --kcore --aux-sample -e '{intel_pt//,cycles}' -c 10000 uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.233 MB perf.data ]
#
Then, before this patch:
# perf script --itrace=Ge | head -20
uname 28642 168664.856384: 10000 cycles: ffffffff9810aeaa commit_creds+0x2a ([kernel.kallsyms])
uname 28642 168664.856388: 10000 cycles: ffffffff982a24f1 mprotect_fixup+0x151 ([kernel.kallsyms])
uname 28642 168664.856392: 10000 cycles: ffffffff982a385b move_page_tables+0xbcb ([kernel.kallsyms])
uname 28642 168664.856396: 10000 cycles: ffffffff982fd4ec __mod_memcg_state+0x1c ([kernel.kallsyms])
uname 28642 168664.856400: 10000 cycles: ffffffff9829fddd do_mmap+0xfd ([kernel.kallsyms])
uname 28642 168664.856404: 10000 cycles: ffffffff9829c879 __vma_adjust+0x479 ([kernel.kallsyms])
uname 28642 168664.856408: 10000 cycles: ffffffff98238e94 __perf_addr_filters_adjust+0x34 ([kernel.kallsyms])
uname 28642 168664.856412: 10000 cycles: ffffffff98a38e0b down_write+0x1b ([kernel.kallsyms])
uname 28642 168664.856416: 10000 cycles: ffffffff983006a0 memcg_kmem_get_cache+0x0 ([kernel.kallsyms])
uname 28642 168664.856421: 10000 cycles: ffffffff98396eaf load_elf_binary+0x92f ([kernel.kallsyms])
uname 28642 168664.856425: 10000 cycles: ffffffff982e0222 kfree+0x62 ([kernel.kallsyms])
uname 28642 168664.856428: 10000 cycles: ffffffff9846dfd4 file_has_perm+0x54 ([kernel.kallsyms])
uname 28642 168664.856433: 10000 cycles: ffffffff98288911 vma_interval_tree_insert+0x51 ([kernel.kallsyms])
uname 28642 168664.856437: 10000 cycles: ffffffff9823e577 perf_event_mmap_output+0x27 ([kernel.kallsyms])
uname 28642 168664.856441: 10000 cycles: ffffffff98a26fa0 xas_load+0x40 ([kernel.kallsyms])
uname 28642 168664.856445: 10000 cycles: ffffffff98004f30 arch_setup_additional_pages+0x0 ([kernel.kallsyms])
uname 28642 168664.856448: 10000 cycles: ffffffff98a297c0 copy_user_generic_unrolled+0xa0 ([kernel.kallsyms])
uname 28642 168664.856452: 10000 cycles: ffffffff9853a87a strnlen_user+0x10a ([kernel.kallsyms])
uname 28642 168664.856456: 10000 cycles: ffffffff986638a7 randomize_page+0x27 ([kernel.kallsyms])
uname 28642 168664.856460: 10000 cycles: ffffffff98a3b645 _raw_spin_lock+0x5 ([kernel.kallsyms])
#
And after:
# perf script --itrace=Ge | head -20
uname 28642 168664.856384: 10000 cycles:
ffffffff9810aeaa commit_creds+0x2a ([kernel.kallsyms])
ffffffff9831fe87 install_exec_creds+0x17 ([kernel.kallsyms])
ffffffff983968d9 load_elf_binary+0x359 ([kernel.kallsyms])
ffffffff98e00c45 __x86_indirect_thunk_rax+0x5 ([kernel.kallsyms])
ffffffff98e00c45 __x86_indirect_thunk_rax+0x5 ([kernel.kallsyms])
uname 28642 168664.856388: 10000 cycles:
ffffffff982a24f1 mprotect_fixup+0x151 ([kernel.kallsyms])
ffffffff9831fa83 setup_arg_pages+0x123 ([kernel.kallsyms])
ffffffff9839691f load_elf_binary+0x39f ([kernel.kallsyms])
ffffffff98e00c45 __x86_indirect_thunk_rax+0x5 ([kernel.kallsyms])
ffffffff98e00c45 __x86_indirect_thunk_rax+0x5 ([kernel.kallsyms])
uname 28642 168664.856392: 10000 cycles:
ffffffff982a385b move_page_tables+0xbcb ([kernel.kallsyms])
ffffffff9831f889 shift_arg_pages+0xa9 ([kernel.kallsyms])
ffffffff9831fb4f setup_arg_pages+0x1ef ([kernel.kallsyms])
ffffffff9839691f load_elf_binary+0x39f ([kernel.kallsyms])
ffffffff98e00c45 __x86_indirect_thunk_rax+0x5 ([kernel.kallsyms])
#
Signed-off-by: Adrian Hunter <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
For reporting purposes, an evsel sample can have a callchain synthesized
from AUX area data. Add support for keeping track of synthesized sample
types. Note, the recorded sample_type cannot be changed because it is
needed to continue to parse events.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Using 'type' variable for checking for callchains is equivalent to using
evsel__has_callchain(evsel) and is how the other PERF_SAMPLE_ bits are checked
in this function, so use it to be consistent.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
[ split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add a thread stack function to create a call chain for hardware events
where the sample records get created some time after the event occurred.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Currently, callchains can be synthesized only for synthesized events. Add
an itrace option to synthesize callchains for regular events.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
An AUX area event must be the group leader when recording traces in
sample mode, but that does not produce the expected results from
'perf report' because it expects the leader to provide samples.
Rather than teach 'perf report' about AUX area sampling, un-group the
AUX area event during processing, making the 2nd event the leader.
Example:
$ perf record -e '{intel_pt//u,branch-misses:u}' -c 1 uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.080 MB perf.data ]
Before:
$ perf report
Samples: 800 of events 'anon group { intel_pt//u, branch-misses:u }', Event count (approx.): 800
Children Self Command Shared Object Symbol
0.00% 47.50% 0.00% 47.50% uname libc-2.28.so [.] _dl_addr
0.00% 16.38% 0.00% 16.38% uname ld-2.28.so [.] __GI___tunables_init
0.00% 54.75% 0.00% 4.75% uname ld-2.28.so [.] dl_main
0.00% 3.12% 0.00% 3.12% uname ld-2.28.so [.] _dl_map_object_from_fd
0.00% 2.38% 0.00% 2.38% uname ld-2.28.so [.] strcmp
0.00% 2.25% 0.00% 2.25% uname ld-2.28.so [.] _dl_check_map_versions
0.00% 2.00% 0.00% 2.00% uname ld-2.28.so [.] _dl_important_hwcaps
0.00% 2.00% 0.00% 2.00% uname ld-2.28.so [.] _dl_map_object_deps
0.00% 51.50% 0.00% 1.50% uname ld-2.28.so [.] _dl_sysdep_start
0.00% 1.25% 0.00% 1.25% uname ld-2.28.so [.] _dl_load_cache_lookup
0.00% 51.12% 0.00% 1.12% uname ld-2.28.so [.] _dl_start
0.00% 50.88% 0.00% 1.12% uname ld-2.28.so [.] do_lookup_x
0.00% 50.62% 0.00% 1.00% uname ld-2.28.so [.] _dl_lookup_symbol_x
0.00% 1.00% 0.00% 1.00% uname ld-2.28.so [.] _dl_map_object
0.00% 1.00% 0.00% 1.00% uname ld-2.28.so [.] _dl_next_ld_env_entry
0.00% 0.88% 0.00% 0.88% uname ld-2.28.so [.] _dl_cache_libcmp
0.00% 0.88% 0.00% 0.88% uname ld-2.28.so [.] _dl_new_object
0.00% 50.88% 0.00% 0.88% uname ld-2.28.so [.] _dl_relocate_object
0.00% 0.62% 0.00% 0.62% uname ld-2.28.so [.] _dl_init_paths
0.00% 0.62% 0.00% 0.62% uname ld-2.28.so [.] _dl_name_match_p
0.00% 0.50% 0.00% 0.50% uname ld-2.28.so [.] get_common_indeces.constprop.1
0.00% 0.50% 0.00% 0.50% uname ld-2.28.so [.] memmove
0.00% 0.50% 0.00% 0.50% uname ld-2.28.so [.] memset
0.00% 0.50% 0.00% 0.50% uname ld-2.28.so [.] open_verify.constprop.11
0.00% 0.38% 0.00% 0.38% uname ld-2.28.so [.] _dl_check_all_versions
0.00% 0.38% 0.00% 0.38% uname ld-2.28.so [.] _dl_find_dso_for_object
0.00% 0.38% 0.00% 0.38% uname ld-2.28.so [.] init_tls
0.00% 0.25% 0.00% 0.25% uname ld-2.28.so [.] __tunable_get_val
0.00% 0.25% 0.00% 0.25% uname ld-2.28.so [.] _dl_add_to_namespace_list
0.00% 0.25% 0.00% 0.25% uname ld-2.28.so [.] _dl_determine_tlsoffset
0.00% 0.25% 0.00% 0.25% uname ld-2.28.so [.] _dl_discover_osversion
0.00% 0.25% 0.00% 0.25% uname ld-2.28.so [.] calloc@plt
0.00% 0.25% 0.00% 0.25% uname ld-2.28.so [.] malloc
0.00% 0.25% 0.00% 0.25% uname ld-2.28.so [.] malloc@plt
0.00% 0.25% 0.00% 0.25% uname libc-2.28.so [.] _nl_load_locale_from_archive
0.00% 0.25% 0.00% 0.25% uname [unknown] [k] 0xffffffffa3a00010
0.00% 0.12% 0.00% 0.12% uname ld-2.28.so [.] __libc_scratch_buffer_set_array_size
0.00% 0.12% 0.00% 0.12% uname ld-2.28.so [.] _dl_allocate_tls_storage
0.00% 0.12% 0.00% 0.12% uname ld-2.28.so [.] _dl_catch_exception
0.00% 0.12% 0.00% 0.12% uname ld-2.28.so [.] _dl_setup_hash
0.00% 0.12% 0.00% 0.12% uname ld-2.28.so [.] _dl_sort_maps
0.00% 0.12% 0.00% 0.12% uname ld-2.28.so [.] _dl_sysdep_read_whole_file
0.00% 0.12% 0.00% 0.12% uname ld-2.28.so [.] access
0.00% 0.12% 0.00% 0.12% uname ld-2.28.so [.] calloc
0.00% 0.12% 0.00% 0.12% uname ld-2.28.so [.] mmap64
0.00% 0.12% 0.00% 0.12% uname ld-2.28.so [.] openaux
0.00% 0.12% 0.00% 0.12% uname ld-2.28.so [.] rtld_lock_default_lock_recursive
0.00% 0.12% 0.00% 0.12% uname ld-2.28.so [.] rtld_lock_default_unlock_recursive
0.00% 0.12% 0.00% 0.12% uname ld-2.28.so [.] strchr
0.00% 0.12% 0.00% 0.12% uname ld-2.28.so [.] strlen
0.00% 0.12% 0.00% 0.12% uname ld-2.28.so [.] 0x0000000000001080
0.00% 0.12% 0.00% 0.12% uname libc-2.28.so [.] __strchrnul_avx2
0.00% 0.12% 0.00% 0.12% uname libc-2.28.so [.] _nl_normalize_codeset
0.00% 0.12% 0.00% 0.12% uname libc-2.28.so [.] malloc
0.00% 0.12% 0.00% 0.12% uname [unknown] [k] 0xffffffffa3a011f0
0.00% 50.00% 0.00% 0.00% uname ld-2.28.so [.] _dl_start_user
0.00% 50.00% 0.00% 0.00% uname [unknown] [.] 0000000000000000
After:
Samples: 800 of event 'branch-misses:u', Event count (approx.): 800
Children Self Command Shared Object Symbol
54.75% 4.75% uname ld-2.28.so [.] dl_main
51.50% 1.50% uname ld-2.28.so [.] _dl_sysdep_start
51.12% 1.12% uname ld-2.28.so [.] _dl_start
50.88% 0.88% uname ld-2.28.so [.] _dl_relocate_object
50.88% 1.12% uname ld-2.28.so [.] do_lookup_x
50.62% 1.00% uname ld-2.28.so [.] _dl_lookup_symbol_x
50.00% 0.00% uname ld-2.28.so [.] _dl_start_user
50.00% 0.00% uname [unknown] [.] 0000000000000000
47.50% 47.50% uname libc-2.28.so [.] _dl_addr
16.38% 16.38% uname ld-2.28.so [.] __GI___tunables_init
3.12% 3.12% uname ld-2.28.so [.] _dl_map_object_from_fd
2.38% 2.38% uname ld-2.28.so [.] strcmp
2.25% 2.25% uname ld-2.28.so [.] _dl_check_map_versions
2.00% 2.00% uname ld-2.28.so [.] _dl_important_hwcaps
2.00% 2.00% uname ld-2.28.so [.] _dl_map_object_deps
1.25% 1.25% uname ld-2.28.so [.] _dl_load_cache_lookup
1.00% 1.00% uname ld-2.28.so [.] _dl_map_object
1.00% 1.00% uname ld-2.28.so [.] _dl_next_ld_env_entry
0.88% 0.88% uname ld-2.28.so [.] _dl_cache_libcmp
0.88% 0.88% uname ld-2.28.so [.] _dl_new_object
0.62% 0.62% uname ld-2.28.so [.] _dl_init_paths
0.62% 0.62% uname ld-2.28.so [.] _dl_name_match_p
0.50% 0.50% uname ld-2.28.so [.] get_common_indeces.constprop.1
0.50% 0.50% uname ld-2.28.so [.] memmove
0.50% 0.50% uname ld-2.28.so [.] memset
0.50% 0.50% uname ld-2.28.so [.] open_verify.constprop.11
0.38% 0.38% uname ld-2.28.so [.] _dl_check_all_versions
0.38% 0.38% uname ld-2.28.so [.] _dl_find_dso_for_object
0.38% 0.38% uname ld-2.28.so [.] init_tls
0.25% 0.25% uname ld-2.28.so [.] __tunable_get_val
0.25% 0.25% uname ld-2.28.so [.] _dl_add_to_namespace_list
0.25% 0.25% uname ld-2.28.so [.] _dl_determine_tlsoffset
0.25% 0.25% uname ld-2.28.so [.] _dl_discover_osversion
0.25% 0.25% uname ld-2.28.so [.] calloc@plt
0.25% 0.25% uname ld-2.28.so [.] malloc
0.25% 0.25% uname ld-2.28.so [.] malloc@plt
0.25% 0.25% uname libc-2.28.so [.] _nl_load_locale_from_archive
0.25% 0.25% uname [unknown] [k] 0xffffffffa3a00010
0.12% 0.12% uname ld-2.28.so [.] __libc_scratch_buffer_set_array_size
0.12% 0.12% uname ld-2.28.so [.] _dl_allocate_tls_storage
0.12% 0.12% uname ld-2.28.so [.] _dl_catch_exception
0.12% 0.12% uname ld-2.28.so [.] _dl_setup_hash
0.12% 0.12% uname ld-2.28.so [.] _dl_sort_maps
0.12% 0.12% uname ld-2.28.so [.] _dl_sysdep_read_whole_file
0.12% 0.12% uname ld-2.28.so [.] access
0.12% 0.12% uname ld-2.28.so [.] calloc
0.12% 0.12% uname ld-2.28.so [.] mmap64
0.12% 0.12% uname ld-2.28.so [.] openaux
0.12% 0.12% uname ld-2.28.so [.] rtld_lock_default_lock_recursive
0.12% 0.12% uname ld-2.28.so [.] rtld_lock_default_unlock_recursive
0.12% 0.12% uname ld-2.28.so [.] strchr
0.12% 0.12% uname ld-2.28.so [.] strlen
0.12% 0.12% uname ld-2.28.so [.] 0x0000000000001080
0.12% 0.12% uname libc-2.28.so [.] __strchrnul_avx2
0.12% 0.12% uname libc-2.28.so [.] _nl_normalize_codeset
0.12% 0.12% uname libc-2.28.so [.] malloc
0.12% 0.12% uname [unknown] [k] 0xffffffffa3a011f0
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Implement ->evsel_is_auxtrace() callback.
Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Thomas Richter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Implement ->evsel_is_auxtrace() callback.
Signed-off-by: Adrian Hunter <[email protected]>
Reviewed-by: Mathieu Poirier <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Implement ->evsel_is_auxtrace() callback.
Signed-off-by: Adrian Hunter <[email protected]>
Reviewed-by: Leo Yan <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kim Phillips <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Implement ->evsel_is_auxtrace() callback.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Implement ->evsel_is_auxtrace() callback.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add ->evsel_is_auxtrace() callback to identify if a selected event
is an AUX area event.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kim Phillips <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Thomas Richter <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This patch refactors metricgroup__add_metric function where some part of
it move to function metricgroup__add_metric_param. No logic change.
Signed-off-by: Kajol Jain <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Anju T Sudhakar <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Joe Mario <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Mamatha Inamdar <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sukadev Bhattiprolu <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add the expr_scanner_ctx object to hold user data for the expr scanner.
Currently it holds only start_token, Kajol Jain will use it to hold 24x7
runtime param.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Anju T Sudhakar <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Joe Mario <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Mamatha Inamdar <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sukadev Bhattiprolu <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Adding expr_ prefix for parse_ctx and parse_id, to straighten out the
expr* namespace.
There's no functional change.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Anju T Sudhakar <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Joe Mario <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Mamatha Inamdar <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sukadev Bhattiprolu <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Reuse an existing char buffer to avoid two PATH_MAX sized char buffers.
Reduces stack frame sizes by 4kb.
perf_event__synthesize_mmap_events before 'sub $0x45b8,%rsp' after
'sub $0x35b8,%rsp'.
perf_event__get_comm_ids before 'sub $0x2028,%rsp' after
'sub $0x1028,%rsp'.
The performance impact of this change is negligible.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andrey Zhizhikin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Kefeng Wang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Extend error messages to mention CAP_PERFMON capability as an option to
substitute CAP_SYS_ADMIN capability for secure system performance
monitoring and observability operations. Make
perf_event_paranoid_check() and __cmd_ftrace() to be aware of
CAP_PERFMON capability.
CAP_PERFMON implements the principle of least privilege for performance
monitoring and observability operations (POSIX IEEE 1003.1e 2.2.2.39
principle of least privilege: A security design principle that states
that a process or program be granted only those privileges (e.g.,
capabilities) necessary to accomplish its legitimate function, and only
for the time that such privileges are actually required)
For backward compatibility reasons access to perf_events subsystem remains
open for CAP_SYS_ADMIN privileged processes but CAP_SYS_ADMIN usage for
secure perf_events monitoring is discouraged with respect to CAP_PERFMON
capability.
Committer testing:
Using a libcap with this patch:
diff --git a/libcap/include/uapi/linux/capability.h b/libcap/include/uapi/linux/capability.h
index 78b2fd4c8a95..89b5b0279b60 100644
--- a/libcap/include/uapi/linux/capability.h
+++ b/libcap/include/uapi/linux/capability.h
@@ -366,8 +366,9 @@ struct vfs_ns_cap_data {
#define CAP_AUDIT_READ 37
+#define CAP_PERFMON 38
-#define CAP_LAST_CAP CAP_AUDIT_READ
+#define CAP_LAST_CAP CAP_PERFMON
#define cap_valid(x) ((x) >= 0 && (x) <= CAP_LAST_CAP)
Note that using '38' in place of 'cap_perfmon' works to some degree with
an old libcap, its only when cap_get_flag() is called that libcap
performs an error check based on the maximum value known for
capabilities that it will fail.
This makes determining the default of perf_event_attr.exclude_kernel to
fail, as it can't determine if CAP_PERFMON is in place.
Using 'perf top -e cycles' avoids the default check and sets
perf_event_attr.exclude_kernel to 1.
As root, with a libcap supporting CAP_PERFMON:
# groupadd perf_users
# adduser perf -g perf_users
# mkdir ~perf/bin
# cp ~acme/bin/perf ~perf/bin/
# chgrp perf_users ~perf/bin/perf
# setcap "cap_perfmon,cap_sys_ptrace,cap_syslog=ep" ~perf/bin/perf
# getcap ~perf/bin/perf
/home/perf/bin/perf = cap_sys_ptrace,cap_syslog,cap_perfmon+ep
# ls -la ~perf/bin/perf
-rwxr-xr-x. 1 root perf_users 16968552 Apr 9 13:10 /home/perf/bin/perf
As the 'perf' user in the 'perf_users' group:
$ perf top -a --stdio
Error:
Failed to mmap with 1 (Operation not permitted)
$
Either add the cap_ipc_lock capability to the perf binary or reduce the
ring buffer size to some smaller value:
$ perf top -m10 -a --stdio
rounding mmap pages size to 64K (16 pages)
Error:
Failed to mmap with 1 (Operation not permitted)
$ perf top -m4 -a --stdio
Error:
Failed to mmap with 1 (Operation not permitted)
$ perf top -m2 -a --stdio
PerfTop: 762 irqs/sec kernel:49.7% exact: 100.0% lost: 0/0 drop: 0/0 [4000Hz cycles], (all, 4 CPUs)
------------------------------------------------------------------------------------------------------
9.83% perf [.] __symbols__insert
8.58% perf [.] rb_next
5.91% [kernel] [k] module_get_kallsym
5.66% [kernel] [k] kallsyms_expand_symbol.constprop.0
3.98% libc-2.29.so [.] __GI_____strtoull_l_internal
3.66% perf [.] rb_insert_color
2.34% [kernel] [k] vsnprintf
2.30% [kernel] [k] string_nocheck
2.16% libc-2.29.so [.] _IO_getdelim
2.15% [kernel] [k] number
2.13% [kernel] [k] format_decode
1.58% libc-2.29.so [.] _IO_feof
1.52% libc-2.29.so [.] __strcmp_avx2
1.50% perf [.] rb_set_parent_color
1.47% libc-2.29.so [.] __libc_calloc
1.24% [kernel] [k] do_syscall_64
1.17% [kernel] [k] __x86_indirect_thunk_rax
$ perf record -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.552 MB perf.data (74 samples) ]
$ perf evlist
cycles
$ perf evlist -v
cycles: size: 120, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
$ perf report | head -20
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 74 of event 'cycles'
# Event count (approx.): 15694834
#
# Overhead Command Shared Object Symbol
# ........ ............... .......................... ......................................
#
19.62% perf [kernel.vmlinux] [k] strnlen_user
13.88% swapper [kernel.vmlinux] [k] intel_idle
13.83% ksoftirqd/0 [kernel.vmlinux] [k] pfifo_fast_dequeue
13.51% swapper [kernel.vmlinux] [k] kmem_cache_free
6.31% gnome-shell [kernel.vmlinux] [k] kmem_cache_free
5.66% kworker/u8:3+ix [kernel.vmlinux] [k] delay_tsc
4.42% perf [kernel.vmlinux] [k] __set_cpus_allowed_ptr
3.45% kworker/2:1-eve [kernel.vmlinux] [k] shmem_truncate_range
2.29% gnome-shell libgobject-2.0.so.0.6000.7 [.] g_closure_ref
$
Signed-off-by: Alexey Budankov <[email protected]>
Reviewed-by: James Morris <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Igor Lubashev <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Serge Hallyn <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add the DSO_BINARY_TYPE__BPF_IMAGE dso binary type to recognize BPF
images that carry trampoline or dispatcher.
Upcoming patches will add support to read the image data, store it
within the BPF feature in perf.data and display it for annotation
purposes.
Currently we only display following message:
# ./perf annotate bpf_trampoline_24456 --stdio
Percent | Source code & Disassembly of . for cycles (504 ...
--------------------------------------------------------------- ...
: to be implemented
Signed-off-by: Jiri Olsa <[email protected]>
Acked-by: Song Liu <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Björn Töpel <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Jesper Dangaard Brouer <[email protected]>
Cc: John Fastabend <[email protected]>
Cc: Martin KaFai Lau <[email protected]>
Cc: Yonghong Song <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
There's no special load action for ksymbol data on map__load/dso__load
action, where the kernel is getting loaded. It only gets confused with
kernel kallsyms/vmlinux load for bpf object, which fails and could mess
up with the map.
Disabling any further load of the map for ksymbol related dso/map.
Signed-off-by: Jiri Olsa <[email protected]>
Acked-by: Song Liu <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Björn Töpel <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Jesper Dangaard Brouer <[email protected]>
Cc: John Fastabend <[email protected]>
Cc: Martin KaFai Lau <[email protected]>
Cc: Yonghong Song <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Synthesize bpf images (trampolines/dispatchers) on start, as ksymbol
events from /proc/kallsyms. Having this perf can recognize samples from
those images and perf report and top shows them correctly.
The rest of the ksymbol handling is already in place from for the bpf
programs monitoring, so only the initial state was needed.
perf report output:
# Overhead Command Shared Object Symbol
12.37% test_progs [kernel.vmlinux] [k] entry_SYSCALL_64
11.80% test_progs [kernel.vmlinux] [k] syscall_return_via_sysret
9.63% test_progs bpf_prog_bcf7977d3b93787c_prog2 [k] bpf_prog_bcf7977d3b93787c_prog2
6.90% test_progs bpf_trampoline_24456 [k] bpf_trampoline_24456
6.36% test_progs [kernel.vmlinux] [k] memcpy_erms
Committer notes:
Use scnprintf() instead of strncpy() to overcome this on fedora:32,
rawhide and OpenMandriva Cooker:
CC /tmp/build/perf/util/bpf-event.o
In file included from /usr/include/string.h:495,
from /git/linux/tools/lib/bpf/libbpf_common.h:12,
from /git/linux/tools/lib/bpf/bpf.h:31,
from util/bpf-event.c:4:
In function 'strncpy',
inlined from 'process_bpf_image' at util/bpf-event.c:323:2,
inlined from 'kallsyms_process_symbol' at util/bpf-event.c:358:9:
/usr/include/bits/string_fortified.h:106:10: error: '__builtin_strncpy' specified bound 256 equals destination size [-Werror=stringop-truncation]
106 | return __builtin___strncpy_chk (__dest, __src, __len, __bos (__dest));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
Signed-off-by: Jiri Olsa <[email protected]>
Acked-by: Song Liu <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Björn Töpel <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Jesper Dangaard Brouer <[email protected]>
Cc: John Fastabend <[email protected]>
Cc: Martin KaFai Lau <[email protected]>
Cc: Yonghong Song <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]/
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
We received a report that was no metric header displayed if --per-socket
and --metric-only were both set.
It's hard for script to parse the perf-stat output. This patch fixes this
issue.
Before:
root@kbl-ppc:~# perf stat -a -M CPI --metric-only --per-socket
^C
Performance counter stats for 'system wide':
S0 8 2.6
2.215270071 seconds time elapsed
root@kbl-ppc:~# perf stat -a -M CPI --metric-only --per-socket -I1000
# time socket cpus
1.000411692 S0 8 2.2
2.001547952 S0 8 3.4
3.002446511 S0 8 3.4
4.003346157 S0 8 4.0
5.004245736 S0 8 0.3
After:
root@kbl-ppc:~# perf stat -a -M CPI --metric-only --per-socket
^C
Performance counter stats for 'system wide':
CPI
S0 8 2.1
1.813579830 seconds time elapsed
root@kbl-ppc:~# perf stat -a -M CPI --metric-only --per-socket -I1000
# time socket cpus CPI
1.000415122 S0 8 3.2
2.001630051 S0 8 2.9
3.002612278 S0 8 4.3
4.003523594 S0 8 3.0
5.004504256 S0 8 3.7
Signed-off-by: Jin Yao <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The set of C compiler options used by distros to build python bindings
may include options that are unknown to clang, we check for a variety of
such options, add -fno-semantic-interposition to that mix:
This fixes the build on, among others, Manjaro Linux:
GEN /tmp/build/perf/python/perf.so
clang-9: error: unknown argument: '-fno-semantic-interposition'
error: command 'clang' failed with exit status 1
make: Leaving directory '/git/perf/tools/perf'
[perfbuilder@602aed1c266d ~]$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-pc-linux-gnu/9.3.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /build/gcc/src/gcc/configure --prefix=/usr --libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man --infodir=/usr/share/info --with-pkgversion='Arch Linux 9.3.0-1' --with-bugurl=https://bugs.archlinux.org/ --enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++,d --enable-shared --enable-threads=posix --with-system-zlib --with-isl --enable-__cxa_atexit --disable-libunwind-exceptions --enable-clocale=gnu --disable-libstdcxx-pch --disable-libssp --enable-gnu-unique-object --enable-linker-build-id --enable-lto --enable-plugin --enable-install-libiberty --with-linker-hash-style=gnu --enable-gnu-indirect-function --enable-multilib --disable-werror --enable-checking=release --enable-default-pie --enable-default-ssp --enable-cet=auto gdc_include_dir=/usr/include/dlang/gdc
Thread model: posix
gcc version 9.3.0 (Arch Linux 9.3.0-1)
[perfbuilder@602aed1c266d ~]$
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The clang check in the python setup.py file expected $CC to be just the
name of the compiler, not the compiler + options, i.e. all options were
expected to be passed in $CFLAGS, this ends up making it fail in systems
where CC is set to, e.g.:
"aarch64-linaro-linux-gcc --sysroot=/oe/build/tmp/work/juno-linaro-linux/perf/1.0-r9/recipe-sysroot"
Like this:
$ python3
>>> from subprocess import Popen
>>> a = Popen(["aarch64-linux-gnu-gcc --sysroot=/oe/build/tmp/work/juno-linaro-linux/perf/1.0-r9/recipe-sysroot", "-v"])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.6/subprocess.py", line 729, in __init__
restore_signals, start_new_session)
File "/usr/lib/python3.6/subprocess.py", line 1364, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'aarch64-linux-gnu-gcc --sysroot=/oe/build/tmp/work/juno-linaro-linux/perf/1.0-r9/recipe-sysroot': 'aarch64-linux-gnu-gcc --sysroot=/oe/build/tmp/work/juno-linaro-linux/perf/1.0-r9/recipe-sysroot'
>>>
Make it more robust, covering this case, by passing cc.split()[0] as the
first arg to popen().
Fixes: a7ffd416d804 ("perf python: Fix clang detection when using CC=clang-version")
Reported-by: Daniel Díaz <[email protected]>
Reported-by: Naresh Kamboju <[email protected]>
Tested-by: Daniel Díaz <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ilie Halip <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When running perf script report with a Python script and a callgraph in
DWARF mode, intr_regs->regs can be 0 and therefore crashing the regs_map
function.
Added a check for this condition (same check as in builtin-script.c:595).
Signed-off-by: Andreas Gerstmayr <[email protected]>
Tested-by: Kim Phillips <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt (VMware) <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
perf list expects CPU events to be parseable by name, e.g.
# perf list | grep el-capacity-read
el-capacity-read OR cpu/el-capacity-read/ [Kernel PMU event]
But the event parser does not recognize them that way, e.g.
# perf test -v "Parse event"
<SNIP>
running test 54 'cycles//u'
running test 55 'cycles:k'
running test 0 'cpu/config=10,config1,config2=3,period=1000/u'
running test 1 'cpu/config=1,name=krava/u,cpu/config=2/u'
running test 2 'cpu/config=1,call-graph=fp,time,period=100000/,cpu/config=2,call-graph=no,time=0,period=2000/'
running test 3 'cpu/name='COMPLEX_CYCLES_NAME:orig=cycles,desc=chip-clock-ticks',period=0x1,event=0x2/ukp'
-> cpu/event=0,umask=0x11/
-> cpu/event=0,umask=0x13/
-> cpu/event=0x54,umask=0x1/
failed to parse event 'el-capacity-read:u,cpu/event=el-capacity-read/u', err 1, str 'parser error'
event syntax error: 'el-capacity-read:u,cpu/event=el-capacity-read/u'
\___ parser error test child finished with 1
---- end ----
Parse event definition strings: FAILED!
This happens because the parser splits names by '-' in order to deal
with cache events. For example 'L1-dcache' is a token in
parse-events.l which is matched to 'L1-dcache-load-miss' by the
following rule:
PE_NAME_CACHE_TYPE '-' PE_NAME_CACHE_OP_RESULT '-' PE_NAME_CACHE_OP_RESULT opt_event_config
And so there is special handling for 2-part PMU names i.e.
PE_PMU_EVENT_PRE '-' PE_PMU_EVENT_SUF sep_dc
but no handling for 3-part names, which are instead added as tokens e.g.
topdown-[a-z-]+
While it would be possible to add a rule for 3-part names, that would
not work if the first parts were also a valid PMU name e.g.
'el-capacity-read' would be matched to 'el-capacity' before the parser
reached the 3rd part.
The parser would need significant change to rationalize all this, so
instead fix for now by adding missing Intel CPU events with 3-part names
to the event parser as tokens.
Missing events were found by using:
grep -r EVENT_ATTR_STR arch/x86/events/intel/core.c
Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This patch extends the perf script --symbols option to filter on
hexadecimal addresses in addition to symbol names. This makes it easier
to handle cases where symbols are aliased.
With this patch, it is possible to mix and match symbols and hexadecimal
addresses using the --symbols option.
$ perf script --symbols=noploop,0x4007a0
Signed-off-by: Stephane Eranian <[email protected]>
Reviewed-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The --all-cgroups option is to enable cgroup profiling support. It
tells kernel to record CGROUP events in the ring buffer so that perf
report can identify task/cgroup association later.
[root@seventh ~]# perf record --all-cgroups --namespaces /wb/cgtest
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.042 MB perf.data (558 samples) ]
[root@seventh ~]# perf report --stdio -s cgroup_id,cgroup,pid
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 558 of event 'cycles'
# Event count (approx.): 458017341
#
# Overhead cgroup id (dev/inode) Cgroup Pid:Command
# ........ ..................... .......... ...............
#
33.15% 4/0xeffffffb /sub 9615:looper0
32.83% 4/0xf00002f5 /sub/cgrp2 9620:looper2
32.79% 4/0xf00002f4 /sub/cgrp1 9619:looper1
0.35% 4/0xf00002f5 /sub/cgrp2 9618:cgtest
0.34% 4/0xf00002f4 /sub/cgrp1 9617:cgtest
0.32% 4/0xeffffffb / 9615:looper0
0.11% 4/0xeffffffb /sub 9617:cgtest
0.10% 4/0xeffffffb /sub 9618:cgtest
#
# (Tip: Sample related events with: perf record -e '{cycles,instructions}:S')
#
[root@seventh ~]#
Signed-off-by: Namhyung Kim <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Link: http://lore.kernel.org/lkml/[email protected]
[ Extracted the HAVE_FILE_HANDLE from the followup patch ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Synthesize cgroup events by iterating cgroup filesystem directories.
The cgroup event only saves the portion of cgroup path after the mount
point and the cgroup id (which actually is a file handle).
Signed-off-by: Namhyung Kim <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Link: http://lore.kernel.org/lkml/[email protected]
[ Extracted the HAVE_FILE_HANDLE from the followup patch, added missing __maybe_unused ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The cgroup sort key is to show cgroup membership of each task.
Currently it shows full path in the cgroupfs (not relative to the root
of cgroup namespace) since it'd be more intuitive IMHO. Otherwise root
cgroup in different namespaces will all show same name - "/".
The cgroup sort key should come before cgroup_id otherwise
sort_dimension__add() will match it to cgroup_id as it only matches with
the given substring.
For example it will look like following. Note that record patch adding
--all-cgroups patch will come later.
$ perf record -a --namespace --all-cgroups cgtest
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.208 MB perf.data (4090 samples) ]
$ perf report -s cgroup_id,cgroup,pid
...
# Overhead cgroup id (dev/inode) Cgroup Pid:Command
# ........ ..................... .......... ...............
#
93.96% 0/0x0 / 0:swapper
1.25% 3/0xeffffffb / 278:looper0
0.86% 3/0xf000015f /sub/cgrp1 280:cgtest
0.37% 3/0xf0000160 /sub/cgrp2 281:cgtest
0.34% 3/0xf0000163 /sub/cgrp3 282:cgtest
0.22% 3/0xeffffffb /sub 278:looper0
0.20% 3/0xeffffffb / 280:cgtest
0.15% 3/0xf0000163 /sub/cgrp3 285:looper3
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Each cgroup is kept in the perf_env's cgroup_tree sorted by the cgroup
id. Hist entries have cgroup id can compare it directly and later it
can be used to find a group name using this tree.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Implement basic functionality to support cgroup tracking. Each cgroup
can be identified by inode number which can be read from userspace too.
The actual cgroup processing will come in the later patch.
Reported-by: kernel test robot <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
[ fix perf test failure on sampling parsing ]
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
We'll need it for the cgroup patches, and its better to have it in a
separate patch in case we need to later revert the cgroup patches.
I.e. without this we have:
[root@five ~]# perf test -v python
19: 'import perf' in python :
--- start ---
test child forked, pid 148447
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: /tmp/build/perf/python/perf.cpython-37m-x86_64-linux-gnu.so: undefined symbol: down_write
test child finished with -1
---- end ----
'import perf' in python: FAILED!
[root@five ~]#
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Terms may have a NULL config in which case a strcmp will SEGV. This can
be reproduced with:
perf stat -e '*/event=?,nr/' sleep 1
Add a NULL check to avoid this. This was caught by LLVM's libfuzzer.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Perf gets dso details from two different sources. 1st, from builid
headers in perf.data and 2nd from MMAP2 samples. Dso from buildid
header does not have dso_id detail. And dso from MMAP2 samples does
not have buildid information. If detail of the same dso is present
at both the places, filename is common.
Previously, __dsos__findnew_link_by_longname_id() used to compare only
long or short names, but Commit 0e3149f86b99 ("perf dso: Move dso_id
from 'struct map' to 'struct dso'") also added a dso_id comparison.
Because of that, now perf is creating two different dso objects of the
same file, one from buildid header (with dso_id but without buildid)
and second from MMAP2 sample (with buildid but without dso_id).
This is causing issues with archive, buildid-list etc subcommands. Fix
this by comparing dso_id only when it's present. And incase dso is
present in 'dsos' list without dso_id, inject dso_id detail as well.
Before:
$ sudo ./perf buildid-list -H
0000000000000000000000000000000000000000 /usr/bin/ls
0000000000000000000000000000000000000000 /usr/lib64/ld-2.30.so
0000000000000000000000000000000000000000 /usr/lib64/libc-2.30.so
$ ./perf archive
perf archive: no build-ids found
After:
$ ./perf buildid-list -H
b6b1291d0cead046ed0fa5734037fa87a579adee /usr/bin/ls
641f0c90cfa15779352f12c0ec3c7a2b2b6f41e8 /usr/lib64/ld-2.30.so
675ace3ca07a0b863df01f461a7b0984c65c8b37 /usr/lib64/libc-2.30.so
$ ./perf archive
Now please run:
$ tar xvf perf.data.tar.bz2 -C ~/.debug
wherever you need to run 'perf report' on.
Committer notes:
Renamed is_empty_dso_id() to dso_id__empty() and inject_dso_id() to
dso__inject_id() to keep namespacing consistent.
Fixes: 0e3149f86b99 ("perf dso: Move dso_id from 'struct map' to 'struct dso'")
Reported-by: Naveen N. Rao <[email protected]>
Signed-off-by: Ravi Bangoria <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Tested-by: Naveen N. Rao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
'snprintf' returns the number of characters which would be generated for
the given input.
If the returned value is *greater than* or equal to the buffer size, it
means that the output has been truncated.
Fix the overflow test accordingly.
Fixes: 7780c25bae59f ("perf tools: Allow ability to map cpus to nodes easily")
Fixes: 92a7e1278005b ("perf cpumap: Add cpu__max_present_cpu()")
Signed-off-by: Christophe JAILLET <[email protected]>
Suggested-by: David Laight <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Don Zickus <[email protected]>
Cc: He Zhe <[email protected]>
Cc: Jan Stancek <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The perf pmu-events test will want to use pmu_uncore_alias_match(), so
make it public.
Signed-off-by: John Garry <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: James Clark <[email protected]>
Cc: Joakim Zhang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add a function to decide whether a PMU is a core PMU.
Signed-off-by: John Garry <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: James Clark <[email protected]>
Cc: Joakim Zhang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Create pmu_add_cpu_aliases_map() from pmu_add_cpu_aliases(), so the caller
can pass the map; the pmu-events test would use this since there would
be no CPUID matching to a mapfile there.
Signed-off-by: John Garry <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: James Clark <[email protected]>
Cc: Joakim Zhang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
events incase of overlapping events
Commit f01642e4912b ("perf metricgroup: Support multiple events for
metricgroup") introduced support for multiple events in a metric group.
But with the current upstream, metric events names are not printed
properly incase we try to run multiple metric groups with overlapping
event.
With current upstream version, incase of overlapping metric events issue
is, we always start our comparision logic from start. So, the events
which already matched with some metric group also take part in
comparision logic. Because of that when we have overlapping events, we
end up matching current metric group event with already matched one.
For example, in skylake machine we have metric event CoreIPC and
Instructions. Both of them need 'inst_retired.any' event value. As
events in Instructions is subset of events in CoreIPC, they endup in
pointing to same 'inst_retired.any' value.
In skylake platform:
command:# ./perf stat -M CoreIPC,Instructions -C 0 sleep 1
Performance counter stats for 'CPU(s) 0':
1,254,992,790 inst_retired.any # 1254992790.0
Instructions
# 1.3 CoreIPC
977,172,805 cycles
1,254,992,756 inst_retired.any
1.000802596 seconds time elapsed
command:# sudo ./perf stat -M UPI,IPC sleep 1
Performance counter stats for 'sleep 1':
948,650 uops_retired.retire_slots
866,182 inst_retired.any # 0.7 IPC
866,182 inst_retired.any
1,175,671 cpu_clk_unhalted.thread
Patch fixes the issue by adding a new bool pointer 'evlist_used' to keep
track of events which already matched with some group by setting it
true. So, we skip all used events in list when we start comparision
logic. Patch also make some changes in comparision logic, incase we get
a match miss, we discard the whole match and start again with first
event id in metric event.
With this patch:
In skylake platform:
command:# ./perf stat -M CoreIPC,Instructions -C 0 sleep 1
Performance counter stats for 'CPU(s) 0':
3,348,415 inst_retired.any # 0.3 CoreIPC
11,779,026 cycles
3,348,381 inst_retired.any # 3348381.0
Instructions
1.001649056 seconds time elapsed
command:# ./perf stat -M UPI,IPC sleep 1
Performance counter stats for 'sleep 1':
1,023,148 uops_retired.retire_slots # 1.1 UPI
924,976 inst_retired.any
924,976 inst_retired.any # 0.6 IPC
1,489,414 cpu_clk_unhalted.thread
1.003064672 seconds time elapsed
Signed-off-by: Kajol Jain <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Anju T Sudhakar <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
There is a slight misalignment in -A -I output.
For example:
# perf stat -e cpu/event=cpu-cycles/ -a -A -I 1000
# time CPU counts unit events
1.000440863 CPU0 1,068,388 cpu/event=cpu-cycles/
1.000440863 CPU1 875,954 cpu/event=cpu-cycles/
1.000440863 CPU2 3,072,538 cpu/event=cpu-cycles/
1.000440863 CPU3 4,026,870 cpu/event=cpu-cycles/
1.000440863 CPU4 5,919,630 cpu/event=cpu-cycles/
1.000440863 CPU5 2,714,260 cpu/event=cpu-cycles/
1.000440863 CPU6 2,219,240 cpu/event=cpu-cycles/
1.000440863 CPU7 1,299,232 cpu/event=cpu-cycles/
The value of counts is not aligned with the column "counts" and
the event name is not aligned with the column "events".
With this patch, the output is,
# perf stat -e cpu/event=cpu-cycles/ -a -A -I 1000
# time CPU counts unit events
1.000423009 CPU0 997,421 cpu/event=cpu-cycles/
1.000423009 CPU1 1,422,042 cpu/event=cpu-cycles/
1.000423009 CPU2 484,651 cpu/event=cpu-cycles/
1.000423009 CPU3 525,791 cpu/event=cpu-cycles/
1.000423009 CPU4 1,370,100 cpu/event=cpu-cycles/
1.000423009 CPU5 442,072 cpu/event=cpu-cycles/
1.000423009 CPU6 205,643 cpu/event=cpu-cycles/
1.000423009 CPU7 1,302,250 cpu/event=cpu-cycles/
Now output is aligned.
Signed-off-by: Jin Yao <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Sometimes we may need to reload the browser to update the output since
some options are changed.
This patch creates a new key K_RELOAD. Once the __cmd_report() returns
K_RELOAD, it would repeat the whole process, such as, read samples from
data file, sort the data and display in the browser.
v5:
---
1. Fix the 'make NO_SLANG=1' error. Define K_RELOAD in util/hist.h.
2. Skip setup_sorting() in repeat path if last key is K_RELOAD.
v4:
---
Need to quit in perf_evsel_menu__run if key is K_RELOAD.
Signed-off-by: Jin Yao <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When performing "perf report --group", it shows the event group
information together. By default, the output is sorted by the first
event in group.
It would be nice for user to select any event for sorting. This patch
introduces a new option "--group-sort-idx" to sort the output by the
event at the index n in event group.
For example,
Before:
# perf report --group --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 12K of events 'cpu/instructions,period=2000003/, cpu/cpu-cycles,period=200003/, BR_MISP_RETIRED.ALL_BRANCHES:pp, cpu/event=0xc0,umask=1,cmask=1,
# Event count (approx.): 6451235635
#
# Overhead Command Shared Object Symbol
# ................................ ......... ....................... ...................................
#
92.19% 98.68% 0.00% 93.30% mgen mgen [.] LOOP1
3.12% 0.29% 0.00% 0.16% gsd-color libglib-2.0.so.0.5600.4 [.] 0x0000000000049515
1.56% 0.03% 0.00% 0.04% gsd-color libglib-2.0.so.0.5600.4 [.] 0x00000000000494b7
1.56% 0.01% 0.00% 0.00% gsd-color libglib-2.0.so.0.5600.4 [.] 0x00000000000494ce
1.56% 0.00% 0.00% 0.00% mgen [kernel.kallsyms] [k] task_tick_fair
0.00% 0.15% 0.00% 0.04% perf [kernel.kallsyms] [k] smp_call_function_single
0.00% 0.13% 0.00% 6.08% swapper [kernel.kallsyms] [k] intel_idle
0.00% 0.03% 0.00% 0.00% gsd-color libglib-2.0.so.0.5600.4 [.] g_main_context_check
0.00% 0.03% 0.00% 0.00% swapper [kernel.kallsyms] [k] apic_timer_interrupt
...
After:
# perf report --group --stdio --group-sort-idx 3
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 12K of events 'cpu/instructions,period=2000003/, cpu/cpu-cycles,period=200003/, BR_MISP_RETIRED.ALL_BRANCHES:pp, cpu/event=0xc0,umask=1,cmask=1,
# Event count (approx.): 6451235635
#
# Overhead Command Shared Object Symbol
# ................................ ......... ....................... ...................................
#
92.19% 98.68% 0.00% 93.30% mgen mgen [.] LOOP1
0.00% 0.13% 0.00% 6.08% swapper [kernel.kallsyms] [k] intel_idle
3.12% 0.29% 0.00% 0.16% gsd-color libglib-2.0.so.0.5600.4 [.] 0x0000000000049515
0.00% 0.00% 0.00% 0.06% swapper [kernel.kallsyms] [k] hrtimer_start_range_ns
1.56% 0.03% 0.00% 0.04% gsd-color libglib-2.0.so.0.5600.4 [.] 0x00000000000494b7
0.00% 0.15% 0.00% 0.04% perf [kernel.kallsyms] [k] smp_call_function_single
0.00% 0.00% 0.00% 0.02% mgen [kernel.kallsyms] [k] update_curr
0.00% 0.00% 0.00% 0.02% mgen [kernel.kallsyms] [k] apic_timer_interrupt
0.00% 0.00% 0.00% 0.02% mgen [kernel.kallsyms] [k] native_apic_msr_eoi_write
0.00% 0.00% 0.00% 0.02% mgen [kernel.kallsyms] [k] __update_load_avg_se
0.00% 0.00% 0.00% 0.02% mgen [kernel.kallsyms] [k] scheduler_tick
Now the output is sorted by the fourth event in group.
v7:
---
Rebase to latest perf/core, no other change.
v4:
---
1. Update Documentation/perf-report.txt to mention
'--group-sort-idx' support multiple groups with different
amount of events and it should be used on grouped events.
2. Update __hpp__group_sort_idx(), just return when the
idx is out of limit.
3. Return failure on symbol_conf.group_sort_idx && !session->evlist->nr_groups.
So now we don't need to use together with --group.
v3:
---
Refine the code in __hpp__group_sort_idx().
Before:
for (i = 1; i < nr_members; i++) {
if (i == idx) {
ret = field_cmp(fields_a[i], fields_b[i]);
if (ret)
goto out;
}
}
After:
if (idx >= 1 && idx < nr_members) {
ret = field_cmp(fields_a[idx], fields_b[idx]);
if (ret)
goto out;
}
Signed-off-by: Jin Yao <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
[ Renamed pair_fields_alloc() to hist_entry__new_pair() and combined decl + assignment of vars ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
For perf report on stripped binaries it is currently impossible to do
annotation. The annotation state is all tied to symbols, but there are
either no symbols, or symbols are not covering all the code.
We should support the annotation functionality even without symbols.
This patch fakes a symbol and the symbol name is the string of address.
After that, we just follow current annotation working flow.
For example,
1. perf report
Overhead Command Shared Object Symbol
20.67% div libc-2.27.so [.] __random_r
17.29% div libc-2.27.so [.] __random
10.59% div div [.] 0x0000000000000628
9.25% div div [.] 0x0000000000000612
6.11% div div [.] 0x0000000000000645
2. Select the line of "10.59% div div [.] 0x0000000000000628" and ENTER.
Annotate 0x0000000000000628
Zoom into div thread
Zoom into div DSO (use the 'k' hotkey to zoom directly into the kernel)
Browse map details
Run scripts for samples of symbol [0x0000000000000628]
Run scripts for all samples
Switch to another data file in PWD
Exit
3. Select the "Annotate 0x0000000000000628" and ENTER.
Percent│
│
│
│ Disassembly of section .text:
│
│ 0000000000000628 <.text+0x68>:
│ divsd %xmm4,%xmm0
│ divsd %xmm3,%xmm1
│ movsd (%rsp),%xmm2
│ addsd %xmm1,%xmm0
│ addsd %xmm2,%xmm0
│ movsd %xmm0,(%rsp)
Now we can see the dump of object starting from 0x628.
v5:
---
Remove the hotkey 'a' implementation from this patch. It
will be moved to a separate patch.
v4:
---
1. Support the hotkey 'a'. When we press 'a' on address,
now it supports the annotation.
2. Change the patch title from
"Support interactive annotation of code without symbols" to
"perf report: Support interactive annotation of code without symbols"
v3:
---
Keep just the ANNOTATION_DUMMY_LEN, and remove the
opts->annotate_dummy_len since it's the "maybe in future
we will provide" feature.
v2:
---
Fix a crash issue when annotating an address in "unknown" object.
The steps to reproduce this issue:
perf record -e cycles:u ls
perf report
75.29% ls ld-2.27.so [.] do_lookup_x
23.64% ls ld-2.27.so [.] __GI___tunables_init
1.04% ls [unknown] [k] 0xffffffff85c01210
0.03% ls ld-2.27.so [.] _start
When annotating 0xffffffff85c01210, the crash happens.
v2 adds checking for ms->map in add_annotate_opt(). If the object is
"unknown", ms->map is NULL.
Committer notes:
Renamed new_annotate_sym() to symbol__new_unresolved().
Use PRIx64 to fix this issue in some 32-bit arches:
ui/browsers/hists.c: In function 'symbol__new_unresolved':
ui/browsers/hists.c:2474:38: error: format '%lx' expects argument of type 'long unsigned int', but argument 5 has type 'u64' {aka 'long long unsigned int'} [-Werror=format=]
snprintf(name, sizeof(name), "%-#.*lx", BITS_PER_LONG / 4, addr);
~~~~~~^ ~~~~
%-#.*llx
Signed-off-by: Jin Yao <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Tested-by: Ravi Bangoria <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|