Age | Commit message (Collapse) | Author | Files | Lines |
|
Perf script was failed to print the phys_addr for SPE profiling.
One 'dummy' event is added by SPE profiling but it doesn't have PHYS_ADDR
attribute set, perf script then exits with error.
Now referring to 'addr', use evsel__do_check_stype() to check the type.
Before:
# perf record -e arm_spe_0/branch_filter=0,ts_enable=1,pa_enable=1,load_filter=1,jitter=0,\
store_filter=0,min_latency=0,event_filter=2/ -p 4064384 -- sleep 3
# perf script -F pid,tid,addr,phys_addr
Samples for 'dummy:u' event do not have PHYS_ADDR attribute set. Cannot print 'phys_addr' field.
After:
# perf record -e arm_spe_0/branch_filter=0,ts_enable=1,pa_enable=1,load_filter=1,jitter=0,\
store_filter=0,min_latency=0,event_filter=2/ -p 4064384 -- sleep 3
# perf script -F pid,tid,addr,phys_addr
4064384/4064384 ffff802f921be0d0 2f921be0d0
4064384/4064384 ffff802f921be0d0 2f921be0d0
Reviewed-by: German Gomez <[email protected]>
Signed-off-by: Yao Jin <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Hanjun Guo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Wei Li <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
A common problem is confusing CPU map indices with the CPU, by wrapping
the CPU with a struct then this is avoided. This approach is similar to
atomic_t.
Committer notes:
To make it build with BUILD_BPF_SKEL=1 these files needed the
conversions to 'struct perf_cpu' usage:
tools/perf/util/bpf_counter.c
tools/perf/util/bpf_counter_cgroup.c
tools/perf/util/bpf_ftrace.c
Also perf_env__get_cpu() was removed back in "perf cpumap: Switch
cpu_map__build_map to cpu function".
Additionally these needed to be fixed for the ARM builds to complete:
tools/perf/arch/arm/util/cs-etm.c
tools/perf/arch/arm64/util/pmu.c
Suggested-by: John Garry <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Clarke <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Riccardo Mancini <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: Vineet Singh <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
perf_counts are accessed by the densely packed index.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Clarke <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Riccardo Mancini <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: Vineet Singh <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Use perf_cpu_map__for_each_cpu() to help with readability.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Clarke <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Riccardo Mancini <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: Vineet Singh <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
To pick up fixes.
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
CPU filtering was not being applied to a script's switch events.
Fixes: 5bf83c29a0ad2e78 ("perf script: Add scripting operation process_switch()")
Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Riccardo Mancini <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Enable dwarf_callchain_users on arm64 which will be needed to do a
DWARF unwind in order to get the caller of the leaf frame.
Reviewed-by: James Clark <[email protected]>
Signed-off-by: Alexandre Truong <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: John Garry <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: German Gomez <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Refactoring script__setup_sample_type() by using callchain_param_setup()
to replace the duplicate code for callchain parameter setting up.
Reviewed-by: James Clark <[email protected]>
Signed-off-by: Alexandre Truong <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: John Garry <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: German Gomez <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When reading a perf.data file with register values, there is a mismatch
between the names and the values of the registers because the tool is
built using only the register names from the local architecture.
Reading a perf.data file that was recorded on ARM64, gives the following
erroneous output on an X86 machine:
# perf report -i perf_arm64.data -D
[...]
24661932634451 0x698 [0x21d0]: PERF_RECORD_SAMPLE(IP, 0x1): 43239/43239: 0xffffc5be8f100f98 period: 1 addr: 0
... user regs: mask 0x1ffffffff ABI 64-bit
.... AX 0x0000ffffd1515817
.... BX 0x0000ffffd1515480
.... CX 0x0000aaaadabf6c80
.... DX 0x000000000000002e
.... SI 0x0000000040100401
.... DI 0x0040600200000080
.... BP 0x0000ffffd1510e10
.... SP 0x0000000000000000
.... IP 0x00000000000000dd
.... FLAGS 0x0000ffffd1510cd0
.... CS 0x0000000000000000
.... SS 0x0000000000000030
.... DS 0x0000ffffa569a208
.... ES 0x0000000000000000
.... FS 0x0000000000000000
.... GS 0x0000000000000000
.... R8 0x0000aaaad3de9650
.... R9 0x0000ffffa57397f0
.... R10 0x0000000000000001
.... R11 0x0000ffffa57fd000
.... R12 0x0000ffffd1515817
.... R13 0x0000ffffd1515480
.... R14 0x0000aaaadabf6c80
.... R15 0x0000000000000000
.... unknown 0x0000000000000001
.... unknown 0x0000000000000000
.... unknown 0x0000000000000000
.... unknown 0x0000000000000000
.... unknown 0x0000000000000000
.... unknown 0x0000ffffd1510d90
.... unknown 0x0000ffffa5739b90
.... unknown 0x0000ffffd1510d80
.... XMM0 0x0000ffffa57392c8
... thread: perf-exec:43239
...... dso: [kernel.kallsyms]
As can be seen, the register names correspond to X86 registers, even
though the perf.data file was recorded on an ARM64 system. After this
patch, the output of the command displays the correct register names:
# perf report -i perf_arm64.data -D
[...]
24661932634451 0x698 [0x21d0]: PERF_RECORD_SAMPLE(IP, 0x1): 43239/43239: 0xffffc5be8f100f98 period: 1 addr: 0
... user regs: mask 0x1ffffffff ABI 64-bit
.... x0 0x0000ffffd1515817
.... x1 0x0000ffffd1515480
.... x2 0x0000aaaadabf6c80
.... x3 0x000000000000002e
.... x4 0x0000000040100401
.... x5 0x0040600200000080
.... x6 0x0000ffffd1510e10
.... x7 0x0000000000000000
.... x8 0x00000000000000dd
.... x9 0x0000ffffd1510cd0
.... x10 0x0000000000000000
.... x11 0x0000000000000030
.... x12 0x0000ffffa569a208
.... x13 0x0000000000000000
.... x14 0x0000000000000000
.... x15 0x0000000000000000
.... x16 0x0000aaaad3de9650
.... x17 0x0000ffffa57397f0
.... x18 0x0000000000000001
.... x19 0x0000ffffa57fd000
.... x20 0x0000ffffd1515817
.... x21 0x0000ffffd1515480
.... x22 0x0000aaaadabf6c80
.... x23 0x0000000000000000
.... x24 0x0000000000000001
.... x25 0x0000000000000000
.... x26 0x0000000000000000
.... x27 0x0000000000000000
.... x28 0x0000000000000000
.... x29 0x0000ffffd1510d90
.... lr 0x0000ffffa5739b90
.... sp 0x0000ffffd1510d80
.... pc 0x0000ffffa57392c8
... thread: perf-exec:43239
...... dso: [kernel.kallsyms]
Tester comments:
Athira reports:
"Looks good to me. Tested this patchset in powerpc by capturing regs in
powerpc and doing perf report to read the data from x86."
Reported-by: Alexandre Truong <[email protected]>
Reviewed-by: Athira Jajeev <[email protected]>
Signed-off-by: German Gomez <[email protected]>
Tested-by: Athira Jajeev <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Only perf report checked the validity of these arguments so apply the
same check to all tools that read them for consistency.
Signed-off-by: James Clark <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Denis Nikitin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
To pick up fixes.
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
-F weight in perf script is broken.
# ./perf mem record
# ./perf script -F weight
Samples for 'dummy:HG' event do not have WEIGHT attribute set. Cannot
print 'weight' field.
The sample type, PERF_SAMPLE_WEIGHT_STRUCT, is an alternative of the
PERF_SAMPLE_WEIGHT sample type. They share the same space, weight. The
lower 32 bits are exactly the same for both sample type. The higher 32
bits may be different for different architecture. For a new kernel on
x86, the PERF_SAMPLE_WEIGHT_STRUCT is used. For an old kernel or other
ARCHs, the PERF_SAMPLE_WEIGHT is used.
With -F weight, current perf script will only check the input string
"weight" with the PERF_SAMPLE_WEIGHT sample type. Because the commit
ea8d0ed6eae3 ("perf tools: Support PERF_SAMPLE_WEIGHT_STRUCT") didn't
update the PERF_SAMPLE_WEIGHT_STRUCT sample type for perf script. For a
new kernel on x86, the check fails.
Use PERF_SAMPLE_WEIGHT_TYPE, which supports both sample types, to
replace PERF_SAMPLE_WEIGHT
Fixes: ea8d0ed6eae37b01 ("perf tools: Support PERF_SAMPLE_WEIGHT_STRUCT")
Reported-by: Joe Mario <[email protected]>
Reviewed-by: Kajol Jain <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Tested-by: Joe Mario <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Acked-by: Joe Mario <[email protected]>
Cc: Andi Kleen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When perf.data is not written cleanly, we would like to process existing
data as much as possible (please see f_header.data.size == 0 condition
in perf_session__read_header). However, perf.data with partial data may
crash perf. Specifically, we see crash in 'perf script' for NULL
session->header.env.arch.
Fix this by checking session->header.env.arch before using it to determine
native_arch. Also split the if condition so it is easier to read.
Committer notes:
If it is a pipe, we already assume is a native arch, so no need to check
session->header.env.arch.
Signed-off-by: Song Liu <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The instruction latency information can be recorded on
some platforms, e.g., the Intel Sapphire Rapids server. With both memory
latency (weight) and the new instruction latency information, users can
easily locate the expensive load instructions, and also understand the time
spent in different stages. The users can optimize their applications in
different pipeline stages.
Add a new field "ins_lat" to filter the instruction latency information,
which is available with sample type PERF_SAMPLE_WEIGHT_STRUCT.
Signed-off-by: Kan Liang <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Joe Mario <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
set_print_ip_opts() was not being called when type != attr->type
because there is not a one-to-one relationship between output types
and attr->type. That resulted in ip not printing.
The attr_type() function is removed, and the match of attr->type to
output type is corrected.
Example on ADL using taskset to select an atom cpu:
# perf record -e cpu_atom/cpu-cycles/ taskset 0x1000 uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.003 MB perf.data (7 samples) ]
Before:
# perf script | head
taskset 428 [-01] 10394.179041: 1 cpu_atom/cpu-cycles/:
taskset 428 [-01] 10394.179043: 1 cpu_atom/cpu-cycles/:
taskset 428 [-01] 10394.179044: 11 cpu_atom/cpu-cycles/:
taskset 428 [-01] 10394.179045: 407 cpu_atom/cpu-cycles/:
taskset 428 [-01] 10394.179046: 16789 cpu_atom/cpu-cycles/:
taskset 428 [-01] 10394.179052: 676300 cpu_atom/cpu-cycles/:
uname 428 [-01] 10394.179278: 4079859 cpu_atom/cpu-cycles/:
After:
# perf script | head
taskset 428 10394.179041: 1 cpu_atom/cpu-cycles/: ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
taskset 428 10394.179043: 1 cpu_atom/cpu-cycles/: ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
taskset 428 10394.179044: 11 cpu_atom/cpu-cycles/: ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
taskset 428 10394.179045: 407 cpu_atom/cpu-cycles/: ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
taskset 428 10394.179046: 16789 cpu_atom/cpu-cycles/: ffffffff95a0bb97 __intel_pmu_enable_all.constprop.48+0x47 ([kernel.kallsyms])
taskset 428 10394.179052: 676300 cpu_atom/cpu-cycles/: 7f829ef73800 cfree+0x0 (/lib/libc-2.32.so)
uname 428 10394.179278: 4079859 cpu_atom/cpu-cycles/: ffffffff95bae912 vma_interval_tree_remove+0x1f2 ([kernel.kallsyms])
Signed-off-by: Adrian Hunter <[email protected]>
Reviewed-by: Kan Liang <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
perf_events may sometimes throttle an event due to creating too many
samples during a given timer tick.
As of now, the perf tool will not report on throttling, which means this
is a silent error.
Implement a callback for the throttle and unthrottle events within the
Python scripting engine, which can allow scripts to detect and report
when events may have been lost due to throttling.
The simplest script to report throttle events is:
def throttle(*args):
print("throttle" + repr(args))
def unthrottle(*args):
print("unthrottle" + repr(args))
Signed-off-by: Stephen Brennan <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
machine_resolve() may have already been called. Test for that to avoid
calling it again unnecessarily.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: https //lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The repipe argument is only used by perf inject and the all others
passes 'false'. Let's remove it from the function signature and add
__perf_session__new() to be called from perf inject directly.
This is a preparation of the change the pipe input/output.
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
[ Fixed up some trivial conflicts as this patchset fell thru the cracks ;-( ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
ASan reports several memory leaks while running:
# perf test "82: Use vfs_getname probe to get syscall args filenames"
Two of these are caused by some refcounts not being decreased on
perf-script exit, namely script.threads and script.cpus.
This patch adds the missing __put calls in a new perf_script__exit
function, which is called at the end of cmd_script.
This patch concludes the fixes of all remaining memory leaks in perf
test "82: Use vfs_getname probe to get syscall args filenames".
Signed-off-by: Riccardo Mancini <[email protected]>
Fixes: cfc8874a48599249 ("perf script: Process cpu/threads maps")
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/5ee73b19791c6fa9d24c4d57f4ac1a23609400d7.1626343282.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
ASan reports several memory leak while running:
# perf test "82: Use vfs_getname probe to get syscall args filenames"
One of the leaks is caused by zstd data not being released on exit in
perf-script.
This patch adds the missing zstd_fini().
Signed-off-by: Riccardo Mancini <[email protected]>
Fixes: b13b04d9382113f7 ("perf script: Initialize zstd_data")
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Milian Wolff <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/39388e8cc2f85ca219ea18697a17b7bd8f74b693.1626343282.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Move evsel::leader to perf_evsel::leader, so we can move the group
interface to libperf.
Also add several evsel helpers to ease up the transition:
struct evsel *evsel__leader(struct evsel *evsel);
- get leader evsel
bool evsel__has_leader(struct evsel *evsel, struct evsel *leader);
- true if evsel has leader as leader
bool evsel__is_leader(struct evsel *evsel);
- true if evsel is itw own leader
void evsel__set_leader(struct evsel *evsel, struct evsel *leader);
- set leader for evsel
Committer notes:
Fix this when building with 'make BUILD_BPF_SKEL=1'
tools/perf/util/bpf_counter.c
- if (evsel->leader->core.nr_members > 1) {
+ if (evsel->core.leader->nr_members > 1) {
Signed-off-by: Jiri Olsa <[email protected]>
Requested-by: Shunsuke Nakamura <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add option --dlarg to pass arguments to dlfilters. The --dlarg option can
be repeated to pass more than 1 argument.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add option --list-dlfilters to list dlfilters in the current directory or
the exec-path e.g. ~/libexec/perf-core/dlfilters. Use with option -v (must
come before option --list-dlfilters) to show long descriptions.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
filter_event_early() can be more than 30% faster than filter_event()
because it is called before internal filtering. In other respects it
is the same as filter_event(), except that it will be passed events
that have yet to be filtered out.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
In some cases, users want to filter very large amounts of data (e.g.
from AUX area tracing like Intel PT) looking for something specific.
While scripting such as Python can be used, Python is 10 to 20 times
slower than C. So define a C API so that custom filters can be written
and loaded.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Share the addr_location of 'addr' so that it need not be resolved more than
once.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
To make it possible to use filtering with scripts, move filtering before
scripting.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Generally, it should be more efficient if filter_cpu() comes before
machine__resolve() because filter_cpu() is much less code than
machine__resolve().
Example:
$ perf record --sample-cpu -- make -C tools/perf >/dev/null
Before:
$ perf stat -- perf script -C 0 >/dev/null
Performance counter stats for 'perf script -C 0':
116.94 msec task-clock # 0.992 CPUs utilized
2 context-switches # 17.103 /sec
0 cpu-migrations # 0.000 /sec
8,187 page-faults # 70.011 K/sec
478,351,812 cycles # 4.091 GHz
564,785,464 instructions # 1.18 insn per cycle
114,341,105 branches # 977.789 M/sec
2,615,495 branch-misses # 2.29% of all branches
0.117840576 seconds time elapsed
0.085040000 seconds user
0.032396000 seconds sys
After:
$ perf stat -- perf script -C 0 >/dev/null
Performance counter stats for 'perf script -C 0':
107.45 msec task-clock # 0.992 CPUs utilized
3 context-switches # 27.919 /sec
0 cpu-migrations # 0.000 /sec
7,964 page-faults # 74.117 K/sec
438,417,260 cycles # 4.080 GHz
522,571,855 instructions # 1.19 insn per cycle
105,187,488 branches # 978.921 M/sec
2,356,261 branch-misses # 2.24% of all branches
0.108282546 seconds time elapsed
0.095935000 seconds user
0.011991000 seconds sys
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Factor out script_fetch_insn() so it can be reused.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This is preparation for allowing a script to set the itrace options
for the session if they have not already been set.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add auxtrace_error to general python scripting.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Factor out perf_sample__sprintf_flags() so it can be reused.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
If sample addr correlates to a symbol, add "addr_dso", "addr_symbol", and
"addr_symoff" to python scripting.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Allow perf script to find a script in the exec path.
Example:
Before:
$ perf record -a -e intel_pt/branch=0/ sleep 0.1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.954 MB perf.data ]
$ perf script intel-pt-events.py 2>&1 | head -3
Error: Couldn't find script `intel-pt-events.py'
See perf script -l for available scripts.
$ perf script -s intel-pt-events.py 2>&1 | head -3
Can't open python script "intel-pt-events.py": No such file or directory
$ perf script ~/libexec/perf-core/scripts/python/intel-pt-events.py 2>&1 | head -3
Error: Couldn't find script `/home/ahunter/libexec/perf-core/scripts/python/intel-pt-events.py'
See perf script -l for available scripts.
$
After:
$ perf script intel-pt-events.py 2>&1 | head -3
Intel PT Power Events and PTWRITE
perf 8123/8123 [000] 551.230753986 cbr: 42 freq: 4219 MHz (156%) 0 [unknown] ([unknown])
perf 8123/8123 [001] 551.230808216 cbr: 42 freq: 4219 MHz (156%) 0 [unknown] ([unknown])
$ perf script -s intel-pt-events.py 2>&1 | head -3
Intel PT Power Events and PTWRITE
perf 8123/8123 [000] 551.230753986 cbr: 42 freq: 4219 MHz (156%) 0 [unknown] ([unknown])
perf 8123/8123 [001] 551.230808216 cbr: 42 freq: 4219 MHz (156%) 0 [unknown] ([unknown])
$ perf script ~/libexec/perf-core/scripts/python/intel-pt-events.py 2>&1 | head -3
Intel PT Power Events and PTWRITE
perf 8123/8123 [000] 551.230753986 cbr: 42 freq: 4219 MHz (156%) 0 [unknown] ([unknown])
perf 8123/8123 [001] 551.230808216 cbr: 42 freq: 4219 MHz (156%) 0 [unknown] ([unknown])
$
Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Fix ~124 single-word typos and a few spelling errors in the perf tooling code,
accumulated over the years.
Signed-off-by: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
They all operate on 'struct evsel_script' instances, so should be
prefixed with evsel_script__, not with perf_evsel_script__.
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
In preparation to support Intel PT decoding of virtual machine traces, add
branch types for VM-Entry and VM-Exit.
Note they are both treated as "calls" because the VM-Exit transfers control
to a different address.
Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Andi Kleen <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Emitting a PSB+ can cause a CPU a slight delay. When doing timing analysis
of code with Intel PT, it is useful to know if a timing bubble was caused
by Intel PT or not. Add reporting of PSB events via perf script. PSB
events are printed with the existing itrace 'p' option which also prints
power and frequency changes. The PSB event contains the trace offset at
which the PSB occurs, to allow easy reference back to the PSB+ packets.
The PSB event timestamp is always the timestamp from the PSB+ TSC
packet, and the ip is always the address from the PSB+ FUP packet.
The code changes are non-trivial because the decoder must walk to the
PSB+ FUP address before outputting the PSB event.
Example:
$ perf record -e intel_pt/cyc,psb_period=0/u uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.046 MB perf.data ]
$ perf script --itrace=p --ns
perf 17981 [006] 25617.510820383: psb: psb offs: 0 0 [unknown] ([unknown])
perf 17981 [006] 25617.510820383: cbr: cbr: 42 freq: 4219 MHz (156%) 0 [unknown] ([unknown])
uname 17981 [006] 25617.510889753: psb: psb offs: 0xb50 7f78c12a212e __GI___tunables_init+0xee (/usr/lib/x86_64-linux-gnu/ld-2.31.so)
uname 17981 [006] 25617.510899162: psb: psb offs: 0x12d0 7f78c128af1c dl_main+0x93c (/usr/lib/x86_64-linux-gnu/ld-2.31.so)
uname 17981 [006] 25617.510939242: psb: psb offs: 0x1a50 7f78c128eefc _dl_map_object_from_fd+0x13c (/usr/lib/x86_64-linux-gnu/ld-2.31.so)
uname 17981 [006] 25617.510981274: psb: psb offs: 0x21c8 7f78c1296307 _dl_relocate_object+0x927 (/usr/lib/x86_64-linux-gnu/ld-2.31.so)
uname 17981 [006] 25617.510993034: psb: psb offs: 0x2948 7f78c12940e4 _dl_lookup_symbol_x+0x14 (/usr/lib/x86_64-linux-gnu/ld-2.31.so)
uname 17981 [006] 25617.511003871: psb: psb offs: 0x30c8 7f78c12937b3 do_lookup_x+0x2f3 (/usr/lib/x86_64-linux-gnu/ld-2.31.so)
uname 17981 [006] 25617.511019854: psb: psb offs: 0x3850 7f78c1295eed _dl_relocate_object+0x50d (/usr/lib/x86_64-linux-gnu/ld-2.31.so)
uname 17981 [006] 25617.511029015: psb: psb offs: 0x4390 7f78c12a855a strcmp+0xf6a (/usr/lib/x86_64-linux-gnu/ld-2.31.so)
uname 17981 [006] 25617.511064876: psb: psb offs: 0x4b10 0 [unknown] ([unknown])
uname 17981 [006] 25617.511080762: psb: psb offs: 0x5290 7f78c11db53d _dl_addr+0x13d (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
uname 17981 [006] 25617.511086035: psb: psb offs: 0x5a08 7f78c11db538 _dl_addr+0x138 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
uname 17981 [006] 25617.511091381: psb: psb offs: 0x6190 7f78c11db534 _dl_addr+0x134 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
uname 17981 [006] 25617.511096681: psb: psb offs: 0x6910 7f78c11db4c3 _dl_addr+0xc3 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
uname 17981 [006] 25617.511119520: psb: psb offs: 0x7090 7f78c10ada5e _nl_intern_locale_data+0x12e (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
uname 17981 [006] 25617.511126584: psb: psb offs: 0x7818 7f78c10ada50 _nl_intern_locale_data+0x120 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
uname 17981 [006] 25617.511132775: psb: psb offs: 0x8358 7f78c10c20c0 getenv+0xa0 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
uname 17981 [006] 25617.511134598: psb: psb offs: 0x8ad0 7f78c10ada09 _nl_intern_locale_data+0xd9 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
uname 17981 [006] 25617.511135685: psb: psb offs: 0x9258 7f78c10ada50 _nl_intern_locale_data+0x120 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
uname 17981 [006] 25617.511138322: psb: psb offs: 0x99d0 7f78c11fffd9 __strncmp_avx2+0x39 (/usr/lib/x86_64-linux-gnu/libc-2.31.so)
uname 17981 [006] 25617.511158907: psb: psb offs: 0xa150 0 [unknown] ([unknown])
Signed-off-by: Adrian Hunter <[email protected]>
Reviewed-by: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Fix the following coccicheck warning:
./tools/perf/builtin-script.c:2789:36-41: WARNING: conversion to bool
not needed here
./tools/perf/builtin-script.c:3237:48-53: WARNING: conversion to bool
not needed here
Reported-by: Abaci Robot <[email protected]>
Signed-off-by: Yang Li <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: John Fastabend <[email protected]>
Cc: KP Singh <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Martin KaFai Lau <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Yonghong Song <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
'perf script' supports '-S' or '--symbol' options to only list the
records for these symbols. A symbol is typically a name or hex address.
If it's hex address, it is the start address of one symbol.
While it would be useful if we can filter trace records by any hex
address (not only the start address of symbol). So now we support
filtering trace records by more conditions, such as:
- symbol name
- start address of symbol
- any hexadecimal address
- address range
The comparison order is defined as:
1. symbol name comparison
2. symbol start address comparison.
3. any hexadecimal address comparison.
4. address range comparison.
The idea is if we can get a valid address from -S list, we add the
address to addr_list for address comparison otherwise we still leave
it to sym_list for symbol comparison.
Some examples:
root@kbl-ppc:~# ./perf script -S ffffffff9a477308
perf 8562 [000] 347303.578858: 1 cycles: ffffffff9a477308 native_write_msr+0x8 ([kernel.kallsyms])
perf 8562 [000] 347303.578860: 1 cycles: ffffffff9a477308 native_write_msr+0x8 ([kernel.kallsyms])
perf 8562 [000] 347303.578861: 11 cycles: ffffffff9a477308 native_write_msr+0x8 ([kernel.kallsyms])
perf 8562 [001] 347303.578903: 1 cycles: ffffffff9a477308 native_write_msr+0x8 ([kernel.kallsyms])
perf 8562 [001] 347303.578905: 1 cycles: ffffffff9a477308 native_write_msr+0x8 ([kernel.kallsyms])
perf 8562 [001] 347303.578906: 15 cycles: ffffffff9a477308 native_write_msr+0x8 ([kernel.kallsyms])
perf 8562 [002] 347303.578952: 1 cycles: ffffffff9a477308 native_write_msr+0x8 ([kernel.kallsyms])
perf 8562 [002] 347303.578953: 1 cycles: ffffffff9a477308 native_write_msr+0x8 ([kernel.kallsyms])
Filter the traced records by hex address ffffffff9a477308.
root@kbl-ppc:~# ./perf script -S ffffffff9a4dd4ce,ffffffff9a4d2de9,ffffffff9a6bf9f4
perf 8562 [001] 347303.578911: 311706 cycles: ffffffff9a6bf9f4 __kmalloc_node+0x204 ([kernel.kallsyms])
perf 8562 [002] 347303.578960: 354477 cycles: ffffffff9a4d2de9 sched_setaffinity+0x49 ([kernel.kallsyms])
perf 8562 [003] 347303.579015: 450958 cycles: ffffffff9a4dd4ce dequeue_task_fair+0x1ae ([kernel.kallsyms])
Filter the traced records by hex address ffffffff9a4dd4ce, ffffffff9a4d2de9, ffffffff9a6bf9f4.
root@kbl-ppc:~# ./perf script -S ffffffff9a477309 --addr-range 16
perf 8562 [000] 347303.578863: 291 cycles: ffffffff9a47730a native_write_msr+0xa ([kernel.kallsyms])
perf 8562 [001] 347303.578907: 411 cycles: ffffffff9a47730a native_write_msr+0xa ([kernel.kallsyms])
perf 8562 [002] 347303.578956: 462 cycles: ffffffff9a47730f native_write_msr+0xf ([kernel.kallsyms])
perf 8562 [003] 347303.579010: 497 cycles: ffffffff9a47730f native_write_msr+0xf ([kernel.kallsyms])
perf 8562 [004] 347303.579059: 429 cycles: ffffffff9a47730f native_write_msr+0xf ([kernel.kallsyms])
perf 8562 [005] 347303.579109: 408 cycles: ffffffff9a47730a native_write_msr+0xa ([kernel.kallsyms])
perf 8562 [006] 347303.579159: 460 cycles: ffffffff9a47730f native_write_msr+0xf ([kernel.kallsyms])
perf 8562 [007] 347303.579213: 436 cycles: ffffffff9a47730f native_write_msr+0xf ([kernel.kallsyms])
Filter the traced records from address range [ffffffff9a477309, ffffffff9a477309 + 15].
root@kbl-ppc:~# ./perf script -S "ffffffff9b163046,rcu_nmi_exit"
perf 8562 [004] 347303.579060: 12013 cycles: ffffffff9b163046 exc_nmi+0x166 ([kernel.kallsyms])
perf 8562 [007] 347303.579214: 12138 cycles: ffffffff9b165944 rcu_nmi_exit+0x34 ([kernel.kallsyms])
Filter by address + symbol
Signed-off-by: Jin Yao <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Other perf tool builtins already supported a DSO filter.
For example:
$ perf report --dsos a,b,c
which only considers symbols in these dsos.
Now the DSO filter is supported in 'perf script':
root@kbl-ppc:~# ./perf script --dsos "[kernel.kallsyms]"
perf 18123 [000] 6142863.075104: 1 cycles: ffffffff9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [000] 6142863.075107: 1 cycles: ffffffff9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [000] 6142863.075108: 10 cycles: ffffffff9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [000] 6142863.075109: 273 cycles: ffffffff9ca7730a native_write_msr+0xa ([kernel.kallsyms])
perf 18123 [000] 6142863.075110: 7684 cycles: ffffffff9ca3c9c0 native_sched_clock+0x50 ([kernel.kallsyms])
perf 18123 [000] 6142863.075112: 213017 cycles: ffffffff9d765a92 syscall_exit_to_user_mode+0x32 ([kernel.kallsyms])
perf 18123 [001] 6142863.075156: 1 cycles: ffffffff9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [001] 6142863.075158: 1 cycles: ffffffff9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
perf 18123 [001] 6142863.075159: 17 cycles: ffffffff9ca77308 native_write_msr+0x8 ([kernel.kallsyms])
Committer testing:
$ perf script
ls 2364888 29303.010949: 1 cycles:u: ffffffffa4bbc6a9 [unknown] ([unknown])
ls 2364888 29303.010957: 1 cycles:u: ffffffffa429ef48 [unknown] ([unknown])
ls 2364888 29303.010961: 1 cycles:u: ffffffffa4260133 [unknown] ([unknown])
ls 2364888 29303.010964: 5 cycles:u: ffffffffa429efad [unknown] ([unknown])
ls 2364888 29303.010967: 41 cycles:u: ffffffffa42a4586 [unknown] ([unknown])
ls 2364888 29303.010972: 435 cycles:u: ffffffffa429efe0 [unknown] ([unknown])
ls 2364888 29303.010978: 5142 cycles:u: 7f9b95bc2abf __GI___tunables_init+0x11f (/usr/lib64/ld-2.32.so)
ls 2364888 29303.011006: 38551 cycles:u: ffffffffa4290f61 [unknown] ([unknown])
ls 2364888 29303.011486: 238234 cycles:u: 7f9b95bb7741 _dl_relocate_object+0xa71 (/usr/lib64/ld-2.32.so)
ls 2364888 29303.011937: 415870 cycles:u: 7f9b95a1c80e __strcoll_l+0xe (/usr/lib64/libc-2.32.so)
$
Before:
$ perf script --dsos /usr/lib64/libc-2.32.so |& head -5
Error: unknown option `dsos'
Usage: perf script [<options>]
or: perf script [<options>] record <script> [<record-options>] <command>
or: perf script [<options>] report <script> [script-args]
$
After:
$ perf script --dsos /usr/lib64/libc-2.32.so
ls 2364888 29303.011937: 415870 cycles:u: 7f9b95a1c80e __strcoll_l+0xe (/usr/lib64/libc-2.32.so)
$
Signed-off-by: Jin Yao <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
To pick up fixes.
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When unpacking the event which is from dynamic PMU, the array
output[OUTPUT_TYPE_MAX] may be overrun. For example, type number of SKL
uncore_imc is 10, but OUTPUT_TYPE_MAX is 7 now (OUTPUT_TYPE_MAX =
PERF_TYPE_MAX + 1).
/* In builtin-script.c */
process_event()
{
unsigned int type = output_type(attr->type);
if (output[type].fields == 0)
return;
}
output[10] is overrun.
Create a type OUTPUT_TYPE_OTHER for dynamic PMU events, then
output_type(attr->type) will return OUTPUT_TYPE_OTHER here.
Note that if PERF_TYPE_MAX ever changed, then there would be a conflict
between old perf.data files that had a dynamicaliy allocated PMU number
that would then be the same as a fixed PERF_TYPE.
Example:
# perf record --switch-events -C 0 -e "{cpu-clock,uncore_imc/data_reads/,uncore_imc/data_writes/}:SD" -a -- sleep 1
# perf script
Before:
swapper 0 [000] 1479253.987551: 277766 cpu-clock: ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
swapper 0 [000] 1479253.987797: 246709 cpu-clock: ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
swapper 0 [000] 1479253.988127: 329883 cpu-clock: ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
swapper 0 [000] 1479253.988273: 146393 cpu-clock: ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
swapper 0 [000] 1479253.988523: 249977 cpu-clock: ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
swapper 0 [000] 1479253.988877: 354090 cpu-clock: ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
swapper 0 [000] 1479253.989023: 145940 cpu-clock: ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
swapper 0 [000] 1479253.989383: 359856 cpu-clock: ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
swapper 0 [000] 1479253.989523: 140082 cpu-clock: ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
After:
swapper 0 [000] 1397040.402011: 272384 cpu-clock: ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
swapper 0 [000] 1397040.402011: 5396 uncore_imc/data_reads/:
swapper 0 [000] 1397040.402011: 967 uncore_imc/data_writes/:
swapper 0 [000] 1397040.402259: 249153 cpu-clock: ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
swapper 0 [000] 1397040.402259: 7231 uncore_imc/data_reads/:
swapper 0 [000] 1397040.402259: 1297 uncore_imc/data_writes/:
swapper 0 [000] 1397040.402508: 249108 cpu-clock: ffffffff9d4ddb6f cpuidle_enter_state+0xdf ([kernel.kallsyms])
swapper 0 [000] 1397040.402508: 5333 uncore_imc/data_reads/:
swapper 0 [000] 1397040.402508: 1008 uncore_imc/data_writes/:
Signed-off-by: Jin Yao <[email protected]>
Acked-by: Adrian Hunter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Display sampled code page sizes when PERF_SAMPLE_CODE_PAGE_SIZE was set.
For example:
# perf script --fields comm,event,ip,code_page_size
dtlb mem-loads:uP: 445777 4K
dtlb mem-loads:uP: 40f724 4K
dtlb mem-loads:uP: 474926 4K
dtlb mem-loads:uP: 401075 4K
dtlb mem-loads:uP: 401095 4K
dtlb mem-loads:uP: 401095 4K
dtlb mem-loads:uP: 4010cc 4K
dtlb mem-loads:uP: 440b6f 4K
#
Signed-off-by: Stephane Eranian <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Kan Liang <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Display the data page size if it is available and asked by the user:
Can be configured by the user, for example:
perf script --fields comm,event,phys_addr,data_page_size
dtlb mem-loads:uP: 3fec82ea8 4K
dtlb mem-loads:uP: 3fec82e90 4K
dtlb mem-loads:uP: 3e23700a4 4K
dtlb mem-loads:uP: 3fec82f20 4K
dtlb mem-loads:uP: 3e23700a4 4K
dtlb mem-loads:uP: 3b4211bec 4K
dtlb mem-loads:uP: 382205dc0 2M
dtlb mem-loads:uP: 36fa082c0 2M
dtlb mem-loads:uP: 377607340 2M
dtlb mem-loads:uP: 330010180 2M
dtlb mem-loads:uP: 33200fd80 2M
dtlb mem-loads:uP: 31b012b80 2M
Signed-off-by: Kan Liang <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Will Deacon <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The kernel can release tasks while they are still running. This can
result in a task having no tid, in which case perf records a tid of -1.
Improve the perf script output in that case.
Example:
Before:
# cat ./autoreap.c
#include <sys/types.h>
#include <unistd.h>
#include <sys/wait.h>
#include <signal.h>
struct sigaction act = {
.sa_handler = SIG_IGN,
};
int main()
{
pid_t child;
int status = 0;
sigaction(SIGCHLD, &act, NULL);
child = fork();
if (child == 0)
return 123;
wait(&status);
return 0;
}
# gcc -o autoreap autoreap.c
# ./perf record -a -e dummy --switch-events ./autoreap
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.948 MB perf.data ]
# ./perf script --show-task-events --show-switch-events | grep -C2 'autoreap\|4294967295\|-1'
swapper 0 [004] 18462.673613: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 25189/25189
perf 25189 [004] 18462.673614: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
autoreap 25189 [004] 18462.673800: PERF_RECORD_COMM exec: autoreap:25189/25189
autoreap 25189 [004] 18462.674042: PERF_RECORD_FORK(25191:25191):(25189:25189)
autoreap 25189 [004] 18462.674050: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0
swapper 0 [004] 18462.674051: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 25189/25189
swapper 0 [005] 18462.674083: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 25191/25191
autoreap 25191 [005] 18462.674084: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
swapper 0 [003] 18462.674121: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 11/11
rcu_preempt 11 [003] 18462.674121: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
rcu_preempt 11 [003] 18462.674124: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0
swapper 0 [003] 18462.674124: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 11/11
autoreap 25191 [005] 18462.674138: PERF_RECORD_EXIT(25191:25191):(25189:25189)
PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0
swapper 0 [005] 18462.674149: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 4294967295/4294967295
swapper 0 [004] 18462.674182: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 25189/25189
autoreap 25189 [004] 18462.674183: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
autoreap 25189 [004] 18462.674218: PERF_RECORD_EXIT(25189:25189):(25188:25188)
autoreap 25189 [004] 18462.674225: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0
swapper 0 [004] 18462.674226: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 25189/25189
swapper 0 [007] 18462.674257: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 25188/25188
After:
# ./perf script --show-task-events --show-switch-events | grep -C2 'autoreap\|4294967295\|-1'
swapper 0 [004] 18462.673613: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 25189/25189
perf 25189 [004] 18462.673614: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
autoreap 25189 [004] 18462.673800: PERF_RECORD_COMM exec: autoreap:25189/25189
autoreap 25189 [004] 18462.674042: PERF_RECORD_FORK(25191:25191):(25189:25189)
autoreap 25189 [004] 18462.674050: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0
swapper 0 [004] 18462.674051: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 25189/25189
swapper 0 [005] 18462.674083: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 25191/25191
autoreap 25191 [005] 18462.674084: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
swapper 0 [003] 18462.674121: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 11/11
rcu_preempt 11 [003] 18462.674121: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
rcu_preempt 11 [003] 18462.674124: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0
swapper 0 [003] 18462.674124: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 11/11
autoreap 25191 [005] 18462.674138: PERF_RECORD_EXIT(25191:25191):(25189:25189)
:-1 -1 [005] 18462.674149: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0
swapper 0 [005] 18462.674149: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: -1/-1
swapper 0 [004] 18462.674182: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 25189/25189
autoreap 25189 [004] 18462.674183: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0
autoreap 25189 [004] 18462.674218: PERF_RECORD_EXIT(25189:25189):(25188:25188)
autoreap 25189 [004] 18462.674225: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0
swapper 0 [004] 18462.674226: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 25189/25189
swapper 0 [007] 18462.674257: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 25188/25188
Signed-off-by: Adrian Hunter <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Yu-cheng Yu <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add a 'tod' field to display time of day column with time of date
(wallclock) time.
# perf record -k CLOCK_MONOTONIC kill
kill: not enough arguments
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.033 MB perf.data (8 samples) ]
# perf script
perf 261340 152919.481538: 1 cycles: ffffffff8106d104 ...
perf 261340 152919.481543: 1 cycles: ffffffff8106d104 ...
perf 261340 152919.481545: 7 cycles: ffffffff8106d104 ...
...
# perf script --ns
perf 261340 152919.481538922: 1 cycles: ffffffff8106d ...
perf 261340 152919.481543286: 1 cycles: ffffffff8106d ...
perf 261340 152919.481545397: 7 cycles: ffffffff8106d ...
...
# perf script -F+tod
perf 261340 2020-07-13 18:26:55.620971 152919.481538: ...
perf 261340 2020-07-13 18:26:55.620975 152919.481543: ...
perf 261340 2020-07-13 18:26:55.620978 152919.481545: ...
...
# perf script -F+tod --ns
perf 261340 2020-07-13 18:26:55.620971621 152919.481538922: ...
perf 261340 2020-07-13 18:26:55.620975985 152919.481543286: ...
perf 261340 2020-07-13 18:26:55.620978096 152919.481545397: ...
...
It's available only for recording with clockid specified, because it's
the only case where we can get reference time to wallclock time. It's
can't do that with perf clock yet.
Error is display if you want to use --tod on data without clockid
specified:
# perf script -F+tod
Can't provide 'tod' time, missing clock data. Please record with -k/--clockid option.
Original-patch-by: David Ahern <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Geneviève Bastien <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jeremie Galarneau <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
So it's possible to add new values. I did not find any place where the
enum values are passed through some number type, so it's safe to make
this change.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Geneviève Bastien <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jeremie Galarneau <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|