Age | Commit message (Collapse) | Author | Files | Lines |
|
Fix the argv ui browser code to correctly display more entries than fit
on the screen without crashing. The problem was some type confusion with
pointer types in the ->seek function. Do the argv arithmetic correctly
with char ** pointers. Also add some asserts to find overruns and limit
the display function correctly.
Then finally remove a workaround for this in the res sample browser.
Committer testing:
1) Resize the x terminal to have just some 5 lines
2) Use 'perf report --samples 1' to activate the sample browser options
in the menu
3) Press ENTER, this will cause the crash:
# perf report --samples 1
perf: Segmentation fault
-------- backtrace --------
perf[0x5a514a]
/lib64/libc.so.6(+0x385bf)[0x7f27281b55bf]
/lib64/libc.so.6(+0x161a67)[0x7f27282dea67]
/lib64/libslang.so.2(SLsmg_write_wrapped_string+0x82)[0x7f272874a0b2]
perf(ui_browser__argv_refresh+0x77)[0x5939a7]
perf[0x5924cc]
perf(ui_browser__run+0x39)[0x593449]
perf(ui__popup_menu+0x83)[0x5a5263]
perf[0x59f421]
perf(perf_evlist__tui_browse_hists+0x3a0)[0x5a3780]
perf(cmd_report+0x2746)[0x447136]
perf[0x4a95fe]
perf(main+0x61c)[0x42dc6c]
/lib64/libc.so.6(__libc_start_main+0xf2)[0x7f27281a1412]
perf(_start+0x2d)[0x42de9d]
#
After applying this patch no crash takes place in such situation.
Signed-off-by: Andi Kleen <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Don't overflow array when the scripts directory is too large, or the
script file name is too long.
Signed-off-by: Andi Kleen <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Signed-off-by: Andi Kleen <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Now 'perf report' can show whole time periods with 'perf script', but
the user still has to find individual samples of interest manually.
It would be expensive and complicated to search for the right samples in
the whole perf file. Typically users only need to look at a small number
of samples for useful analysis.
Also the full scripts tend to show samples of all CPUs and all threads
mixed up, which can be very confusing on larger systems.
Add a new --samples option to save a small random number of samples per
hist entry.
Use a reservoir sample technique to select a representatve number of
samples.
Then allow browsing the samples using 'perf script' as part of the hist
entry context menu. This automatically adds the right filters, so only
the thread or cpu of the sample is displayed. Then we use less' search
functionality to directly jump the to the time stamp of the selected
sample.
It uses different menus for assembler and source display. Assembler
needs xed installed and source needs debuginfo.
Currently it only supports as many samples as fit on the screen due to
some limitations in the slang ui code.
Signed-off-by: Andi Kleen <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The scripts menu traditionally only showed custom perf scripts.
Allow to run standard perf script with useful default options too.
- Normal perf script
- perf script with assembler (needs xed installed)
- perf script with source code output (needs debuginfo)
- perf script with custom arguments
Then we automatically select the right options to display the
information in the perf.data file.
For example with -b display branch contexts.
It's not easily possible to check for xed's existence in advance. perf
script usually gives sensible error messages when it's not available.
Signed-off-by: Andi Kleen <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When using the time sort key, add new context menus to run scripts for
only the currently selected time range. Compute the correct range for
the selection add pass it as the --time option to perf script.
Signed-off-by: Andi Kleen <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add a time sort key to perf report to display samples for different time
quantums separately. This allows easier analysis of workloads that
change over time, and also will allow looking at the context of samples.
% perf record ...
% perf report --sort time,overhead,symbol --time-quantum 1ms --stdio
...
0.67% 277061.87300 [.] _dl_start
0.50% 277061.87300 [.] f1
0.50% 277061.87300 [.] f2
0.33% 277061.87300 [.] main
0.29% 277061.87300 [.] _dl_lookup_symbol_x
0.29% 277061.87300 [.] dl_main
0.29% 277061.87300 [.] do_lookup_x
0.17% 277061.87300 [.] _dl_debug_initialize
0.17% 277061.87300 [.] _dl_init_paths
0.08% 277061.87300 [.] check_match
0.04% 277061.87300 [.] _dl_count_modids
1.33% 277061.87400 [.] f1
1.33% 277061.87400 [.] f2
1.33% 277061.87400 [.] main
1.17% 277061.87500 [.] main
1.08% 277061.87500 [.] f1
1.08% 277061.87500 [.] f2
1.00% 277061.87600 [.] main
0.83% 277061.87600 [.] f1
0.83% 277061.87600 [.] f2
1.00% 277061.87700 [.] main
Committer notes:
Rename 'time' argument to hist_time() to htime to overcome this in older
distros:
cc1: warnings being treated as errors
util/hist.c: In function 'hist_time':
util/hist.c:251: error: declaration of 'time' shadows a global declaration
/usr/include/time.h:186: error: shadowed declaration is here
Signed-off-by: Andi Kleen <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The --cpu option only filtered samples. Filter other perf events, such
as COMM, FORK, SWITCH by the CPU too.
Reported-by: Jiri Olsa <[email protected]>
Signed-off-by: Andi Kleen <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
To pick the changes in 7948450d4556 ("x86/x32: use time64 versions of
sigtimedwait and recvmmsg"), that doesn't cause any change in behaviour
in tools/perf/ as it deals just with the x32 entries.
This silences this tools/perf build warning:
Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
Cc: Adrian Hunter <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Introduce a printdate function to eliminate the repetitive use of
datetime.datetime.today() in the SQL exporting scripts.
Signed-off-by: Tony Jones <[email protected]>
Acked-by: Adrian Hunter <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Support both Python2 and Python3 in the export-to-sqlite.py script
The use of 'from __future__' implies the minimum supported Python2 version
is now v2.6
Signed-off-by: Tony Jones <[email protected]>
Acked-by: Adrian Hunter <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Seeteena Thoufeek <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Support both Python2 and Python3 in the export-to-postgresql.py script.
The use of 'from __future__' implies the minimum supported Python2 version
is now v2.6
Signed-off-by: Tony Jones <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Adrian Hunter <[email protected]>
Signed-off-by: Seeteena Thoufeek <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Support both Python2 and Python3 in the exported-sql-viewer.py script.
The use of 'from __future__' implies the minimum supported Python2 version
is now v2.6
Signed-off-by: Tony Jones <[email protected]>
Acked-by: Adrian Hunter <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Seeteena Thoufeek <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The UI viewer for scripts output has a lot of limitations: limited size,
no search or save function, slow, and various other issues.
Just use 'less' to display directly on the terminal instead.
This won't work in GTK mode, but GTK doesn't support these context menus
anyways. If that is ever done could use an terminal for the output.
Signed-off-by: Andi Kleen <[email protected]>
Acked-by: Feng Tang <[email protected]>
Cc: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Adding callback function to reader object so callers can process data in
different ways.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexey Budankov <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The data files layout is described by HEADER_DIR_FORMAT feature.
Currently it holds only version number (1):
uint64_t version;
The current version holds only version value (1) means that data files:
- Follow the 'data.*' name format.
- Contain raw events data in standard perf format as read from kernel
(and need to be sorted)
Future versions are expected to describe different data files layout
according to special needs.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexey Budankov <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Make perf_data__size() return proper size for directory data, summing up
all the individual file sizes.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexey Budankov <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add perf_data__update_dir() to update the size for every file within the
perf.data directory.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexey Budankov <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
We can't store the auxtrace index when we store into multiple files,
because we keep only offset for it, not the file.
The auxtrace data will be processed correctly in the 'pipe' mode.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexey Budankov <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The caller needs to set 'struct perf_data::is_dir flag and the path will
be treated as a directory.
The 'struct perf_data::file' is initialized and open as 'path/header'
file.
Add a check to the direcory interface functions to check the is_dir flag.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexey Budankov <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Be consistent on how to signal failure, i.e. use -1 and let users check errno ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Thi patch adds PMC events for AMD Family 17 CPUs as defined in [1]. It
covers events described in section: 2.1.13. Regex pattern in mapfile.csv
covers all CPUs of the family.
[1] https://support.amd.com/TechDocs/54945_PPR_Family_17h_Models_00h-0Fh.pdf
Signed-off-by: Martin Liška <[email protected]>
Acked-by: Borislav Petkov <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Jon Grimm <[email protected]>
Cc: Martin Jambor <[email protected]>
Cc: William Cohen <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Since commit 4d99e4136580 ("perf machine: Workaround missing maps for
x86 PTI entry trampolines"), perf tools has been creating more than one
kernel map, however 'perf probe' assumed there could be only one.
Fix by using machine__kernel_map() to get the main kernel map.
Signed-off-by: Adrian Hunter <[email protected]>
Tested-by: Joseph Qi <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Jiufei Xue <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Cc: Xu Yu <[email protected]>
Fixes: 4d99e4136580 ("perf machine: Workaround missing maps for x86 PTI entry trampolines")
Fixes: d83212d5dd67 ("kallsyms, x86: Export addresses of PTI entry trampolines")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Many workloads change over time. 'perf report' currently aggregates the
whole time range reported in perf.data.
This patch adds an option for a time quantum to quantisize the perf.data
over time.
This just adds the option, will be used in follow on patches for a time
sort key.
Signed-off-by: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Use NSEC_PER_[MU]SEC ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add a utility function to print nanosecond timestamps.
Signed-off-by: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Upcoming changes add timestamp output in perf report. Add a --ns
argument similar to perf script to support nanoseconds resolution when
needed.
Signed-off-by: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
perf script -F +insn was only working for PT traces because the PT
instruction decoder was filling in the insn/insn_len sample attributes.
Support it for non PT samples too on x86 using the existing x86
instruction decoder.
This adds some extra checking to ensure that we don't try to decode
instructions when using perf.data from a different architecture.
% perf record -a sleep 1
% perf script -F ip,sym,insn --xed
ffffffff811704c9 remote_function movl %eax, 0x18(%rbx)
ffffffff8100bb50 intel_bts_enable_local retq
ffffffff81048612 native_apic_mem_write movl %esi, -0xa04000(%rdi)
ffffffff81048612 native_apic_mem_write movl %esi, -0xa04000(%rdi)
ffffffff81048612 native_apic_mem_write movl %esi, -0xa04000(%rdi)
ffffffff810f1f79 generic_exec_single xor %eax, %eax
ffffffff811704c9 remote_function movl %eax, 0x18(%rbx)
ffffffff8100bb34 intel_bts_enable_local movl 0x2000(%rax), %edx
ffffffff81048610 native_apic_mem_write mov %edi, %edi
...
Committer testing:
Before:
# perf script -F ip,sym,insn --xed | head -5
ffffffffa4068804 native_write_msr addb %al, (%rax)
ffffffffa4068804 native_write_msr addb %al, (%rax)
ffffffffa4068804 native_write_msr addb %al, (%rax)
ffffffffa4068806 native_write_msr addb %al, (%rax)
ffffffffa4068806 native_write_msr addb %al, (%rax)
# perf script -F ip,sym,insn --xed | grep -v "addb %al, (%rax)"
#
After:
# perf script -F ip,sym,insn --xed | head -5
ffffffffa4068804 native_write_msr wrmsr
ffffffffa4068804 native_write_msr wrmsr
ffffffffa4068804 native_write_msr wrmsr
ffffffffa4068806 native_write_msr nopl %eax, (%rax,%rax,1)
ffffffffa4068806 native_write_msr nopl %eax, (%rax,%rax,1)
# perf script -F ip,sym,insn --xed | grep -v "addb %al, (%rax)" | head -5
ffffffffa4068804 native_write_msr wrmsr
ffffffffa4068804 native_write_msr wrmsr
ffffffffa4068804 native_write_msr wrmsr
ffffffffa4068806 native_write_msr nopl %eax, (%rax,%rax,1)
ffffffffa4068806 native_write_msr nopl %eax, (%rax,%rax,1)
#
More examples:
# perf script -F ip,sym,insn --xed | grep -v native_write_msr | head
ffffffffa416b90e tick_check_broadcast_expired btq %rax, 0x1a5f42a(%rip)
ffffffffa4956bd0 nmi_cpu_backtrace pushq %r13
ffffffffa415b95e __hrtimer_next_event_base movq 0x18(%rax), %rdx
ffffffffa4956bf3 nmi_cpu_backtrace popq %r12
ffffffffa4171d5c smp_call_function_single pause
ffffffffa4956bdd nmi_cpu_backtrace mov %ebp, %r12d
ffffffffa4797e4d menu_select cmp $0x190, %rax
ffffffffa4171d5c smp_call_function_single pause
ffffffffa405a7d8 nmi_cpu_backtrace_handler callq 0xffffffffa4956bd0
ffffffffa4797f7a menu_select shr $0x3, %rax
#
Which matches the annotate output modulo resolving callqs:
# perf annotate --stdio2 nmi_cpu_backtrace_handler
Samples: 4 of event 'cycles:ppp', 4000 Hz, Event count (approx.): 35908, [percent: local period]
nmi_cpu_backtrace_handler() /lib/modules/5.0.0+/build/vmlinux
Percent
Disassembly of section .text:
ffffffff8105a7d0 <nmi_cpu_backtrace_handler>:
nmi_cpu_backtrace_handler():
nmi_trigger_cpumask_backtrace(mask, exclude_self,
nmi_raise_cpu_backtrace);
}
static int nmi_cpu_backtrace_handler(unsigned int cmd, struct pt_regs *regs)
{
24.45 → callq __fentry__
if (nmi_cpu_backtrace(regs))
mov %rsi,%rdi
75.55 → callq nmi_cpu_backtrace
return NMI_HANDLED;
movzbl %al,%eax
return NMI_DONE;
}
← retq
#
# perf annotate --stdio2 __hrtimer_next_event_base
Samples: 4 of event 'cycles:ppp', 4000 Hz, Event count (approx.): 767977, [percent: local period]
__hrtimer_next_event_base() /lib/modules/5.0.0+/build/vmlinux
Percent
Disassembly of section .text:
ffffffff8115b910 <__hrtimer_next_event_base>:
__hrtimer_next_event_base():
static ktime_t __hrtimer_next_event_base(struct hrtimer_cpu_base *cpu_base,
const struct hrtimer *exclude,
unsigned int active,
ktime_t expires_next)
{
→ callq __fentry__
<SNIP>
4a: add $0x1,%r14
77.31 mov 0x18(%rax),%rdx
shl $0x6,%r14
sub 0x38(%rbx,%r14,1),%rdx
if (expires < expires_next) {
cmp %r12,%rdx
↓ jge 68
<SNIP>
Signed-off-by: Andi Kleen <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Converted fetch_exe() to use the name it ended up having when merged: thread__memcpy() ]
[ archinsn.c needs the instruction decoder that is only build when CONFIG_AUXTRACE=y, fix that ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Thomas Gleixner:
"Perf updates and fixes:
Kernel:
- Handle events which have the bpf_event attribute set as side band
events as they carry information about BPF programs.
- Add missing switch-case fall-through comments
Libraries:
- Fix leaks and double frees in error code paths.
- Prevent buffer overflows in libtraceevent
Tools:
- Improvements in handling Intel BT/PTS
- Add BTF ELF markers to perf trace BPF programs to improve output
- Support --time, --cpu, --pid and --tid filters for perf diff
- Calculate the column width in perf annotate as the hardcoded 6
characters for the instruction are not sufficient
- Small fixes all over the place"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits)
perf/core: Mark expected switch fall-through
perf/x86/intel/uncore: Fix client IMC events return huge result
perf/ring_buffer: Use high order allocations for AUX buffers optimistically
perf data: Force perf_data__open|close zero data->file.path
perf session: Fix double free in perf_data__close
perf evsel: Probe for precise_ip with simple attr
perf tools: Read and store caps/max_precise in perf_pmu
perf hist: Fix memory leak of srcline
perf hist: Add error path into hist_entry__init
perf c2c: Fix c2c report for empty numa node
perf script python: Add Python3 support to intel-pt-events.py
perf script python: Add Python3 support to event_analyzing_sample.py
perf script python: add Python3 support to check-perf-trace.py
perf script python: Add Python3 support to futex-contention.py
perf script python: Remove mixed indentation
perf diff: Support --pid/--tid filter options
perf diff: Support --cpu filter option
perf diff: Support --time filter option
perf thread: Generalize function to copy from thread addr space from intel-bts code
perf annotate: Calculate the max instruction name, align column to that
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Pull perf/core changes from Arnaldo Carvalho de Melo:
perf bpf:
Arnaldo Carvalho de Melo:
- Automatically add BTF ELF markers to 'perf trace' BPF programs, so that
tools such as 'bpftool map dump' can pretty print map keys and values.
perf c2c:
Jiri Olsa:
- Fix report for empty NUMA node.
perf diff:
Jin Yao:
- Support --time, --cpu, --pid and --tid filter options.
perf probe:
Arnaldo Carvalho de Melo:
- Clarify error message about not finding kernel modules debuginfo.
perf record:
Jiri Olsa:
- Fixup probing for max attr.precise_ip.
perf trace:
Arnaldo Carvalho de Melo:
- Add missing %s lost in the 'msg_flags' recvmmsg arg when adding prefix suppression logic.
perf annotate:
Arnaldo Carvalho de Melo:
- Calculate the max instruction name, align column to that, removing the
hardcoded max 6 chars and cope with instructions with names longer than that,
such as vpmovmskb, vpcmpeqb, etc.
kernel:
Song Liu:
- Consider events with attr.bpf_event set as side-band.
Gustavo A. R. Silva:
- Mark expected switch fall-through in perf_event_parse_addr_filter().
Libraries:
Jiri Olsa:
- Fix leaks and double frees on error paths.
libtraceevent:
Tony Jones:
- Fix buffer overflow in arg_eval().
python scripting:
Tony Jones:
- More python3 fixes.
Trivial:
Yang Wei:
- Remove needless extra semicolon in clang C++ glue code.
Intel PT/BTS:
Adrian Hunter:
- Improve auxtrace address filter error message when there is no DSO.
- Fix divide by zero when TSC is not available.
- Further improvements to the export to sqlite/posgresql python scripts
and to the GUI sqlviewer, exporting 'parent_id' so that we have enable
the creation of call trees.
Andi Kleen:
- Generalize function to copy from thread addr space from intel-bts code.
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Making sure the data->file.path is zeroed on perf_data__open error path
and in perf_data__close, so we don't double free it in case someone call
it twice.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jonas Rabenstein <[email protected]>
Cc: Nageswara R Sastry <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
We can't call perf_data__close and subsequently perf_session__delete,
because it will call perf_data__close again and cause double free for
data->file.path.
$ perf report -i .
incompatible file format (rerun with -v to learn more)
free(): double free detected in tcache 2
Aborted (core dumped)
In fact we don't need to call perf_data__close at all, because at the
time the got out_close is reached, session->data is already initialized,
so the perf_data__close call will be triggered from
perf_session__delete.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jonas Rabenstein <[email protected]>
Cc: Nageswara R Sastry <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Fixes: 2d4f27999b88 ("perf data: Add global path holder")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Currently we probe for precise_ip with user specified perf_event_attr,
which might fail because of unsupported kernel features, which would get
disabled during the open time anyway.
Switching the probe to take place on simple hw cycles, so the following
record sets proper precise_ip:
# perf record -e cycles:P ls
# perf evlist -v
cycles:P: size: 112, ... precise_ip: 3, ...
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jonas Rabenstein <[email protected]>
Cc: Nageswara R Sastry <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Read the caps/max_precise value and store it in struct perf_pmu to be
used when setting the maximum precise_ip field in following patch.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jonas Rabenstein <[email protected]>
Cc: Nageswara R Sastry <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
We can't allocate he->srcline unconditionaly, only when new hist_entry
is created. Moving he->srcline allocation into hist_entry__init
function.
Original-patch-by: Jonas Rabenstein <[email protected]>
Suggested-by: Namhyung Kim <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Nageswara R Sastry <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Adding error path into hist_entry__init to unify error handling, so
every new member does not need to free everything else.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jonas Rabenstein <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: nageswara r sastry <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Ravi Bangoria reported that we fail with an empty NUMA node with the
following message:
$ lscpu
NUMA node0 CPU(s):
NUMA node1 CPU(s): 0-4
$ sudo ./perf c2c report
node/cpu topology bugFailed setup nodes
Fix this by detecting the empty node and keeping its CPU set empty.
Reported-by: Nageswara R Sastry <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Tested-by: Ravi Bangoria <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jonas Rabenstein <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Support both Python2 and Python3 in the intel-pt-events.py script
There may be differences in the ordering of output lines due to
differences in dictionary ordering etc. However the format within lines
should be unchanged.
The use of 'from __future__' implies the minimum supported Python2 version
is now v2.6
Signed-off-by: Tony Jones <[email protected]>
Acked-by: Adrian Hunter <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Seeteena Thoufeek <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Support both Python2 and Python3 in the event_analyzing_sample.py script
There may be differences in the ordering of output lines due to
differences in dictionary ordering etc. However the format within lines
should be unchanged.
The use of 'from __future__' implies the minimum supported Python2 version
is now v2.6
Signed-off-by: Tony Jones <[email protected]>
Cc: Feng Tang <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Seeteena Thoufeek <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Support both Python 2 and Python 3 in the check-perf-trace.py script.
There may be differences in the ordering of output lines due to
differences in dictionary ordering etc. However the format within lines
should be unchanged.
The use of from __future__ implies the minimum supported version of
Python2 is now v2.6
Signed-off-by: Tony Jones <[email protected]>
Cc: Tom Zanussi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Seeteena Thoufeek <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Support both Python2 and Python3 in the futex-contention.py script
There may be differences in the ordering of output lines due to
differences in dictionary ordering etc. However the format within lines
should be unchanged.
The use of 'from __future__' implies the minimum supported Python2 version
is now v2.6
Signed-off-by: Tony Jones <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Seeteena Thoufeek <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Remove mixed indentation in Python scripts. Revert to either all tabs
(most common form) or all spaces (4 or 8) depending on what was the
intent of the original commit. This is necessary to complete Python3
support as it will flag an error if it encounters mixed indentation.
Signed-off-by: Tony Jones <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Using the existing symbol_conf.pid_list_str and symbol_conf.tid_list_str
logic.
For example:
perf diff --tid 13965
It'll only diff the samples for thread 13965.
Signed-off-by: Jin Yao <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
To improve 'perf diff', implement a --cpu filter option.
Multiple CPUs can be provided as a comma-separated list with no space:
0,1. Ranges of CPUs are specified with -: 0-2. Default is to report
samples on all CPUs.
For example,
perf diff --cpu 0,1
It only diff the samples for CPU0 and CPU1.
Signed-off-by: Jin Yao <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
To improve 'perf diff', implement a --time filter option to diff the
samples within given time window.
It supports time percent with multiple time ranges. The time string
format is 'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'.
For example:
Select the second 10% time slice to diff:
perf diff --time 10%/2
Select from 0% to 10% time slice to diff:
perf diff --time 0%-10%
Select the first and the second 10% time slices to diff:
perf diff --time 10%/1,10%/2
Select from 0% to 10% and 30% to 40% slices to diff:
perf diff --time 0%-10%,30%-40%
It also supports analysing samples within a given time window
<start>,<stop>.
Times have the format seconds.microseconds.
If 'start' is not given (i.e., time string is ',x.y') then analysis starts at
the beginning of the file.
If the stop time is not given (i.e, time string is 'x.y,') then analysis
goes to end of file.
Time string is 'a1.b1,c1.d1:a2.b2,c2.d2'. Use ':' to separate timestamps for
different perf.data files.
For example, we get the timestamp information from perf script.
perf script -i perf.data.old
mgen 13940 [000] 3946.361400: ...
perf script -i perf.data
mgen 13940 [000] 3971.150589 ...
perf diff --time 3946.361400,:3971.150589,
It analyzes the perf.data.old from the timestamp 3946.361400 to the end of
perf.data.old and analyzes the perf.data from the timestamp 3971.150589 to the
end of perf.data.
v4:
---
Update abstime_str_dup(), let it return error if strdup
is failed, and update __cmd_diff() accordingly.
Signed-off-by: Jin Yao <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
intel-bts code
Add a utility function to fetch executable code. Convert one
user over to it. There are more places doing that, but they
do significantly different actions, so they are not
easy to fit into a single library function.
Committer changes:
. No need to cast around, make 'buf' be a void pointer.
. Rename it to thread__memcpy() to reflect the fact it is about copying
a chunk of memory from a thread, i.e. from its address space.
. No need to have it in a separate object file, move it to thread.[ch]
. Check the return of map__load(), the original code didn't do it, but
since we're moving this around, check that as well.
Signed-off-by: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
We were hardcoding '6' as the max instruction name, and we have lots
that are longer than that, see the diff from two 'P' printed TUI
annotations for a libc function that uses instructions with long names,
such as 'vpmovmskb' with its 9 chars:
--- __strcmp_avx2.annotation.before 2019-03-06 16:31:39.368020425 -0300
+++ __strcmp_avx2.annotation 2019-03-06 16:32:12.079450508 -0300
@@ -2,284 +2,284 @@
Event: cycles:ppp
Percent endbr64
- 0.10 mov %edi,%eax
+ 0.10 mov %edi,%eax
- xor %edx,%edx
+ xor %edx,%edx
- 3.54 vpxor %ymm7,%ymm7,%ymm7
+ 3.54 vpxor %ymm7,%ymm7,%ymm7
- or %esi,%eax
+ or %esi,%eax
- and $0xfff,%eax
+ and $0xfff,%eax
- cmp $0xf80,%eax
+ cmp $0xf80,%eax
- ↓ jg 370
+ ↓ jg 370
- 27.07 vmovdqu (%rdi),%ymm1
+ 27.07 vmovdqu (%rdi),%ymm1
- 7.97 vpcmpeqb (%rsi),%ymm1,%ymm0
+ 7.97 vpcmpeqb (%rsi),%ymm1,%ymm0
- 2.15 vpminub %ymm1,%ymm0,%ymm0
+ 2.15 vpminub %ymm1,%ymm0,%ymm0
- 4.09 vpcmpeqb %ymm7,%ymm0,%ymm0
+ 4.09 vpcmpeqb %ymm7,%ymm0,%ymm0
- 0.43 vpmovmskb %ymm0,%ecx
+ 0.43 vpmovmskb %ymm0,%ecx
- 1.53 test %ecx,%ecx
+ 1.53 test %ecx,%ecx
- ↓ je b0
+ ↓ je b0
- 5.26 tzcnt %ecx,%edx
+ 5.26 tzcnt %ecx,%edx
- 18.40 movzbl (%rdi,%rdx,1),%eax
+ 18.40 movzbl (%rdi,%rdx,1),%eax
- 7.09 movzbl (%rsi,%rdx,1),%edx
+ 7.09 movzbl (%rsi,%rdx,1),%edx
- 3.34 sub %edx,%eax
+ 3.34 sub %edx,%eax
2.37 vzeroupper
← retq
nop
- 50: tzcnt %ecx,%edx
+ 50: tzcnt %ecx,%edx
- movzbl 0x20(%rdi,%rdx,1),%eax
+ movzbl 0x20(%rdi,%rdx,1),%eax
- movzbl 0x20(%rsi,%rdx,1),%edx
+ movzbl 0x20(%rsi,%rdx,1),%edx
- sub %edx,%eax
+ sub %edx,%eax
vzeroupper
← retq
- data16 nopw %cs:0x0(%rax,%rax,1)
+ data16 nopw %cs:0x0(%rax,%rax,1)
Reported-by: Travis Downs <[email protected]>
LPU-Reference: CAOBGo4z1KfmWeOm6Et0cnX5Z6DWsG2PQbAvRn1MhVPJmXHrc5g@mail.gmail.com
Cc: Adrian Hunter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Merge misc updates from Andrew Morton:
- a few misc things
- ocfs2 updates
- most of MM
* emailed patches from Andrew Morton <[email protected]>: (159 commits)
tools/testing/selftests/proc/proc-self-syscall.c: remove duplicate include
proc: more robust bulk read test
proc: test /proc/*/maps, smaps, smaps_rollup, statm
proc: use seq_puts() everywhere
proc: read kernel cpu stat pointer once
proc: remove unused argument in proc_pid_lookup()
fs/proc/thread_self.c: code cleanup for proc_setup_thread_self()
fs/proc/self.c: code cleanup for proc_setup_self()
proc: return exit code 4 for skipped tests
mm,mremap: bail out earlier in mremap_to under map pressure
mm/sparse: fix a bad comparison
mm/memory.c: do_fault: avoid usage of stale vm_area_struct
writeback: fix inode cgroup switching comment
mm/huge_memory.c: fix "orig_pud" set but not used
mm/hotplug: fix an imbalance with DEBUG_PAGEALLOC
mm/memcontrol.c: fix bad line in comment
mm/cma.c: cma_declare_contiguous: correct err handling
mm/page_ext.c: fix an imbalance with kmemleak
mm/compaction: pass pgdat to too_many_isolated() instead of zone
mm: remove zone_lru_lock() function, access ->lru_lock directly
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
"Lots of tooling updates - too many to list, here's a few highlights:
- Various subcommand updates to 'perf trace', 'perf report', 'perf
record', 'perf annotate', 'perf script', 'perf test', etc.
- CPU and NUMA topology and affinity handling improvements,
- HW tracing and HW support updates:
- Intel PT updates
- ARM CoreSight updates
- vendor HW event updates
- BPF updates
- Tons of infrastructure updates, both on the build system and the
library support side
- Documentation updates.
- ... and lots of other changes, see the changelog for details.
Kernel side updates:
- Tighten up kprobes blacklist handling, reduce the number of places
where developers can install a kprobe and hang/crash the system.
- Fix/enhance vma address filter handling.
- Various PMU driver updates, small fixes and additions.
- refcount_t conversions
- BPF updates
- error code propagation enhancements
- misc other changes"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (238 commits)
perf script python: Add Python3 support to syscall-counts-by-pid.py
perf script python: Add Python3 support to syscall-counts.py
perf script python: Add Python3 support to stat-cpi.py
perf script python: Add Python3 support to stackcollapse.py
perf script python: Add Python3 support to sctop.py
perf script python: Add Python3 support to powerpc-hcalls.py
perf script python: Add Python3 support to net_dropmonitor.py
perf script python: Add Python3 support to mem-phys-addr.py
perf script python: Add Python3 support to failed-syscalls-by-pid.py
perf script python: Add Python3 support to netdev-times.py
perf tools: Add perf_exe() helper to find perf binary
perf script: Handle missing fields with -F +..
perf data: Add perf_data__open_dir_data function
perf data: Add perf_data__(create_dir|close_dir) functions
perf data: Fail check_backup in case of error
perf data: Make check_backup work over directories
perf tools: Add rm_rf_perf_data function
perf tools: Add pattern name checking to rm_rf
perf tools: Add depth checking to rm_rf
perf data: Add global path holder
...
|
|
Delete a superfluous semicolon in getBPFObjectFromModule().
Signed-off-by: Yang Wei <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Yang Wei <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The libbpf loader expects that some __btf_map_<MAP_NAME> structs be in
place with the keys and values types of maps so that one can store the
struct definitions and have them sent to the kernel via sys_bpf(fd, cmd
= BTF_LOAD) and then later be retrievable via sys_bpf(fd, cmd =
BPF_OBJ_GET_INFO_BY_FD) for use by tools such as 'bpftool map dump id
MAP_ID'.
Since we already have this for defining maps in 'perf trace' BPF events:
bpf_map(name, _type, type_key, type_val, _max_entries)
As used in the tools/perf/examples/bpf/augmented_raw_syscalls.c:
--- 8< ---
struct syscall {
bool enabled;
};
bpf_map(syscalls, ARRAY, int, struct syscall, 512);
--- 8< ---
All we need is to get all that already available info, piggyback on the
'bpf_map' define in tools/perf/include/bpf/bpf.h, that is included by
'perf trace' BPF programs and do that without requiring changes to the
BPF programs already defining maps using 'bpf_map()'.
So this is what we have before this patch:
1) With this in ~/.perfconfig to dump .c events as .o, aka save a copy
so that we can use the .o later as a pre-compiled BPF bytecode:
# grep '\[llvm\]' -A2 ~/.perfconfig
[llvm]
dump-obj = true
clang-opt = -g
#
# clang --version
clang version 9.0.0 (https://git.llvm.org/git/clang.git/ 7906282d3afec5dfdc2b27943fd6c0309086c507) (https://git.llvm.org/git/llvm.git/ a1b5de1ff8ae8bc79dc8e86e1f82565229bd0500)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/llvm/bin
2) Note the -g there so that we get clang to generate debuginfo, and
since the target is 'bpf' it will generate the BTF info in this
clang version (9.0).
3) Run a simple 'perf record' specifiying as an event the augmented_raw_syscalls.c
source code:
# perf record -e /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c sleep 1
LLVM: dumping /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.025 MB perf.data ]
# file /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o: ELF 64-bit LSB relocatable, eBPF, version 1 (SYSV), with debug_info, not stripped
4) Look at the BTF structs encoded in it:
# pahole -F btf --sizes /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
syscall_enter_args 64 0
augmented_filename 264 0
syscall 1 0
syscall_exit_args 24 0
bpf_map 28 0
#
# pahole -F btf -C syscalls /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
# pahole -F btf -C syscall /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
struct syscall {
bool enabled; /* 0 1 */
/* size: 1, cachelines: 1, members: 1 */
/* last cacheline: 1 bytes */
};
#
5) Ok, with just this we don't have the markers expected by the libbpf
loader and when we run with this BPF bytecode, because we have:
# grep '\[trace\]' -A1 ~/.perfconfig
[trace]
add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
#
6) Lets do a 'perf trace' system wide session using this BPF program:
# perf trace -e *mmsg,open*
Cache2 I/O/6885 openat(AT_FDCWD, "/home/acme/.cache/mozilla/firefox/ina67tev.default/cache2/entries/BA220AB2914006A7AE96D27BE6EA13DD77519FCA", O_RDWR|O_CREAT|O_TRUNC, S_IRUSR|S_IWUSR) = 106
Cache2 I/O/6885 openat(AT_FDCWD, "/proc/self/mountinfo", O_RDONLY) = 121
Cache2 I/O/6885 openat(AT_FDCWD, "/proc/self/mountinfo", O_RDONLY) = 121
Cache2 I/O/6885 openat(AT_FDCWD, "/proc/self/mountinfo", O_RDONLY) = 121
Cache2 I/O/6885 openat(AT_FDCWD, "/proc/self/mountinfo", O_RDONLY) = 121
DNS Res~ver #3/23340 openat(AT_FDCWD, "/etc/hosts", O_RDONLY|O_CLOEXEC) = 106
DNS Res~ver #3/23340 sendmmsg(106<socket:[3482690]>, 0x7f252f1fcaf0, 2, MSG_NOSIGNAL) = 2
Cache2 I/O/6885 openat(AT_FDCWD, "/home/acme/.cache/mozilla/firefox/ina67tev.default/cache2/entries/BA220AB2914006A7AE96D27BE6EA13DD77519FCA", O_RDWR) = 106
lighttpd/18915 openat(AT_FDCWD, "/proc/loadavg", O_RDONLY) = 12
7) While it runs lets see the maps that 'perf trace' + libbpf's BPF
loader loaded into the kernel via sys_bpf(fd, BPF_BTF_LOAD, ...):
# bpftool map list | tail -6
149: perf_event_array name __augmented_sys flags 0x0
key 4B value 4B max_entries 8 memlock 4096B
150: array name syscalls flags 0x0
key 4B value 1B max_entries 512 memlock 8192B
151: hash name pids_filtered flags 0x0
key 4B value 1B max_entries 64 memlock 8192B
#
8) Dump the "pids_filtered", map, that will have one entry per PID that
'perf trace' wants filtered, which includes its own, to avoid a
tracing feedback loop (perf trace shows the syscalls it does which
generates more syscalls that it has to show that...), it also
auto-filters the 'gnome-terminal' and 'sshd' parent PIDs, for the
same reason:
# bpftool map dump id 151
key: a5 0c 00 00 value: 01
key: 14 63 00 00 value: 01
Found 2 elements
#
9) Since there is no BTF info available, it does a generic hex dump :-\
10) Now, with this patch applied, we'll do steps 3 to 6 again and look
with pahole if there are extra structs encoded in BTF:
# pahole -F btf --sizes /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
syscall_enter_args 64 0
augmented_filename 264 0
syscall 1 0
syscall_exit_args 24 0
bpf_map 28 0
____btf_map___augmented_syscalls__ 8 0
____btf_map_syscalls 8 0
____btf_map_pids_filtered 8 0
#
11) Yes, those __btf_map_ + the map names, lets see how they look like:
# pahole -F btf -C ____btf_map_syscalls /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
struct ____btf_map_syscalls {
int key; /* 0 4 */
struct syscall value; /* 4 1 */
/* size: 8, cachelines: 1, members: 2 */
/* padding: 3 */
/* last cacheline: 8 bytes */
};
#
12) Lets repeat step 7 to get the new map ids:
# bpftool map list | tail -6
155: perf_event_array name __augmented_sys flags 0x0
key 4B value 4B max_entries 8 memlock 4096B
156: array name syscalls flags 0x0
key 4B value 1B max_entries 512 memlock 8192B
157: hash name pids_filtered flags 0x0
key 4B value 1B max_entries 64 memlock 8192B
#
13) And finally lets dump the 'pids_filtered':
# bpftool map dump id 157
[{
"key": 3237,
"value": true
},{
"key": 26435,
"value": true
}
]
#
Looks much better! BTF info was used to interpret the key as an integer
and the value as a struct with just one boolean member, so to make it
more compact, show just the 'true' value where we saw '01'.
Now to make 'perf trace --dump-map' to use BTF!
Cc: Adrian Hunter <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Andrii Nakryiko <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Luis Cláudio Gonçalves <[email protected]>
Cc: Martin KaFai Lau <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Wang Nan <[email protected]>
Cc: Yonghong Song <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This replaces all open encodings in tools with NUMA_NO_NODE. Also
linux/numa.h is now needed for the perf build.
[[email protected]: fix for replace open encodings for NUMA_NO_NODE]
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Stephen Rothwell <[email protected]>
Signed-off-by: Anshuman Khandual <[email protected]>
Signed-off-by: Stephen Rothwell <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Doug Ledford <[email protected]> [drivers/infiniband]
Cc: Hans Verkuil <[email protected]>
Cc: Jeff Kirsher <[email protected]> [ixgbe]
Cc: Jens Axboe <[email protected]> [mtip32xx]
Cc: Joseph Qi <[email protected]>
Cc: Michael Ellerman <[email protected]> [powerpc]
Cc: Vinod Koul <[email protected]> [dmaengine.c]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|