Age | Commit message (Collapse) | Author | Files | Lines |
|
In the Makefile, targets install, doc and doc-install should be added to
.PHONY. Let's fix this.
Fixes: 71bb428fe2c1 ("tools: bpf: add bpftool")
Signed-off-by: Quentin Monnet <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
|
|
Programs and documentation not managed by package manager are generally
installed under /usr/local/, instead of the user's home directory. In
particular, `man` is generally able to find manual pages under
`/usr/local/share/man`.
bpftool generally follows perf's example, and perf installs to home
directory. However bpftool requires root credentials, so it seems
sensible to follow the more common convention of installing files under
/usr/local instead. So, make /usr/local the default prefix for
installing the binary with `make install`, and the documentation with
`make doc-install`. Also, create /usr/local/sbin if it does not exist.
Note that the bash-completion file, however, is still installed under
/usr/share/bash-completion/completions, as the default setup for bash
does not attempt to load completion files under /usr/local/.
Reported-by: David Beckett <[email protected]>
Signed-off-by: Quentin Monnet <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
|
|
The end-of-line character inside the string would break JSON compliance.
Remove it, `p_err()` already adds a '\n' character for plain output
anyway.
Fixes: 9a5ab8bf1d6d ("tools: bpftool: turn err() and info() macros into functions")
Reported-by: Jakub Kicinski <[email protected]>
Signed-off-by: Quentin Monnet <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
|
|
If `getopt_long()` meets an unknown option, it prints its own error
message to standard error output. While this does not strictly break
JSON output, it is the only case bpftool prints something to standard
error output if JSON output is required. All other errors are printed on
standard output as JSON objects, so that an external program does not
have to parse stderr.
This is changed by setting the global variable `opterr` to 0.
Furthermore, p_err() is used to reproduce the error message in a more
JSON-friendly way, so that users still get to know what the erroneous
option is.
Signed-off-by: Quentin Monnet <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
|
|
The writer is cleaned at the end of the main function, but not if the
program exits sooner in usage(). Let's keep it clean and destroy the
writer before exiting.
Destruction and actual call to exit() are moved to another function so
that clean exit can also be performed without printing usage() hints.
Fixes: d35efba99d92 ("tools: bpftool: introduce --json and --pretty options")
Signed-off-by: Quentin Monnet <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
|
|
If bad or unrecognised parameters are specified after JSON output is
requested, `usage()` will try to output null JSON object before the
writer is created.
To prevent this, create the writer as soon as the `--json` option is
parsed.
Fixes: 004b45c0e51a ("tools: bpftool: provide JSON output for all possible commands")
Reported-by: Jakub Kicinski <[email protected]>
Signed-off-by: Quentin Monnet <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
|
|
Print file names of files that differ. For example, instead of:
Warning: Intel PT: x86 instruction decoder differs from kernel
print:
Warning: Intel PT: x86 instruction decoder header at 'tools/perf/util/intel-pt-decoder/inat.h' differs from latest version at 'arch/x86/include/asm/inat.h'
Reported-by: Ingo Molnar <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Adrian Hunter <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The PERF_RECORD_USER_ events are synthesized by the tool to assist in
processing the PERF_RECORD_ ones generated by the kernel, the printing
of that information doesn't come with a perf_sample structure, so, when
dumping the event fields using 'perf report -D' there were columns that
end up not being printed.
To tidy up a bit this, fake a perf_sample structure with zeroes to have
the missing columns printed and avoid the occasional surprise with that.
Before:
0 0x45b8 [0x68]: PERF_RECORD_MMAP -1/0: [0xffffffffc12ec000(0x4000) @ 0]: x /lib/modules/4.14.0+/kernel/fs/nls/nls_utf8.ko
0x4620 [0x28]: PERF_RECORD_THREAD_MAP nr: 1 thread: 27820
0x4648 [0x18]: PERF_RECORD_CPU_MAP: 0-3
0 0x4660 [0x28]: PERF_RECORD_COMM: perf:27820/27820
0x4a58 [0x8]: PERF_RECORD_FINISHED_ROUND
447723433020976 0x4688 [0x28]: PERF_RECORD_SAMPLE(IP, 0x4001): 27820/27820: 0xffffffff8f1b6d7a period: 1 addr: 0
After:
$ perf report -D | grep PERF_RECORD_ | head
0 0xe8 [0x20]: PERF_RECORD_TIME_CONV: unhandled!
0 0x108 [0x28]: PERF_RECORD_THREAD_MAP nr: 1 thread: 32555
0 0x130 [0x18]: PERF_RECORD_CPU_MAP: 0-3
0 0x148 [0x28]: PERF_RECORD_COMM: perf:32555/32555
0 0x4e8 [0x8]: PERF_RECORD_FINISHED_ROUND
448743409421205 0x170 [0x28]: PERF_RECORD_COMM exec: sleep:32555/32555
448743409431883 0x198 [0x68]: PERF_RECORD_MMAP2 32555/32555: [0x55e11d75a000(0x208000) @ 0 fd:00 3147174 2566255743]: r-xp /usr/bin/sleep
448743409443873 0x200 [0x70]: PERF_RECORD_MMAP2 32555/32555: [0x7f0ced316000(0x229000) @ 0 fd:00 3151761 2566238119]: r-xp /usr/lib64/ld-2.25.so
448743409454790 0x270 [0x60]: PERF_RECORD_MMAP2 32555/32555: [0x7ffe84f6d000(0x2000) @ 0 00:00 0 0]: r-xp [vdso]
448743409479500 0x2d0 [0x28]: PERF_RECORD_SAMPLE(IP, 0x4002): 32555/32555: 0xffffffff8f84c7e7 period: 1 addr: 0
$
Cc: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Fixes: 9aefcab0de47 ("perf session: Consolidate the dump code")
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add a tip for Node.js USDT(User-Level Statically Defined Tracing) probes
in tips.txt
Signed-off-by: Hansuk Hong <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add support for computing 'perf stat' style metrics in 'perf script'.
When using leader sampling we can get metrics for each sampling period
by computing formulas over the values of the different group members.
This allows things like fine grained IPC tracking through sampling, much
more fine grained than with 'perf stat'.
The metric is still averaged over the sampling period, it is not just
for the sampling point.
This patch adds a new metric output field for 'perf script' that uses
the existing 'perf stat' metrics infrastructure to compute any metrics
supported by 'perf stat'.
For example to sample IPC:
$ perf record -e '{ref-cycles,cycles,instructions}:S' -a sleep 1
$ perf script -F metric,ip,sym,time,cpu,comm
...
alsa-sink-ALC32 [000] 42815.856074: 7fd65937d6cc [unknown]
alsa-sink-ALC32 [000] 42815.856074: 7fd65937d6cc [unknown]
alsa-sink-ALC32 [000] 42815.856074: 7fd65937d6cc [unknown]
alsa-sink-ALC32 [000] 42815.856074: metric: 0.13 insn per cycle
swapper [000] 42815.857961: ffffffff81655df0 __schedule
swapper [000] 42815.857961: ffffffff81655df0 __schedule
swapper [000] 42815.857961: ffffffff81655df0 __schedule
swapper [000] 42815.857961: metric: 0.23 insn per cycle
qemu-system-x86 [000] 42815.858130: ffffffff8165ad0e _raw_spin_unlock_irqrestore
qemu-system-x86 [000] 42815.858130: ffffffff8165ad0e _raw_spin_unlock_irqrestore
qemu-system-x86 [000] 42815.858130: ffffffff8165ad0e _raw_spin_unlock_irqrestore
qemu-system-x86 [000] 42815.858130: metric: 0.46 insn per cycle
:4972 [000] 42815.858312: ffffffffa080e5f2 vmx_vcpu_run
:4972 [000] 42815.858312: ffffffffa080e5f2 vmx_vcpu_run
:4972 [000] 42815.858312: ffffffffa080e5f2 vmx_vcpu_run
:4972 [000] 42815.858312: metric: 0.45 insn per cycle
TopDown:
This requires disabling SMT if you have it enabled, because SMT would
require sampling per core, which is not supported.
$ perf record -e '{ref-cycles,topdown-fetch-bubbles,\
topdown-recovery-bubbles,\
topdown-slots-retired,topdown-total-slots,\
topdown-slots-issued}:S' -a sleep 1
$ perf script --header -I -F cpu,ip,sym,event,metric,period
...
[000] 121108 ref-cycles: ffffffff8165222e copy_user_enhanced_fast_string
[000] 190350 topdown-fetch-bubbles: ffffffff8165222e copy_user_enhanced_fast_string
[000] 2055 topdown-recovery-bubbles: ffffffff8165222e copy_user_enhanced_fast_string
[000] 148729 topdown-slots-retired: ffffffff8165222e copy_user_enhanced_fast_string
[000] 144324 topdown-total-slots: ffffffff8165222e copy_user_enhanced_fast_string
[000] 160852 topdown-slots-issued: ffffffff8165222e copy_user_enhanced_fast_string
[000] metric: 33.0% frontend bound
[000] metric: 3.5% bad speculation
[000] metric: 25.8% retiring
[000] metric: 37.7% backend bound
[000] 112112 ref-cycles: ffffffff8165aec8 _raw_spin_lock_irqsave
[000] 357222 topdown-fetch-bubbles: ffffffff8165aec8 _raw_spin_lock_irqsave
[000] 3325 topdown-recovery-bubbles: ffffffff8165aec8 _raw_spin_lock_irqsave
[000] 323553 topdown-slots-retired: ffffffff8165aec8 _raw_spin_lock_irqsave
[000] 270507 topdown-total-slots: ffffffff8165aec8 _raw_spin_lock_irqsave
[000] 341226 topdown-slots-issued: ffffffff8165aec8 _raw_spin_lock_irqsave
[000] metric: 33.0% frontend bound
[000] metric: 2.9% bad speculation
[000] metric: 29.9% retiring
[000] metric: 34.2% backend bound
...
v2:
Use evsel->priv for new fields
Port to new base line, support fp output.
Handle stats in ->stats, not ->priv
Minor cleanups
Extra explanation about the use of the term 'averaging', from Andi in the
thread in the Link: tag below:
<quote Andi>
The current samples contains the sum of event counts for a sampling period.
EventA-1 EventA-2 EventA-3 EventA-4
EventB-1 EventB-2 EventC-3
gap with no events overflow
|-----------------------------------------------------------------|
period-start period-end
^ ^
| |
previous sample current sample
So EventA = 4 and EventB = 3 at the sample point
I generate a metric, let's say EventA / EventB. It applies to the whole period.
But the metric is over a longer time which does not have the same behavior. For
example the gap above doesn't have any events, while they are clustered at the
beginning and end of the sample period.
But we're summing everything together. The metric doesn't know that the gap is
different than the busy period.
That's what I'm trying to express with averaging.
</quote>
Signed-off-by: Andi Kleen <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Synthesize the per attr thread maps and cpu maps in 'perf record'.
This allows code from 'perf stat' called from 'perf script' to access
this information.
Committer testing:
Please see the PERF_RECORD_THREAD_MAP and PERF_RECORD_CPU_MAP records,
added by this patch:
$ perf record sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.001 MB perf.data (8 samples) ]
$ perf report -D | grep PERF_RECORD_ | head
0xe8 [0x20]: PERF_RECORD_TIME_CONV: unhandled!
0x108 [0x28]: PERF_RECORD_THREAD_MAP nr: 1 thread: 23568
0x130 [0x18]: PERF_RECORD_CPU_MAP: 0-3
0 0x148 [0x28]: PERF_RECORD_COMM: perf:23568/23568
0x570 [0x8]: PERF_RECORD_FINISHED_ROUND
445342677837144 0x170 [0x28]: PERF_RECORD_COMM exec: sleep:23568/23568
445342677847339 0x198 [0x68]: PERF_RECORD_MMAP2 23568/23568: [0x564c943a4000(0x208000) @ 0 fd:00 3147174 2566255743]: r-xp /usr/bin/sleep
445342677862450 0x200 [0x70]: PERF_RECORD_MMAP2 23568/23568: [0x7f25968a8000(0x229000) @ 0 fd:00 3151761 2566238119]: r-xp /usr/lib64/ld-2.25.so
445342677873174 0x270 [0x60]: PERF_RECORD_MMAP2 23568/23568: [0x7ffc98176000(0x2000) @ 0 00:00 0 0]: r-xp [vdso]
445342677891928 0x2d0 [0x28]: PERF_RECORD_SAMPLE(IP, 0x4002): 23568/23568: 0xffffffff8f84c7e7 period: 1 addr: 0
$
Signed-off-by: Andi Kleen <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Move the code to synthesize event updates for scale/unit/cpus to a
common utility file, and use it both from stat and record.
This allows to access scale and other extra qualifiers from perf script.
Signed-off-by: Andi Kleen <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The s390x CPU sampling and measurement facilities do not support perf
events of type PERF_TYPE_BREAKPOINT. The test cases are executed and
fail with -ENOENT due to missing hardware support.
Disable the execution of both test cases based on a
platform check. This is the same approach as done for
PowerPC.
Signed-off-by: Thomas Richter <[email protected]>
Reviewed-by: Hendrik Brueckner <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
LPU-Reference: [email protected]
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
kernel
Remove this from check-headers.sh:
opts="--ignore-blank-lines --ignore-space-change"
as the easiest policy is to just follow the upstream UAPI header version 100%.
Pure space-only changes are comparatively rare.
Signed-off-by: Ingo Molnar <[email protected]>
Cc: Adrian Hunter <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
These can run on certain kernel configs. This will allow
us later to enable these tests under the right kernel
configurations.
Signed-off-by: Luis R. Rodriguez <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
These cannot run on all kernel builds. This will help us later
skip this test on kernel configs where non-applicable.
Signed-off-by: Luis R. Rodriguez <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
This cannot run on all kernel builds. This will help us later
skip this test on kernel configs where non-applicable.
Signed-off-by: Luis R. Rodriguez <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
The file /sys/module/firmware_class/parameters/path can be used
to set a custom firmware path. The fw_filesystem.sh script creates
a temporary directory to add a test firmware file to be used during
testing, in order for this to work it uses the custom path syfs file
and it was supposed to reset back the file on execution exit. The
script failed to do this due to a typo, it was using OLD_PATH instead
of OLD_FWPATH, since its inception since v3.17.
Its not as easy to just keep the old setting, it turns out that
resetting an empty setting won't actually do what we want, we need
to check if it was empty and set an empty space.
Without this we end up having the temporary path always set after
we run these tests.
Fixes: 0a8adf58475 ("test: add firmware_class loader test")
Signed-off-by: Luis R. Rodriguez <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Because %p prints "(null)" and %pK prints "0000000000000000" or (on
32-bit systems) "00000000", this commit adjusts torture-test scripting
accordingly.
Signed-off-by: Paul E. McKenney <[email protected]>
|
|
To add support for the MAP_SYNC flag introduced in:
b6fb293f2497 ("mm: Define MAP_SYNC and VM_SYNC flags")
Update tools/perf/trace/beauty/mmap.c to support that flag.
This silences this perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/mman.h' differs from latest version at 'include/uapi/asm-generic/mman.h'
Cc: Adrian Hunter <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
To pick up changes from:
2d2123bc7c7f ("arm64/sve: Add prctl controls for userspace vector length management")
7582e22038a2 ("arm64/sve: Backend logic for setting the vector length")
That showed a limitation of the regexp used in tools/perf/trace/beauty/prctl_option.sh,
that matches only PR_{SET,GET}_, but should match a few more, like
PR_MPX_*, PR_CAP_* and the one added by the above commit, PR_SVE_SET_*.
This silences this warning when building tools/perf:
Warning: Kernel ABI header at 'tools/include/uapi/linux/prctl.h' differs from latest version at 'include/uapi/linux/prctl.h'
Support for those extra prctl options should be left for the next merge
window tho.
Cc: Adrian Hunter <[email protected]>
Cc: Dave Martin <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
To pick up changes from these csets:
da9a1446d248 ("KVM: s390: provide a capability for AIS state migration")
5c5196da4e96 ("KVM: arm/arm64: Support EL1 phys timer register access in set/get reg")
None of which affects buildint tools/perf/.
Cc: Adrian Hunter <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Christoffer Dall <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
To pick up the changes from these csets:
bf64e0b00e1f ("drm/i915: Expand I915_PARAM_HAS_SCHEDULER into a capability bitmask")
ac14fbd460d0 ("drm/i915/scheduler: Support user-defined priorities")
822a4b673284 ("drm/i915: Don't use BIT() in UAPI section")
3fd3a6ffe279 ("drm/i915: Simplify i915_reg_read_ioctl")
None of them affects how the tools are built, this os done just to
silence this perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/drm/i915_drm.h' differs from latest version at 'include/uapi/drm/i915_drm.h'
Cc: Adrian Hunter <[email protected]>
Cc: Chris Wilson <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
To pick up the new ioctls added in these csets:
3064abfa932b ("drm: Add CRTC_GET_SEQUENCE and CRTC_QUEUE_SEQUENCE ioctls [v3]")
62884cd386b8 ("drm: Add four ioctls for managing drm mode object leases [v7]")
That will be automatically decoded (the ioctl cmd parameter, the structs
will be supported when we start using eBPF for that, which is in the
works).
This silences this warning when building tools/perf:
Warning: Kernel ABI header at 'tools/include/uapi/drm/drm.h' differs from latest version at 'include/uapi/drm/drm.h'
Cc: Adrian Hunter <[email protected]>
Cc: Dave Airlie <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Keith Packard <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
To get the changes in the 085b30625e39 ("perf/core: Add
PERF_AUX_FLAG_COLLISION to report colliding samples") commit, that will
be eventually used by perf to handle the ARM SPE architecture.
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Two more, that were just in perf/core and thus weren't covered by Ingo's
latest headers synch, kcmp.h and prctl.h, silencing this:
Warning: Kernel ABI header at 'tools/include/uapi/linux/kcmp.h' differs from latest version at 'include/uapi/linux/kcmp.h'
Warning: Kernel ABI header at 'tools/include/uapi/linux/prctl.h' differs from latest version at 'include/uapi/linux/prctl.h'
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Two x86 headers got modified in this merge window:
arch/x86/include/asm/cpufeatures.h
arch/x86/include/asm/disabled-features.h
To support x86 UMIP feature, to add new AVX instructions, plus cleanups.
None of those changes have an effect on tooling, so do a plain copy.
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: [email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
There are just a few new defines which do not affect perf tools.
Signed-off-by: Adrian Hunter <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Test case 21 (Number of exit events of a simple workload) fails on
s390x. The reason is the invalid sample frequency supplied for this
test. On s390x the minimum sample frequency is much higher (see output
of /proc/service_levels).
Supply a save sample frequency value for s390x to fix this. The value
will be adjusted by the s390x CPUMF frequency convertion function to a
value well below the sysctl kernel.perf_event_max_sample_rate value.
Signed-off-by: Thomas Richter <[email protected]>
Reviewed-by: Hendrik Brueckner <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
LPU-Reference: [email protected]
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Certain systems are designed to have sparse/discontiguous nodes. On
such systems, 'perf bench numa' hangs, shows wrong number of nodes and
shows values for non-existent nodes. Handle this by only taking nodes
that are exposed by kernel to userspace.
Signed-off-by: Satheesh Rajendran <[email protected]>
Reviewed-by: Srikar Dronamraju <[email protected]>
Acked-by: Naveen N. Rao <[email protected]>
Link: http://lkml.kernel.org/r/1edbcd353c009e109e93d78f2f46381930c340fe.1511368645.git.sathnaga@linux.vnet.ibm.com
Signed-off-by: Balamuruhan S <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
There's no need for SA_SIGINFO data in SIGWINCH handler, switching it to
register the handler via signal interface as we do for the rest of the
signals in perf top.
Signed-off-by: Jiri Olsa <[email protected]>
Tested-by: Ravi Bangoria <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The stdio perf top crashes when we change the terminal
window size. The reason is that we assumed we get the
perf_top pointer as a signal handler argument which is
not the case.
Changing the SIGWINCH handler logic to change global
resize variable, which is checked in the main thread
loop.
Signed-off-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Tested-by: Ravi Bangoria <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
If all events have attr.exclude_kernel set, no need to look at
kptr_restrict.
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
If we're not sampling the kernel, we shouldn't care about kptr_restrict
neither synthesize anything for assisting in resolving kernel samples,
like the reference relocation symbol or kernel modules information.
Before:
$ cat /proc/sys/kernel/kptr_restrict /proc/sys/kernel/perf_event_paranoid
2
2
$ perf record sleep 1
WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,
check /proc/sys/kernel/kptr_restrict.
Samples in kernel functions may not be resolved if a suitable vmlinux
file is not found in the buildid cache or in the vmlinux path.
Samples in kernel modules won't be resolved at all.
If some relocation was applied (e.g. kexec) symbols may be misresolved
even with a suitable vmlinux or kallsyms file.
Couldn't record kernel reference relocation symbol
Symbol resolution may be skewed if relocation was used (e.g. kexec).
Check /proc/kallsyms permission or run as root.
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.001 MB perf.data (8 samples) ]
$ perf evlist -v
cycles:uppp: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|PERIOD, disabled: 1, inherit: 1, exclude_kernel: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1
$
After:
$ perf record sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.001 MB perf.data (10 samples) ]
$
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
If none of the evsels has attr.exclude_kernel set to zero, no kernel
samples, so no point in warning the user about problems in processing
kernel samples, as there will be none.
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The warning about kptr_restrict needs to be emitted only when it is set
and we ask for kernel space samples, so add a helper to help with that.
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The 'perf test' case "probe libc's inet_pton & backtrace it with ping"
fails on s390x. The reason is the 'realpath /lib64/ld*.so.* | uniq' line
which returns 2 libraries:
root@s35lp76 shell]# realpath /lib64/ld*.so.* | uniq
/usr/lib64/ld-2.26.so
/usr/lib64/ld_pre_smc.so.1.0.1
[root@s35lp76 shell]
This output makes the "perf probe" command lines invalid.
Use ldd tool to find out the libraries required by "bash" and check if
symbol "inet_pton" is part of the "libc" library. Some distros do not
have a /lib64 directory.
I have also added a check for the existence of an IPv6 network interface
before it is being used.
Committer changes:
We can't really use ldd for libc, as in some systems, such as x86_64, it
has hardlinks and then ldd sees one and the kernel the other, so grep
for libc in /proc/self/maps to get the one we'll receive from
PERF_RECORD_MMAP.
Thomas checked this change and acked it.
Signed-off-by: Thomas-Mich Richter <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Suggested-by: Hendrik Brückner <[email protected]>
Reviewed-by: Hendrik Brückner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This 'perf test' case fails on s390x. The 'touch' command on s390x uses
the 'openat' system call to open the file named on the command line:
[root@s35lp76 perf]# perf probe -l
probe:vfs_getname (on getname_flags:72@fs/namei.c with pathname)
[root@s35lp76 perf]# perf trace -e open touch /tmp/abc
0.400 ( 0.015 ms): touch/27542 open(filename:
/usr/lib/locale/locale-archive, flags: CLOEXEC) = 3
[root@s35lp76 perf]#
There is no 'open' system call for file '/tmp/abc'. Instead the 'openat'
system call is used:
[root@s35lp76 perf]# strace touch /tmp/abc
execve("/usr/bin/touch", ["touch", "/tmp/abc"], 0x3ffd547ec98
/* 30 vars */) = 0
[...]
openat(AT_FDCWD, "/tmp/abc", O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK, 0666) = 3
[...]
On s390x the 'egrep' command does not find a matching pattern and
returns an error.
Fix this for s390x create a platform dependent command line to enable
the 'perf probe' call to listen to the 'openat' system call and get the
expected output.
Signed-off-by: Thomas-Mich Richter <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Hendrik Brueckner <[email protected]>
Cc: Thomas-Mich Richter <[email protected]>
LPU-Reference: [email protected]
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
There are many instructions, esp on PowerPC, whose mnemonics are longer
than 6 characters. Using precision limit causes truncation of such
mnemonics.
Fix this by removing precision limit. Note that, 'width' is still 6, so
alignment won't get affected for length <= 6.
Before:
li r11,-1
xscvdp vs1,vs1
add. r10,r10,r11
After:
li r11,-1
xscvdpsxds vs1,vs1
add. r10,r10,r11
Reported-by: Donald Stence <[email protected]>
Signed-off-by: Ravi Bangoria <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Taeung Song <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The commit 8e99b6d4533c changed prefixcmp() to strstart() but missed to
change the return value in some place. It makes perf help print
annoying output even for sane config items like below:
$ perf help
'.root': unsupported man viewer sub key.
...
Reported-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Tested-by: Taeung Song <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Sihyeon Jang <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/20171114001542.GA16464@sejong
Fixes: 8e99b6d4533c ("tools include: Adopt strstarts() from the kernel")
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
A recent fix for 'perf trace' introduced a bug where
machine__exit(trace->host) could be called while trace->host was still
NULL, so make this more robust by guarding against NULL, just like
free() does.
The problem happens, for instance, when !root users try to run 'perf
trace':
[acme@jouet linux]$ trace
Error: No permissions to read /sys/kernel/debug/tracing/events/raw_syscalls/sys_(enter|exit)
Hint: Try 'sudo mount -o remount,mode=755 /sys/kernel/debug/tracing'
perf: Segmentation fault
Obtained 7 stack frames.
[0x4f1b2e]
/lib64/libc.so.6(+0x3671f) [0x7f43a1dd971f]
[0x4f3fec]
[0x47468b]
[0x42a2db]
/lib64/libc.so.6(__libc_start_main+0xe9) [0x7f43a1dc3509]
[0x42a6c9]
Segmentation fault (core dumped)
[acme@jouet linux]$
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andrei Vagin <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Vasily Averin <[email protected]>
Cc: Wang Nan <[email protected]>
Fixes: 33974a414ce2 ("perf trace: Call machine__exit() at exit")
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When processing PERF_RECORD_AUXTRACE_INFO several perf_evsel entries
will be synthesized and inserted into session->evlist, eventually ending
in perf_script.tool.sample(), which ends up calling builtin-script.c's
process_event(), that expects evsel->priv to be a perf_evsel_script
object with a valid FILE pointer in fp.
So we need to intercept the processing of PERF_RECORD_AUXTRACE_INFO and
then setup evsel->priv for these newly created perf_evsel instances, do
it to fix the segfault in process_event() trying to use a NULL for that
FILE pointer.
Reported-by: Alexander Shishkin <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Wang Nan <[email protected]>
Cc: yuzhoujian <[email protected]>
Fixes: a14390fde64e ("perf script: Allow creating per-event dump files")
Link: http://lkml.kernel.org/n/[email protected]
[ Merge fix by Ravi Bangoria before pushing upstream to preserv bisectability ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
I forgot one conversion, which got noticed by Thomas when running:
$ perf stat -e '{cpu-clock,instructions}' kill
kill: not enough arguments
Segmentation fault (core dumped)
$
Fix it, those stats are in evsel->stats, not anymore in evsel->priv.
Reported-by: Thomas-Mich Richter <[email protected]>
Tested-by: Thomas-Mich Richter <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Hendrik Brueckner <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Fixes: e669e833da8d ("perf evsel: Restore evsel->priv as a tool private area")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Currently if trace_event__register_resolver() fails, we return -errno,
but we can't be sure that errno isn't zero in this case.
Signed-off-by: Andrei Vagin <[email protected]>
Reviewed-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Vasily Averin <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The Intel PMU event aliases have a implicit period= specifier to set the
default period.
Unfortunately this breaks overriding these periods with -c or -F,
because the alias terms look like they are user specified to the
internal parser, and user specified event qualifiers override the
command line options.
Track that they are coming from aliases by adding a "weak" state to the
term. Any weak terms don't override command line options.
I only did it for -c/-F for now, I think that's the only case that's
broken currently.
Before:
$ perf record -c 1000 -vv -e uops_issued.any
...
{ sample_period, sample_freq } 2000003
After:
$ perf record -c 1000 -vv -e uops_issued.any
...
{ sample_period, sample_freq } 1000
Signed-off-by: Andi Kleen <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When we use an initial delay, e.g.: 'perf record --delay 1000', we do not
enable the events until that delay has passed after we started the workload,
including the tracking event, i.e. the one for which we have attr.mmap, etc,
enabled to ask the kernel to generate the PERF_RECORD_{MMAP,COMM,EXEC} metadata
events that will then allow us to resolve addresses in samples to the map, dso
and symbol. There will be a shadow that even synthesizing samples won't cover,
i.e. the workload that we start and other processes forking while we
wait for the initial delay to expire.
So use a dummy event to be the tracking one and make it be enabled on exec.
Before:
# perf record --delay 1000 stress --cpu 1 --timeout 5
stress: info: [9029] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd
stress: info: [9029] successful run completed in 5s
[ perf record: Woken up 3 times to write data ]
[ perf record: Captured and wrote 0.624 MB perf.data (15908 samples) ]
# perf script | head
:9031 9031 32001.826888: 1 cycles:ppp: ffffffff831aa30d event_function (/lib/modules/4.14.0-rc6+/build/vmlinux)
:9031 9031 32001.826893: 1 cycles:ppp: ffffffff8300d1a0 intel_bts_enable_local (/lib/modules/4.14.0-rc6+/build/vmlinux)
:9031 9031 32001.826895: 7 cycles:ppp: ffffffff83023870 sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux)
:9031 9031 32001.826897: 103 cycles:ppp: ffffffff8300c331 intel_pmu_handle_irq (/lib/modules/4.14.0-rc6+/build/vmlinux)
:9031 9031 32001.826899: 1615 cycles:ppp: ffffffff830231f8 native_sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux)
:9031 9031 32001.826902: 26724 cycles:ppp: ffffffff8384c6a7 native_irq_return_iret (/lib/modules/4.14.0-rc6+/build/vmlinux)
:9031 9031 32001.826913: 329739 cycles:ppp: 7fb2a5410932 [unknown] ([unknown])
:9031 9031 32001.827033: 1225451 cycles:ppp: 7fb2a5410930 [unknown] ([unknown])
:9031 9031 32001.827474: 1391725 cycles:ppp: 7fb2a5410930 [unknown] ([unknown])
:9031 9031 32001.827978: 1233697 cycles:ppp: 7fb2a5410928 [unknown] ([unknown])
#
After:
# perf record --delay 1000 stress --cpu 1 --timeout 5
stress: info: [9741] dispatching hogs: 1 cpu, 0 io, 0 vm, 0 hdd
stress: info: [9741] successful run completed in 5s
[ perf record: Woken up 3 times to write data ]
[ perf record: Captured and wrote 0.751 MB perf.data (15976 samples) ]
# perf script | head
stress 9742 32110.959106: 1 cycles:ppp: ffffffff831b26f6 __perf_event_task_sched_in (/lib/modules/4.14.0-rc6+/build/vmlinux)
stress 9742 32110.959110: 1 cycles:ppp: ffffffff8300c2e9 intel_pmu_handle_irq (/lib/modules/4.14.0-rc6+/build/vmlinux)
stress 9742 32110.959112: 7 cycles:ppp: ffffffff830231e0 native_sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux)
stress 9742 32110.959115: 101 cycles:ppp: ffffffff83023870 sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux)
stress 9742 32110.959117: 1533 cycles:ppp: ffffffff830231f8 native_sched_clock (/lib/modules/4.14.0-rc6+/build/vmlinux)
stress 9742 32110.959119: 23992 cycles:ppp: ffffffff831b0900 ctx_sched_in (/lib/modules/4.14.0-rc6+/build/vmlinux)
stress 9742 32110.959129: 329406 cycles:ppp: 7f4b1b661930 __random_r (/usr/lib64/libc-2.25.so)
stress 9742 32110.959249: 1288322 cycles:ppp: 5566e1e7cbc9 hogcpu (/usr/bin/stress)
stress 9742 32110.959712: 1464046 cycles:ppp: 7f4b1b66179e __random (/usr/lib64/libc-2.25.so)
stress 9742 32110.960241: 1266918 cycles:ppp: 7f4b1b66195b __random_r (/usr/lib64/libc-2.25.so)
#
Reported-by: Bram Stolk <[email protected]>
Tested-by: Bram Stolk <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Fixes: 6619a53ef757 ("perf record: Add --initial-delay option")
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The evsel->idx field is used mainly to access the right bucket in
per-event arrays such as the annotation ones, but also to set
evsel->tracking, that in turn will decide what of the events will ask
for PERF_RECORD_{MMAP,COMM,EXEC} to be generated, i.e. which
perf_event_attr will have its mmap, etc fields set.
When we were adding the "dummy" event using perf_evlist__add_dummy() we
were not setting it correctly, which could result in multiple tracking
events.
Now that I'll try using a dummy event to be the tracking one when using
'perf record --delay', i.e. when we process the --delay
setting we may already have the evlist set up, like with:
perf record -e cycles,instructions --delay 1000 ./workload
We will need to add a "dummy" event, then reset evsel->tracking for the
first event, "cycles", and set it instead to the dummy one, and also
setting its attr.enable_on_exec, so that we get the PERF_RECORD_MMAP,
etc metadata events while waiting to enable the explicitely requested
events, so lets get this straight and set the right evsel->idx.
Cc: Adrian Hunter <[email protected]>
Cc: Bram Stolk <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
While reading in more than one block (50) of KVP records, the allocation
goes per block, but the reads used the total number of allocated records
(without resetting the pointer/stream). This causes the records buffer to
overrun when the refresh reads more than one block over the previous
capacity (e.g. reading more than 100 KVP records whereas the in-memory
database was empty before).
Fix this by reading the correct number of KVP records from file each time.
Signed-off-by: Paul Meyer <[email protected]>
Signed-off-by: Long Li <[email protected]>
Cc: [email protected]
Signed-off-by: K. Y. Srinivasan <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Makefiles usually come with 'install' target included so each distro
doesn't need to implement the procedure from scratch. Add it to tools/hv.
Signed-off-by: Vitaly Kuznetsov <[email protected]>
Signed-off-by: K. Y. Srinivasan <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|