Age | Commit message (Collapse) | Author | Files | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Thomas Gleixner:
"This update contains:
- a fix for the bpf tools to use the new EM_BPF code
- a fix for the module parser of perf to retrieve the
proper text start address
- add str_error_c to libapi to avoid linking against
tools/lib/str_error_r.o"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
tools lib api: Add str_error_c to libapi
perf s390: Fix 'start' address of module's map
tools lib bpf: Use official ELF e_machine value
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm updates from Dan Williams:
- Replace pcommit with ADR / directed-flushing.
The pcommit instruction, which has not shipped on any product, is
deprecated. Instead, the requirement is that platforms implement
either ADR, or provide one or more flush addresses per nvdimm.
ADR (Asynchronous DRAM Refresh) flushes data in posted write buffers
to the memory controller on a power-fail event.
Flush addresses are defined in ACPI 6.x as an NVDIMM Firmware
Interface Table (NFIT) sub-structure: "Flush Hint Address Structure".
A flush hint is an mmio address that when written and fenced assures
that all previous posted writes targeting a given dimm have been
flushed to media.
- On-demand ARS (address range scrub).
Linux uses the results of the ACPI ARS commands to track bad blocks
in pmem devices. When latent errors are detected we re-scrub the
media to refresh the bad block list, userspace can also request a
re-scrub at any time.
- Support for the Microsoft DSM (device specific method) command
format.
- Support for EDK2/OVMF virtual disk device memory ranges.
- Various fixes and cleanups across the subsystem.
* tag 'libnvdimm-for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (41 commits)
libnvdimm-btt: Delete an unnecessary check before the function call "__nd_device_register"
nfit: do an ARS scrub on hitting a latent media error
nfit: move to nfit/ sub-directory
nfit, libnvdimm: allow an ARS scrub to be triggered on demand
libnvdimm: register nvdimm_bus devices with an nd_bus driver
pmem: clarify a debug print in pmem_clear_poison
x86/insn: remove pcommit
Revert "KVM: x86: add pcommit support"
nfit, tools/testing/nvdimm/: unify shutdown paths
libnvdimm: move ->module to struct nvdimm_bus_descriptor
nfit: cleanup acpi_nfit_init calling convention
nfit: fix _FIT evaluation memory leak + use after free
tools/testing/nvdimm: add manufacturing_{date|location} dimm properties
tools/testing/nvdimm: add virtual ramdisk range
acpi, nfit: treat virtual ramdisk SPA as pmem region
pmem: kill __pmem address space
pmem: kill wmb_pmem()
libnvdimm, pmem: use nvdimm_flush() for namespace I/O writes
fs/dax: remove wmb_pmem()
libnvdimm, pmem: flush posted-write queues on shutdown
...
|
|
After the previous patch, we can distinguish costly allocations that
should be really lightweight, such as THP page faults, with
__GFP_NORETRY. This means we don't need to recognize khugepaged
allocations via PF_KTHREAD anymore. We can also change THP page faults
in areas where madvise(MADV_HUGEPAGE) was used to try as hard as
khugepaged, as the process has indicated that it benefits from THP's and
is willing to pay some initial latency costs.
We can also make the flags handling less cryptic by distinguishing
GFP_TRANSHUGE_LIGHT (no reclaim at all, default mode in page fault) from
GFP_TRANSHUGE (only direct reclaim, khugepaged default). Adding
__GFP_NORETRY or __GFP_KSWAPD_RECLAIM is done where needed.
The patch effectively changes the current GFP_TRANSHUGE users as
follows:
* get_huge_zero_page() - the zero page lifetime should be relatively
long and it's shared by multiple users, so it's worth spending some
effort on it. We use GFP_TRANSHUGE, and __GFP_NORETRY is not added.
This also restores direct reclaim to this allocation, which was
unintentionally removed by commit e4a49efe4e7e ("mm: thp: set THP defrag
by default to madvise and add a stall-free defrag option")
* alloc_hugepage_khugepaged_gfpmask() - this is khugepaged, so latency
is not an issue. So if khugepaged "defrag" is enabled (the default), do
reclaim via GFP_TRANSHUGE without __GFP_NORETRY. We can remove the
PF_KTHREAD check from page alloc.
As a side-effect, khugepaged will now no longer check if the initial
compaction was deferred or contended. This is OK, as khugepaged sleep
times between collapsion attempts are long enough to prevent noticeable
disruption, so we should allow it to spend some effort.
* migrate_misplaced_transhuge_page() - already was masking out
__GFP_RECLAIM, so just convert to GFP_TRANSHUGE_LIGHT which is
equivalent.
* alloc_hugepage_direct_gfpmask() - vma's with VM_HUGEPAGE (via madvise)
are now allocating without __GFP_NORETRY. Other vma's keep using
__GFP_NORETRY if direct reclaim/compaction is at all allowed (by default
it's allowed only for madvised vma's). The rest is conversion to
GFP_TRANSHUGE(_LIGHT).
[[email protected]: suggested GFP_TRANSHUGE_LIGHT]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Vlastimil Babka <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Acked-by: Mel Gorman <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Because it uses that function, which would lead every tool using it
to need to link against tools/lib/str_error_r.o.
This fixes building tools/vm/, that links with libapi.
Reported-by: Arjan van de Ven <[email protected]>
Reported-by: Randy Dunlap <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Fixes: b31e3e3316a7 ("tools lib api fs: Use str_error_r()")
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Pull networking updates from David Miller:
1) Unified UDP encapsulation offload methods for drivers, from
Alexander Duyck.
2) Make DSA binding more sane, from Andrew Lunn.
3) Support QCA9888 chips in ath10k, from Anilkumar Kolli.
4) Several workqueue usage cleanups, from Bhaktipriya Shridhar.
5) Add XDP (eXpress Data Path), essentially running BPF programs on RX
packets as soon as the device sees them, with the option to mirror
the packet on TX via the same interface. From Brenden Blanco and
others.
6) Allow qdisc/class stats dumps to run lockless, from Eric Dumazet.
7) Add VLAN support to b53 and bcm_sf2, from Florian Fainelli.
8) Simplify netlink conntrack entry layout, from Florian Westphal.
9) Add ipv4 forwarding support to mlxsw spectrum driver, from Ido
Schimmel, Yotam Gigi, and Jiri Pirko.
10) Add SKB array infrastructure and convert tun and macvtap over to it.
From Michael S Tsirkin and Jason Wang.
11) Support qdisc packet injection in pktgen, from John Fastabend.
12) Add neighbour monitoring framework to TIPC, from Jon Paul Maloy.
13) Add NV congestion control support to TCP, from Lawrence Brakmo.
14) Add GSO support to SCTP, from Marcelo Ricardo Leitner.
15) Allow GRO and RPS to function on macsec devices, from Paolo Abeni.
16) Support MPLS over IPV4, from Simon Horman.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1622 commits)
xgene: Fix build warning with ACPI disabled.
be2net: perform temperature query in adapter regardless of its interface state
l2tp: Correctly return -EBADF from pppol2tp_getname.
net/mlx5_core/health: Remove deprecated create_singlethread_workqueue
net: ipmr/ip6mr: update lastuse on entry change
macsec: ensure rx_sa is set when validation is disabled
tipc: dump monitor attributes
tipc: add a function to get the bearer name
tipc: get monitor threshold for the cluster
tipc: make cluster size threshold for monitoring configurable
tipc: introduce constants for tipc address validation
net: neigh: disallow transition to NUD_STALE if lladdr is unchanged in neigh_update()
MAINTAINERS: xgene: Add driver and documentation path
Documentation: dtb: xgene: Add MDIO node
dtb: xgene: Add MDIO node
drivers: net: xgene: ethtool: Use phy_ethtool_gset and sset
drivers: net: xgene: Use exported functions
drivers: net: xgene: Enable MDIO driver
drivers: net: xgene: Add backward compatibility
drivers: net: phy: xgene: Add MDIO driver
...
|
|
At present, when creating module's map, perf gets 'start' address by
parsing '/proc/modules', but it's the module base address, it isn't the
start address of the '.text' section.
In most arches, it's OK. But for s390, it places 'GOT' and 'PLT'
relocations before '.text' section. So there exists an offset between
module base address and '.text' section, which will incur wrong symbol
resolution for modules.
Fix this bug by getting 'start' address of module's map from parsing
'/sys/module/[module name]/sections/.text', not from '/proc/modules'.
Signed-off-by: Song Shan Gong <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: David Ahern <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This reverts commit e083a21fcac9311ca425e600a15332f4792c56cc.
Not needed at all, tools/perf/util/perf_regs.h, included via:
#include "perf_regs.h"
Should have a definition for PERF_REGS_MAX, and since this is dependent
on HAVE_PERF_REGS_SUPPORT, fixes the build on powerpc, noticed by trying
to cross compile this from ubuntu16.04 with a locally build libz &
elfutils pair, since those are not available in multilib packages.
Cc: Jiri Olsa <[email protected]>
Cc: Naveen N. Rao <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Sukadev Bhattiprolu <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The pcommit instruction is being deprecated in favor of either ADR
(asynchronous DRAM refresh: flush-on-power-fail) at the platform level, or
posted-write-queue flush addresses as defined by the ACPI 6.x NFIT (NVDIMM
Firmware Interface Table).
Cc: Thomas Gleixner <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: [email protected]
Cc: Josh Poimboeuf <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Xiao Guangrong <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ross Zwisler <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
|
|
Cross building it on Ubuntu 16.04 to ARM ends up showing we get
the free() prototype by luck in other environments, fix it.
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Previous patches added support for Intel's AVX-512 instructions to the
kernel and perf tools instruction decoders.
AVX-512 instructions are documented in Intel Architecture Instruction
Set Extensions Programming Reference (February 2016).
Add a representative set of instructions to perf's "new instructions"
test. e.g.
perf test "new instructions"
Or to view a particular instruction:
perf test -v "new instructions" 2>&1 | grep vbroadcasti64x4
Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: X86 ML <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add support for Intel's AVX-512 instructions to perf tools instruction
decoder used by Intel PT. The kernel's instruction decoder was updated in
a previous patch.
AVX-512 instructions are documented in Intel Architecture Instruction Set
Extensions Programming Reference (February 2016).
AVX-512 instructions are identified by a EVEX prefix which, for the purpose
of instruction decoding, can be treated as though it were a 4-byte VEX
prefix.
Existing instructions which can now accept an EVEX prefix need not be
further annotated in the op code map (x86-opcode-map.txt). In the case of
new instructions, the op code map is updated accordingly.
Also add associated Mask Instructions that are used to manipulate mask
registers used in AVX-512 instructions.
A representative set of instructions is added to the perf tools new
instructions test in a subsequent patch.
Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: X86 ML <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
vcvtph2ps does not have an immediate operand, so remove the erroneous
'Ib' from its opcode map entry. Add vcvtph2ps to the perf tools new
instructions test to verify it.
Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: H. Peter Anvin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: X86 ML <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add automated test for is_printable_array function.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Pirko <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
It's used from 2 objects in perf, so it's better to keep just one copy.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Pirko <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Jirka reported that python code returns all arrays as strings. This
makes impossible to get all items for byte array tracepoint field
containing 0x00 value item.
Fixing this by scanning full length of the array and returning it as
PyByteArray object in case non printable byte is found.
Signed-off-by: Jiri Olsa <[email protected]>
Reported-and-Tested-by: Jiri Pirko <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Warn unmatched function filter correctly instead of warning
"symbol-loading error", since that can be a filter issue.
From the technical point of view, this adds a filter chech in map__load
and if there is a filter, it returns -2 (filter-out), instead of -1
(error), and perf-probe checks it and change message.
E.g. without this fix:
# perf probe -F rt_sp*
no symbols found in [kernel.kallsyms], maybe install a debug package?
Failed to load symbols in kernel
With this fix:
# perf probe -F rt_sp*
no symbols passed the given filter.
Failed to find symbols matched to "rt_sp*"
Error: Failed to show functions.
Reported-and-Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/146885835596.16106.2293540792775552481.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
In some cases it's necessry to figure out the map-local index of a given
Linux logical CPU ID. Add a new helper, cpu_map__idx, to acquire this.
As the logic is largely the same as the existing cpu_map__has, this is
rewritten in terms of the new helper.
At the same time, add the inverse operation, cpu_map__cpu, which yields
the logical CPU id for a map-local index. While this can be performed
manually, wrapping this in a helper can make code more legible.
Signed-off-by: Mark Rutland <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: He Kuang <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
In create_perf_stat_counter, when a target CPU has not been provided, we
call __perf_evsel__open with empty_cpu_map, and open a single FD per
thread. However, in read_counter we assume that we opened events for the
product of threads and CPUs described in the evsel's cpu_map.
Thus, if an evsel has a cpu_map with more than one entry, we will
attempt to access FDs that we didn't open. This could result in a number
of problems (e.g. blocking while reading from STDIN if the fd memory
happened to be initialised to zero).
This is problematic for systems were a logical CPU PMU covers some
arbitrary subset of CPUs. The cpu_map of any evsel for that PMU will be
initialised based on the cpumask exposed through sysfs, even if the user
requests per-thread events.
Signed-off-by: Mark Rutland <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: He Kuang <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
We were also using this directly from the kernel sources, the two last
cases, fix it.
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
It hasn't been used since we made tools/ self sufficiente wrt list.h.
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Josh Poimboeuf <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Fixes: d1b39d41ebec ("tools: Make list.h self-sufficient")
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
copy some more kernel files accessed from tools/, check for drift.
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
No need to copy it to a detached tarball as they aren't used anymore
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Not used anymore, remove one more file referencing kernel sources, i.e.
outside of tools/
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Not used anymore. This also stops include linux/swab.h directly
from the kernel sources, remove that reference from the MANIFEST.
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
It uses the likely/unlikely macros, so need to include
<linux/compiler.h>.
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The 'info.e_machine' struct member is an uint16_t so 'm' is never less
than zero. It looks like this was maybe left over code from earlier
versions so I've just removed it.
Signed-off-by: Dan Carpenter <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/20160715210836.GB19522@mwanda
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
It doesn't change the runtime behavior, but my static checker complains
that curly braces were intended.
Signed-off-by: Dan Carpenter <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/20160715210712.GA19522@mwanda
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When working with overwritable ring buffer there's a inconvenience
problem: if perf dumps data after a long period after it starts,
non-sample events may lost, which makes following 'perf report' unable
to identify proc name and mmap layout. For example:
# perf record -m 4 -e raw_syscalls:* -g --overwrite --switch-output \
dd if=/dev/zero of=/dev/null
send SIGUSR2 after dd runs long enough. The resuling perf.data lost
correct comm and mmap events:
# perf script -i perf.data.2016061522374354
perf 24478 [004] 2581325.601789: raw_syscalls:sys_exit: NR 0 = 512
^^^^
Should be 'dd'
27b2e8 syscall_slow_exit_work+0xfe2000e3 (/lib/modules/4.6.0-rc3+/build/vmlinux)
203cc7 do_syscall_64+0xfe200117 (/lib/modules/4.6.0-rc3+/build/vmlinux)
b18d83 return_from_SYSCALL_64+0xfe200000 (/lib/modules/4.6.0-rc3+/build/vmlinux)
7f47c417edf0 [unknown] ([unknown])
^^^^^^^^^^^^
Fail to unwind
This patch provides a '--tail-synthesize' option, allows perf to collect
system status when finalizing output file. In resuling output file, the
non-sample events reflect system status when dumping data.
After this patch:
# perf record -m 4 -e raw_syscalls:* -g --overwrite --switch-output --tail-synthesize \
dd if=/dev/zero of=/dev/null
# perf script -i perf.data.2016061600544998
dd 27364 [004] 2583244.994464: raw_syscalls:sys_enter: NR 1 (1, ...
^^
Correct comm
203a18 syscall_trace_enter_phase2+0xfe2001a8 ([kernel.kallsyms])
203aa5 syscall_trace_enter+0xfe200055 ([kernel.kallsyms])
203caa do_syscall_64+0xfe2000fa ([kernel.kallsyms])
b18d83 return_from_SYSCALL_64+0xfe200000 ([kernel.kallsyms])
d8e50 __GI___libc_write+0xffff01d9639f4010 (/tmp/oxygen_root-w00229757/lib64/libc-2.18.so)
^^^^^
Correct unwind
This option doesn't aim to solve this problem completely. If a process
terminates before SIGUSR2, we still lost its COMM and MMAP events. For
example, we can't unwind correctly from the final perf.data we get from
the previous example, because when perf collects the final output file
(when we press C-c), 'dd' has been terminated so its '/proc/<pid>/mmap'
becomes empty.
However, this is a cheaper choice. To completely solve this problem we
need to continously output non-sample events. To satisify the
requirement of daemonization, we need to merge them periodically. It is
possible but requires much more code and cycles.
Automatically select --tail-synthesize when --overwrite is provided.
Signed-off-by: Wang Nan <[email protected]>
Cc: He Kuang <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Nilay Vaish <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
If write_backward attribute is set, records are written into kernel
ring buffer from end to beginning, but read from beginning to end.
To avoid 'XX out of order events recorded' warning message (timestamps
of records is in reverse order when using write_backward), suppress the
warning message if write_backward is selected by at lease one event.
Result:
Before this patch:
# perf record -m 1 -e raw_syscalls:sys_exit/overwrite/ \
-e raw_syscalls:sys_enter \
dd if=/dev/zero of=/dev/null count=300
300+0 records in
300+0 records out
153600 bytes (154 kB) copied, 0.000601617 s, 255 MB/s
[ perf record: Woken up 5 times to write data ]
Warning:
40 out of order events recorded.
[ perf record: Captured and wrote 0.096 MB perf.data (696 samples) ]
After this patch:
# perf record -m 1 -e raw_syscalls:sys_exit/overwrite/ \
-e raw_syscalls:sys_enter \
dd if=/dev/zero of=/dev/null count=300
300+0 records in
300+0 records out
153600 bytes (154 kB) copied, 0.000644873 s, 238 MB/s
[ perf record: Woken up 5 times to write data ]
[ perf record: Captured and wrote 0.096 MB perf.data (696 samples) ]
Signed-off-by: Wang Nan <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Nilay Vaish <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: He Kuang <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This patch allows following config terms and option:
Globally setting events to overwrite;
# perf record --overwrite ...
Set specific events to be overwrite or no-overwrite.
# perf record --event cycles/overwrite/ ...
# perf record --event cycles/no-overwrite/ ...
Add missing config terms and update the config term array size because
the longest string length has changed.
For overwritable events, it automatically selects attr.write_backward
since perf requires it to be backward for reading.
Test result:
# perf record --overwrite -e syscalls:*enter_nanosleep* usleep 1
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.011 MB perf.data (1 samples) ]
# perf evlist -v
syscalls:sys_enter_nanosleep: type: 2, size: 112, config: 0x134, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|RAW, disabled: 1, inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, write_backward: 1
# Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events
Signed-off-by: Wang Nan <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Nilay Vaish <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: He Kuang <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
There's no user of these two function outside evlist.c. Remove them from
public namespace.
Signed-off-by: Wang Nan <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: He Kuang <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Nilay Vaish <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Drive the evlist->bkw_mmap_state state machine during draining and when
SIGUSR2 is received. Read the backward ring buffer in record__mmap_read_all.
Signed-off-by: Wang Nan <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Nilay Vaish <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: He Kuang <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Introduce a bkw_mmap_state state machine to evlist:
.________________(forbid)_____________.
| V
NOTREADY --(0)--> RUNNING --(1)--> DATA_PENDING --(2)--> EMPTY
^ ^ | ^ |
| |__(forbid)____/ |___(forbid)___/|
| |
\_________________(3)_______________/
NOTREADY : Backward ring buffers are not ready
RUNNING : Backward ring buffers are recording
DATA_PENDING : We are required to collect data from backward ring buffers
EMPTY : We have collected data from backward ring buffers.
(0): Setup backward ring buffer
(1): Pause ring buffers for reading
(2): Read from ring buffers
(3): Resume ring buffers for recording
We can't avoid this complexity. Since we deliberately drop records from
overwritable ring buffer, there's no way for us to check remaining from
ring buffer itself (by checking head and old pointers). Therefore, we
need DATA_PENDING and EMPTY state to help us recording what we have done
to the ring buffer.
In record__mmap_read_evlist(), drive this state machine from DATA_PENDING
to EMPTY.
In perf_evlist__mmap_per_evsel(), drive this state machine from NOTREADY
to RUNNING when creating backward mmap.
Signed-off-by: Wang Nan <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: He Kuang <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Nilay Vaish <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Now there's no real user of evlist->backward. Drop it. We are going to
use evlist->backward_mmap as a container for backward ring buffer.
Signed-off-by: Wang Nan <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: He Kuang <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Nilay Vaish <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
In perf_evlist__mmap_per_evsel(), select backward_mmap for backward
events. Utilize new perf_mmap APIs. Dynamically alloc backward_mmap.
Remove useless functions.
Signed-off-by: Wang Nan <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: He Kuang <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Nilay Vaish <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add backward_mmap to evlist, free it together with normal mmap.
Improve perf_evlist__pick_pc(), search backward_mmap if evlist->mmap is
not available.
This patch doesn't alloc this array. It will be allocated conditionally
in the following commits.
Signed-off-by: Wang Nan <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: He Kuang <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Nilay Vaish <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
In perf_evlist__mmap_per_cpu() and perf_evlist__mmap_per_thread(), in
case of mmap failure, successfully created maps should be cleared.
Current code uses two loops calling __perf_evlist__munmap() for each
function.
This patch extracts common code to perf_evlist__munmap_nofree() and use
previous introduced decoupled API perf_mmap__munmap(). Now
__perf_evlist__munmap() can be removed because of no user.
Signed-off-by: Wang Nan <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: He Kuang <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Nilay Vaish <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Insetad of saving a index into fdarray entries private field, save the
corresponding 'struct perf_mmap' pointer, and release them directly
using perf_mmap__put().
Following commits introduce multiple mmap arrays to evlist. Without this
patch, perf_evlist__munmap_filtered() is unable to retrive correct
'struct perf_mmap' pointer.
Signed-off-by: Wang Nan <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: He Kuang <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Nilay Vaish <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Perf evlist will have multiple mmap arrays. Update record__mmap_read():
it should read from 'struct perf_mmap' directly.
Also, make record__mmap_read() ready to read from backward ring buffer.
Signed-off-by: Wang Nan <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: He Kuang <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Nilay Vaish <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Currently, the evlist mmap related helpers and APIs accept evlist and
idx, and dereference 'struct perf_mmap' by evlist->mmap[idx]. This is
unnecessary, and force each evlist contains only one mmap array.
Following commits are going to introduce multiple mmap arrays to a
evlist. This patch refators these APIs and helpers, introduces
functions accept perf_mmap pointer directly. New helpers and APIs are
decoupled with perf_evlist, and become perf_mmap functions (so they have
perf_mmap prefix).
Old functions are reimplemented with new functions. Some of them will be
removed in following commits.
Signed-off-by: Wang Nan <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: He Kuang <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Nilay Vaish <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
evsel->overwrite indicator means an event should be put into
overwritable ring buffer. In current implementation, it equals to
evsel->attr.write_backward. To reduce compliexity, remove
evsel->overwrite, use evsel->attr.write_backward instead.
In addition, in __perf_evsel__open(), if kernel doesn't support
write_backward and user explicitly set it in evsel, don't fallback
like other missing feature, since it is meaningless to fall back to
a forward ring buffer in this case: we are unable to stably read
from an forward overwritable ring buffer.
Cc: He Kuang <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Nilay Vaish <[email protected]>
Cc: Wang Nan <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Wang Nan <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
There are cases where further work would be needed to overcome the fact
that neither sysconf(_SC_LEVEL1_DCACHE_LINESIZE) nor
/sys/devices/system/cpu/cpu0/cache/index0/coherency_line_size are
available in some systems (Android, for instance), so bail out when such
a situation takes place.
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
So far the cacheline_size is only useful for the "dcacheline" --sort
order, i.e. if that is not used, which is the norm, then the user
shouldn't care that he is running this, say, on an Android system where
sysconf(_SC_LEVEL1_DCACHE_LINESIZE) and the
/sys/devices/system/cpu/cpu0/cache/index0/coherency_line_size sysfs file
isn't available.
An upcoming patch will emit an warning only for "--sort ...,dcacheline,...".
Cc: Adrian Hunter <[email protected]>
Cc: Chris Phlipot <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The Bionic libc has this definition, so don't duplicate it.
Cc: Adrian Hunter <[email protected]>
Cc: Chris Phlipot <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add a basic test case for SDT event support. This test scans an SDT
event in perftools and check whether the SDT event is correctly stored
into the buildid cache.
Here is an example:
----
$ perf test sdt -v
47: Test SDT event probing :
--- start ---
test child forked, pid 20732
Found 72 SDTs in /home/mhiramat/ksrc/linux/tools/perf/perf
Writing cache: %sdt_perf:test_target=test_target
Cache committed: 0
symbol:test_target file:(null) line:0 offset:0 return:0 lazy:(null)
test child finished with 0
---- end ----
Test SDT event probing: Ok
----
Signed-off-by: Masami Hiramatsu <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Ananth N Mavinakayanahalli <[email protected]>
Cc: Brendan Gregg <[email protected]>
Cc: Hemant Kumar <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/146831796546.17065.1502584370844087537.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This checks whether sys/sdt.h is available or not, which is required for
DTRACE_PROBE().
We can disable this feature by passing NO_SDT=1 when building.
This flag will be used for SDT test case and further SDT events in
perftools.
Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Ananth N Mavinakayanahalli <[email protected]>
Cc: Brendan Gregg <[email protected]>
Cc: Hemant Kumar <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/146831795615.17065.17513820540591053933.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Support a special SDT probe format which can omit the '%' prefix only if
the SDT group name starts with "sdt_". So, for example both of
"%sdt_libc:setjump" and "sdt_libc:setjump" are acceptable for perf probe
--add.
E.g. without this:
# perf probe -a sdt_libc:setjmp
Semantic error :There is non-digit char in line number.
...
With this:
# perf probe -a sdt_libc:setjmp
Added new event:
sdt_libc:setjmp (on %setjmp in /usr/lib64/libc-2.20.so)
You can now use it in all perf tools, such as:
perf record -e sdt_libc:setjmp -aR sleep 1
Suggested-by: Brendan Gregg <[email protected]>
Signed-off-by: Masami Hiramatsu <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Ananth N Mavinakayanahalli <[email protected]>
Cc: Brendan Gregg <[email protected]>
Cc: Hemant Kumar <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/146831794674.17065.13359473252168740430.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Support @BUILDID or @FILE suffix for SDT events. This allows perf to add
probes on SDTs/pre-cached events on given FILE or the file which has
given BUILDID (also, this complements BUILDID.)
For example, both gcc and libstdc++ has same SDTs as below. If you
would like to add a probe on sdt_libstdcxx:catch on gcc, you can do as
below.
----
# perf list sdt | tail -n 6
sdt_libstdcxx:catch@/usr/bin/gcc(0cc207fc4b27) [SDT event]
sdt_libstdcxx:catch@/usr/lib64/libstdc++.so.6.0.20(91c7a88fdf49)
sdt_libstdcxx:rethrow@/usr/bin/gcc(0cc207fc4b27) [SDT event]
sdt_libstdcxx:rethrow@/usr/lib64/libstdc++.so.6.0.20(91c7a88fdf49)
sdt_libstdcxx:throw@/usr/bin/gcc(0cc207fc4b27) [SDT event]
sdt_libstdcxx:throw@/usr/lib64/libstdc++.so.6.0.20(91c7a88fdf49)
# perf probe -a %sdt_libstdcxx:catch@0cc
Added new event:
sdt_libstdcxx:catch (on %catch in /usr/bin/gcc)
You can now use it in all perf tools, such as:
perf record -e sdt_libstdcxx:catch -aR sleep 1
----
Committer note:
Doing the full sequence of steps to get the results above:
With a clean build-id cache:
[root@jouet ~]# rm -rf ~/.debug/
[root@jouet ~]# perf list sdt
List of pre-defined events (to be used in -e):
[root@jouet ~]#
No events whatsoever, then, we can add all events in gcc to the build-id
cache, doing a --add + --dry-run:
[root@jouet ~]# perf probe --dry-run --cache -x /usr/bin/gcc --add %sdt_libstdcxx:\*
Added new events:
sdt_libstdcxx:throw (on %* in /usr/bin/gcc)
sdt_libstdcxx:rethrow (on %* in /usr/bin/gcc)
sdt_libstdcxx:catch (on %* in /usr/bin/gcc)
You can now use it in all perf tools, such as:
perf record -e sdt_libstdcxx:catch -aR sleep 1
[root@jouet ~]#
It really didn't add any events, it just cached them:
[root@jouet ~]# perf probe -l
[root@jouet ~]#
We can see that it was cached as:
[root@jouet ~]# ls -la ~/.debug/usr/bin/gcc/9a0730e2bcc6d2a2003d21ac46807e8ee6bcb7c2/
total 976
drwxr-xr-x. 2 root root 4096 Jul 13 21:47 .
drwxr-xr-x. 3 root root 4096 Jul 13 21:47 ..
-rwxr-xr-x. 4 root root 985912 Jun 22 18:52 elf
-rw-r--r--. 1 root root 303 Jul 13 21:47 probes
[root@jouet ~]# file ~/.debug/usr/bin/gcc/9a0730e2bcc6d2a2003d21ac46807e8ee6bcb7c2/elf
/root/.debug/usr/bin/gcc/9a0730e2bcc6d2a2003d21ac46807e8ee6bcb7c2/elf: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=9a0730e2bcc6d2a2003d21ac46807e8ee6bcb7c2, stripped
[root@jouet ~]# cat ~/.debug/usr/bin/gcc/9a0730e2bcc6d2a2003d21ac46807e8ee6bcb7c2/probes
%sdt_libstdcxx:throw=throw
p:sdt_libstdcxx/throw /usr/bin/gcc:0x71ffd
%sdt_libstdcxx:rethrow=rethrow
p:sdt_libstdcxx/rethrow /usr/bin/gcc:0x720b8
%sdt_libstdcxx:catch=catch
p:sdt_libstdcxx/catch /usr/bin/gcc:0x7307f
%sdt_libgcc:unwind=unwind
p:sdt_libgcc/unwind /usr/bin/gcc:0x7eec0
#sdt_libstdcxx:*=%*
[root@jouet ~]#
Ok, now we can use 'perf probe' to refer to those cached entries as:
Humm, nope, doing as above we end up with:
[root@jouet ~]# perf probe -a %sdt_libstdcxx:catch
Semantic error :* is bad for event name -it must follow C symbol-naming rule.
Error: Failed to add events.
[root@jouet ~]#
But it worked at some point, lets try not using --dry-run:
Resetting everything:
# rm -rf ~/.debug/
# perf probe -d *:*
# perf probe -l
# perf list sdt
List of pre-defined events (to be used in -e):
#
Ok, now it cached everything, even things we haven't asked it to
(sdt_libgcc:unwind):
[root@jouet ~]# perf probe -x /usr/bin/gcc --add %sdt_libstdcxx:\*
Added new events:
sdt_libstdcxx:throw (on %* in /usr/bin/gcc)
sdt_libstdcxx:rethrow (on %* in /usr/bin/gcc)
sdt_libstdcxx:catch (on %* in /usr/bin/gcc)
You can now use it in all perf tools, such as:
perf record -e sdt_libstdcxx:catch -aR sleep 1
[root@jouet ~]# perf list sdt
List of pre-defined events (to be used in -e):
sdt_libgcc:unwind [SDT event]
sdt_libstdcxx:catch [SDT event]
sdt_libstdcxx:rethrow [SDT event]
sdt_libstdcxx:throw [SDT event]
[root@jouet ~]#
And we have the events in place:
[root@jouet ~]# perf probe -l
sdt_libstdcxx:catch (on execute_cfa_program+1551@../../../libgcc/unwind-dw2.c in /usr/bin/gcc)
sdt_libstdcxx:rethrow (on d_print_subexpr+280@libsupc++/cp-demangle.c in /usr/bin/gcc)
sdt_libstdcxx:throw (on d_print_subexpr+93@libsupc++/cp-demangle.c in /usr/bin/gcc)
[root@jouet ~]#
And trying to use them at least has 'perf trace --event sdt*:*' working.
Then, if we try to add the ones in libstdc++:
[root@jouet ~]# perf probe -x /usr/lib64/libstdc++.so.6 -a %sdt_libstdcxx:\*
Error: event "catch" already exists.
Hint: Remove existing event by 'perf probe -d'
or force duplicates by 'perf probe -f'
or set 'force=yes' in BPF source.
Error: Failed to add events.
[root@jouet ~]#
Doesn't work, dups, but at least this served to, unbeknownst to the user, add
the SDT probes in /usr/lib64/libstdc++.so.6!
[root@jouet ~]# perf list sdt
List of pre-defined events (to be used in -e):
sdt_libgcc:unwind [SDT event]
sdt_libstdcxx:catch@/usr/bin/gcc(9a0730e2bcc6) [SDT event]
sdt_libstdcxx:catch@/usr/lib64/libstdc++.so.6.0.22(ef2b7066559a) [SDT event]
sdt_libstdcxx:rethrow@/usr/bin/gcc(9a0730e2bcc6) [SDT event]
sdt_libstdcxx:rethrow@/usr/lib64/libstdc++.so.6.0.22(ef2b7066559a) [SDT event]
sdt_libstdcxx:throw@/usr/bin/gcc(9a0730e2bcc6) [SDT event]
sdt_libstdcxx:throw@/usr/lib64/libstdc++.so.6.0.22(ef2b7066559a) [SDT event]
[root@jouet ~]#
Now we should be able to get to the original cset comment, if we remove all
SDTs events in place, not from the cache, from the kernel, where it was set up as:
[root@jouet ~]# ls -la /sys/kernel/debug/tracing/events/sdt_libstdcxx/
total 0
drwxr-xr-x. 5 root root 0 Jul 13 22:00 .
drwxr-xr-x. 80 root root 0 Jul 13 21:56 ..
drwxr-xr-x. 2 root root 0 Jul 13 22:00 catch
-rw-r--r--. 1 root root 0 Jul 13 22:00 enable
-rw-r--r--. 1 root root 0 Jul 13 22:00 filter
drwxr-xr-x. 2 root root 0 Jul 13 22:00 rethrow
drwxr-xr-x. 2 root root 0 Jul 13 22:00 throw
[root@jouet ~]#
[root@jouet ~]# head -2 /sys/kernel/debug/tracing/events/sdt_libstdcxx/throw/format
name: throw
ID: 2059
[root@jouet ~]#
Now to remove it:
[root@jouet ~]# perf probe -d sdt_libstdc*:*
Removed event: sdt_libstdcxx:catch
Removed event: sdt_libstdcxx:rethrow
Removed event: sdt_libstdcxx:throw
[root@jouet ~]#
Which caused:
[root@jouet ~]# ls -la /sys/kernel/debug/tracing/events/sdt_libstdcxx/
ls: cannot access '/sys/kernel/debug/tracing/events/sdt_libstdcxx/': No such file or directory
[root@jouet ~]#
Ok, now we can do:
[root@jouet ~]# perf list sdt_libstdcxx:catch
List of pre-defined events (to be used in -e):
sdt_libstdcxx:catch@/usr/bin/gcc(9a0730e2bcc6) [SDT event]
sdt_libstdcxx:catch@/usr/lib64/libstdc++.so.6.0.22(ef2b7066559a) [SDT event]
[root@jouet ~]#
So, these are not really 'pre-defined events', i.e. we can't use them with
'perf record --event':
[root@jouet ~]# perf record --event sdt_libstdcxx:catch*
event syntax error: 'sdt_libstdcxx:catch*'
\___ unknown tracepoint
Error: File /sys/kernel/debug/tracing/events/sdt_libstdcxx/catch* not found.
Hint: Perhaps this kernel misses some CONFIG_ setting to enable this feature?.
<SNIP>
[root@jouet ~]#
To have it really pre-defined we must use perf probe to get its definition from
the cache and set it up in the kernel, creating the tracepoint to _then_ use it
with 'perf record --event':
[root@jouet ~]# perf probe -a sdt_libstdcxx:catch
Semantic error :There is non-digit char in line number.
<SNIP>
Oops, there is another gotcha here, we need that pesky '%' character:
[root@jouet ~]# perf probe -a %sdt_libstdcxx:catch
Added new events:
sdt_libstdcxx:catch (on %catch in /usr/bin/gcc)
sdt_libstdcxx:catch_1 (on %catch in /usr/lib64/libstdc++.so.6.0.22)
You can now use it in all perf tools, such as:
perf record -e sdt_libstdcxx:catch_1 -aR sleep 1
[root@jouet ~]#
But then we added _two_ events, one with the name we expected, the other one
with a _ added, when doing the analysis we need to pay attention to who maps to
who.
And here is where we get to the point of this patch, which is to be able to
disambiguate those definitions for 'catch' in the build-id cache, but first we need
remove those events we just added:
[root@jouet ~]# perf probe -d %sdt_libstdcxx:catch
Oops, that didn't remove anything, we need to _remove_ that % char in this case:
[root@jouet ~]# perf probe -d sdt_libstdcxx:catch
Removed event: sdt_libstdcxx:catch
And we need to remove the other event added, i.e. I forgot to add a * at the end:
[root@jouet ~]# perf probe -d sdt_libstdcxx:catch*
Removed event: sdt_libstdcxx:catch_1
[root@jouet ~]#
Ok, disambiguating it using what is in this patch:
[root@jouet ~]# perf list sdt_libstdcxx:catch
List of pre-defined events (to be used in -e):
sdt_libstdcxx:catch@/usr/bin/gcc(9a0730e2bcc6) [SDT event]
sdt_libstdcxx:catch@/usr/lib64/libstdc++.so.6.0.22(ef2b7066559a) [SDT event]
[root@jouet ~]#
[root@jouet ~]# perf probe -a %sdt_libstdcxx:catch@9a07
Added new event:
sdt_libstdcxx:catch (on %catch in /usr/bin/gcc)
You can now use it in all perf tools, such as:
perf record -e sdt_libstdcxx:catch -aR sleep 1
[root@jouet ~]# perf probe -l
sdt_libstdcxx:catch (on execute_cfa_program+1551@../../../libgcc/unwind-dw2.c in /usr/bin/gcc)
[root@jouet ~]#
Yeah, it works! But we need to try and simplify this :-)
Update: Some aspects of this simplification take place in the following
patches.
Signed-off-by: Masami Hiramatsu <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Ananth N Mavinakayanahalli <[email protected]>
Cc: Brendan Gregg <[email protected]>
Cc: Hemant Kumar <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/146831793746.17065.13065062753978236612.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Show SDT and pre-cached events by perf-list with "sdt". This also shows
the binary and build-id where the events are placed only when there are
same name events on different binaries.
e.g.:
# perf list sdt
List of pre-defined events (to be used in -e):
sdt_libc:lll_futex_wake [SDT event]
sdt_libc:lll_lock_wait_private [SDT event]
sdt_libc:longjmp [SDT event]
sdt_libc:longjmp_target [SDT event]
...
sdt_libstdcxx:rethrow@/usr/bin/gcc(0cc207fc4b27) [SDT event]
sdt_libstdcxx:rethrow@/usr/lib64/libstdc++.so.6.0.20(91c7a88fdf49)
sdt_libstdcxx:throw@/usr/bin/gcc(0cc207fc4b27) [SDT event]
sdt_libstdcxx:throw@/usr/lib64/libstdc++.so.6.0.20(91c7a88fdf49)
The binary path and build-id are shown in below format;
<GROUP>:<EVENT>@<PATH>(<BUILD-ID>)
Signed-off-by: Masami Hiramatsu <[email protected]>
Signed-off-by: Masami Hiramatsu <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Ananth N Mavinakayanahalli <[email protected]>
Cc: Brendan Gregg <[email protected]>
Cc: Hemant Kumar <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/20160624090646.25421.44225.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Search SDT/cached event from all probe caches if user doesn't pass any
binary. With this, we don't have to specify target binary for SDT and
named cached events (which start with %).
E.g. without this, a target binary must be passed with -x.
# perf probe -x /usr/lib64/libc-2.20.so -a %sdt_libc:\*
With this change, we don't need it anymore.
# perf probe -a %sdt_libc:\*
Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Ananth N Mavinakayanahalli <[email protected]>
Cc: Brendan Gregg <[email protected]>
Cc: Hemant Kumar <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/146831792812.17065.2353705982669445313.stgit@devbox
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|