Age | Commit message (Collapse) | Author | Files | Lines |
|
The Open CoreSight Decoding Library (openCSD) is a free and open library
to decode traces collected by the CoreSight hardware infrastructure.
This patch adds the required mechanic to recognise the presence of the
openCSD library on a system and set up miscellaneous flags to be used in
the compilation of the trace decoding feature.
Signed-off-by: Mathieu Poirier <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kim Phillips <[email protected]>
Cc: [email protected]
Cc: Mike Leach <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
[ Merged missing test-libopencsd.c file, provided later by Mathieu ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Signed-off-by: Andi Kleen <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Signed-off-by: Andi Kleen <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Signed-off-by: Andi Kleen <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Signed-off-by: Andi Kleen <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Signed-off-by: Andi Kleen <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Signed-off-by: Andi Kleen <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Signed-off-by: Andi Kleen <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Signed-off-by: Andi Kleen <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Signed-off-by: Andi Kleen <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Signed-off-by: Andi Kleen <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Signed-off-by: Andi Kleen <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Signed-off-by: Andi Kleen <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Signed-off-by: Andi Kleen <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Change the Makefile and build process to no longer require audit-libs
interfaces when the architecture provides system call tables.
Committer notes:
Its not enough to hook into the NO_LIBAUDIT makefile block, we need to
define a CONFIG_TRACE that gets selected by both architectures
generating the syscall tables from the kernel headers and from detecting
the availability of libaudit.
With that in place we will not link against libaudit even if the
necessary files are available for that, in fact we will not even try to
detect its availability, speeding up a bit the feature detection phase.
Signed-off-by: Hendrik Brueckner <[email protected]>
Reviewed-by: Thomas Richter <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: [email protected]
LPU-Reference: [email protected]
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Replace the errno_to_name() from the audit-libs with the newly
introduced arch_syscalls__strerrno() function.
With this change:
1. With replacing errno_to_name() from audit-libs, perf trace
does no longer require audit-lib interfaces.
2. In addition to 1, the audit-libs dependency can be removed
for architectures that support syscall tables in perf.
This is achieved in a follow-up commit.
3. With the architecture specific errno number/name mapping,
perf trace reports can work across architectures.
Signed-off-by: Hendrik Brueckner <[email protected]>
Reviewed-by: Thomas Richter <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: [email protected]
LPU-Reference: [email protected]
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Introduce a script that generates a mapping of errno numbers to their
names for each architecture that is supported by perf (i.e. has a
subdirectory in tools/perf/arch/).
The errno mapping is generated as part of the trace beautifiers and can
be used by including the trace/beauty/arch_errno_names.c file. Then,
use arch_syscalls__strerrno() to look up an errno value to obtain the
errno name (e.g. ENOENT) for a particular architecture.
Signed-off-by: Hendrik Brueckner <[email protected]>
Reviewed-by: Thomas Richter <[email protected]>
Suggested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: [email protected]
LPU-Reference: [email protected]
Link: https://lkml.kernel.org/n/[email protected]
[ Make x86 be the first arch, most common, add newline to last line, fixing build on centos:5 ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This is a pre-req to generate an architecture specific mapping of errno
numbers to their names. This errno mapping can be used by perf trace to
support cross-architecture trace reports and to get rid of the
audit-libs dependency.
Signed-off-by: Hendrik Brueckner <[email protected]>
Reviewed-by: Thomas Richter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: [email protected]
LPU-Reference: [email protected]
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
For each arch in tools/perf/arch, grab a copy of errno.h.
This is a pre-req to generate an architecture specific mapping of errno
numbers to their names. This errno mapping can be used by perf trace to
support cross-architecture trace reports and to get rid of the
audit-libs dependency.
Signed-off-by: Hendrik Brueckner <[email protected]>
Reviewed-by: Thomas Richter <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: [email protected]
LPU-Reference: [email protected]
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Display the state of the rest of the features (FEATURE_TESTS_EXTRA) on a
'make VF=1' build. These features are detected manually by perf's
Makefile.config so they can't be displayed with the main list, but only
after we're done in Makefile.config.
$ make VF=1
BUILD: Doing 'make -j4' parallel build
Auto-detecting system features:
... dwarf: [ on ]
... dwarf_getlocations: [ on ]
... glibc: [ on ]
... gtk2: [ on ]
SNIP
... timerfd: [ on ]
... sched_getcpu: [ on ]
... sdt: [ on ]
... setns: [ on ]
extra features:
... bionic: [ OFF ]
... compile-32: [ on ]
... compile-x32: [ OFF ]
... cplus-demangle: [ on ]
... hello: [ OFF ]
... libbabeltrace: [ on ]
... liberty: [ on ]
... liberty-z: [ on ]
... libunwind-debug-frame: [ OFF ]
... libunwind-debug-frame-arm: [ OFF ]
... libunwind-debug-frame-aarch64: [ OFF ]
SNIP
Signed-off-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/20180109092646.GB11520@krava
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Commit (93d10af26bb7 perf tools: Optimize sample parsing for ordered
events) breaks intelPT trace decoding by invariably returning an error
if the event type isn't a PERF_SAMPLE_TIME.
With this patch the timestamp is initialised and processing is allowed
to continue if the error returned by function
perf_evlist__parse_sample_timestamp() is not a fault.
Signed-off-by: Mathieu Poirier <[email protected]>
Acked-by: Adrian Hunter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Fixes: 93d10af26bb7 ("perf tools: Optimize sample parsing for ordered events")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
I've meet a strange behavior with these commands on my gentoo box:
1: perf kmem record
2: CTRL-C to stop 1
3: perf report
4: "Enter", "Enter", "Run scripts for all samples",
"event_analyzing_sample".
Then 'perf report' says:
"
No kallsyms or vmlinux with build-id xxxx was found
/lib/modules/4.10.0+/build/vmlinux with build id xxxx not found,
continuing without symbols
".
It is strange because I am sure /lib/modules/4.10.0+/build/vmlinux is
right for perf.data.
After digging, I found out the reason is that "perf report" generates
many open fds, then "script_browse" uses popen to run "perf script"
which run out of open files.
The gentoo box has a small default value for "max open files", 1024.
Yes, "ulimit -n " with a bigger number could fix it, but I think that
using O_CLOEXEC in do_open is a better way.
Signed-off-by: Wang YanQing <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/20180115050448.GA20759@udknight
[ Make sure O_CLOEXEC is available in old systems by adding a patch
just before this one, to keep this bisectable in such systems ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
To be more generally available and get the build on centos:5 to
work after we use O_CLOEXEC in the next patch, in the util/dso.c file.
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Cc: Wang YanQing <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When clang is not linked with 'perf' we should just add a debug message
about that before doing the fallback to calling the external compiler.
I.e. just the "-95" warning below gets turned into a debug message:
# cat sys_enter_open.c
#include "bpf.h"
SEC("syscalls:sys_enter_open")
int func(void *ctx)
{
struct {
char *ptr;
char path[256];
} filename = {
.ptr = *((char **)(ctx + 16)),
};
int len = bpf_probe_read_str(filename.path, sizeof(filename.path), filename.ptr);
if (len > 0) {
if (len == 1)
perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, &filename, len + sizeof(filename.ptr));
else if (len < 256)
perf_event_output(ctx, &__bpf_stdout__, BPF_F_CURRENT_CPU, &filename, len + sizeof(filename.ptr));
}
return 0;
}
# trace -e open,sys_enter_open.c
bpf: builtin compilation failed: -95, try external compiler
0.000 ( ): __bpf_stdout__:@......./proc/self/task/11160/comm..)
0.014 ( 0.116 ms): qemu-system-x8/6721 open(filename: /proc/self/task/11160/comm, flags: RDWR) = 91
2335.411 ( ): __bpf_stdout__:FB..~.../etc/resolv.conf....)
2335.421 ( 0.030 ms): chronyd/883 open(filename: /etc/resolv.conf, flags: CLOEXEC) = 5
^C#
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
So that we can get it working for TUI, where using just pr_err() would
end up making the message emitted to stderr to be erased by the TUI exit
routine restoring the terminal to its previous state.
Now we can see that trying to use a tracepoint field as one of the
--field entries isn't working:
# perf top --stdio --no-children -e syscalls:sys_enter_write --fields pid,sym,count
Error:
Unknown --fields key: `count'
Usage: perf top [<options>]
--fields <key[,keys...]>
output field(s): overhead, period, sample plus all of sort keys
#
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
perf_event__synthesize_sample()
There is never a need to synthesize a 'swapped' sample, so all callers
to perf_event__synthesize_sample() pass 'false' as the value to
'swapped'. So get rid of the unused 'swapped' parameter.
Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
perf_event__synthesize_sample()
PERF_SAMPLE_CPU contains the cpu number in the first 4 bytes and the
second 4 bytes are reserved. Ensure the reserved bytes are zero in
perf_event__synthesize_sample().
Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Both 'perf inject' and internal tools consume cpu endian samples, so
there is never a need to do any swapping when synthesizing samples.
Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
In x86 architecture dependend part function get_cpuid_str() mallocs a
128 byte buffer, but does not check if the memory allocation succeeded
or not.
When the memory allocation fails, function __get_cpuid() is called with
first parameter being a NULL pointer. However this function references
its first parameter and operates on a NULL pointer which might cause
core dumps.
Signed-off-by: Thomas Richter <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Hendrik Brueckner <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Previously it was only allowed to use at most 10 time slices in 'perf
script --time'.
This patch removes this limitation.
For example, following command line is OK (12 time slices)
perf script --time 1%/1,1%/2,1%/3,1%/4,1%/5,1%/6,1%/7,1%/8,1%/9,1%/10,1%/11,1%/12
Signed-off-by: Jin Yao <[email protected]>
Suggested-by: Arnaldo Carvalho de Melo <[email protected]>
Reviewed-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ No need to check for NULL to call free, use zfree ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Previously it was only allowed to use at most 10 time slices in 'perf
report --time'.
This patch removes this limitation.
For example, following command line is OK (12 time slices)
perf report --stdio --time 1%/1,1%/2,1%/3,1%/4,1%/5,1%/6,1%/7,1%/8,1%/9,1%/10,1%/11,1%/12
Signed-off-by: Jin Yao <[email protected]>
Suggested-by: Arnaldo Carvalho de Melo <[email protected]>
Reviewed-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ No need to check for NULL to call free, use zfree ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Previously we use a magic number 10 to limit the number of time slices.
It's not very good.
This patch creates a new function perf_time__range_alloc() to allocate
time slices buffer. The number of buffer entries is determined by the
number of comma in string but at least it will allocate one entry even
if no comma is found.
Signed-off-by: Jin Yao <[email protected]>
Suggested-by: Arnaldo Carvalho de Melo <[email protected]>
Reviewed-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add a time slices indication to the perf report header.
For example,
# perf report --stdio --time 10%
# Total Lost Samples: 0
#
# Samples: 9K of event 'cycles:ppp' (time slices: 10%)
# Event count (approx.): 8951288803
Signed-off-by: Jin Yao <[email protected]>
Suggested--by: Arnaldo Carvalho de Melo <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Reviewed-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Previously, the time percent slice needs an index to specify which one
the user wants.
It may be easier to use if the index can be omitted. So with this
patch, for example,
perf report --stdio --time 10%/1 should be equivalent to
perf report --stdio --time 10%
Signed-off-by: Jin Yao <[email protected]>
Suggested-by: Arnaldo Carvalho de Melo <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Reviewed-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The command line like 'perf report --stdio --time 1abc%/1' could be
accepted by perf. It looks not very good.
This patch uses strtod() to replace original atof() and check the entire
string. Now for the same command line, it would return error message
"Invalid time string".
root@skl:/tmp# perf report --stdio --time 1abc%/1
Invalid time string
Signed-off-by: Jin Yao <[email protected]>
Reviewed-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The following message will be returned to user when executing 'perf
script --time' if perf data file doesn't contain the first/last sample
time.
"HINT: no first/last sample time found in perf data.
Please use latest perf binary to execute 'perf record'
(if '--buildid-all' is enabled, needs to set '--timestamp-boundary')."
Signed-off-by: Jin Yao <[email protected]>
Reviewed-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The following message will be returned to user when executing
'perf report --time' if perf data file doesn't contain the
first/last sample time.
"HINT: no first/last sample time found in perf data.
Please use latest perf binary to execute 'perf record'
(if '--buildid-all' is enabled, needs to set '--timestamp-boundary')."
Signed-off-by: Jin Yao <[email protected]>
Reviewed-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When we use a global DWARF setting as in:
perf record --call-graph dwarf
According to 5c0cf22477ea ("perf record: Store data mmaps for dwarf unwind") we need
to set up some extra perf_event_attr bits.
But when we instead do a per event dwarf setting:
perf record -e cycles/call-graph=dwarf/
This was not being done, make them equivalent.
This didn't produce any output changes in my tests while fixing up loose
ends in the per-event settings, I found it just by comparing the
perf_event_attr fields trying to find an explanation for those problems.
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Hendrick Brueckner <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Noel Grandin <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The per-event max-stack setting wasn't overriding the global --max-stack
setting:
# perf trace --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf,max-stack=2/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.072 ms
--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.072/0.072/0.072/0.000 ms
0.000 probe_libc:inet_pton:(7feb7a998350))
__inet_pton (inlined)
gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
__GI_getaddrinfo (inlined)
[0xffffaa39b6108f3f] (/usr/bin/ping)
#
Fix it:
# perf trace --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf,max-stack=2/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.073 ms
--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.073/0.073/0.073/0.000 ms
0.000 probe_libc:inet_pton:(7f1083221350))
__inet_pton (inlined)
gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
#
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Hendrick Brueckner <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
is used
If we use:
perf trace --max-stack=4
then the syscall events will use DWARF callchains, when available
(libunwind enabled in the build) and the printing will stop at 4 levels.
When we introduced support for tracepoint events this ended up not
applying for them, fix it.
Before:
# perf trace --call-graph=dwarf --no-syscalls -e probe_libc:inet_pton ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.058 ms
--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.058/0.058/0.058/0.000 ms
0.000 probe_libc:inet_pton:(7fc6c2a16350))
#
After:
# perf trace --call-graph=dwarf --no-syscalls -e probe_libc:inet_pton ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.087 ms
--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.087/0.087/0.087/0.000 ms
0.000 probe_libc:inet_pton:(7fbf9a041350))
__inet_pton (inlined)
gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
__GI_getaddrinfo (inlined)
[0xffffaa947cb67f3f] (/usr/bin/ping)
__libc_start_main (/usr/lib64/libc-2.26.so)
[0xffffaa947cb68379] (/usr/bin/ping)
#
Reported-by: Thomas Richter <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Hendrick Brueckner <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When setting up DWARF callchains on specific events, without using
'record' or 'trace' --call-graph, but instead doing it like:
perf trace -e cycles/call-graph=dwarf/
The unwind__prepare_access() call in thread__insert_map() when we
process PERF_RECORD_MMAP(2) metadata events were not being performed,
precluding us from using per-event DWARF callchains, handling them just
when we asked for all events to be DWARF, using "--call-graph dwarf".
We do it in the PERF_RECORD_MMAP because we have to look at one of the
executable maps to figure out the executable type (64-bit, 32-bit) of
the DSO laid out in that mmap. Also to look at the architecture where
the perf.data file was recorded.
All this probably should be deferred to when we process a sample for
some thread that has callchains, so that we do this processing only for
the threads with samples, not for all of them.
For now, fix using DWARF on specific events.
Before:
# perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.048 ms
--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.048/0.048/0.048/0.000 ms
0.000 probe_libc:inet_pton:(7fe9597bb350))
Problem processing probe_libc:inet_pton callchain, skipping...
#
After:
# perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.060 ms
--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.060/0.060/0.060/0.000 ms
0.000 probe_libc:inet_pton:(7fd4aa930350))
__inet_pton (inlined)
gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
__GI_getaddrinfo (inlined)
[0xffffaa804e51af3f] (/usr/bin/ping)
__libc_start_main (/usr/lib64/libc-2.26.so)
[0xffffaa804e51b379] (/usr/bin/ping)
#
# perf trace --call-graph=dwarf --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.057 ms
--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.057/0.057/0.057/0.000 ms
0.000 probe_libc:inet_pton:(7f9363b9e350))
__inet_pton (inlined)
gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
__GI_getaddrinfo (inlined)
[0xffffa9e8a14e0f3f] (/usr/bin/ping)
__libc_start_main (/usr/lib64/libc-2.26.so)
[0xffffa9e8a14e1379] (/usr/bin/ping)
#
# perf trace --call-graph=fp --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.077 ms
--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.077/0.077/0.077/0.000 ms
0.000 probe_libc:inet_pton:(7f4947e1c350))
__inet_pton (inlined)
gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
__GI_getaddrinfo (inlined)
[0xffffaa716d88ef3f] (/usr/bin/ping)
__libc_start_main (/usr/lib64/libc-2.26.so)
[0xffffaa716d88f379] (/usr/bin/ping)
#
# perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=fp/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.078 ms
--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.078/0.078/0.078/0.000 ms
0.000 probe_libc:inet_pton:(7fa157696350))
__GI___inet_pton (/usr/lib64/libc-2.26.so)
getaddrinfo (/usr/lib64/libc-2.26.so)
[0xffffa9ba39c74f40] (/usr/bin/ping)
#
Acked-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Hendrick Brueckner <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When setting the "dwarf" unwinder for a specific event and not
specifying the max-stack, the attr.sample_max_stack ended up using an
uninitialized callchain_param.max_stack, fix it by using designated
initializers for that callchain_param variable, zeroing all non
explicitely initialized struct members.
Here is what happened:
# perf trace -vv --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
callchain: type DWARF
callchain: stack dump size 8192
perf_event_attr:
type 2
size 112
config 0x730
{ sample_period, sample_freq } 1
sample_type IP|TID|TIME|ADDR|CALLCHAIN|CPU|PERIOD|RAW|REGS_USER|STACK_USER|DATA_SRC
exclude_callchain_user 1
{ wakeup_events, wakeup_watermark } 1
sample_regs_user 0xff0fff
sample_stack_user 8192
sample_max_stack 50656
sys_perf_event_open failed, error -75
Value too large for defined data type
# perf trace -vv --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
callchain: type DWARF
callchain: stack dump size 8192
perf_event_attr:
type 2
size 112
config 0x730
sample_type IP|TID|TIME|ADDR|CALLCHAIN|CPU|PERIOD|RAW|REGS_USER|STACK_USER|DATA_SRC
exclude_callchain_user 1
sample_regs_user 0xff0fff
sample_stack_user 8192
sample_max_stack 30448
sys_perf_event_open failed, error -75
Value too large for defined data type
#
Now the attr.sample_max_stack is set to zero and the above works as
expected:
# perf trace --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.072 ms
--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.072/0.072/0.072/0.000 ms
0.000 probe_libc:inet_pton:(7feb7a998350))
__inet_pton (inlined)
gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
__GI_getaddrinfo (inlined)
[0xffffaa39b6108f3f] (/usr/bin/ping)
#
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Hendrick Brueckner <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
'perf record' and 'perf report --dump-raw-trace' supported in this
release.
Example usage:
# perf record -e arm_spe/ts_enable=1,pa_enable=1/ dd if=/dev/zero of=/dev/null count=10000
# perf report --dump-raw-trace
Note that the perf.data file is portable, so the report can be run on
another architecture host if necessary.
Output will contain raw SPE data and its textual representation, such
as:
0x5c8 [0x30]: PERF_RECORD_AUXTRACE size: 0x200000 offset: 0 ref: 0x1891ad0e idx: 1 tid: 2227 cpu: 1
.
. ... ARM SPE data: size 2097152 bytes
. 00000000: 49 00 LD
. 00000002: b2 c0 3b 29 0f 00 00 ff ff VA 0xffff00000f293bc0
. 0000000b: b3 c0 eb 24 fb 00 00 00 80 PA 0xfb24ebc0 ns=1
. 00000014: 9a 00 00 LAT 0 XLAT
. 00000017: 42 16 EV RETIRED L1D-ACCESS TLB-ACCESS
. 00000019: b0 00 c4 15 08 00 00 ff ff PC 0xff00000815c400 el3 ns=1
. 00000022: 98 00 00 LAT 0 TOT
. 00000025: 71 36 6c 21 2c 09 00 00 00 TS 39395093558
. 0000002e: 49 00 LD
. 00000030: b2 80 3c 29 0f 00 00 ff ff VA 0xffff00000f293c80
. 00000039: b3 80 ec 24 fb 00 00 00 80 PA 0xfb24ec80 ns=1
. 00000042: 9a 00 00 LAT 0 XLAT
. 00000045: 42 16 EV RETIRED L1D-ACCESS TLB-ACCESS
. 00000047: b0 f4 11 16 08 00 00 ff ff PC 0xff0000081611f4 el3 ns=1
. 00000050: 98 00 00 LAT 0 TOT
. 00000053: 71 36 6c 21 2c 09 00 00 00 TS 39395093558
. 0000005c: 48 00 INSN-OTHER
. 0000005e: 42 02 EV RETIRED
. 00000060: b0 2c ef 7f 08 00 00 ff ff PC 0xff0000087fef2c el3 ns=1
. 00000069: 98 00 00 LAT 0 TOT
. 0000006c: 71 d1 6f 21 2c 09 00 00 00 TS 39395094481
...
Other release notes:
- applies to acme's perf/{core,urgent} branches, likely elsewhere
- Report is self-contained within the tool.
Record requires enabling the kernel SPE driver by
setting CONFIG_ARM_SPE_PMU.
- The intel-bts implementation was used as a starting point; its
min/default/max buffer sizes and power of 2 pages granularity need to be
revisited for ARM SPE
- Recording across multiple SPE clusters/domains not supported
- Snapshot support (record -S), and conversion to native perf events
(e.g., via 'perf inject --itrace'), are also not supported
- Technically both cs-etm and spe can be used simultaneously, however
disabled for simplicity in this release
Signed-off-by: Kim Phillips <[email protected]>
Reviewed-by: Dongjiu Geng <[email protected]>
Acked-by: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: [email protected]
Cc: Marc Zyngier <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Pawel Moll <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rob Herring <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Wang Nan <[email protected]>
Cc: Will Deacon <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The raw_syscalls:sys_{enter,exit} were first supported in 'perf trace',
together with minor and major page faults, then we supported
--call-graph, then --max-stack, but when the other tracepoints got
supported, and bpf, etc, I forgot to make those global call-graph
settings apply to them.
Fix it by realizing that the global --max-stack and --call-graph
settings are done via:
OPT_CALLBACK(0, "call-graph", &trace.opts,
"record_mode[,record_size]", record_callchain_help,
&record_parse_callchain_opt),
And then, when we go to parse the events in -e via:
OPT_CALLBACK('e', "event", &trace, "event",
"event/syscall selector. use 'perf list' to list available events",
trace__parse_events_option),
And trace__parse_sevents_option() calls:
struct option o = OPT_CALLBACK('e', "event", &trace->evlist, "event",
"event selector. use 'perf list' to list available events",
parse_events_option);
err = parse_events_option(&o, lists[0], 0);
parse_events_option() will override the global --call-graph and
--max-stack if the "call-graph" and/or "max-stack" terms are in the
event definition, such as in the probe_libc:inet_pton event in one of the
examples below (-e probe_libc:inet_pton/max-stack=2).
Before:
# perf trace --mmap 1024 --call-graph dwarf -e sendto,probe_libc:inet_pton ping -6 -c 1 ::1
1.525 ( ): probe_libc:inet_pton:(7f77f3ac9350))
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.071 ms
--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.071/0.071/0.071/0.000 ms
1.677 ( 0.081 ms): ping/31296 sendto(fd: 3, buff: 0x55681b652720, len: 64, addr: 0x55681b650640, addr_len: 28) = 64
__libc_sendto (/usr/lib64/libc-2.26.so)
[0xffffaa97e4bc9cef] (/usr/bin/ping)
[0xffffaa97e4bc656d] (/usr/bin/ping)
[0xffffaa97e4bc7d0a] (/usr/bin/ping)
[0xffffaa97e4bca447] (/usr/bin/ping)
[0xffffaa97e4bc2f91] (/usr/bin/ping)
__libc_start_main (/usr/lib64/libc-2.26.so)
[0xffffaa97e4bc3379] (/usr/bin/ping)
#
After:
# perf trace --mmap 1024 --call-graph dwarf -e sendto,probe_libc:inet_pton ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.089 ms
--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.089/0.089/0.089/0.000 ms
1.955 ( ): probe_libc:inet_pton:(7f383a311350))
__inet_pton (inlined)
gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
__GI_getaddrinfo (inlined)
[0xffffaa5d91444f3f] (/usr/bin/ping)
__libc_start_main (/usr/lib64/libc-2.26.so)
[0xffffaa5d91445379] (/usr/bin/ping)
2.140 ( 0.101 ms): ping/32047 sendto(fd: 3, buff: 0x55a26edd0720, len: 64, addr: 0x55a26edce640, addr_len: 28) = 64
__libc_sendto (/usr/lib64/libc-2.26.so)
[0xffffaa5d9144bcef] (/usr/bin/ping)
[0xffffaa5d9144856d] (/usr/bin/ping)
[0xffffaa5d91449d0a] (/usr/bin/ping)
[0xffffaa5d9144c447] (/usr/bin/ping)
[0xffffaa5d91444f91] (/usr/bin/ping)
__libc_start_main (/usr/lib64/libc-2.26.so)
[0xffffaa5d91445379] (/usr/bin/ping)
#
Same thing for --max-stack, the global one:
# perf trace --max-stack 3 -e sendto,probe_libc:inet_pton ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.097 ms
--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.097/0.097/0.097/0.000 ms
1.577 ( ): probe_libc:inet_pton:(7f32f3957350))
__inet_pton (inlined)
gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so)
__GI_getaddrinfo (inlined)
1.738 ( 0.108 ms): ping/32103 sendto(fd: 3, buff: 0x55c3132d7720, len: 64, addr: 0x55c3132d5640, addr_len: 28) = 64
__libc_sendto (/usr/lib64/libc-2.26.so)
[0xffffaa3cecf44cef] (/usr/bin/ping)
[0xffffaa3cecf4156d] (/usr/bin/ping)
#
And then setting up a global setting (dwarf, max-stack=4), that will
affect the raw_syscall:sys_enter for the 'sendto' syscall and that will
be overriden in the probe_libc:inet_pton call to just one entry.
# perf trace --max-stack=4 --call-graph dwarf -e sendto -e probe_libc:inet_pton/max-stack=1/ ping -6 -c 1 ::1
PING ::1(::1) 56 data bytes
64 bytes from ::1: icmp_seq=1 ttl=64 time=0.090 ms
--- ::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.090/0.090/0.090/0.000 ms
2.140 ( ): probe_libc:inet_pton:(7f9fe9337350))
__GI___inet_pton (/usr/lib64/libc-2.26.so)
2.283 ( 0.103 ms): ping/31804 sendto(fd: 3, buff: 0x55c7f3e19720, len: 64, addr: 0x55c7f3e17640, addr_len: 28) = 64
__libc_sendto (/usr/lib64/libc-2.26.so)
[0xffffaa380c402cef] (/usr/bin/ping)
[0xffffaa380c3ff56d] (/usr/bin/ping)
[0xffffaa380c400d0a] (/usr/bin/ping)
#
Install iputils-debuginfo to get those /usr/bin/ping addresses resolved,
those routines are not on its .dymsym nor .symtab :-)
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Hendrick Brueckner <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The construct:
if (callchain_param)
perf_evsel__config_callchain(evsel, opts, &callchain_param);
happens in several places, so make perf_evsel__config_callchain() work
just like free(NULL), do nothing if param->enabled is not set.
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Hendrick Brueckner <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
We need to increase output offset in each iteration, not decrease it as
we currently do.
I guess we were lucky to finish in most cases in first iteration, so the
bug never showed. However it shows a lot when working with big (~4GB)
size data.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Fixes: 9c9f5a2f1944 ("perf tools: Introduce copyfile_offset() function")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Since 75562573bab3 ("perf tools: Add support for
PERF_SAMPLE_IDENTIFIER") we don't need explicitely set
PERF_SAMPLE_IDENTIFIER, as perf_evlist__config() will do this for us,
i.e. when there are more than one evsel in an evlist, it will check if
some evsel has a sample_type different than the one on the first evsel
in the list, setting PERF_SAMPLE_IDENTIFIER in that case.
So, to simplify 'perf trace' codebase, ditch that check.
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Hendrick Brueckner <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Wang Nan <[email protected]>
Link: https://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
There could be different types of memory in the system. E.g normal
System Memory, Persistent Memory. To understand how the workload maps to
those memories, it's important to know the I/O statistics of them. Perf
can collect physical addresses, but those are raw data. It still needs
extra work to resolve the physical addresses. Provide a script to
facilitate the physical addresses resolving and I/O statistics.
Profile with MEM_INST_RETIRED.ALL_LOADS or MEM_UOPS_RETIRED.ALL_LOADS
event if any of them is available.
Look up the /proc/iomem and resolve the physical address. Provide
memory type summary.
Here is an example output:
# perf script report mem-phys-addr
Event: mem_inst_retired.all_loads:P
Memory type count percentage
---------------------------------------- ----------- -----------
System RAM 74 53.2%
Persistent Memory 55 39.6%
N/A
---
Changes since V2:
- Apply the new license rules.
- Add comments for globals
Changes since V1:
- Do not mix DLA and Load Latency. Do not compare the loads and stores.
Only profile the loads.
- Use event name to replace the RAW event
Signed-off-by: Kan Liang <[email protected]>
Reviewed-by: Andi Kleen <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Philippe Ombredanne <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The trailing semicolon is an empty statement that does no operation.
Removing it since it doesn't do anything.
Signed-off-by: Luis de Bethencourt <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Joe Perches <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Commit ("d0565132605f perf evsel: Enable type checking for
perf_evsel_config_term types") assumes PERF_EVSEL__CONFIG_TERM_DRV_CFG
isn't used and as such adds a BUG_ON().
Since the enumeration type is used in macro ADD_CONFIG_TERM() the change
break CoreSight trace acquisition.
This patch restores the original code.
Signed-off-by: Mathieu Poirier <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Fixes: d0565132605f ("perf evsel: Enable type checking for perf_evsel_config_term types")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|