Age | Commit message (Collapse) | Author | Files | Lines |
|
When introducing the PERF_RECORD_MMAP2 in:
5c5e854bc760 perf tools: Add attr->mmap2 support
A check for the number of entries parsed by sscanf was introduced that
assumed all of the 8 fields needed to be correctly parsed so that
particular /proc/pid/maps line would be considered synthesizable.
That broke anon records synthesizing, as it doesn't have the 'execname'
field.
Fix it by keeping the sscanf return check, changing it to not require
that the 'execname' variable be parsed, so that the preexisting logic
can kick in and set it to '//anon'.
This should get things like JIT profiling working again.
Signed-off-by: Don Zickus <[email protected]>
Cc: Bill Gray <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Joe Mario <[email protected]>
Cc: Richard Fowles <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/n/[email protected]
[ commit log message is mine, dzickus reported the problem with a patch ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When perf_event_attr.mmap_data is set the kernel will generate
PERF_RECORD_MMAP events when non-exec (data, SysV mem) mmaps are
created, so we need to synthesize from /proc/pid/maps for existing
threads, as we do for exec mmaps.
Right now just 'perf record' does it, but any other tool that uses
perf_event__synthesize_thread(s|map) can request it.
Reported-by: Don Zickus <[email protected]>
Tested-by: Don Zickus <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Bill Gray <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Don Zickus <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Joe Mario <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Richard Fowles <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
They convey no information, perhaps I was bitten by some snake at some
point, complete the detox by naming the last of those arguments more
sensibly.
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This way we can later delimit a lifecycle for the COMM and map a hist to
a precise COMM:timeslice couple.
PERF_RECORD_COMM and PERF_RECORD_FORK events that don't have
PERF_SAMPLE_TIME samples can only send 0 value as a timestamp and thus
should overwrite any previous COMM on a given thread because there is no
sensible way to keep track of all the comms lifecycles in a thread
without time informations.
Signed-off-by: Frederic Weisbecker <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
[ Made it cope with PERF_RECORD_MMAP2 ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
As the thread comm is going to be implemented by way of a more
complicated data structure than just a pointer to a string from the
thread struct, convert the readers of comm to use an accessor instead of
accessing it directly.
The accessor will be later overriden to support an enhanced comm
implementation.
Signed-off-by: Frederic Weisbecker <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
[ Rename thread__comm_curr() to thread__comm_str() ]
Signed-off-by: Namhyung Kim <[email protected]>
[ Fixed up some minor const pointer issues ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When introducing support for MMAP2 we considered more parts of each map
representation in /proc/PID/maps, and when disabling it we forgot to
reduce the number of expected parsed/assigned entries in the sscanf
call, fix it to expect the right number of desired fields, 5.
Reported-by: Markus Trippelsdorf <[email protected]>
Based-on-a-patch-by: Markus Trippelsdorf <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
For now, we disable the extended MMAP record support (MMAP2).
We have identified cases where it would not report the correct mapping
information, clone(VM_CLONE) but with separate pids. We will revisit
the support once we find a solution for this case.
The patch changes the kernel to return EINVAL if attr->mmap2 is set. The
patch also modifies the perf tool to use regular PERF_RECORD_MMAP for
synthetic events and it also prevents the tool from requesting
attr->mmap2 mode because the kernel would reject it.
The support will be revisited once the kenrel interface is updated.
In V2, we reduce the patch to the strict minimum.
In V3, we avoid calling perf_event_open() with mmap2 set because we know
it will fail and require fallback retry.
Signed-off-by: Stephane Eranian <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/20131017173215.GA8820@quad
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This patch adds support for the new PERF_RECORD_MMAP2 record type
exposed by the kernel. This is an extended PERF_RECORD_MMAP record.
It adds for each file-backed mapping the device major, minor number and
the inode number and generation.
This triplet uniquely identifies the source of a file-backed mapping. It
can be used to detect identical virtual mappings between processes, for
instance.
The patch will prefer MMAP2 over MMAP.
Signed-off-by: Stephane Eranian <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Cope with 314add6 "Change machine__findnew_thread() to set thread pid",
fix 'perf test' regression test entry affected,
use perf_missing_features.mmap2 to fallback to not using .mmap2 in older kernels,
so that new tools can work with kernels where this feature is not present ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The ip_event struct assumes fixed positions for ip, pid and tid. That
is no longer true with the addition of PERF_SAMPLE_IDENTIFIER. The
information is anyway in struct sample, so use that instead.
Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add a new parameter for 'pid' to machine__findnew_thread().
Change callers to pass 'pid' when it is known.
Note that callers sometimes want to find the main thread
which has the memory maps. The main thread has tid == pid
so the usage in that case is:
machine__findnew_thread(machine, pid, pid)
whereas the usage to find the specific thread is:
machine__findnew_thread(machine, pid, tid)
Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: David Ahern <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Now that the symbol filter is recorded on the machine there is no need
to pass it to thread__find_addr_map(). So remove it.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Now that the symbol filter is recorded on the machine there is no need
to pass it to thread__find_addr_location(). So remove it.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Now that the symbol filter is recorded on the machine there is no need
to pass it to perf_event__preprocess_sample(). So remove it.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
In order to use kernel maps to read object code, those maps must be
adjusted to map to the dso file offset. Because lazy-initialization is
used, that is not done until symbols are loaded. However the maps are
first used by thread__find_addr_map() before symbols are loaded. So
this patch changes thread__find_addr() to "load" kernel maps before
using them.
Signed-off-by: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
As evident from 'machine__process_fork_event()' and
'machine__process_exit_event()' the 'pid' member of struct thread is
actually the tid.
Rename 'pid' to 'tid' in struct thread accordingly.
Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: David Ahern <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
cppcheck reported:
[util/event.c:480]: (error) Memory leak: event
Signed-off-by: Thomas Jarosch <[email protected]>
Link: http://lkml.kernel.org/r/2717013.8dV0naNhAV@storm
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When reading those files to synthesize MMAP events. It makes the code
shorter and cleaner.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The perf_event__synthesize_threads routine synthesizes all the existing
threads in the system, because we don't have any kernel facilities to
ask for PERF_RECORD_{FORK,MMAP,COMM} for existing threads.
It was returning an error as soon as one thread couldn't be synthesized,
which is a bit extreme when, for instance, a forkish workload is
running, like a kernel compile.
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The perf_tool vtable expects methods that receive perf_tool and
perf_sample entries, but for tools not interested in doing any special
processing on non PERF_RECORD_SAMPLE events, like 'perf top', and for
those not using perf_session, like 'perf trace', they were using
perf_event__process passing tool and sample paramenters that were just
not used.
Provide 'machine' methods for this purpose and make the perf_event
ones use them.
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When we were processing a PERF_RECORD_EXIT event we first used
machine__findnew_thread for both the thread exiting and for its parent,
only to use just the thread struct associated with the one exiting, and
to just delete it.
If it existed, i.e. not created at this very moment in
machine__findnew_thread, it will be moved to the machine->dead_threads
linked list, because we may have hist_entries pointing to it, but if it
was created just do be deleted, it will just sit there with no
references at all.
Use the new machine__find_thread() method so that if it is not there, we
don't create it.
As a bonus the parent thread will also not be created at this point.
Create process_fork() and process_exit() helpers to use this and make
the builtins use it instead of the generic process_task(), ditched by
this patch.
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
is used
Report/top commands support to only handle specific symbols with
"--symbols" option, but current code will keep those samples whose
symbol can't be resolved, which should actually be filtered.
If we run following commands:
$perf record -a tree
$perf report --symbols intel_idle -n
the output will be:
Without the patch:
==================
46.27% 156 sshd [unknown]
26.05% 48 swapper [kernel.kallsyms]
17.26% 38 tree libc-2.12.1.so
7.69% 17 tree tree
2.73% 6 tree ld-2.12.1.so
With the patch:
===============
100.00% 48 swapper [kernel.kallsyms]
Signed-off-by: Feng Tang <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
perf defines both __used and __unused variables to use for marking
unused variables. The variable __used is defined to
__attribute__((__unused__)), which contradicts the kernel definition to
__attribute__((__used__)) for new gcc versions. On Android, __used is
also defined in system headers and this leads to warnings like: warning:
'__used__' attribute ignored
__unused is not defined in the kernel and is not a standard definition.
If __unused is included everywhere instead of __used, this leads to
conflicts with glibc headers, since glibc has a variables with this name
in its headers.
The best approach is to use __maybe_unused, the definition used in the
kernel for __attribute__((unused)). In this way there is only one
definition in perf sources (instead of 2 definitions that point to the
same thing: __used and __unused) and it works on both Linux and Android.
This patch simply replaces all instances of __used and __unused with
__maybe_unused.
Signed-off-by: Irina Tirdea <[email protected]>
Acked-by: Pekka Enberg <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ committer note: fixed up conflict with a116e05 in builtin-sched.c ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
On some systems (e.g. Android), ALIGN is defined in system headers as
ALIGN(p). The definition of ALIGN used in perf takes 2 parameters:
ALIGN(x,a). This leads to redefinition conflicts.
Redefinition error on Android:
In file included from util/include/linux/list.h:1:0,
from util/callchain.h:5,
from util/hist.h:6,
from util/session.h:4,
from util/build-id.h:4,
from util/annotate.c:11:
util/include/linux/kernel.h:11:0: error: "ALIGN" redefined [-Werror]
bionic/libc/include/sys/param.h:38:0: note: this is the location of
the previous definition
Conflics with system defined ALIGN in Android:
util/event.c: In function 'perf_event__synthesize_comm':
util/event.c:115:32: error: macro "ALIGN" passed 2 arguments, but takes just 1
util/event.c:115:9: error: 'ALIGN' undeclared (first use in this function)
util/event.c:115:9: note: each undeclared identifier is reported only once for
each function it appears in
In order to avoid this redefinition, ALIGN is renamed to PERF_ALIGN.
Signed-off-by: Irina Tirdea <[email protected]>
Acked-by: Pekka Enberg <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Irina Tirdea <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steven Rostedt <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Handle error from process callback and propagate back to caller.
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: David Ahern <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
kallsyms__parse() takes a callback that is called on every discovered
symbol. As /proc/kallsyms does not supply symbol sizes, the callback was
simply called with end=start, faking the symbol size to 1.
All of the callbacks (there are 2) used in calls to kallsyms__parse()
are _only_ used as callbacks for kallsyms__parse().
Given that kallsyms__parse() lacks real information about what
end/length should be, don't make up a length in kallsyms__parse().
Instead have the callbacks handle guessing the length.
Also relocate a comment regarding symbol creation to the callback which
does symbol creation (kallsyms__parse() is not in general used to create
symbols).
Signed-off-by: Cody P Schafer <[email protected]>
Cc: David Hansen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Matt Hellsley <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Sukadev Bhattiprolu <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
If threads in a multi-threaded process have names shorter than the main
thread the comm for the named threads is not properly terminated.
E.g., for the process 'namedthreads' where each thread is named noploop%d
where %d is the thread number:
Before:
perf script -f comm,tid,ip,sym,dso
noploop:4ads 21616 400a49 noploop (/tmp/namedthreads)
The 'ads' in the thread comm bleeds over from the process name.
After:
perf script -f comm,tid,ip,sym,dso
noploop:4 21616 400a49 noploop (/tmp/namedthreads)
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: David Ahern <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
In some perf ancient versions we used '[kernel.kallsyms._text]' as the
name for the kernel map.
This got changed with commit:
perf: 'perf kvm' tool for monitoring guest performance from host
commit a1645ce12adb6c9cc9e19d7695466204e3f017fe
Author: Zhang, Yanmin <[email protected]>
and we started to use following name '[kernel.kallsyms]_text'.
This name change is important for the report code dealing with ancient
perf data. When processing the kernel map event, we need to recognize
the old naming (dont match the last ']') and initialize the kernel map
correctly.
The subsequent call to maps__set_kallsyms_ref_reloc_sym deals with the
superfluous ']' to get correct symbol name.
Cc: Corey Ashford <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Jiri Olsa <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This handles multithreaded processes with named threads when doing
system wide profiling: the comm for each thread is looked up allowing
them to be different from the thread group leader.
v2:
- fixed sizeof arg to perf_event__get_comm_tgid
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: David Ahern <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
perf does not properly handle monitoring of processes with named threads.
For example:
$ ps -C myapp -L
PID LWP TTY TIME CMD
25118 25118 ? 00:00:00 myapp
25118 25119 ? 00:00:00 myapp:worker
perf record -e cs -c 1 -fo /tmp/perf.data -p 25118 -- sleep 10
perf report --stdio -i /tmp/perf.data
100.00% myapp:worker [kernel.kallsyms] [k] perf_event_task_sched_out
The process name is set to the name of the last thread it finds for the
process.
The Problem:
perf-top and perf-record both create a thread_map of threads to be
monitored. That map is used in perf_event__synthesize_thread_map which
loops over the entries in thread_map and calls __event__synthesize_thread
to generate COMM and MMAP events.
__event__synthesize_thread calls perf_event__synthesize_comm which opens
/proc/pid/status, reads the name of the task and its thread group id.
That's all fine. The problem is that it then reads /proc/pid/task and
generates COMM events for each task it finds - but using the name found
in /proc/pid/status where pid is the thread of interest.
The end result (looping over thread_map + synthesizing comm events for
each thread each time) means the name of the last thread processed sets
the name for all threads in the process - which is not good for
multithreaded processes with named threads.
The Fix:
perf_event__synthesize_comm has an input argument (full) that decides
whether to process task entries for each pid it is passed. It currently
never set to 0 (perf_event__synthesize_comm has a single caller and it
always passes the value 1). Let's fix that.
Add the full input argument to __event__synthesize_thread which passes
it to perf_event__synthesize_comm. For thread/process monitoring set full
to 0 which means COMM and MMAP events are only generated for the pid
passed to it. For system wide monitoring set full to 1 so that COMM events
are generated for all threads in a process.
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: David Ahern <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Use local variable 'dso' to reduce typing a bit and rearrange the if
condition. Also NULL check of al->map in the condition is not necessary.
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Namhyung Kim <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
So that tools like 'perf test' can print the events when in verbose
mode, for instance.
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
To better reflect that it became the base class for all tools, that must
be in each tool struct and where common stuff will be put.
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Reducing the exposure of perf_session further, so that we can use the
classes in cases where no perf.data file is created.
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
So that we don't need to have that many globals.
Next steps will remove the 'session' pointer, that in most cases is
not needed.
Then we can rename perf_event_ops to 'perf_tool' that better describes
this class hierarchy.
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
perf_event__synthesize_mmap_events does not create anonymous mmap events
even though the kernel does. As a result an already running application
with dynamically created code will not get profiled - all samples end up
in the unknown bucket.
This patch skips any entries with '[' in the name to avoid adding events
for special regions (eg the vsyscall page). All other executable mmaps
are assumed to be anonymous and an event is synthesized.
Acked-by: Pekka Enberg <[email protected]>
Cc: Eric B Munson <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Pekka Enberg <[email protected]>
Link: http://lkml.kernel.org/r/20110830091506.60b51fe8@kryten
Signed-off-by: Anton Blanchard <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Fixes two more cases where the python binding would not load:
. Not finding die(), which it shouldn't anyway, not good to just stop the
world because some particular perf.data file is invalid, just propagate
the error to the caller.
. Not finding perf_sample_size: fix it by moving it from event.c to evsel,
where it belongs, as most cases are moving to operate on an evsel object.o
One of the fixed problems:
[root@emilia ~]# python
>>> import perf
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: perf_sample_size
>>>
[root@emilia ~]#
Cc: Frederic Weisbecker <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Perf uses /proc/modules to figure out where kernel modules are loaded.
With the advent of kptr_restrict, non root users get zeroes for all module
start addresses.
So check if kptr_restrict is non zero and don't generate the syntethic
PERF_RECORD_MMAP events for them.
Warn the user about it in perf record and in perf report.
In perf report the reference relocation symbol being zero means that
kptr_restrict was set, thus /proc/kallsyms has only zeroed addresses, so don't
use it to fixup symbol addresses when using a valid kallsyms (in the buildid
cache) or vmlinux (in the vmlinux path) build-id located automatically or
specified by the user.
Provide an explanation about it in 'perf report' if kernel samples were taken,
checking if a suitable vmlinux or kallsyms was found/specified.
Restricted /proc/kallsyms don't go to the buildid cache anymore.
Example:
[acme@emilia ~]$ perf record -F 100000 sleep 1
WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted, check
/proc/sys/kernel/kptr_restrict.
Samples in kernel functions may not be resolved if a suitable vmlinux file is
not found in the buildid cache or in the vmlinux path.
Samples in kernel modules won't be resolved at all.
If some relocation was applied (e.g. kexec) symbols may be misresolved even
with a suitable vmlinux or kallsyms file.
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.005 MB perf.data (~231 samples) ]
[acme@emilia ~]$
[acme@emilia ~]$ perf report --stdio
Kernel address maps (/proc/{kallsyms,modules}) were restricted,
check /proc/sys/kernel/kptr_restrict before running 'perf record'.
If some relocation was applied (e.g. kexec) symbols may be misresolved.
Samples in kernel modules can't be resolved as well.
# Events: 13 cycles
#
# Overhead Command Shared Object Symbol
# ........ ....... ................. .....................
#
20.24% sleep [kernel.kallsyms] [k] page_fault
20.04% sleep [kernel.kallsyms] [k] filemap_fault
19.78% sleep [kernel.kallsyms] [k] __lru_cache_add
19.69% sleep ld-2.12.so [.] memcpy
14.71% sleep [kernel.kallsyms] [k] dput
4.70% sleep [kernel.kallsyms] [k] flush_signal_handlers
0.73% sleep [kernel.kallsyms] [k] perf_event_comm
0.11% sleep [kernel.kallsyms] [k] native_write_msr_safe
#
# (For a higher level overview, try: perf report --sort comm,dso)
#
[acme@emilia ~]$
This is because it found a suitable vmlinux (build-id checked) in
/lib/modules/2.6.39-rc7+/build/vmlinux (use -v in perf report to see the long
file name).
If we remove that file from the vmlinux path:
[root@emilia ~]# mv /lib/modules/2.6.39-rc7+/build/vmlinux \
/lib/modules/2.6.39-rc7+/build/vmlinux.OFF
[acme@emilia ~]$ perf report --stdio
[kernel.kallsyms] with build id 57298cdbe0131f6871667ec0eaab4804dcf6f562
not found, continuing without symbols
Kernel address maps (/proc/{kallsyms,modules}) were restricted, check
/proc/sys/kernel/kptr_restrict before running 'perf record'.
As no suitable kallsyms nor vmlinux was found, kernel samples can't be
resolved.
Samples in kernel modules can't be resolved as well.
# Events: 13 cycles
#
# Overhead Command Shared Object Symbol
# ........ ....... ................. ......
#
80.31% sleep [kernel.kallsyms] [k] 0xffffffff8103425a
19.69% sleep ld-2.12.so [.] memcpy
#
# (For a higher level overview, try: perf report --sort comm,dso)
#
[acme@emilia ~]$
Reported-by: Stephane Eranian <[email protected]>
Suggested-by: David Miller <[email protected]>
Cc: Dave Jones <[email protected]>
Cc: David Miller <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Tom Zanussi <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The shift used here to count the number of bits set in
the mask doesn't work above the low part for archs that
are not 64 bits.
Fix the constant used for the shift.
This fixes a 32-bit perf top failure reported by Eric Dumazet:
Can't parse sample, err = -14
Can't parse sample, err = -14
...
Reported-and-tested-by: Eric Dumazet <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Stephane Eranian <[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
What we want is to count the number of bits in the mask,
not some other random operation written in the middle
of the night.
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Fixed perf_event__names[] alignment which was nearby and hurting my eyes ... ]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Check that the total size of the sample fields having a fixed
size do not exceed the one of the whole event. This robustifies
the sample parsing.
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Stephane Eranian <[email protected]>
|
|
Perf can't currently trace into the vsyscall page. It looks like it was
meant to work.
Tested on 2.6.38 and today's -git.
The bug is easy to reproduce. Compile this:
int main()
{
int i;
struct timespec t;
for(i = 0; i < 10000000; i++)
clock_gettime(CLOCK_MONOTONIC, &t);
return 0;
}
and run it through perf record; perf report. The top entry shows
"[unknown]" and you can't zoom in.
It looks like there are two issues. The first is a that a test for user
mode executing in kernel space is backwards. (That's the first hunk
below). The second (I think) is that something's wrong with the code
that generates lots of little struct dso objects for different sections
-- when it runs on vmlinux it results in bogus long_name values which
cause objdump to fail.
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
LPU-Reference: <[email protected]>
Signed-off-by: Andy Lutomirski <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
To support multiple events we need to do these calcs per 'struct hists'
instance, and it turns out we already do that at:
__hists__add_entry
hists__inc_nr_entries
hists__calc_col_len
for all the unfiltered hist_entry instances we stash in the rb tree, so
trow away the dead code.
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Tom Zanussi <[email protected]>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Fixups due to rename of event_t routines from event__ to perf_event__
done in perf/core.
Conflicts:
tools/perf/builtin-record.c
tools/perf/builtin-top.c
tools/perf/util/event.c
tools/perf/util/event.h
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Jeff Moyer reported these messages:
Warning: ... trying to fall back to cpu-clock-ticks
couldn't open /proc/-1/status
couldn't open /proc/-1/maps
[ls output]
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.008 MB perf.data (~363 samples) ]
That lead me and David Ahern to see that something was fishy on the thread
synthesizing routines, at least for the case where the workload is started
from 'perf record', as -1 is the default for target_tid in 'perf record --tid'
parameter, so somehow we were trying to synthesize the PERF_RECORD_MMAP and
PERF_RECORD_COMM events for the thread -1, a bug.
So I investigated this and noticed that when we introduced support for
recording a process and its threads using --pid some bugs were introduced and
that the way to fix it was to instead of passing the target_tid to the event
synthesizing routines we should better pass the thread_map that has the list of
threads for a --pid or just the single thread for a --tid.
Checked in the following ways:
On a 8-way machine run cyclictest:
[root@emilia ~]# perf record cyclictest -a -t -n -p99 -i100 -d50
policy: fifo: loadavg: 0.00 0.13 0.31 2/139 28798
T: 0 (28791) P:99 I:100 C: 25072 Min: 4 Act: 5 Avg: 6 Max: 122
T: 1 (28792) P:98 I:150 C: 16715 Min: 4 Act: 6 Avg: 5 Max: 27
T: 2 (28793) P:97 I:200 C: 12534 Min: 4 Act: 5 Avg: 4 Max: 8
T: 3 (28794) P:96 I:250 C: 10028 Min: 4 Act: 5 Avg: 5 Max: 96
T: 4 (28795) P:95 I:300 C: 8357 Min: 5 Act: 6 Avg: 5 Max: 12
T: 5 (28796) P:94 I:350 C: 7163 Min: 5 Act: 6 Avg: 5 Max: 12
T: 6 (28797) P:93 I:400 C: 6267 Min: 4 Act: 5 Avg: 5 Max: 9
T: 7 (28798) P:92 I:450 C: 5571 Min: 4 Act: 5 Avg: 5 Max: 9
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.108 MB perf.data (~4719 samples) ]
[root@emilia ~]#
This will create one extra thread per CPU:
[root@emilia ~]# tuna -t cyclictest -CP
thread ctxt_switches
pid SCHED_ rtpri affinity voluntary nonvoluntary cmd
28825 OTHER 0 0xff 2169 671 cyclictest
28832 FIFO 93 6 52338 1 cyclictest
28833 FIFO 92 7 46524 1 cyclictest
28826 FIFO 99 0 209360 1 cyclictest
28827 FIFO 98 1 139577 1 cyclictest
28828 FIFO 97 2 104686 0 cyclictest
28829 FIFO 96 3 83751 1 cyclictest
28830 FIFO 95 4 69794 1 cyclictest
28831 FIFO 94 5 59825 1 cyclictest
[root@emilia ~]#
So we should expect only samples for the above 9 threads when using the
--dump-raw-trace|-D perf report switch to look at the column with the tid:
[root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c
629 28825
110 28826
491 28827
308 28828
198 28829
621 28830
225 28831
203 28832
89 28833
[root@emilia ~]#
So for workloads started by 'perf record' seems to work, now for existing workloads,
just run cyclictest first, without 'perf record':
[root@emilia ~]# tuna -t cyclictest -CP
thread ctxt_switches
pid SCHED_ rtpri affinity voluntary nonvoluntary cmd
28859 OTHER 0 0xff 594 200 cyclictest
28864 FIFO 95 4 16587 1 cyclictest
28865 FIFO 94 5 14219 1 cyclictest
28866 FIFO 93 6 12443 0 cyclictest
28867 FIFO 92 7 11062 1 cyclictest
28860 FIFO 99 0 49779 1 cyclictest
28861 FIFO 98 1 33190 1 cyclictest
28862 FIFO 97 2 24895 1 cyclictest
28863 FIFO 96 3 19918 1 cyclictest
[root@emilia ~]#
and then later did:
[root@emilia ~]# perf record --pid 28859 sleep 3
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.027 MB perf.data (~1195 samples) ]
[root@emilia ~]#
To collect 3 seconds worth of samples for pid 28859 and its children:
[root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c
15 28859
33 28860
19 28861
13 28862
13 28863
10 28864
11 28865
9 28866
255 28867
[root@emilia ~]#
Works, last thing is to check if looking at just one of those threads also works:
[root@emilia ~]# perf record --tid 28866 sleep 3
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.006 MB perf.data (~242 samples) ]
[root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c
3 28866
[root@emilia ~]#
Works too.
Reported-by: Jeff Moyer <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jeff Moyer <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Tom Zanussi <[email protected]>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
And move the event_t methods to the perf_event__ too.
No code changes, just namespace consistency.
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Tom Zanussi <[email protected]>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Making the namespace more uniform.
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Tom Zanussi <[email protected]>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Right now this function is only used by perf top, that uses PROT_READ
only, i.e. overwrite mode, so no PERF_RECORD_LOST events are generated,
but don't forget those events.
The patch that moved this out of perf top was made so that this routine
could be used by 'perf probe' in the uprobes patchset, so perhaps there
they need to check for LOST events and warn the user, as will be done in
the following patches that will switch 'perf top' to non overwrite mode
(mmap with PROT_READ|PROT_WRITE).
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Srikar Dronamraju <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Tom Zanussi <[email protected]>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
To avoid linking more stuff in the python binding I'm working on, future
csets will make the sample type be taken from the evsel itself, but for
that we need to first have one file per cpu and per sample_type, not a
single perf.data file.
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Tom Zanussi <[email protected]>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Using %L[uxd] has issues in some architectures, like on ppc64. Fix it
by making our 64 bit integers typedefs of stdint.h types and using
PRI[ux]64 like, for instance, git does.
Reported by Denis Kirjanov that provided a patch for one case, I went
and changed all cases.
Reported-by: Denis Kirjanov <[email protected]>
Tested-by: Denis Kirjanov <[email protected]>
LKML-Reference: <[email protected]>
Cc: Denis Kirjanov <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Pingtian Han <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Tom Zanussi <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
For kallsyms we don't have the symbol address end, so we do an extra pass and
set the symbol end addr as being the start of the next minus one.
But this was being done just after we filtered the symbols of a
particular type (functions, variables), so the symbol end was sometimes
after what it really is.
Fixing up symbol end also was falling apart when we have symbol aliases,
then the end address of all but the last alias was being set to be
before its start.
Fix it up by checking for symbol aliases and making the kallsyms__parse
routine use the next symbol, whatever its type, as the limit for the
previous symbol, passing that end address to the callback.
This was detected by the 'perf test' synthetic paranoid regression
tests, fix it up so that even that case doesn't mislead us.
Cc: Frederic Weisbecker <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Stephane Eranian <[email protected]>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|