Age | Commit message (Collapse) | Author | Files | Lines |
|
ipchain__fprintf_graph() casts the number of hits in a branch as an
int, which means we lose its highests bits.
This results in meaningless number of callchain hits in perf.data
that have a high number of hits recorded, typically those that have
callchain branches hits appearing more than INT_MAX. This happens
easily as those are pondered by the event period.
Reported-by: Nick Piggin <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Paul Mackerras <[email protected]>
|
|
After adding probes, perf-probe(1) reports the probes locations which include
filenames for certain cases.
But for short file names (whose length < 32), perf-probe didn't display the
name correctly. It actually skipped the first character.
Here's an example where 'icmp.c' was screwed:
$ perf probe -n -a "icmp.c;sk=*"
Add new events:
probe:icmp_push_reply (on @cmp.c)
probe:icmp_reply (on @cmp.c)
probe:icmp_reply_1 (on @cmp.c)
probe:icmp_send (on @cmp.c)
probe:icmp_send_1 (on @cmp.c)
probe:icmp_error (on @cmp.c)
probe:icmp_error_1 (on @cmp.c)
probe:icmp_error_2 (on @cmp.c)
probe:icmp_error_3 (on @cmp.c)
This patch fixes this bug in synthesize_perf_probe_point().
Acked-by: Masami Hiramatsu <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Franck Bui-Huu <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This was introduced by commit fde52dbd7f71934aba4e150f3d1d51e826a08850.
Cc: H. Peter Anvin <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Franck Bui-Huu <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
For kallsyms we don't have the symbol address end, so we do an extra pass and
set the symbol end addr as being the start of the next minus one.
But this was being done just after we filtered the symbols of a
particular type (functions, variables), so the symbol end was sometimes
after what it really is.
Fixing up symbol end also was falling apart when we have symbol aliases,
then the end address of all but the last alias was being set to be
before its start.
Fix it up by checking for symbol aliases and making the kallsyms__parse
routine use the next symbol, whatever its type, as the limit for the
previous symbol, passing that end address to the callback.
This was detected by the 'perf test' synthetic paranoid regression
tests, fix it up so that even that case doesn't mislead us.
Cc: Frederic Weisbecker <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Stephane Eranian <[email protected]>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Since the libdwfl library before 0.148 fails to analyze live kernel debuginfo,
'perf probe --list' compiled with those old libdwfl sometimes crashes.
To avoid that bug, perf probe does not use libdwfl's live kernel analysis
routine when it is compiled with older libdwfl.
Side effect: perf with older libdwfl doesn't support listing probe in modules
with source code line. Those could be shown by symbol+offset.
Cc: [email protected]
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Srikar Dronamraju <[email protected]>
Cc: Steven Rostedt <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Masami Hiramatsu <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Fix lazy wildcard matching to ignore space after wild card.
Cc: [email protected]
Cc: Frederic Weisbecker <[email protected]>
Cc: Hitoshi Mitake <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Srikar Dronamraju <[email protected]>
Cc: Steven Rostedt <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Masami Hiramatsu <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Currently perf probe doesn't handle those incorrect syntaxes:
$ perf probe -L sched.c:++13
$ perf probe -L sched.c:-+13
$ perf probe -L sched.c:10000000000000000000000000000+13
This patches rewrites parse_line_range_desc() to handle them.
As a bonus, it reports more useful error messages instead of: "Tailing
with invalid character...".
Acked-by: Masami Hiramatsu <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Franck Bui-Huu <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When listing a whole file or a function which is located at the end,
perf-probe -L output wrongly: "Source file is shorter than expected.".
This is because show_one_line() always consider EOF as an error.
This patch fixes this by not considering EOF as an error when dumping
the trailing lines. Otherwise it's still an error and perf-probe still
outputs its warning.
Acked-by: Masami Hiramatsu <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Franck Bui-Huu <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
$ perf-probe -L sched.c
is currently allowed but not documented.
Cc: Masami Hiramatsu <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Franck Bui-Huu <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
It also removes some superflous parentheses.
Cc: Masami Hiramatsu <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Franck Bui-Huu <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Cc: Masami Hiramatsu <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Franck Bui-Huu <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The actual file used by 'perf probe -L sched.c' is reported in the ouput
of the command.
But it's simply displayed as it has been given to the command (simply
sched.c) which is too ambiguous to be really usefull since several
sched.c files can be found into the same project and we also don't know
which search path has been used.
Acked-by: Masami Hiramatsu <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Franck Bui-Huu <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add new lines for error or debug messages, change dwarf related words to more
generic words (or just removed).
Cc: [email protected]
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Srikar Dronamraju <[email protected]>
Cc: Steven Rostedt <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Masami Hiramatsu <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The symfs argument allows analysis of perf.data file using a locally accessible
filesystem tree with debug symbols - e.g., tree created during image builds,
sshfs mount, loop mounted KVM disk images, USB keys, initrds, etc. Anything
with an OS tree can be analyzed from anywhere without the need to populate a
local data store with build-ids.
Commiter notes:
o Fixed up symfs="/" variants handling.
o prefixed DSO__ORIG_GUEST_KMODULE case with symfs too, avoiding use of files
outside the symfs directory.
LKML-Reference: <[email protected]>
Signed-off-by: David Ahern <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
If we are running the new perf on an old kernel without support for
sample_id_all, we should fall back to the old unordered processing of
events. If we didn't than we would *always* process events without
timestamps out of order, whether or not we hit a reordering race. In
other words, instead of there being a chance of not attributing samples
correctly, we would guarantee that samples would not be attributed.
While processing all events without timestamps before events with
timestamps may seem like an intuitive solution, it falls down as
PERF_RECORD_EXIT events would also be processed before any samples.
Even with a workaround for that case, samples before/after an exec would
not be attributed correctly.
This patch allows commands to indicate whether they need to fall back to
unordered processing, so that commands that do not care about timestamps
on every event will not be affected. If we do fallback, this will print
out a warning if report -D was invoked.
This patch adds the test in perf_session__new so that we only need to
test once per session. Commands that do not use an event_ops (such as
record and top) can simply pass NULL in it's place.
Acked-by: Thomas Gleixner <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ian Munsie <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This was broken since link(2) doesn't dereference symbolic
links. Instead 'filename' becomes a symbolic link to the same file
that 'name' refers to.
This had the bad effect to create dangling symlinks in the case that
even can't be removed with perf-buildid-cache(1).
LKML-Reference: <[email protected]>
Signed-off-by: Franck Bui-Huu <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Fail if the kernel image contains no symbol, allowing using other images
in the vmlinux search path that may have a usable symtab.
Acked-by: Masami Hiramatsu <[email protected]>
Cc: [email protected]
Cc: Francis Moreau <[email protected]>
Cc: Franck Bui-Huu <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
LPU-Reference: <[email protected]>
Signed-off-by: Franck Bui-Huu <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Users were not being able to have the explicitely specified vmlinux
pathname used, instead a search on the vmlinux path was always being
made.
Reported-by: Francis Moreau <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Cc: [email protected]
Cc: Francis Moreau <[email protected]>
Cc: Franck Bui-Huu <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
LPU-Reference: <[email protected]>
Signed-off-by: Franck Bui-Huu <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Merge reason: We want to apply a dependent patch.
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Since we check at the beginning of the callers, no need to ask if
dump_trace is set multiple times.
Cc: Frederic Weisbecker <[email protected]>
Cc: Ian Munsie <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Thomas Gleixner <[email protected]>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Simplify further.
Cc: Frederic Weisbecker <[email protected]>
Cc: Ian Munsie <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Simplify the code a bit.
Cc: Frederic Weisbecker <[email protected]>
Cc: Ian Munsie <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Preparatory patch for ordered perf report -D
Acked-by: Ian Munsie <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ian Munsie <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Preparatory patch for ordered output of perf report -D
Acked-by: Ian Munsie <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ian Munsie <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Preparatory patch for ordered output of perf report -D.
Acked-by: Ian Munsie <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ian Munsie <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The dump code used by perf report -D is scattered all over the place.
Move it to separate functions.
Acked-by: Ian Munsie <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ian Munsie <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
If the event has no timestamp assigned then the parse code sets it to
~0ULL which causes the ordering code to enqueue it at the end.
Process it right away.
Reported-by: Ian Munsie <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ian Munsie <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
event__name[] is missing an entry for PERF_RECORD_FINISHED_ROUND, but we
happily access the array from the dump code.
Make event__name[] static and provide an accessor function, fix up all
callers and add the missing string.
Cc: Frederic Weisbecker <[email protected]>
Cc: Ian Munsie <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This is useful for analyzing a perf data file on a different system than
the one data was collected on and still include symbols from loaded
kernel modules in the output.
Commiter note: Updated the man page accordingly.
LKML-Reference: <[email protected]>
Signed-off-by: David Ahern <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
There were a few stray calloc()'s and malloc()'s which were not having
their return values checked for success.
As the calling code either already coped with failure or didn't actually
care we just return -ENOMEM at that point.
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Signed-off-by: Chris Samuel <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Now that we have timestamps on FORK, EXIT, COMM, MMAP events we can
sort everything in time order. This fixes the following observed
problem:
mmap(file1) -> pagefault() -> munmap(file1)
mmap(file2) -> pagefault() -> munmap(file2)
Resulted in decoding both pagefaults in file2 because the file1 map
was already replaced by the file2 map when the map address was
identical.
With all events sorted we decode both pagefaults correctly.
Cc: Frederic Weisbecker <[email protected]>
Cc: Ian Munsie <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add new macro OPT_CALLBACK_DEFAULT_NOOPT for parse_options.
It enables to pass the default value (opt->defval) to the callback function
processing options require no argument.
Reviewed-by: Masami Hiramatsu <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Akihiro Nagai <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
In the event that a DSO has not been identified, just print out [unknown]
instead of the instruction pointer as we previously were doing, which is pretty
meaningless for a shared object (at least to the users perspective).
The IP we print out is fairly meaningless in general anyway - it's just one
(the first) of the many addresses that were lumped together as unidentified,
and could span many shared objects and symbols. In reality if we see this
[unknown] output then the report -D output is going to be more useful anyway as
we can see all the different address that it represents.
If we are printing the symbols we are still going to see this IP in that column
anyway since they shouldn't resolve either.
This patch also changes the symbol address printouts so that they print out 0x
before the address, are left aligned, and changes the %L format string (which
relies on a glibc bug) to %ll.
Before:
74.11% :3259 4a6c [k] 4a6c
After:
74.11% :3259 [unknown] [k] 0x4a6c
Cc: Frederic Weisbecker <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Ian Munsie <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME:
$ perf record -aT
$ perf report -D | grep PERF_RECORD_
<SNIP>
3 5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3
3 5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3
3 5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811)
3 5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3
3 5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853
3 5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find
3 5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3
3 5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so
3 5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso]
3 5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1
3 5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so
3 5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so
3 5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1
3 5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3
<SNIP>
First column is the cpu and the second the timestamp.
That way we can investigate problems in the event stream.
If the new perf binary is run on an older kernel, it will disable this feature
automatically.
Tested-by: Thomas Gleixner <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Acked-by: Ian Munsie <[email protected]>
Acked-by: Thomas Gleixner <[email protected]>
Cc: Frédéric Weisbecker <[email protected]>
Cc: Ian Munsie <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Stephane Eranian <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
At perf_session__process_event, so that we reduce the number of lines in eache
tool sample processing routine that now receives a sample_data pointer already
parsed.
This will also be useful in the next patch, where we'll allow sample the
identity fields in MMAP, FORK, EXIT, etc, when it will be possible to see (cpu,
timestamp) just after before every event.
Also validate callchains in perf_session__process_event, i.e. as early as
possible, and keep a counter of the number of events discarded due to invalid
callchains, warning the user about it if it happens.
There is an assumption that was kept that all events have the same sample_type,
that will be dealt with in the future, when this preexisting limitation will be
removed.
Tested-by: Thomas Gleixner <[email protected]>
Reviewed-by: Thomas Gleixner <[email protected]>
Acked-by: Ian Munsie <[email protected]>
Acked-by: Thomas Gleixner <[email protected]>
Cc: Frédéric Weisbecker <[email protected]>
Cc: Ian Munsie <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Stephane Eranian <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Merge reason: This is an older commit under testing that was not pushed yet - merge it.
Also fix up the merge in command-list.txt.
Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Tom Zanussi <[email protected]>
|
|
There are number of issues that prevent the use of multiple tracepoint events
being specified in a -e/--event switch, separated by commas.
For example, perf stat -e irq:irq_handler_entry,irq:irq_handler_exit ... fails
because the tracepoint event parsing code doesn't recognize the comma separator
properly.
This patch corrects those issues.
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Julia Lawall <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Reported-by: Michael Ellerman <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Corey Ashford <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
No need to check that many times if debug_trace is on.
Cc: Frédéric Weisbecker <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Stephane Eranian <[email protected]>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The ordered sample code allocates singular reference objects struct
sample_queue which have 48byte size on 64bit and 20 bytes on 32bit. That's
silly. Allocate ~64k sized chunks and hand them out.
Performance gain: ~ 15%
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When the sample queue is flushed we free the sample reference objects. Though
we need to malloc new objects when we process further. Stop the malloc/free
orgy and cache the already allocated object for resuage. Only allocate when
the cache is empty.
Performance gain: ~ 10%
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Profiling perf with perf revealed that a large part of the processing time is
spent in malloc/memcpy/free in the sample ordering code. That code copies the
data from the mmap into malloc'ed memory. That's silly. We can keep the mmap
and just store the pointer in the queuing data structure. For 64 bit this is
not a problem as we map the whole file anyway. On 32bit we keep 8 maps around
and unmap the oldest before mmaping the next chunk of the file.
Performance gain: 2.95s -> 1.23s (Faktor 2.4)
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
On 64bit we can map the whole file in one go, on 32bit we can at least map
32MB and not map/unmap tiny chunks of the file.
Base the progress bar on 1/16 of the data size.
Preparatory patch to get rid of the malloc/memcpy/free of trace data.
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
No need to check twice.
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The progress bar is changed when the file offset changes. This happens only
when the next mmap is done. No need to call ui_progress_update() for every
event.
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Replace the pseudo C++ self argument with session and give the mmap related
variables a sensible name. shift is a complete misnomer - it took me several
rounds of cursing to figure out that it's not a shift value.
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
There is no reason to use a struct sample_event pointer in struct sample_queue
and type cast it when flushing the queue.
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The homebrewn sort algorithm fails to sort in time order. One of the problem
spots is that it fails to deal with equal timestamps correctly.
My first gut reaction was to replace the fancy list with an rbtree, but the
performance is 3 times worse.
Rewrite it so it works.
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This primarily fixes perf-report, which didn't report the correct type
of event if perf-record was called to record one event different from
'cycles':
$ perf record -e instructions true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.007 MB perf.data (~295 samples) ]
$ perf report | head -n1
# Events: 7 cycles
LPU-Reference: <[email protected]>
Signed-off-by: Franck Bui-Huu <[email protected]>
|
|
On ARM, module symbol start address is ahead of kernel symbol start address, so
we can't suppose that the start address of kernel map always is zero, otherwise
may cause incorrect .start and .end of kernel map (caused by fixup) when there
are modules loaded, then map_groups__find may return incorrect map for symbol
query.
This patch always figures out the start address of kernel map from
/proc/kallsyms if the file is available, so fix the issues on ARM for module
loaded case.
This patch fixes the following issues on ARM when modules are loaded:
- vmlinux symbol can't be found by kallsyms maps doing 'perf test'
- module symbols are parsed mistakenlly when doing 'perf top'/'perf report'
Cc: Ian Munsie <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Tom Zanussi <[email protected]>
LKML-Reference: <20101125192725.62d31b42@tom-lei>
Signed-off-by: Ming Lei <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
On ARM, module addresss space is ahead of kernel space, so the module
symbols are handled before kernel symbol in dso__split_kallsyms, then
was causing one map to be created for each kernel symbol.
Reported-by: Ming Lei <[email protected]>
Tested-by: Ming Lei <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Ming Lei <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Tom Zanussi <[email protected]>
LKML-Reference: <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|