Age | Commit message (Collapse) | Author | Files | Lines |
|
Moving out the option parameter from parse_events function,
and adding new parse_events_option function instead.
The option parameter is used only to carry "struct perf_evlist"
pointer for chaining new events. Putting it away, enable us
to call parse_events from other places without using the
option parameter.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Non-callchain path is using al.addr which prints as:
openssl 14564 17672.003587: 7862d _x86_64_AES_encrypt_compact
This should be sample->ip to print as:
openssl 14564 17672.003587: 3f7867862d _x86_64_AES_encrypt_compact
Signed-off-by: David Ahern <[email protected]>
Acked-by: Frederic Weisbecker <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
The perf_event_attr struct has two __u32's at the top and
they need to be swapped individually.
With this change I was able to analyze a perf.data collected in a
32-bit PPC VM on an x86 system. I tested both 32-bit and 64-bit
binaries for the Intel analysis side; both read the PPC perf.data
file correctly.
-v2:
- changed the existing perf_event__attr_swap() to swap only elements
of perf_event_attr and exported it for use in swapping the
attributes in the file header
- updated swap_ops used for processing events
Signed-off-by: David Ahern <[email protected]>
Acked-by: Frederic Weisbecker <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Add "node" as a simple alias for NODE cache events.
The addition of NODE cache events broke the parse_alias
function, so any mismatched event caused the segfault, like:
# ./perf stat -e krava ls
The hw_cache/hw_cache_op/hw_cache_result arrays needs to follow
PERF_COUNT_HW_CACHE_*MAX enums. Adding those MAXs to be size
of those arrays, so possible ommision in future wil not lead to
segfault.
Adding read/write/prefetch as allowed operations for node cache
event.
Signed-off-by: Jiri Olsa <[email protected]>
Acked-by: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Support adding probes on offline kernel modules. This enables
perf-probe to trace kernel-module init functions via perf-probe.
If user gives the path of module with -m option, perf-probe
expects the module is offline.
This feature works with --add, --funcs, and --vars.
E.g)
# perf probe -m /lib/modules/`uname -r`/kernel/fs/btrfs/btrfs.ko \
-a "extent_io_init:5 extent_state_cache"
Add new events:
probe:extent_io_init (on extent_io_init:5 with extent_state_cache)
probe:extent_io_init_1 (on extent_io_init:5 with extent_state_cache)
You can now use it on all perf tools, such as:
perf record -e probe:extent_io_init_1 -aR sleep 1
Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Link: http://lkml.kernel.org/r/20110627072751.6528.10230.stgit@fedora15
Signed-off-by: Steven Rostedt <[email protected]>
|
|
Add probed module name and ":" in front of function name
if -m module option is given. In the result, the symbol
name passed to kprobe-tracer becomes MODULE:FUNCTION,
so that kallsyms can solve it as a symbol in the module
correctly.
Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Link: http://lkml.kernel.org/r/20110627072745.6528.26416.stgit@fedora15
Signed-off-by: Steven Rostedt <[email protected]>
|
|
Introduce debuginfo to encapsulate dwarf information.
This new object allows us to reuse and expand debuginfo easily.
Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Link: http://lkml.kernel.org/r/20110627072739.6528.12438.stgit@fedora15
Signed-off-by: Steven Rostedt <[email protected]>
|
|
Move dwarf library related routines to dwarf-aux.{c,h}.
This includes several minor changes.
- Add simple documents for each API.
- Rename die_find_real_subprogram() to die_find_realfunc()
- Rename line_walk_handler_t to line_walk_callback_t.
- Minor cleanups.
Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Link: http://lkml.kernel.org/r/20110627072727.6528.57647.stgit@fedora15
Signed-off-by: Steven Rostedt <[email protected]>
|
|
Since there are dwarf_bitsize, dwarf_bitoffset and dwarf_bytesize
defined in libdw, we don't need die_get_bit_size, die_get_bit_offset
and die_get_byte_size anymore.
Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Link: http://lkml.kernel.org/r/20110627072721.6528.2747.stgit@fedora15
Signed-off-by: Steven Rostedt <[email protected]>
|
|
Since strtailcmp() is enough generic, it should be defined in string.c.
Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Link: http://lkml.kernel.org/r/20110627072715.6528.10677.stgit@fedora15
Signed-off-by: Steven Rostedt <[email protected]>
|
|
Since die_find/walk* callbacks use DIE_FIND_CB_FOUND for
both of failed and found cases, it should be "END"
instead "FOUND" for avoiding confusion.
Signed-off-by: Masami Hiramatsu <[email protected]>
Reported-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Ingo Molnar <[email protected]>
Link: http://lkml.kernel.org/r/20110627072709.6528.45706.stgit@fedora15
Signed-off-by: Steven Rostedt <[email protected]>
|
|
While attempting to create a timechart of boot up I found perf didn't
tolerate modules being loaded/unloaded. This patch fixes this by
reading the file once and then writing the size read at the correct
point in the file. It also simplifies the code somewhat.
Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Sonny Rao <[email protected]>
Signed-off-by: Michael Neuling <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Steven Rostedt <[email protected]>
|
|
Add an option to perf report/annotate/script to specify which
CPUs to operate on. This enables us to take a single system wide
profile and analyse each CPU (or group of CPUs) in isolation.
This was useful when profiling a multiprocess workload where the
bottleneck was on one CPU but this was hidden in the overall
profile. Per process and per thread breakdowns didn't help
because multiple processes were running on each CPU and no
single process consumed an entire CPU.
The patch converts the list of CPUs returned by cpu_map__new
into a bitmap for fast lookup. I wanted to use -C to be
consistent with perf top/record/stat, but unfortunately perf
report already uses -C <comms>.
v2: Incorporate suggestions from David Ahern:
- Added -c to perf script
- Check that SAMPLE_CPU is set when -c is used
- Update documentation
v3: Create perf_session__cpu_bitmap()
Signed-off-by: Anton Blanchard <[email protected]>
Acked-by: David Ahern <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>
Link: http://lkml.kernel.org/r/20110704215750.11647eb9@kryten
Signed-off-by: Ingo Molnar <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/core
|
|
Merge reason: Pick up the latest fixes.
Signed-off-by: Ingo Molnar <[email protected]>
|
|
So that the parent sort dimension can be registered twice: once
if we add it as an explicit sort dimension (-s parent) and twice
if we request a parent filter (-p foo).
We'll have only one parent sort dimension in the end but this
allows to override the default parent filter with we gave in "-p"
option. The goal of this is to prepare to allow the use of
"-s parent" and "-p foo" at the same time, ie: sort by filtered
parent.
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Sam Liao <[email protected]>
|
|
As for newt ui, don't display entries that have been marked
as ignored.
The practical current effect of this is to make parent
filtering really working. Before, entries that were ignored
were given a null parent but were still displayed. This
resulted in some weird effects:
# Overhead Command Shared Object Symbol
# ........ ........... ................. ............
#
^A
|
--- __lock_acquire
|
|--95.97%-- lock_acquire
| |
| |--30.75%-- _raw_spin_lock
Discard these from the stdio display.
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Sam Liao <[email protected]>
|
|
These are probably some old leftovers.
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Sam Liao <[email protected]>
|
|
These don't need to be globally visible.
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Sam Liao <[email protected]>
|
|
Add "caller/callee" option to support inverted butterfly report,
in the inverted report (with caller option), the call graph start
from the callee's ancestor. Users can use such view to catch system's
performance bottleneck from a sysprof like view. Using this option
with specified sort order like pid gives us high level view of call
graph statistics.
Also add "-G" alias for inverted call graph.
Signed-off-by: Sam Liao <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: David Ahern <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
rcu: Move RCU_BOOST #ifdefs to header file
rcu: use softirq instead of kthreads except when RCU_BOOST=y
rcu: Use softirq to address performance regression
rcu: Simplify curing of load woes
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6:
kbuild: Call depmod.sh via shell
perf: clear out make flags when calling kernel make kernelver
|
|
Merge reason: add the latest fixes.
Signed-off-by: Ingo Molnar <[email protected]>
|
|
When generating the perf version from the kernel version using 'make
kernelver' it is necessary to clear out any MAKEFLAGS otherwise they may
trigger additional output which pollute the contents.
Signed-off-by: Andy Whitcroft <[email protected]>
Signed-off-by: Michal Marek <[email protected]>
|
|
Commit a26ac2455ffcf3(rcu: move TREE_RCU from softirq to kthread)
introduced performance regression. In an AIM7 test, this commit degraded
performance by about 40%.
The commit runs rcu callbacks in a kthread instead of softirq. We observed
high rate of context switch which is caused by this. Out test system has
64 CPUs and HZ is 1000, so we saw more than 64k context switch per second
which is caused by RCU's per-CPU kthread. A trace showed that most of
the time the RCU per-CPU kthread doesn't actually handle any callbacks,
but instead just does a very small amount of work handling grace periods.
This means that RCU's per-CPU kthreads are making the scheduler do quite
a bit of work in order to allow a very small amount of RCU-related
processing to be done.
Alex Shi's analysis determined that this slowdown is due to lock
contention within the scheduler. Unfortunately, as Peter Zijlstra points
out, the scheduler's real-time semantics require global action, which
means that this contention is inherent in real-time scheduling. (Yes,
perhaps someone will come up with a workaround -- otherwise, -rt is not
going to do well on large SMP systems -- but this patch will work around
this issue in the meantime. And "the meantime" might well be forever.)
This patch therefore re-introduces softirq processing to RCU, but only
for core RCU work. RCU callbacks are still executed in kthread context,
so that only a small amount of RCU work runs in softirq context in the
common case. This should minimize ksoftirqd execution, allowing us to
skip boosting of ksoftirqd for CONFIG_RCU_BOOST=y kernels.
Signed-off-by: Shaohua Li <[email protected]>
Tested-by: "Alex,Shi" <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6:
perf: Use make kernelversion instead of parsing the Makefile
kbuild: Hack for depmod not handling X.Y versions
kbuild: Move depmod call to a separate script
kbuild: Fix <linux/version.h> for empty SUBLEVEL or PATCHLEVEL
kbuild: Fix KERNELVERSION for empty SUBLEVEL or PATCHLEVEL
kbuild: silence Nothing to be done for 'all' message
|
|
Cc: Peter Zijlstra <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Michal Marek <[email protected]>
|
|
Mandatory arguments need to be present in the argument name list, as
well as optional arguments, otherwise python barfs:
# ./python/twatch.py
Traceback (most recent call last):
File "./python/twatch.py", line 41, in <module>
main()
File "./python/twatch.py", line 32, in main
event = evlist.read_on_cpu(cpu)
RuntimeError: more argument specifiers than keyword list entries
Hence, add cpu to the name list.
Cc: David Ahern <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Tom Zanussi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Fixes two more cases where the python binding would not load:
. Not finding die(), which it shouldn't anyway, not good to just stop the
world because some particular perf.data file is invalid, just propagate
the error to the caller.
. Not finding perf_sample_size: fix it by moving it from event.c to evsel,
where it belongs, as most cases are moving to operate on an evsel object.o
One of the fixed problems:
[root@emilia ~]# python
>>> import perf
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: perf_sample_size
>>>
[root@emilia ~]#
Cc: Frederic Weisbecker <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
We were using pr_debug to tell the user about not being able to parse a sample
where we should really use the python way of reporting errors: exceptions.
Fixes this problem:
[root@emilia ~]# python
>>> import perf
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: eprintf
>>>
[root@emilia ~]
As we want to keep the objects linked in the python binding (and in the future
in a shared library) minimal.
Cc: Frederic Weisbecker <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
So far we avoided having to link debug.o in the python binding, keep it
that way by not using ui__warning() in evlist.c.
Cc: Frederic Weisbecker <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Resolve to a function or variable if possible and if the sym option is
enabled.
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: David Ahern <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The 'sym' option displays both the function name and the DSO it comes
from. Split the display of the dso into a separate option. This allows
display of the ip address and symbol without the dso, thus shortening
line lengths - and decluttering the output a bit.
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: David Ahern <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Currently the "sym" output field is used to dump instruction pointers
and callchain stack. Sample addresses can also be converted to symbols,
so the meaning of "sym" needs to be fixed. This patch adds an "ip"
option and if it is selected the user can also opt to dump symbols for
them. If the user opts to dump IP without syms only the address is
shown.
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: David Ahern <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
perf stat continues running even if the event list contains counters
that are not supported. The resulting output then contains <not counted>
for those events which gets confusing as to which events are supported,
but not counted and which are not supported.
Before:
perf stat -ddd -- sleep 1
Performance counter stats for 'sleep 1':
0.571283 task-clock # 0.001 CPUs utilized
1 context-switches # 0.002 M/sec
0 CPU-migrations # 0.000 M/sec
157 page-faults # 0.275 M/sec
1,037,707 cycles # 1.816 GHz
<not counted> stalled-cycles-frontend
<not counted> stalled-cycles-backend
654,499 instructions # 0.63 insns per cycle
136,129 branches # 238.286 M/sec
<not counted> branch-misses
<not counted> L1-dcache-loads
<not counted> L1-dcache-load-misses
<not counted> LLC-loads
<not counted> LLC-load-misses
<not counted> L1-icache-loads
<not counted> L1-icache-load-misses
<not counted> dTLB-loads
<not counted> dTLB-load-misses
<not counted> iTLB-loads
<not counted> iTLB-load-misses
<not counted> L1-dcache-prefetches
<not counted> L1-dcache-prefetch-misses
1.001004836 seconds time elapsed
After:
perf stat -ddd -- sleep 1
Performance counter stats for 'sleep 1':
1.350326 task-clock # 0.001 CPUs utilized
2 context-switches # 0.001 M/sec
0 CPU-migrations # 0.000 M/sec
157 page-faults # 0.116 M/sec
11,986 cycles # 0.009 GHz
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
496,986 instructions # 41.46 insns per cycle
138,065 branches # 102.246 M/sec
7,245 branch-misses # 5.25% of all branches
<not counted> L1-dcache-loads
<not counted> L1-dcache-load-misses
<not counted> LLC-loads
<not counted> LLC-load-misses
<not counted> L1-icache-loads
<not counted> L1-icache-load-misses
<not counted> dTLB-loads
<not counted> dTLB-load-misses
<not counted> iTLB-loads
<not counted> iTLB-load-misses
<not counted> L1-dcache-prefetches
<not supported> L1-dcache-prefetch-misses
1.002397333 seconds time elapsed
v1->v2:
changed supported type from int to bool
v2->v3
fixed vertical alignment of new struct element
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: David Ahern <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The list of methods argument names only needs to be NULL terminated
once. Remove the second ones.
Cc: David Ahern <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Tom Zanussi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Mandatory arguments need to be present in the argument name list, as
well as optional arguments, otherwise python barfs:
# ./python/twatch.py
Traceback (most recent call last):
File "./python/twatch.py", line 41, in <module>
main()
File "./python/twatch.py", line 32, in main
event = evlist.read_on_cpu(cpu)
RuntimeError: more argument specifiers than keyword list entries
Hence, add cpu to the name list.
Cc: David Ahern <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Tom Zanussi <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Fixes two more cases where the python binding would not load:
. Not finding die(), which it shouldn't anyway, not good to just stop the
world because some particular perf.data file is invalid, just propagate
the error to the caller.
. Not finding perf_sample_size: fix it by moving it from event.c to evsel,
where it belongs, as most cases are moving to operate on an evsel object.o
One of the fixed problems:
[root@emilia ~]# python
>>> import perf
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: perf_sample_size
>>>
[root@emilia ~]#
Cc: Frederic Weisbecker <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
We were using pr_debug to tell the user about not being able to parse a sample
where we should really use the python way of reporting errors: exceptions.
Fixes this problem:
[root@emilia ~]# python
>>> import perf
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: eprintf
>>>
[root@emilia ~]
As we want to keep the objects linked in the python binding (and in the future
in a shared library) minimal.
Cc: Frederic Weisbecker <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
So far we avoided having to link debug.o in the python binding, keep it
that way by not using ui__warning() in evlist.c.
Cc: Frederic Weisbecker <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
perf_evsel__alloc_fd allocates an array of file descriptors with the
memory initialized to 0. The array has dimensions for cpus and threads.
Later, __perf_evsel__open calls sys_perf_event_open for each cpu and thread
dimensions. If the open fails for any of the cpus or threads then the fd's
for this event are closed and the fd entry in the array is set to -1. Now,
if the first attempt fails for the event (e.g., the event is not supported)
the remaining dimensions (cpu > 0 and thread > 0) are not touched and left
at the initialized value of 0.
builtin-stat catches ENOENT and ENOSYS failures and allows the command to
continue. The end result is that stat attempts to read from an fd of 0 which
of course is stdin and so the command hangs until you type ctrl-D.
Resolve by initializing the array to -1 since an fd < 0 is already
handled.
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: David Ahern <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Where /usr/include/linux/const.h is not present, e.g. RHEL5.
Reported-by: Srikar Dronamraju <[email protected]>
Cc: Srikar Dronamraju <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Tom Zanussi <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Perf uses /proc/modules to figure out where kernel modules are loaded.
With the advent of kptr_restrict, non root users get zeroes for all module
start addresses.
So check if kptr_restrict is non zero and don't generate the syntethic
PERF_RECORD_MMAP events for them.
Warn the user about it in perf record and in perf report.
In perf report the reference relocation symbol being zero means that
kptr_restrict was set, thus /proc/kallsyms has only zeroed addresses, so don't
use it to fixup symbol addresses when using a valid kallsyms (in the buildid
cache) or vmlinux (in the vmlinux path) build-id located automatically or
specified by the user.
Provide an explanation about it in 'perf report' if kernel samples were taken,
checking if a suitable vmlinux or kallsyms was found/specified.
Restricted /proc/kallsyms don't go to the buildid cache anymore.
Example:
[acme@emilia ~]$ perf record -F 100000 sleep 1
WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted, check
/proc/sys/kernel/kptr_restrict.
Samples in kernel functions may not be resolved if a suitable vmlinux file is
not found in the buildid cache or in the vmlinux path.
Samples in kernel modules won't be resolved at all.
If some relocation was applied (e.g. kexec) symbols may be misresolved even
with a suitable vmlinux or kallsyms file.
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.005 MB perf.data (~231 samples) ]
[acme@emilia ~]$
[acme@emilia ~]$ perf report --stdio
Kernel address maps (/proc/{kallsyms,modules}) were restricted,
check /proc/sys/kernel/kptr_restrict before running 'perf record'.
If some relocation was applied (e.g. kexec) symbols may be misresolved.
Samples in kernel modules can't be resolved as well.
# Events: 13 cycles
#
# Overhead Command Shared Object Symbol
# ........ ....... ................. .....................
#
20.24% sleep [kernel.kallsyms] [k] page_fault
20.04% sleep [kernel.kallsyms] [k] filemap_fault
19.78% sleep [kernel.kallsyms] [k] __lru_cache_add
19.69% sleep ld-2.12.so [.] memcpy
14.71% sleep [kernel.kallsyms] [k] dput
4.70% sleep [kernel.kallsyms] [k] flush_signal_handlers
0.73% sleep [kernel.kallsyms] [k] perf_event_comm
0.11% sleep [kernel.kallsyms] [k] native_write_msr_safe
#
# (For a higher level overview, try: perf report --sort comm,dso)
#
[acme@emilia ~]$
This is because it found a suitable vmlinux (build-id checked) in
/lib/modules/2.6.39-rc7+/build/vmlinux (use -v in perf report to see the long
file name).
If we remove that file from the vmlinux path:
[root@emilia ~]# mv /lib/modules/2.6.39-rc7+/build/vmlinux \
/lib/modules/2.6.39-rc7+/build/vmlinux.OFF
[acme@emilia ~]$ perf report --stdio
[kernel.kallsyms] with build id 57298cdbe0131f6871667ec0eaab4804dcf6f562
not found, continuing without symbols
Kernel address maps (/proc/{kallsyms,modules}) were restricted, check
/proc/sys/kernel/kptr_restrict before running 'perf record'.
As no suitable kallsyms nor vmlinux was found, kernel samples can't be
resolved.
Samples in kernel modules can't be resolved as well.
# Events: 13 cycles
#
# Overhead Command Shared Object Symbol
# ........ ....... ................. ......
#
80.31% sleep [kernel.kallsyms] [k] 0xffffffff8103425a
19.69% sleep ld-2.12.so [.] memcpy
#
# (For a higher level overview, try: perf report --sort comm,dso)
#
[acme@emilia ~]$
Reported-by: Stephane Eranian <[email protected]>
Suggested-by: David Miller <[email protected]>
Cc: Dave Jones <[email protected]>
Cc: David Miller <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Mike Galbraith <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Tom Zanussi <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf tools: Fix sample type size calculation in 32 bits archs
profile: Use vzalloc() rather than vmalloc() & memset()
|
|
The shift used here to count the number of bits set in
the mask doesn't work above the low part for archs that
are not 64 bits.
Fix the constant used for the shift.
This fixes a 32-bit perf top failure reported by Eric Dumazet:
Can't parse sample, err = -14
Can't parse sample, err = -14
...
Reported-and-tested-by: Eric Dumazet <[email protected]>
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Stephane Eranian <[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf tools: Fix sample size bit operations
perf tools: Fix ommitted mmap data update on remap
watchdog: Change the default timeout and configure nmi watchdog period based on watchdog_thresh
watchdog: Disable watchdog when thresh is zero
watchdog: Only disable/enable watchdog if neccessary
watchdog: Fix rounding bug in get_sample_period()
perf tools: Propagate event parse error handling
perf tools: Robustify dynamic sample content fetch
perf tools: Pre-check sample size before parsing
perf tools: Move evlist sample helpers to evlist area
perf tools: Remove junk code in mmap size handling
perf tools: Check we are able to read the event size on mmap
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
b43: fix comment typo reqest -> request
Haavard Skinnemoen has left Atmel
cris: typo in mach-fs Makefile
Kconfig: fix copy/paste-ism for dell-wmi-aio driver
doc: timers-howto: fix a typo ("unsgined")
perf: Only include annotate.h once in tools/perf/util/ui/browsers/annotate.c
md, raid5: Fix spelling error in comment ('Ofcourse' --> 'Of course').
treewide: fix a few typos in comments
regulator: change debug statement be consistent with the style of the rest
Revert "arm: mach-u300/gpio: Fix mem_region resource size miscalculations"
audit: acquire creds selectively to reduce atomic op overhead
rtlwifi: don't touch with treewide double semicolon removal
treewide: cleanup continuations and remove logging message whitespace
ath9k_hw: don't touch with treewide double semicolon removal
include/linux/leds-regulator.h: fix syntax in example code
tty: fix typo in descripton of tty_termios_encode_baud_rate
xtensa: remove obsolete BKL kernel option from defconfig
m68k: fix comment typo 'occcured'
arch:Kconfig.locks Remove unused config option.
treewide: remove extra semicolons
...
|
|
What we want is to count the number of bits in the mask,
not some other random operation written in the middle
of the night.
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Fixed perf_event__names[] alignment which was nearby and hurting my eyes ... ]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Commit eac9eacee16 "perf tools: Check we are able to read the event
size on mmap" brought a check to ensure we can read the size of the
event before dereferencing it, and do a remap otherwise to move the
buffer forward.
However that remap was ommitting all the necessary work to
update the new page offset, head, and to unmap previous pages,
etc...
To fix this, gather all the code that fetches the event in a
seperate helper which does all the necessary checks about the
header/event size and tells us anytime a remap is needed.
Signed-off-by: Frederic Weisbecker <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/urgent
Conflicts:
tools/perf/builtin-top.c
Semantic conflict:
util/include/linux/list.h # fix prefetch.h removal fallout
Signed-off-by: Ingo Molnar <[email protected]>
|