Commit graph

1341 commits

Author SHA1 Message Date
Stephane Eranian
777d1d71db perf tools: Fix error handling of unknown events
There was a problem with the parse_events() code not printing the
correct event name when an event was unknown and starting with an 'r'.
The source of the problem was the way raw notation was parsed.

Without the patch:
	$ perf stat -e retired_foo
	invalid event modifier: 'tired_foo'

With the patch:
	$ perf stat -e retired_foo
	invalid or unsupported event: 'retired_foo'

This also covers the case where the name of the event was not printed at
all when perf was linked with libpfm4.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20110723021043.GA20178@quad
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-18 07:21:13 -03:00
Stephane Eranian
cc2d86b04d perf evlist: Fix missing event name init for default event
When no event is given to perf record, perf top, a default event is
initialized (cycles). However, perf_evlist__add_default() was not
setting the symbolic name for the event. Perf top worked simply because
it was reconstructing the name from the event code. But it should not
have to do this. This patch initializes the evsel->name field properly.

This second version improves the code flow on the non error path.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20110607161936.GA8163@quad
Signed-off-by: Stephane Eranian <eranian@google.com>
[committer note: Use perf_evsel__delete() instead of plain free()]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-18 07:20:31 -03:00
Stephane Eranian
77e57297b4 perf list: Fix exit value
This patch fixes an issue with the exit value of perf list:

$ perf list; echo $?
129

perf list returns an error exit code even though there is no error.

There was a stray exit(129) in print_events(). This patch removes this
exit().

$ perf list; echo $?
0

$ perf list hw sw
  cpu-cycles OR cycles                               [Hardware event]
  stalled-cycles-frontend OR idle-cycles-frontend    [Hardware event]
  stalled-cycles-backend OR idle-cycles-backend      [Hardware event]
  instructions                                       [Hardware event]
  cache-references                                   [Hardware event]
  cache-misses                                       [Hardware event]
  branch-instructions OR branches                    [Hardware event]
  branch-misses                                      [Hardware event]
  bus-cycles                                         [Hardware event]

  cpu-clock                                          [Software event]
  task-clock                                         [Software event]
  page-faults OR faults                              [Software event]
  minor-faults                                       [Software event]
  major-faults                                       [Software event]
  context-switches OR cs                             [Software event]
  cpu-migrations OR migrations                       [Software event]
  alignment-faults                                   [Software event]
  emulation-faults                                   [Software event]
$ echo $?
0

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20110523123917.GA31060@quad
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-18 07:19:15 -03:00
Masami Hiramatsu
3f4460a28f perf probe: Filter out redundant inline-instances
With gcc4.6, some instances of concrete inlined function looks redundant
and broken, because it appears inside of a concrete instance and its
call_file and call_line are same as the original abstruct's decl_file
and decl_line respectively.

e.g.
 [  d1aa]    subprogram
             external             (flag) Yes
             name                 (strp) "add_timer"
             decl_file            (data1) 2		;here is original
             decl_line            (data2) 847		;line and file
             prototyped           (flag) Yes
             inline               (data1) inlined (1)
             sibling              (ref4) [  d1c6]
...
 [ 11d84]    subprogram
             abstract_origin      (ref4) [  d1aa]	; concrete instance
             low_pc               (addr) .text+0x000000000000246f <add_timer>
             high_pc              (addr) .text+0x000000000000248b <mod_timer_pending>
             frame_base           (block1)               [   0] call_frame_cfa
             sibling              (ref4) [ 11dd9]
 [ 11d9f]      formal_parameter
               abstract_origin      (ref4) [  d1b9]
               location             (data4) location list [  701b]
 [ 11da8]      inlined_subroutine
               abstract_origin      (ref4) [  d1aa]	; redundant instance
               low_pc               (addr) .text+0x000000000000247e <add_timer+0xf>
               high_pc              (addr) .text+0x0000000000002480 <add_timer+0x11>
               call_file            (data1) 2		; call line and file
               call_line            (data2) 847		; are same as above

Those redundant instances leads unwilling results;

e.g. find probe points inside of functions even if we specify
a function entry as below;

$ perf probe -V add_timer
Available variables at add_timer
        @<add_timer+0>
                struct timer_list*      timer
        @<add_timer+15>
                (No matched variables)

So, this filters out those redundant instances based on call-site and
decl-site information.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: yrl.pp-manager.tt@hitachi.com
Link: http://lkml.kernel.org/r/20110811110317.19900.59525.stgit@fedora15
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-12 09:34:35 -03:00
Masami Hiramatsu
db0d2c6420 perf probe: Search concrete out-of-line instances
gcc 4.6 generates a concrete out-of-line instance when there is a
function which is implicitly inlined somewhere but also has its own
instance. The concrete out-of-line instance means that it has an
abstract origin of the function which is referred by not only
inlined-subroutines but also a concrete subprogram.

Since current dwarf_func_inline_instances() can find only instances of
inlined-subroutines, this introduces new die_walk_instances() to find
both of subprogram and inlined-subroutines.

e.g. without this,
Available variables at sched_group_rt_period
        @<cpu_rt_period_read_uint+9>
                struct task_group*      tg

perf probe failed to find actual subprogram instance of
sched_group_rt_period().

With this,

Available variables at sched_group_rt_period
        @<cpu_rt_period_read_uint+9>
                struct task_group*      tg
        @<sched_group_rt_period+0>
                struct task_group*      tg

Now it found the sched_group_rt_period() itself.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: yrl.pp-manager.tt@hitachi.com
Link: http://lkml.kernel.org/r/20110811110311.19900.63997.stgit@fedora15
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-12 09:32:10 -03:00
Masami Hiramatsu
f182e3e13c perf probe: Avoid searching variables in intermediate scopes
Fix variable searching logic to search one in inner than local scope or
global(CU) scope. In the other words, skip searching in intermediate
scopes.

e.g., in the following code,

int var1;

void inline infunc(int i)
{
    i++;   <--- [A]
}

void func(void)
{
   int var1, var2;
   infunc(var2);
}

At [A], "var1" should point the global variable "var1", however, if user
mis-typed as "var2", variable search should be failed. However, current
logic searches variable infunc() scope, global scope, and then func()
scope. Thus, it can find "var2" variable in func() scope. This may not
be what user expects.

So, it would better not search outer scopes except outermost (compile
unit) scope which contains only global variables, when it failed to find
given variable in local scope.

E.g.

Without this:
$ perf probe -V pre_schedule --externs > without.vars

With this:
$ perf probe -V pre_schedule --externs > with.vars

Check the diff:
$ diff without.vars with.vars
88d87
<               int     cpu
133d131
<               long unsigned int*      switch_count

These vars are actually in the scope of schedule(), the caller of
pre_schedule().

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: yrl.pp-manager.tt@hitachi.com
Link: http://lkml.kernel.org/r/20110811110305.19900.94374.stgit@fedora15
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-12 09:29:34 -03:00
Masami Hiramatsu
221d061182 perf probe: Fix to search local variables in appropriate scope
Fix perf probe to search local variables in appropriate local inlined
function scope. For example, pre_schedule() has only 2 local variables,
as below;

$ perf probe -L pre_schedule
<pre_schedule@/home/mhiramat/ksrc/linux-2.6/kernel/sched.c:0>
      0  static inline void pre_schedule(struct rq *rq, struct task_struct *prev)
         {
      2         if (prev->sched_class->pre_schedule)
      3                 prev->sched_class->pre_schedule(rq, prev);
         }

However, current perf probe shows 4 local variables on pre_schedule(),
because it searches variables in the caller(schedule()) scope.

$ perf probe -V pre_schedule
Available variables at pre_schedule
        @<schedule+445>
                int     cpu
                long unsigned int*      switch_count
                struct rq*      rq
                struct task_struct*     prev

This patch fixes this issue by searching variables in the local scope of
the instance of inlined function. Here is the result.

$ perf probe -V pre_schedule
Available variables at pre_schedule
        @<schedule+445>
                struct rq*      rq
                struct task_struct*     prev

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: yrl.pp-manager.tt@hitachi.com
Link: http://lkml.kernel.org/r/20110811110259.19900.85664.stgit@fedora15
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-12 09:28:45 -03:00
Masami Hiramatsu
36c0c588b9 perf probe: Fix to walk all inline instances
Fix line-range collector to walk all instances of inlined function,
because some execution paths can be optimized out depending on the
function argument of instances.

E.g.)
inline_func (arg) {
	if (arg)
		do_something;
	else
		do_another;
}

func_A() {
	inline_func(1)
}

func_B() {
	inline_func(0)
}

In this case, func_A may have only do_something code and func_B may have
only do_another.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <masami.hiramatsu@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: yrl.pp-manager.tt@hitachi.com
Link: http://lkml.kernel.org/r/20110811110247.19900.93702.stgit@fedora15
Signed-off-by: Masami Hiramatsu <masami.hiramatsu@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-12 09:25:38 -03:00
Masami Hiramatsu
b0e9cb2802 perf probe: Fix to search nested inlined functions in CU
Fix perf probe to walk through the lines of all nested inlined function
call sites and declared lines when a whole CU is passed to the line
walker.

The die_walk_lines() can have two different type of DIEs, subprogram (or
inlined-subroutine) DIE and CU DIE.

If a caller passes a subprogram DIE, this means that the walker walk on
lines of given subprogram. In this case, it just needs to search on
direct children of DIE tree for finding call-site information of inlined
function which directly called from given subprogram.

On the other hand, if a caller passes a CU DIE to the walker, this means
that the walker have to walk on all lines in the source files included
in given CU DIE. In this case, it has to search whole DIE trees of all
subprograms to find the call-site information of all nested inlined
functions.

Without this patch:

$ perf probe --line kernel/cpu.c:151-157
</home/mhiramat/ksrc/linux-2.6/kernel/cpu.c:151>

         static int cpu_notify(unsigned long val, void *v)
         {
    154         return __cpu_notify(val, v, -1, NULL);
         }

With this:
$ perf probe --line kernel/cpu.c:151-157
</home/mhiramat/ksrc/linux-2.6/kernel/cpu.c:151>

    152  static int cpu_notify(unsigned long val, void *v)
         {
    154         return __cpu_notify(val, v, -1, NULL);
         }

As you can see, --line option with source line range shows the declared
lines as probe-able.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: yrl.pp-manager.tt@hitachi.com
Link: http://lkml.kernel.org/r/20110811110241.19900.34994.stgit@fedora15
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-12 09:23:39 -03:00
Masami Hiramatsu
a128405c6b perf probe: Fix line walker to check CU correctly
Fix line walker to check whether a given DIE is CU or not.

Actually this function accepts CU, subprogram and inlined_subroutine
DIEs.

Without this fix, perf probe always fails to analyze lines on inlined
functions;

$ perf probe -L pre_schedule
Debuginfo analysis failed. (-2)
  Error: Failed to show lines. (-2)

This fixes that bug, as below.

$ perf probe -L pre_schedule
<pre_schedule@/home/mhiramat/ksrc/linux-2.6/kernel/sched.c:0>
      0  static inline void pre_schedule(struct rq *rq, struct task_struct *prev
         {
      2         if (prev->sched_class->pre_schedule)
      3                 prev->sched_class->pre_schedule(rq, prev);
         }

         /* rq->lock is NOT held, but preemption is disabled */

Changes from v1:
 - Update against current tip tree.(Fix dwarf-aux.c)

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <masami.hiramatsu@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: yrl.pp-manager.tt@hitachi.com
Link: http://lkml.kernel.org/r/20110811110235.19900.20614.stgit@fedora15
Signed-off-by: Masami Hiramatsu <masami.hiramatsu@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-12 09:22:46 -03:00
Masami Hiramatsu
8afa2a707d perf probe: Fix a memory leak for scopes array
Fix a memory leak for scopes array when it finds a variable in the
global scope.

Reviewed-by: Pekka Enberg <penberg@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: yrl.pp-manager.tt@hitachi.com
Link: http://lkml.kernel.org/r/20110811110229.19900.63019.stgit@fedora15
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-12 09:21:15 -03:00
Vasiliy Kulikov
e9b52ef222 perf: fix temporary file ownership check
A file in /tmp/ might be a symlink, so lstat() should be used instead of
stat().

Acked-by: Pekka Enberg <penberg@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20110811205537.GA22864@albatros
Signed-off-by: Vasiliy Kulikov <segoon@openwall.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-12 08:28:17 -03:00
Jiri Olsa
f57b05ed53 perf report: Use properly build_id kernel binaries
If we bring the recorded perf data together with kernel binary from another
machine using:

	on server A:
	perf archive

	on server B:
	tar xjvf perf.data.tar.bz2 -C ~/.debug

the build_id kernel dso is not properly recognized during the "perf report"
command on server B.

The reason is, that build_id dsos are added during the session initialization,
while the kernel maps are created during the sample event processing.

The machine__create_kernel_maps functions ends up creating new dso object for
kernel, but it does not check if we already have one added by build_id
processing.

Also the build_id reading ABI quirk added in commit:

 - commit b25114817a
   perf build-id: Add quirk to deal with perf.data file format breakage

populates the "struct build_id_event::pid" with 0, which
is later interpreted as DEFAULT_GUEST_KERNEL_ID.

This is not always correct, so it's better to guess the pid
value based on the "struct build_id_event::header::misc" value.

- Tested with data generated on x86 kernel version v2.6.34
  and reported back on x86_64 current kernel.
- Not tested for guest kernel case.

Note the problem stays for PERF_RECORD_MMAP events recorded by perf that
does not use proper pid (HOST_KERNEL_ID/DEFAULT_GUEST_KERNEL_ID). They are
misinterpreted within the current perf code. Probably there's not much we
can do about that.

Cc: Avi Kivity <avi@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Link: http://lkml.kernel.org/r/20110601194346.GB1934@jolsa.brq.redhat.com
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-11 08:58:03 -03:00
Arnaldo Carvalho de Melo
fc8ed7be73 perf top browser: Remove spurious helpline update
It will be immediately replaced in perf_top_browser__run.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-q7e2jzb44elqpkvdllk94x0i@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-10 12:42:26 -03:00
Pekka Enberg
981c125269 perf symbols: Check '/tmp/perf-' symbol file ownership
The external symbol files are generated by JIT compilers, for example, but we
need to make sure they're ours before injecting them to 'perf report'.

Requested-by: Ingo Molnar <mingo@elte.hu>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1312919658-17158-1-git-send-email-penberg@kernel.org
Signed-off-by: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-09 15:23:08 -03:00
Arnaldo Carvalho de Melo
069e3725dd perf tools: Check $HOME/.perfconfig ownership
Just like we do already for perf.data files.

Requested-by: Ingo Molnar <mingo@elte.hu>
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Christian Ohm <chr.ohm@gmx.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jonathan Nieder <jrnieder@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-qgokmxsmvppwpc5404qhyk7e@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-09 12:42:13 -03:00
Jiri Olsa
9941c96ad8 perf tools: Add support to install perf python extension
Adding install-python_ext target to install python extension related
files.  Installation directory is governed by python distutils package
and follows the DESTDIR variable settings.

Also moving python extension build output into '$(O)python_ext_build'
directory and making it configurable via PYTHON_EXTBUILD variable.

Keeping the '$(O)python/perf.so' file, so it could be used for testing
as of until now.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110722113307.GA1931@jolsa.brq.redhat.com
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-08 12:54:26 -03:00
Jonathan Nieder
aba8d05607 perf tools: do not look at ./config for configuration
In addition to /etc/perfconfig and $HOME/.perfconfig, perf looks for
configuration in the file ./config, imitating git which looks at
$GIT_DIR/config.  If ./config is not a perf configuration file, it
fails, or worse, treats it as a configuration file and changes behavior
in some unexpected way.

"config" is not an unusual name for a file to be lying around and perf
does not have a private directory dedicated for its own use, so let's
just stop looking for configuration in the cwd.  Callers needing
context-sensitive configuration can use the PERF_CONFIG environment
variable.

Requested-by: Christian Ohm <chr.ohm@gmx.net>
Cc: 632923@bugs.debian.org
Cc: Ben Hutchings <ben@decadent.org.uk>
Cc: Christian Ohm <chr.ohm@gmx.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110805165838.GA7237@elie.gateway.2wire.net
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-08 09:46:32 -03:00
Jovi Zhang
ce27a443d1 perf probe: Fix coredump introduced by probe module option
perf will coredump if the user doesn't give the "-m" option in probe
command, this patch fixes it.

[root@localhost perf]# ./perf probe --add='PROBE'
Segmentation fault (core dumped)

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1311602888-2389-1-git-send-email-bookjovi@gmail.com
Signed-off-by: Jovi Zhang <bookjovi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-08-08 09:35:41 -03:00
Arnaldo Carvalho de Melo
3e9f45a7a4 perf python: Add PERF_RECORD_{LOST,READ,SAMPLE} routine tables
So those friggin "spurious" PERF_RECORD_MMAP events were actually a
brain fart copy'n'paste error in the python binding, doh. I.e. they
weren't MMAPs, just SAMPLEs.

Fix it by providing routines for these events instead of using the MMAP
ones.

Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-b0rc8y5jd03f9f11kftodvkm@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-07-25 17:13:27 -03:00
Arnaldo Carvalho de Melo
4152ab377b perf evlist: Introduce 'disable' method
To remove the last case of access to the FD() macro outside the library.

Inspired by a patch by Borislav that moved the FD() macro to util.h, for
namespace concerns I rather preferred to constrain it to ev{sel,list}.c.

Cc: Borislav Petkov <bp@amd64.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-qn893qsstcg366tkucu649qj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-07-25 11:06:19 -03:00
Han Pingtian
4f9bae351d perf buildid-cache: Zero out buffer of filenames when adding/removing buildid
The readlink() function doesn't append a null byte to buf. So we should
zero out buf with zalloc(). Or we'll see sometimes error like this:

[root@intel-s3e36-01]~# /usr/bin/perf buildid-cache -a /lib/modules/2.6.32-130.el6.x86_64/kernel/crypto/twofish_common.ko -v
Adding f64ba8efd5f53c7ad332fc17db1d21de309038e1 /lib/modules/2.6.32-130.el6.x86_64/kernel/crypto/twofish_common.ko: Ok
[root@intel-s3e36-01]~# /usr/bin/perf buildid-cache -r /lib/modules/2.6.32-130.el6.x86_64/kernel/crypto/twofish_common.ko -v
Removing f64ba8efd5f53c7ad332fc17db1d21de309038e1 /lib/modules/2.6.32-130.el6.x86_64/kernel/crypto/twofish_common.ko: FAIL
/lib/modules/2.6.32-130.el6.x86_64/kernel/crypto/twofish_common.ko wasn't in the cache

The change in build_id_cache__add_s() is a defense.

Tested-by: Jiri Olsa <jolsa@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20110718031314.GA5802@hpt.nay.redhat.com
Signed-off-by: Han Pingtian <phan@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-07-22 08:59:26 -03:00
Jiri Olsa
f120f9d51b perf tools: De-opt the parse_events function
Moving out the option parameter from parse_events function,
and adding new parse_events_option function instead.

The option parameter is used only to carry "struct perf_evlist"
pointer for chaining new events. Putting it away, enable us
to call parse_events from other places without using the
option parameter.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: acme@redhat.com
Cc: a.p.zijlstra@chello.nl
Cc: paulus@samba.org
Link: http://lkml.kernel.org/r/1310635534-4013-2-git-send-email-jolsa@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-21 10:41:11 +02:00
David Ahern
adc4bf9955 perf script: Fix display of IP address for non-callchain path
Non-callchain path is using al.addr which prints as:
  openssl 14564 17672.003587:       7862d _x86_64_AES_encrypt_compact

This should be sample->ip to print as:
  openssl 14564 17672.003587:  3f7867862d _x86_64_AES_encrypt_compact

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: acme@ghostprotocols.net
Cc: peterz@infradead.org
Cc: paulus@samba.org
Link: http://lkml.kernel.org/r/1306768587-15376-1-git-send-email-dsahern@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-21 10:09:28 +02:00
David Ahern
eda3913bb7 perf tools: Fix endian conversion reading event attr from file header
The perf_event_attr struct has two __u32's at the top and
they need to be swapped individually.

With this change I was able to analyze a perf.data collected in a
32-bit PPC VM on an x86 system. I tested both 32-bit and 64-bit
binaries for the Intel analysis side; both read the PPC perf.data
file correctly.

-v2:
 - changed the existing perf_event__attr_swap() to swap only elements
   of perf_event_attr and exported it for use in swapping the
   attributes in the file header
 - updated swap_ops used for processing events

Signed-off-by: David Ahern <dsahern@gmail.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: acme@ghostprotocols.net
Cc: peterz@infradead.org
Cc: paulus@samba.org
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/r/1310754849-12474-1-git-send-email-dsahern@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-21 09:57:36 +02:00
Jiri Olsa
0111919da2 perf tools: Add missing 'node' alias to the hw_cache[] array
Add "node" as a simple alias for NODE cache events.

The addition of NODE cache events broke the parse_alias
function, so any mismatched event caused the segfault, like:

  # ./perf stat -e krava ls

The hw_cache/hw_cache_op/hw_cache_result arrays needs to follow
PERF_COUNT_HW_CACHE_*MAX enums. Adding those MAXs to be size
of those arrays, so possible ommision in future wil not lead to
segfault.

Adding read/write/prefetch as allowed operations for node cache
event.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: acme@redhat.com
Link: http://lkml.kernel.org/r/20110713205818.GB7827@jolsa.brq.redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-21 09:54:51 +02:00
Masami Hiramatsu
14a8fd7cee perf probe: Support adding probes on offline kernel modules
Support adding probes on offline kernel modules. This enables
perf-probe to trace kernel-module init functions via perf-probe.
If user gives the path of module with -m option, perf-probe
expects the module is offline.
This feature works with --add, --funcs, and --vars.

E.g)
 # perf probe -m /lib/modules/`uname -r`/kernel/fs/btrfs/btrfs.ko \
   -a "extent_io_init:5 extent_state_cache"
 Add new events:
   probe:extent_io_init (on extent_io_init:5 with extent_state_cache)
   probe:extent_io_init_1 (on extent_io_init:5 with extent_state_cache)

 You can now use it on all perf tools, such as:

         perf record -e probe:extent_io_init_1 -aR sleep 1

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/20110627072751.6528.10230.stgit@fedora15
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-15 16:25:12 -04:00
Masami Hiramatsu
190b57fcb9 perf probe: Add probed module in front of function
Add probed module name and ":" in front of function name
if -m module option is given. In the result, the symbol
name passed to kprobe-tracer becomes MODULE:FUNCTION,
so that kallsyms can solve it as a symbol in the module
correctly.

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/20110627072745.6528.26416.stgit@fedora15
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-15 16:19:08 -04:00
Masami Hiramatsu
ff74178350 perf probe: Introduce debuginfo to encapsulate dwarf information
Introduce debuginfo to encapsulate dwarf information.
This new object allows us to reuse and expand debuginfo easily.

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/20110627072739.6528.12438.stgit@fedora15
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-15 16:14:19 -04:00
Masami Hiramatsu
e0d153c690 perf-probe: Move dwarf library routines to dwarf-aux.{c, h}
Move dwarf library related routines to dwarf-aux.{c,h}.
This includes several minor changes.
- Add simple documents for each API.
- Rename die_find_real_subprogram() to die_find_realfunc()
- Rename line_walk_handler_t to line_walk_callback_t.
- Minor cleanups.

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/20110627072727.6528.57647.stgit@fedora15
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-15 16:10:17 -04:00
Masami Hiramatsu
bcfc082150 perf probe: Remove redundant dwarf functions
Since there are dwarf_bitsize, dwarf_bitoffset and dwarf_bytesize
defined in libdw, we don't need die_get_bit_size, die_get_bit_offset
and die_get_byte_size anymore.

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/20110627072721.6528.2747.stgit@fedora15
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-15 16:04:47 -04:00
Masami Hiramatsu
bad03ae476 perf probe: Move strtailcmp to string.c
Since strtailcmp() is enough generic, it should be defined in string.c.

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/20110627072715.6528.10677.stgit@fedora15
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-15 16:00:47 -04:00
Masami Hiramatsu
baad2d3e69 perf probe: Rename DIE_FIND_CB_FOUND to DIE_FIND_CB_END
Since die_find/walk* callbacks use DIE_FIND_CB_FOUND for
both of failed and found cases, it should be "END"
instead "FOUND" for avoiding confusion.

Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Reported-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Link: http://lkml.kernel.org/r/20110627072709.6528.45706.stgit@fedora15
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-15 15:55:57 -04:00
Sonny Rao
259032bfe3 perf: Robustify proc and debugfs file recording
While attempting to create a timechart of boot up I found perf didn't
tolerate modules being loaded/unloaded.  This patch fixes this by
reading the file once and then writing the size read at the correct
point in the file.  It also simplifies the code somewhat.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: Sonny Rao <sonnyrao@chromium.org>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Link: http://lkml.kernel.org/r/10011.1310614483@neuling.org
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-14 15:53:01 -04:00
Anton Blanchard
5d67be97f8 perf report/annotate/script: Add option to specify a CPU range
Add an option to perf report/annotate/script to specify which
CPUs to operate on. This enables us to take a single system wide
profile and analyse each CPU (or group of CPUs) in isolation.

This was useful when profiling a multiprocess workload where the
bottleneck was on one CPU but this was hidden in the overall
profile. Per process and per thread breakdowns didn't help
because multiple processes were running on each CPU and no
single process consumed an entire CPU.

The patch converts the list of CPUs returned by cpu_map__new
into a bitmap for fast lookup. I wanted to use -C to be
consistent with perf top/record/stat, but unfortunately perf
report already uses -C <comms>.

 v2: Incorporate suggestions from David Ahern:
	- Added -c to perf script
	- Check that SAMPLE_CPU is set when -c is used
	- Update documentation

 v3: Create perf_session__cpu_bitmap()

Signed-off-by: Anton Blanchard <anton@samba.org>
Acked-by: David Ahern <dsahern@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Link: http://lkml.kernel.org/r/20110704215750.11647eb9@kryten
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-05 10:44:44 +02:00
Ingo Molnar
343a031f3c Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/core 2011-07-01 11:51:58 +02:00
Ingo Molnar
10e6962765 Merge commit 'v3.0-rc5' into perf/core
Merge reason: Pick up the latest fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-01 10:28:46 +02:00
Frederic Weisbecker
fd8ea21276 perf tools: Allow sort dimensions to be registered more than once
So that the parent sort dimension can be registered twice: once
if we add it as an explicit sort dimension (-s parent) and twice
if we request a parent filter (-p foo).

We'll have only one parent sort dimension in the end but this
allows to override the default parent filter with we gave in "-p"
option. The goal of this is to prepare to allow the use of
"-s parent" and "-p foo" at the same time, ie: sort by filtered
parent.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
2011-06-30 00:26:41 +02:00
Frederic Weisbecker
e84d21227c perf tools: Don't display ignored entries on stdio ui
As for newt ui, don't display entries that have been marked
as ignored.

The practical current effect of this is to make parent
filtering really working. Before, entries that were ignored
were given a null parent but were still displayed. This
resulted in some weird effects:

 # Overhead      Command      Shared Object        Symbol
 # ........  ...........  .................  ............
 #
^A
                   |
                   --- __lock_acquire
                      |
                      |--95.97%-- lock_acquire
                      |          |
                      |          |--30.75%-- _raw_spin_lock

Discard these from the stdio display.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
2011-06-30 00:26:33 +02:00
Frederic Weisbecker
2fd701bc78 perf tools: Remove sort print helpers declarations
These are probably some old leftovers.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
2011-06-30 00:26:19 +02:00
Frederic Weisbecker
872a878fb1 perf tools: Make sort operations static
These don't need to be globally visible.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Sam Liao <phyomh@gmail.com>
2011-06-30 00:25:12 +02:00
Sam Liao
d797fdc5c5 perf tools: Add inverted call graph report support.
Add "caller/callee" option to support inverted butterfly report,
in the inverted report (with caller option), the call graph start
from the callee's ancestor. Users can use such view to catch system's
performance bottleneck from a sysprof like view. Using this option
with specified sort order like pid gives us high level view of call
graph statistics.

Also add "-G" alias for inverted call graph.

Signed-off-by: Sam Liao <phyomh@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2011-06-30 00:24:30 +02:00
Linus Torvalds
357ed6b1a1 Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  rcu: Move RCU_BOOST #ifdefs to header file
  rcu: use softirq instead of kthreads except when RCU_BOOST=y
  rcu: Use softirq to address performance regression
  rcu: Simplify curing of load woes
2011-06-19 08:56:56 -07:00
Linus Torvalds
7cc2ed0589 Merge branch 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6:
  kbuild: Call depmod.sh via shell
  perf: clear out make flags when calling kernel make kernelver
2011-06-16 10:26:58 -07:00
Ingo Molnar
b4f9f2b64a Merge commit 'v3.0-rc3' into perf/core
Merge reason: add the latest fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-06-16 13:23:22 +02:00
Andy Whitcroft
37aa9a2eb4 perf: clear out make flags when calling kernel make kernelver
When generating the perf version from the kernel version using 'make
kernelver' it is necessary to clear out any MAKEFLAGS otherwise they may
trigger additional output which pollute the contents.

Signed-off-by: Andy Whitcroft <apw@canonical.com>
Signed-off-by: Michal Marek <mmarek@suse.cz>
2011-06-15 22:12:55 +02:00
Shaohua Li
09223371de rcu: Use softirq to address performance regression
Commit a26ac2455ffcf3(rcu: move TREE_RCU from softirq to kthread)
introduced performance regression. In an AIM7 test, this commit degraded
performance by about 40%.

The commit runs rcu callbacks in a kthread instead of softirq. We observed
high rate of context switch which is caused by this. Out test system has
64 CPUs and HZ is 1000, so we saw more than 64k context switch per second
which is caused by RCU's per-CPU kthread.  A trace showed that most of
the time the RCU per-CPU kthread doesn't actually handle any callbacks,
but instead just does a very small amount of work handling grace periods.
This means that RCU's per-CPU kthreads are making the scheduler do quite
a bit of work in order to allow a very small amount of RCU-related
processing to be done.

Alex Shi's analysis determined that this slowdown is due to lock
contention within the scheduler.  Unfortunately, as Peter Zijlstra points
out, the scheduler's real-time semantics require global action, which
means that this contention is inherent in real-time scheduling.  (Yes,
perhaps someone will come up with a workaround -- otherwise, -rt is not
going to do well on large SMP systems -- but this patch will work around
this issue in the meantime.  And "the meantime" might well be forever.)

This patch therefore re-introduces softirq processing to RCU, but only
for core RCU work.  RCU callbacks are still executed in kthread context,
so that only a small amount of RCU work runs in softirq context in the
common case.  This should minimize ksoftirqd execution, allowing us to
skip boosting of ksoftirqd for CONFIG_RCU_BOOST=y kernels.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Tested-by: "Alex,Shi" <alex.shi@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-06-14 15:25:39 -07:00
Linus Torvalds
6aecceccf5 Merge branch 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6:
  perf: Use make kernelversion instead of parsing the Makefile
  kbuild: Hack for depmod not handling X.Y versions
  kbuild: Move depmod call to a separate script
  kbuild: Fix <linux/version.h> for empty SUBLEVEL or PATCHLEVEL
  kbuild: Fix KERNELVERSION for empty SUBLEVEL or PATCHLEVEL
  kbuild: silence Nothing to be done for 'all' message
2011-06-09 16:27:42 -07:00
Michal Marek
5d61b9fd19 perf: Use make kernelversion instead of parsing the Makefile
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: Michal Marek <mmarek@suse.cz>
2011-06-09 23:05:54 +02:00
Frederic Weisbecker
b273fa9716 perf python: Fix argument name list of read_on_cpu()
Mandatory arguments need to be present in the argument name list, as
well as optional arguments, otherwise python barfs:

	# ./python/twatch.py
	Traceback (most recent call last):
	  File "./python/twatch.py", line 41, in <module>
	    main()
	  File "./python/twatch.py", line 32, in main
	    event = evlist.read_on_cpu(cpu)
	RuntimeError: more argument specifiers than keyword list entries

Hence, add cpu to the name list.

Cc: David Ahern <daahern@cisco.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/r/1301588863-20210-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-03 10:09:22 -03:00
Arnaldo Carvalho de Melo
56722381b8 perf evlist: Don't die if sample_{id_all|type} is invalid
Fixes two more cases where the python binding would not load:

. Not finding die(), which it shouldn't anyway, not good to just stop the
  world because some particular perf.data file is invalid, just propagate
  the error to the caller.

. Not finding perf_sample_size: fix it by moving it from event.c to evsel,
  where it belongs, as most cases are moving to operate on an evsel object.o

One of the fixed problems:

[root@emilia ~]# python
>>> import perf
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: perf_sample_size
>>>
[root@emilia ~]#

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-1hkj7b2cvgbfnoizsekjb6c9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-03 10:07:52 -03:00
Arnaldo Carvalho de Melo
9c850d6c4b perf python: Use exception to propagate errors
We were using pr_debug to tell the user about not being able to parse a sample
where we should really use the python way of reporting errors: exceptions.

Fixes this problem:

[root@emilia ~]# python
>>> import perf
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: eprintf
>>>
[root@emilia ~]

As we want to keep the objects linked in the python binding (and in the future
in a shared library) minimal.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-m9dba9kaluas0kq8r58z191c@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-03 10:07:01 -03:00
Arnaldo Carvalho de Melo
d21cc9f67d perf evlist: Remove dependency on debug routines
So far we avoided having to link debug.o in the python binding, keep it
that way by not using ui__warning() in evlist.c.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-4wtew8hd3g7ejnlehtspys2t@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-03 10:05:23 -03:00
David Ahern
7cec092238 perf script: Add printing of sample address
Resolve to a function or variable if possible and if the sym option is
enabled.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1306782503-22002-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-02 13:31:01 -03:00
David Ahern
610723f24e perf script: Make printing of dso a separate field option
The 'sym' option displays both the function name and the DSO it comes
from. Split the display of the dso into a separate option.  This allows
display of the ip address and symbol without the dso, thus shortening
line lengths - and decluttering the output a bit.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1306528124-25861-3-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-02 13:29:14 -03:00
David Ahern
787bef174f perf script: "sym" field really means show IP data
Currently the "sym" output field is used to dump instruction pointers
and callchain stack. Sample addresses can also be converted to symbols,
so the meaning of "sym" needs to be fixed. This patch adds an "ip"
option and if it is selected the user can also opt to dump symbols for
them. If the user opts to dump IP without syms only the address is
shown.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1306528124-25861-2-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-02 13:28:34 -03:00
David Ahern
2cee77c450 perf stat: clarify unsupported events from uncounted events
perf stat continues running even if the event list contains counters
that are not supported. The resulting output then contains <not counted>
for those events which gets confusing as to which events are supported,
but not counted and which are not supported.

Before:

perf stat -ddd -- sleep 1

      Performance counter stats for 'sleep 1':

          0.571283 task-clock                #    0.001 CPUs utilized
                 1 context-switches          #    0.002 M/sec
                 0 CPU-migrations            #    0.000 M/sec
               157 page-faults               #    0.275 M/sec
         1,037,707 cycles                    #    1.816 GHz
     <not counted> stalled-cycles-frontend
     <not counted> stalled-cycles-backend
           654,499 instructions              #    0.63  insns per cycle
           136,129 branches                  #  238.286 M/sec
     <not counted> branch-misses
     <not counted> L1-dcache-loads
     <not counted> L1-dcache-load-misses
     <not counted> LLC-loads
     <not counted> LLC-load-misses
     <not counted> L1-icache-loads
     <not counted> L1-icache-load-misses
     <not counted> dTLB-loads
     <not counted> dTLB-load-misses
     <not counted> iTLB-loads
     <not counted> iTLB-load-misses
     <not counted> L1-dcache-prefetches
     <not counted> L1-dcache-prefetch-misses

       1.001004836 seconds time elapsed

After:

perf stat -ddd -- sleep 1

 Performance counter stats for 'sleep 1':

          1.350326 task-clock                #    0.001 CPUs utilized
                 2 context-switches          #    0.001 M/sec
                 0 CPU-migrations            #    0.000 M/sec
               157 page-faults               #    0.116 M/sec
            11,986 cycles                    #    0.009 GHz
   <not supported> stalled-cycles-frontend
   <not supported> stalled-cycles-backend
           496,986 instructions              #   41.46  insns per cycle
           138,065 branches                  #  102.246 M/sec
             7,245 branch-misses             #    5.25% of all branches
     <not counted> L1-dcache-loads
     <not counted> L1-dcache-load-misses
     <not counted> LLC-loads
     <not counted> LLC-load-misses
     <not counted> L1-icache-loads
     <not counted> L1-icache-load-misses
     <not counted> dTLB-loads
     <not counted> dTLB-load-misses
     <not counted> iTLB-loads
     <not counted> iTLB-load-misses
     <not counted> L1-dcache-prefetches
   <not supported> L1-dcache-prefetch-misses

       1.002397333 seconds time elapsed

v1->v2:
changed supported type from int to bool

v2->v3
fixed vertical alignment of new struct element

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1306767359-13221-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-02 13:26:15 -03:00
Frederic Weisbecker
64348153c6 perf python: Cleanup useless double NULL termination in method arg names
The list of methods argument names only needs to be NULL terminated
once. Remove the second ones.

Cc: David Ahern <daahern@cisco.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/r/1301588863-20210-2-git-send-email-fweisbec@gmail.com
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-02 13:21:26 -03:00
Frederic Weisbecker
e95cc02880 perf python: Fix argument name list of read_on_cpu()
Mandatory arguments need to be present in the argument name list, as
well as optional arguments, otherwise python barfs:

	# ./python/twatch.py
	Traceback (most recent call last):
	  File "./python/twatch.py", line 41, in <module>
	    main()
	  File "./python/twatch.py", line 32, in main
	    event = evlist.read_on_cpu(cpu)
	RuntimeError: more argument specifiers than keyword list entries

Hence, add cpu to the name list.

Cc: David Ahern <daahern@cisco.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/r/1301588863-20210-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-02 13:21:07 -03:00
Arnaldo Carvalho de Melo
c2a70653af perf evlist: Don't die if sample_{id_all|type} is invalid
Fixes two more cases where the python binding would not load:

. Not finding die(), which it shouldn't anyway, not good to just stop the
  world because some particular perf.data file is invalid, just propagate
  the error to the caller.

. Not finding perf_sample_size: fix it by moving it from event.c to evsel,
  where it belongs, as most cases are moving to operate on an evsel object.o

One of the fixed problems:

[root@emilia ~]# python
>>> import perf
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: perf_sample_size
>>>
[root@emilia ~]#

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-1hkj7b2cvgbfnoizsekjb6c9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-02 11:04:54 -03:00
Arnaldo Carvalho de Melo
5c6970af2f perf python: Use exception to propagate errors
We were using pr_debug to tell the user about not being able to parse a sample
where we should really use the python way of reporting errors: exceptions.

Fixes this problem:

[root@emilia ~]# python
>>> import perf
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: eprintf
>>>
[root@emilia ~]

As we want to keep the objects linked in the python binding (and in the future
in a shared library) minimal.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-m9dba9kaluas0kq8r58z191c@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-02 10:55:10 -03:00
Arnaldo Carvalho de Melo
bccdaba044 perf evlist: Remove dependency on debug routines
So far we avoided having to link debug.o in the python binding, keep it
that way by not using ui__warning() in evlist.c.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-4wtew8hd3g7ejnlehtspys2t@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-06-02 10:41:41 -03:00
David Ahern
4af4c9550c perf events: initialize fd array to -1 instead of 0
perf_evsel__alloc_fd allocates an array of file descriptors with the
memory initialized to 0. The array has dimensions for cpus and threads.

Later, __perf_evsel__open calls sys_perf_event_open for each cpu and thread
dimensions. If the open fails for any of the cpus or threads then the fd's
for this event are closed and the fd entry in the array is set to -1. Now,
if the first attempt fails for the event (e.g., the event is not supported)
the remaining dimensions (cpu > 0 and thread > 0) are not touched and left
at the initialized value of 0.

builtin-stat catches ENOENT and ENOSYS failures and allows the command to
continue. The end result is that stat attempts to read from an fd of 0 which
of course is stdin and so the command hangs until you type ctrl-D.

Resolve by initializing the array to -1 since an fd < 0 is already
handled.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1306511914-8016-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-05-27 16:02:12 -03:00
Arnaldo Carvalho de Melo
75911c9bd1 perf tools: Fix build on older systems
Where /usr/include/linux/const.h is not present, e.g. RHEL5.

Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/n/tip-ypcw2mu0w7dl1rrc6ncz3pee@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-05-26 11:16:29 -03:00
Arnaldo Carvalho de Melo
ec80fde746 perf symbols: Handle /proc/sys/kernel/kptr_restrict
Perf uses /proc/modules to figure out where kernel modules are loaded.

With the advent of kptr_restrict, non root users get zeroes for all module
start addresses.

So check if kptr_restrict is non zero and don't generate the syntethic
PERF_RECORD_MMAP events for them.

Warn the user about it in perf record and in perf report.

In perf report the reference relocation symbol being zero means that
kptr_restrict was set, thus /proc/kallsyms has only zeroed addresses, so don't
use it to fixup symbol addresses when using a valid kallsyms (in the buildid
cache) or vmlinux (in the vmlinux path) build-id located automatically or
specified by the user.

Provide an explanation about it in 'perf report' if kernel samples were taken,
checking if a suitable vmlinux or kallsyms was found/specified.

Restricted /proc/kallsyms don't go to the buildid cache anymore.

Example:

 [acme@emilia ~]$ perf record -F 100000 sleep 1

 WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted, check
 /proc/sys/kernel/kptr_restrict.

 Samples in kernel functions may not be resolved if a suitable vmlinux file is
 not found in the buildid cache or in the vmlinux path.

 Samples in kernel modules won't be resolved at all.

 If some relocation was applied (e.g. kexec) symbols may be misresolved even
 with a suitable vmlinux or kallsyms file.

 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.005 MB perf.data (~231 samples) ]
 [acme@emilia ~]$

 [acme@emilia ~]$ perf report --stdio
 Kernel address maps (/proc/{kallsyms,modules}) were restricted,
 check /proc/sys/kernel/kptr_restrict before running 'perf record'.

 If some relocation was applied (e.g. kexec) symbols may be misresolved.

 Samples in kernel modules can't be resolved as well.

 # Events: 13  cycles
 #
 # Overhead  Command      Shared Object                 Symbol
 # ........  .......  .................  .....................
 #
    20.24%    sleep  [kernel.kallsyms]  [k] page_fault
    20.04%    sleep  [kernel.kallsyms]  [k] filemap_fault
    19.78%    sleep  [kernel.kallsyms]  [k] __lru_cache_add
    19.69%    sleep  ld-2.12.so         [.] memcpy
    14.71%    sleep  [kernel.kallsyms]  [k] dput
     4.70%    sleep  [kernel.kallsyms]  [k] flush_signal_handlers
     0.73%    sleep  [kernel.kallsyms]  [k] perf_event_comm
     0.11%    sleep  [kernel.kallsyms]  [k] native_write_msr_safe

 #
 # (For a higher level overview, try: perf report --sort comm,dso)
 #
 [acme@emilia ~]$

This is because it found a suitable vmlinux (build-id checked) in
/lib/modules/2.6.39-rc7+/build/vmlinux (use -v in perf report to see the long
file name).

If we remove that file from the vmlinux path:

 [root@emilia ~]# mv /lib/modules/2.6.39-rc7+/build/vmlinux \
		     /lib/modules/2.6.39-rc7+/build/vmlinux.OFF
 [acme@emilia ~]$ perf report --stdio
 [kernel.kallsyms] with build id 57298cdbe0131f6871667ec0eaab4804dcf6f562
 not found, continuing without symbols

 Kernel address maps (/proc/{kallsyms,modules}) were restricted, check
 /proc/sys/kernel/kptr_restrict before running 'perf record'.

 As no suitable kallsyms nor vmlinux was found, kernel samples can't be
 resolved.

 Samples in kernel modules can't be resolved as well.

 # Events: 13  cycles
 #
 # Overhead  Command      Shared Object  Symbol
 # ........  .......  .................  ......
 #
    80.31%    sleep  [kernel.kallsyms]  [k] 0xffffffff8103425a
    19.69%    sleep  ld-2.12.so         [.] memcpy

 #
 # (For a higher level overview, try: perf report --sort comm,dso)
 #
 [acme@emilia ~]$

Reported-by: Stephane Eranian <eranian@google.com>
Suggested-by: David Miller <davem@davemloft.net>
Cc: Dave Jones <davej@redhat.com>
Cc: David Miller <davem@davemloft.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Kees Cook <kees.cook@canonical.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/n/tip-mt512joaxxbhhp1odop04yit@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-05-26 11:15:25 -03:00
Linus Torvalds
5214638384 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf tools: Fix sample type size calculation in 32 bits archs
  profile: Use vzalloc() rather than vmalloc() & memset()
2011-05-23 21:20:48 -07:00
Frederic Weisbecker
0f61f3e4db perf tools: Fix sample type size calculation in 32 bits archs
The shift used here to count the number of bits set in
the mask doesn't work above the low part for archs that
are not 64 bits.

Fix the constant used for the shift.

This fixes a 32-bit perf top failure reported by Eric Dumazet:

	Can't parse sample, err = -14
	Can't parse sample, err = -14
	...

Reported-and-tested-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com
Link: http://lkml.kernel.org/r/1306200686-17317-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-24 04:33:24 +02:00
Linus Torvalds
19504828b4 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf tools: Fix sample size bit operations
  perf tools: Fix ommitted mmap data update on remap
  watchdog: Change the default timeout and configure nmi watchdog period based on watchdog_thresh
  watchdog: Disable watchdog when thresh is zero
  watchdog: Only disable/enable watchdog if neccessary
  watchdog: Fix rounding bug in get_sample_period()
  perf tools: Propagate event parse error handling
  perf tools: Robustify dynamic sample content fetch
  perf tools: Pre-check sample size before parsing
  perf tools: Move evlist sample helpers to evlist area
  perf tools: Remove junk code in mmap size handling
  perf tools: Check we are able to read the event size on mmap
2011-05-23 09:25:52 -07:00
Linus Torvalds
57d19e80f4 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
  b43: fix comment typo reqest -> request
  Haavard Skinnemoen has left Atmel
  cris: typo in mach-fs Makefile
  Kconfig: fix copy/paste-ism for dell-wmi-aio driver
  doc: timers-howto: fix a typo ("unsgined")
  perf: Only include annotate.h once in tools/perf/util/ui/browsers/annotate.c
  md, raid5: Fix spelling error in comment ('Ofcourse' --> 'Of course').
  treewide: fix a few typos in comments
  regulator: change debug statement be consistent with the style of the rest
  Revert "arm: mach-u300/gpio: Fix mem_region resource size miscalculations"
  audit: acquire creds selectively to reduce atomic op overhead
  rtlwifi: don't touch with treewide double semicolon removal
  treewide: cleanup continuations and remove logging message whitespace
  ath9k_hw: don't touch with treewide double semicolon removal
  include/linux/leds-regulator.h: fix syntax in example code
  tty: fix typo in descripton of tty_termios_encode_baud_rate
  xtensa: remove obsolete BKL kernel option from defconfig
  m68k: fix comment typo 'occcured'
  arch:Kconfig.locks Remove unused config option.
  treewide: remove extra semicolons
  ...
2011-05-23 09:12:26 -07:00
Frederic Weisbecker
3cb6d15408 perf tools: Fix sample size bit operations
What we want is to count the number of bits in the mask,
not some other random operation written in the middle
of the night.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1306148788-6179-2-git-send-email-fweisbec@gmail.com
[ Fixed perf_event__names[] alignment which was nearby and hurting my eyes ... ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-23 13:26:36 +02:00
Frederic Weisbecker
998bedc8c5 perf tools: Fix ommitted mmap data update on remap
Commit eac9eacee1 "perf tools: Check we are able to read the event
size on mmap" brought a check to ensure we can read the size of the
event before dereferencing it, and do a remap otherwise to move the
buffer forward.

However that remap was ommitting all the necessary work to
update the new page offset, head, and to unmap previous pages,
etc...

To fix this, gather all the code that fetches the event in a
seperate helper which does all the necessary checks about the
header/event size and tells us anytime a remap is needed.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1306148788-6179-3-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-23 13:22:57 +02:00
Ingo Molnar
3ac1bbcf13 Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into perf/urgent
Conflicts:
	tools/perf/builtin-top.c

Semantic conflict:
	util/include/linux/list.h        # fix prefetch.h removal fallout

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-22 10:10:01 +02:00
Frederic Weisbecker
5538becaec perf tools: Propagate event parse error handling
Better handle event parsing error by propagating the details
in upper layers or by dumping some failure message. So that
the user knows he has some crazy events in the batch.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
2011-05-22 03:38:49 +02:00
Frederic Weisbecker
98e1da905c perf tools: Robustify dynamic sample content fetch
Ensure the size of the dynamic fields such as callchains
or raw events don't overlap the whole event boundaries.

This prevents from dereferencing junk if the given size of
the callchain goes too eager.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
2011-05-22 03:38:48 +02:00
Frederic Weisbecker
a285412479 perf tools: Pre-check sample size before parsing
Check that the total size of the sample fields having a fixed
size do not exceed the one of the whole event. This robustifies
the sample parsing.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
2011-05-22 03:38:36 +02:00
Frederic Weisbecker
74429964d8 perf tools: Move evlist sample helpers to evlist area
These APIs should belong to evlist.c as they may not be
exclusively tied to the headers.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com
2011-05-22 03:12:29 +02:00
Frederic Weisbecker
dd5f5fd108 perf tools: Remove junk code in mmap size handling
size is overriden later and used only then. Those
lines are only junk, probably a leftover.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
2011-05-22 03:12:28 +02:00
Frederic Weisbecker
eac9eacee1 perf tools: Check we are able to read the event size on mmap
Check we have enough mmaped space to read the current event
size from its headers, otherwise we may dereference some
hell there.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
2011-05-22 03:12:13 +02:00
Linus Torvalds
268bb0ce3e sanitize <linux/prefetch.h> usage
Commit e66eed651f ("list: remove prefetching from regular list
iterators") removed the include of prefetch.h from list.h, which
uncovered several cases that had apparently relied on that rather
obscure header file dependency.

So this fixes things up a bit, using

   grep -L linux/prefetch.h $(git grep -l '[^a-z_]prefetchw*(' -- '*.[ch]')
   grep -L 'prefetchw*(' $(git grep -l 'linux/prefetch.h' -- '*.[ch]')

to guide us in finding files that either need <linux/prefetch.h>
inclusion, or have it despite not needing it.

There are more of them around (mostly network drivers), but this gets
many core ones.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-20 12:50:29 -07:00
Linus Torvalds
eb04f2f04e Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (78 commits)
  Revert "rcu: Decrease memory-barrier usage based on semi-formal proof"
  net,rcu: convert call_rcu(prl_entry_destroy_rcu) to kfree
  batman,rcu: convert call_rcu(softif_neigh_free_rcu) to kfree_rcu
  batman,rcu: convert call_rcu(neigh_node_free_rcu) to kfree()
  batman,rcu: convert call_rcu(gw_node_free_rcu) to kfree_rcu
  net,rcu: convert call_rcu(kfree_tid_tx) to kfree_rcu()
  net,rcu: convert call_rcu(xt_osf_finger_free_rcu) to kfree_rcu()
  net/mac80211,rcu: convert call_rcu(work_free_rcu) to kfree_rcu()
  net,rcu: convert call_rcu(wq_free_rcu) to kfree_rcu()
  net,rcu: convert call_rcu(phonet_device_rcu_free) to kfree_rcu()
  perf,rcu: convert call_rcu(swevent_hlist_release_rcu) to kfree_rcu()
  perf,rcu: convert call_rcu(free_ctx) to kfree_rcu()
  net,rcu: convert call_rcu(__nf_ct_ext_free_rcu) to kfree_rcu()
  net,rcu: convert call_rcu(net_generic_release) to kfree_rcu()
  net,rcu: convert call_rcu(netlbl_unlhsh_free_addr6) to kfree_rcu()
  net,rcu: convert call_rcu(netlbl_unlhsh_free_addr4) to kfree_rcu()
  security,rcu: convert call_rcu(sel_netif_free) to kfree_rcu()
  net,rcu: convert call_rcu(xps_dev_maps_release) to kfree_rcu()
  net,rcu: convert call_rcu(xps_map_release) to kfree_rcu()
  net,rcu: convert call_rcu(rps_map_release) to kfree_rcu()
  ...
2011-05-19 18:14:34 -07:00
Linus Torvalds
df48d8716e Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (107 commits)
  perf stat: Add more cache-miss percentage printouts
  perf stat: Add -d -d and -d -d -d options to show more CPU events
  ftrace/kbuild: Add recordmcount files to force full build
  ftrace: Add self-tests for multiple function trace users
  ftrace: Modify ftrace_set_filter/notrace to take ops
  ftrace: Allow dynamically allocated function tracers
  ftrace: Implement separate user function filtering
  ftrace: Free hash with call_rcu_sched()
  ftrace: Have global_ops store the functions that are to be traced
  ftrace: Add ops parameter to ftrace_startup/shutdown functions
  ftrace: Add enabled_functions file
  ftrace: Use counters to enable functions to trace
  ftrace: Separate hash allocation and assignment
  ftrace: Create a global_ops to hold the filter and notrace hashes
  ftrace: Use hash instead for FTRACE_FL_FILTER
  ftrace: Replace FTRACE_FL_NOTRACE flag with a hash of ignored functions
  perf bench, x86: Add alternatives-asm.h wrapper
  x86, 64-bit: Fix copy_[to/from]_user() checks for the userspace address limit
  x86, mem: memset_64.S: Optimize memset by enhanced REP MOVSB/STOSB
  x86, mem: memmove_64.S: Optimize memmove by enhanced REP MOVSB/STOSB
  ...
2011-05-19 17:36:08 -07:00
Ingo Molnar
b313207286 perf bench, x86: Add alternatives-asm.h wrapper
perf bench needs this to build the kernel's memcpy routine:

In file included from bench/mem-memcpy-x86-64-asm.S:2:0:
bench/../../../arch/x86/lib/memcpy_64.S:7:33: fatal error: asm/alternative-asm.h: No such file or directory

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/n/tip-c5d41xibgullk8h2280q4gv0@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-18 21:00:44 +02:00
Stephane Eranian
94692349c4 perf: Fix multi-event parsing bug
This patch fixes an issue with event parsing.
The following commit appears to have broken the
ability to specify a comma separated list of events:

   commit ceb53fbf6d
   Author: Ingo Molnar <mingo@elte.hu>
   Date:   Wed Apr 27 04:06:33 2011 +0200

       perf stat: Fail more clearly when an invalid modifier is specified

This patch fixes this while preserving the desired effect:

$ perf stat -e instructions:u,instructions:k ls /dev/null /dev/null

 Performance counter stats for 'ls /dev/null':

            365956 instructions:u           #    0.00  insns per cycle
            731806 instructions:k           #    0.00  insns per cycle

        0.001108862  seconds time elapsed

$ perf stat -e task-clock-msecs true
invalid event modifier: '-msecs'
Run 'perf list' for a list of valid events and modifiers

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: acme@redhat.com
Cc: peterz@infradead.org
Cc: fweisbec@gmail.com
Link: http://lkml.kernel.org/r/20110517133619.GA6999@quad
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-17 20:45:36 +02:00
Arnaldo Carvalho de Melo
aece948f5d perf evlist: Fix per thread mmap setup
The PERF_EVENT_IOC_SET_OUTPUT ioctl was returning -EINVAL when using
--pid when monitoring multithreaded apps, as we can only share a ring
buffer for events on the same thread if not doing per cpu.

Fix it by using per thread ring buffers.

Tested with:

[root@felicio ~]# tuna -t 26131 -CP | nl
  1                      thread       ctxt_switches
  2    pid SCHED_ rtpri affinity voluntary nonvoluntary             cmd
  3 26131   OTHER     0      0,1  10814276      2397830 chromium-browse
  4  642    OTHER     0      0,1     14688            0 chromium-browse
  5  26148  OTHER     0      0,1    713602       115479 chromium-browse
  6  26149  OTHER     0      0,1    801958         2262 chromium-browse
  7  26150  OTHER     0      0,1   1271128          248 chromium-browse
  8  26151  OTHER     0      0,1         3            0 chromium-browse
  9  27049  OTHER     0      0,1     36796            9 chromium-browse
 10  618    OTHER     0      0,1     14711            0 chromium-browse
 11  661    OTHER     0      0,1     14593            0 chromium-browse
 12  29048  OTHER     0      0,1     28125            0 chromium-browse
 13  26143  OTHER     0      0,1   2202789          781 chromium-browse
[root@felicio ~]#

So 11 threads under pid 26131, then:

[root@felicio ~]# perf record -F 50000 --pid 26131

[root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl
  1 7fa4a2538000-7fa4a25b9000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
  2 7fa4a25b9000-7fa4a263a000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
  3 7fa4a263a000-7fa4a26bb000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
  4 7fa4a26bb000-7fa4a273c000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
  5 7fa4a273c000-7fa4a27bd000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
  6 7fa4a27bd000-7fa4a283e000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
  7 7fa4a283e000-7fa4a28bf000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
  8 7fa4a28bf000-7fa4a2940000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
  9 7fa4a2940000-7fa4a29c1000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
 10 7fa4a29c1000-7fa4a2a42000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
 11 7fa4a2a42000-7fa4a2ac3000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
[root@felicio ~]#

11 mmaps, one per thread since we didn't specify any CPU list, so we need one
mmap per thread and:

[root@felicio ~]# perf record -F 50000 --pid 26131
^M
^C[ perf record: Woken up 79 times to write data ]
[ perf record: Captured and wrote 20.614 MB perf.data (~900639 samples) ]

[root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl
     1	 371310 26131
     2	  96516 26148
     3	  95694 26149
     4	  95203 26150
     5	   7291 26143
     6	     87 27049
     7	     76 661
     8	     60 29048
     9	     47 618
    10	     43 642
[root@felicio ~]#

Ok, one of the threads, 26151 was quiescent, so no samples there, but all the
others are there.

Then, if I specify one CPU:

[root@felicio ~]# perf record -F 50000 --pid 26131 --cpu 1
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.680 MB perf.data (~29730 samples) ]

[root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl
     1	   8444 26131
     2	   2584 26149
     3	   2518 26148
     4	   2324 26150
     5	    123 26143
     6	      9 661
     7	      9 29048
[root@felicio ~]#

This machine has two cores, so fewer threads appeared on the radar, and:

[root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl
 1 7f484b922000-7f484b9a3000 rwxs 00000000 00:09 4064 anon_inode:[perf_event]
[root@felicio ~]#

Just one mmap, as now we can use just one per-cpu buffer instead of the
per-thread needed in the previous case.

For global profiling:

[root@felicio ~]# perf record -F 50000 -a
^C[ perf record: Woken up 26 times to write data ]
[ perf record: Captured and wrote 7.128 MB perf.data (~311412 samples) ]

[root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl
     1	7fb49b435000-7fb49b4b6000 rwxs 00000000 00:09 4064                       anon_inode:[perf_event]
     2	7fb49b4b6000-7fb49b537000 rwxs 00000000 00:09 4064                       anon_inode:[perf_event]
[root@felicio ~]#

It uses per-cpu buffers.

For just one thread:

[root@felicio ~]# perf record -F 50000 --tid 26148
^C[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.330 MB perf.data (~14426 samples) ]

[root@felicio ~]# perf report -D | grep PERF_RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort -n | uniq -c | sort -nr | nl
     1	   9969 26148
[root@felicio ~]#

[root@felicio ~]# grep perf_event /proc/`pidof perf`/maps | nl
     1	7f286a51b000-7f286a59c000 rwxs 00000000 00:09 4064                       anon_inode:[perf_event]
[root@felicio ~]#

Tested-by: David Ahern <dsahern@gmail.com>
Tested-by: Lin Ming <ming.m.lin@intel.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/r/20110426204401.GB1746@ghostprotocols.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-05-15 10:02:14 -03:00
Arnaldo Carvalho de Melo
b901941819 perf tools: Honour the cpu list parameter when also monitoring a thread list
The perf_evlist__create_maps was discarding the --cpu parameter when a
--pid or --tid was specified, fix that.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/r/20110426204401.GB1746@ghostprotocols.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-05-15 09:32:52 -03:00
Lin Ming
2b348a7798 perf probe: Fix the missed parameter initialization
pubname_callback_param::found should be initialized to 0 in
fastpath lookup, the structure is on the stack and
uninitialized otherwise.

Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Link: http://lkml.kernel.org/r/1304066518-30420-2-git-send-email-ming.m.lin@intel.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-10 17:06:23 +02:00
Jesper Juhl
2d9ff66da6 perf: Only include annotate.h once in tools/perf/util/ui/browsers/annotate.c
Including "../../annotate.h" once in
tools/perf/util/ui/browsers/annotate.c is enough. No need to do it twice.

Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-05-10 10:20:07 +02:00
Paul E. McKenney
a26ac2455f rcu: move TREE_RCU from softirq to kthread
If RCU priority boosting is to be meaningful, callback invocation must
be boosted in addition to preempted RCU readers.  Otherwise, in presence
of CPU real-time threads, the grace period ends, but the callbacks don't
get invoked.  If the callbacks don't get invoked, the associated memory
doesn't get freed, so the system is still subject to OOM.

But it is not reasonable to priority-boost RCU_SOFTIRQ, so this commit
moves the callback invocations to a kthread, which can be boosted easily.

Also add comments and properly synchronized all accesses to
rcu_cpu_kthread_task, as suggested by Lai Jiangshan.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2011-05-05 23:16:54 -07:00
Ingo Molnar
947b4ad1d1 perf list: Fix max event string size
Recent stalled-cycles event names were larger than the 40 chars printout
used by perf list.

Extend that, make it robust for future extensions and also adjust alignments
in face of wider event names.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-7y40wib8n009io7hjpn1dsrm@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-29 22:52:54 +02:00
Ingo Molnar
d3d1e86da0 perf stat: Analyze front-end and back-end stall counts
Sample output:

 Performance counter stats for './loop_1b':

        873.691065 task-clock               #    1.000 CPUs utilized
                 1 context-switches         #    0.000 M/sec
                 1 CPU-migrations           #    0.000 M/sec
                96 page-faults              #    0.000 M/sec
     2,012,637,222 cycles                   #    2.304 GHz                      (66.58%)
     1,001,397,911 stalled-cycles-frontend  #   49.76% frontend cycles idle     (66.58%)
         7,523,398 stalled-cycles-backend   #    0.37%  backend cycles idle     (66.76%)
     2,004,551,046 instructions             #    1.00  insns per cycle
                                            #    0.50  stalled cycles per insn  (66.80%)
     1,001,304,992 branches                 # 1146.063 M/sec                    (66.76%)
            39,453 branch-misses            #    0.00% of all branches          (66.64%)

        0.874046121  seconds time elapsed

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-7y40wib8n003io7hjpn1dsrm@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-29 14:35:55 +02:00
Ingo Molnar
129c04cb8c perf tools: Add front-end and back-end stalled cycles support
Update perf tooling to deal with front-end and back-end stalled cycles events.

Add both the default 'perf stat' output.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-7y40wib8n002io7hjpn1dsrm@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-29 14:35:49 +02:00
Ingo Molnar
1fc570ad89 perf stat: Add stalled cycles to the default output
The new default output looks like this:

 Performance counter stats for './loop_1b_instructions':

        236.010686 task-clock               #    0.996 CPUs utilized
                 0 context-switches         #    0.000 M/sec
                 0 CPU-migrations           #    0.000 M/sec
                99 page-faults              #    0.000 M/sec
       756,487,646 cycles                   #    3.205 GHz
       354,938,996 stalled-cycles           #   46.92% of all cycles are idle
     1,001,403,797 instructions             #    1.32  insns per cycle
                                            #    0.35  stalled cycles per insn
       100,279,773 branches                 #  424.895 M/sec
            12,646 branch-misses            #    0.013 % of all branches

        0.236902540  seconds time elapsed

We dropped cache-refs and cache-misses and added stalled-cycles - this is a
more generic "how well utilized is the CPU" metric.

If the stalled-cycles ratio is too high then more specific measurements can be
taken to figure out the source of the inefficiency.

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-pbpl2l4mn797s69bclfpwkwn@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-26 20:04:57 +02:00
Ingo Molnar
749141d926 perf stat: Make all displayed event names parseable as well
Right now we display this by default:

          0.202204 task-clock-msecs         #      0.282 CPUs
                 0 context-switches         #      0.000 M/sec
                 0 CPU-migrations           #      0.000 M/sec
                85 page-faults              #      0.420 M/sec

The task-clock-msecs event cannot actually be passed back as an
event name, the event name we recognize is 'task-clock'.

So change the output of the cpu-clock and task-clock events
to be idempotent.

( Units should be printed out in the right-side column, if needed. )

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-lexrnbzy09asscgd4f7oac4i@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-26 20:04:55 +02:00
Ingo Molnar
ceb53fbf6d perf stat: Fail more clearly when an invalid modifier is specified
Currently we fail without printing any error message on "perf stat -e task-clock-msecs".

The reason is that the task-clock event is matched and the "-msecs" postfix is assumed
to be an event modifier - but is not recognized.

This patch changes the code to be more informative:

 $ perf stat -e task-clock-msecs true
 invalid event modifier: '-msecs'
 Run 'perf list' for a list of valid events and modifiers

And restructures the return value of parse_event_modifier() to allow
the printing of all variants of invalid event modifiers.

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-wlaw3dvz1ly6wple8l52cfca@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-26 20:04:55 +02:00
Ingo Molnar
b908debd4e perf tools: Accept case-insensitive symbolic event variants
We currently fail on something like '-e CPU-migrations', with:

  invalid or unsupported event: 'CPU-migrations'

While 'CPU-migrations' is how we actually print out the event
in the default perf stat output:

 Performance counter stats for 'true':

          0.202204 task-clock-msecs         #      0.282 CPUs
                 0 context-switches         #      0.000 M/sec
                 0 CPU-migrations           #      0.000 M/sec

So change the matching to be case-insensitive.

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-omcm3edjjtx83a4kh2e244se@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-26 20:04:55 +02:00
Ingo Molnar
94403f8863 perf events: Add stalled cycles generic event - PERF_COUNT_HW_STALLED_CYCLES
The new PERF_COUNT_HW_STALLED_CYCLES event tries to approximate
cycles the CPU does nothing useful, because it is stalled on a
cache-miss or some other condition.

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/n/tip-fue11vymwqsoo5to72jxxjyl@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-26 20:04:53 +02:00
David Ahern
9cbdb70209 perf script: improve validation of sample attributes for output fields
Check for required sample attributes using evsel rather than sample_type
in the session header. If the attribute for a default field is not
present for the event type (e.g., new command operating on file from
older kernel) the field is removed from the output list.

Expected event types must exist. For example, if a user specifies

  -f trace:time,trace -f sw:time,cpu,sym

the perf.data file must contain both tracepoints and software events
(ie., it is an error if either does not exist in the file).

Attribute checking is done once at the beginning of perf-script rather
than for each sample.

v1 -> v2:
- addressed comments from acme

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1302148460-570-1-git-send-email-daahern@cisco.com
Signed-off-by: David Ahern <daahern@cisco.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-20 07:27:52 -03:00
Arnaldo Carvalho de Melo
aeafcbaf4f perf symbols: Give more useful names to 'self' parameters
One more installment on an area that is mostly dormant.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-19 08:18:35 -03:00
Ingo Molnar
68d2cf25d3 Merge branch 'perf/urgent' into perf/core
Merge reason: we'll be queueing up dependent changes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-19 07:56:17 +02:00
Arnaldo Carvalho de Melo
5d2cd90922 perf evsel: Fix use of inherit
perf stat doesn't mmap and its perfectly fine for it to use task-bound
counters with inheritance.

So set the attr.inherit on the caller and leave the syscall itself to
validate it.

When the mmap fails perf_evlist__mmap will just emit a warning if this
is the failure reason.

Reported-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Link: http://lkml.kernel.org/r/20110414170121.GC3229@ghostprotocols.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-15 12:52:28 -03:00
Lin Ming
db9a9cbc81 perf hists browser: Fix seg fault when annotate null symbol
In hists browser, press hotkey 'a' to annotate current symbol.

Now it causes segment fault if 'a' is pressed on a null symbol.

Here are 2 small bugs:
- In perf_evsel__hists_browse, the condition check after 'a' is pressed
  is not correct, we should check ->sym instead of ->map.
- In symbol__tui_annotate we must check whether sym is NULL or not
  before getting annotation structure.

This patch fixes above 2 small bugs.

Link: http://lkml.kernel.org/r/1302244286.4106.36.camel@minggr.sh.intel.com
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-15 12:51:49 -03:00
Eric Dumazet
621d26567f perf: Fix a build error with some GCC versions
Fix this:

 util/cgroup.c: In function ‘open_cgroup’:
 util/cgroup.c:16:16: error: ‘saved_ptr’ may be used uninitialized in this function
 util/cgroup.c:16:16: note: ‘saved_ptr’ was declared here

Apparently newer GCC (4.6) can figure out that this variable is properly
initialized - but some versions of GCC (such as 4.5.2) need help.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-04-08 17:40:21 +02:00
Linus Torvalds
8b9686ff4d Merge branches 'x86-fixes-for-linus', 'sched-fixes-for-linus', 'timers-fixes-for-linus', 'irq-fixes-for-linus' and 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86-32, fpu: Fix FPU exception handling on non-SSE systems
  x86, hibernate: Initialize mmu_cr4_features during boot
  x86-32, NUMA: Fix ACPI NUMA init broken by recent x86-64 change
  x86: visws: Fixup irq overhaul fallout

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: Clean up rebalance_domains() load-balance interval calculation

* 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86/mrst/vrtc: Fix boot crash in mrst_rtc_init()
  rtc, x86/mrst/vrtc: Fix boot crash in rtc_read_alarm()

* 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  genirq: Fix cpumask leak in __setup_irq()

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf probe: Fix listing incorrect line number with inline function
  perf probe: Fix to find recursively inlined function
  perf probe: Fix multiple --vars options behavior
  perf probe: Fix to remove redundant close
  perf probe: Fix to ensure function declared file
2011-04-07 12:12:58 -07:00
Linus Torvalds
42933bac11 Merge branch 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6
* 'for-linus2' of git://git.profusion.mobi/users/lucas/linux-2.6:
  Fix common misspellings
2011-04-07 11:14:49 -07:00
Masami Hiramatsu
1d46ea2a6a perf probe: Fix listing incorrect line number with inline function
Fix a bug showing incorrect line number when a probe is put on the head of an
inline function. This patch updates find_perf_probe_point() and introduces new
rules to get correct line number.

 - If debuginfo doesn't have a correct file name, we shouldn't return line
   number too, because, without file name, line number is meaningless.

 - If the address is in a function, it stores the function name and the offset
   from the function entry.

   - If the address is on a line, it tries to get the relative line number from
     the function entry line, except for the address is same as the entry
     address of the function (in this case, the relative line number should
     be 0).

     - If the address is in an inline function entry (call-site), it uses the
       inline function call line number as the line on which the address is.

   - If the address is in an inline function body, it stores the inline
     function name and offset from the inline function call site instead of the
     (non-inlined) function.

Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20110330092605.2132.11629.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-05 15:38:12 -03:00
Masami Hiramatsu
1d878083c2 perf probe: Fix to find recursively inlined function
Fix die_find_inlinefunc() to return correct innermost inlined function
at given address. Without this fix, it returns the outermost inlined
function.

Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20110330092559.2132.78634.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-05 15:36:47 -03:00
Masami Hiramatsu
cc446446ff perf probe: Fix multiple --vars options behavior
Fix a bug that perf-probe fails to initialize libdwfl and shows incorrect error
when user gives multiple --vars options.

Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20110330092553.2132.42691.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-05 15:36:04 -03:00
Masami Hiramatsu
f0c4801a17 perf probe: Fix to remove redundant close
Since dwfl_end() closes given fd with dwfl, caller doesn't need to close its fd
when finishing process.

Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20110330092547.2132.93728.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-05 15:35:16 -03:00
Masami Hiramatsu
7d21635ac5 perf probe: Fix to ensure function declared file
Fix to ensure function declared file matches given file name. This fixes
a potential bug.

As I've commented on Lin Ming's fastpath enhancement, decl_file should
be checked on each probe point if user gives a probe point as func@file.

Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Lin Ming <ming.m.lin@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20110330092541.2132.3584.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-04-05 15:34:53 -03:00
Lucas De Marchi
25985edced Fix common misspellings
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
2011-03-31 11:26:23 -03:00
Ingo Molnar
652d78fd7a Merge commits 'ca6a42586fae' and 'c286c419c784' into perf/urgent
Pick up these two commits from Arnaldo's perf/core tree:

  ca6a42586f: perf tools: Emit clearer message for sys_perf_event_open ENOENT return
  c286c419c7: perf tools: Fixup exit path when not able to open events

As they are really fixes we want to have sooner than laer.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-30 09:08:39 +02:00
Robert Richter
1b7155f7de perf tools: Fix NO_NEWT=1 python build error
Fix the following build error:

     GEN python/perf.so
 In file included from util/evsel.h:10,
                  from util/python.c:6:
 util/hist.h:106:18: error: newt.h: No such file or directory
 error: command 'x86_64-pc-linux-gnu-gcc' failed with exit status 1
 make: *** [python/perf.so] Error 1

by passing BASIC_CFLAGS to setup.py. BASIC_CFLAGS variable contains
the -DNO_NEWT_SUPPORT switch which prevents building python c
extension with newt.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <20110329180236.GA19366@erda.amd.com>
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-29 16:46:57 -03:00
David S. Miller
4d43951756 perf symbols: Properly align symbol_conf.priv_size
If symbol_conf.priv_size is not a multiple of "sizeof(u64)" we'll bus
error on sparc64 in symbol__new because the "struct symbol *" pointer
is computed by adding symbol_conf.priv_size to the memory allocated.

We cannot isolate the fix to symbol__new and symbol__delete since the
private area is computed by subtracting the priv_size value from a
"struct symbol" pointer, so then the private area can still be
potentially unaligned.

So, simply align the symbol_conf.priv_size value in symbol__init()

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20110328.175849.112593455.davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-29 14:18:39 -03:00
Lin Ming
cd25f8bc26 perf probe: Add fastpath to do lookup by function name
v3 -> v2:
- Make pubname_search_cb more generic
- Add fastpath to find_probes also

v2 -> v1:
- Don't compare file names with cu_find_realpath(...), instead, compare
  them with the name returned by dwarf_decl_file(sp_die)

The vmlinux file may have thousands of CUs.
We can lookup function name from .debug_pubnames section
to avoid the slow loop on CUs.

1. Improvement data for find_line_range

./perf stat -e cycles -r 10 -- ./perf probe -k /home/mlin/vmlinux \
        -s /home/mlin/linux-2.6 \
        --line csum_partial_copy_to_user > tmp.log

before patch applied
=====================
       847,988,276 cycles

        0.355075856  seconds time elapsed

after patch applied
=====================
       206,102,622 cycles

        0.086883555  seconds time elapsed

2. Improvement data for find_probes

./perf stat -e cycles -r 10 -- ./perf probe -k /home/mlin/vmlinux \
        -s /home/mlin/linux-2.6 \
        --vars csum_partial_copy_to_user > tmp.log

before patch applied
=====================
       848,490,844 cycles

        0.355307901  seconds time elapsed

after patch applied
=====================
       205,684,469 cycles

        0.086694010  seconds time elapsed

Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linux-kernel <linux-kernel@vger.kernel.org>
LKML-Reference: <1301041668.14111.52.camel@minggr.sh.intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-29 13:40:27 -03:00
Arnaldo Carvalho de Melo
c286c419c7 perf tools: Fixup exit path when not able to open events
We have to deal with the TUI mode in perf top, so that we don't end up
with a garbled screen when, say, a non root user on a machine with a
paranoid setting (the default) tries to use 'perf top'.

Introduce a ui__warning_paranoid() routine shared by top and record that
tells the user the valid values for /proc/sys/kernel/perf_event_paranoid.

Cc: David Ahern <daahern@cisco.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-29 13:40:27 -03:00
Andrew Lutomirski
6c6804fb2c perf symbols: Fix vsyscall symbol lookup
Perf can't currently trace into the vsyscall page.  It looks like it was
meant to work.

Tested on 2.6.38 and today's -git.

The bug is easy to reproduce.  Compile this:

int main()
{
	int i;
	struct timespec t;
	for(i = 0; i < 10000000; i++)
		clock_gettime(CLOCK_MONOTONIC, &t);
	return 0;
}

and run it through perf record; perf report.  The top entry shows
"[unknown]" and you can't zoom in.

It looks like there are two issues.  The first is a that a test for user
mode executing in kernel space is backwards.  (That's the first hunk
below).  The second (I think) is that something's wrong with the code
that generates lots of little struct dso objects for different sections
-- when it runs on vmlinux it results in bogus long_name values which
cause objdump to fail.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LPU-Reference: <AANLkTikxSw5+wJZUWNz++nL7mgivCh_Zf=2Kq6=f9Ce_@mail.gmail.com>
Signed-off-by: Andy Lutomirski <luto@mit.edu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-28 14:44:15 -03:00
Arnaldo Carvalho de Melo
60e4b10c5a perf symbols: Look at .dynsym again if .symtab not found
The original intent of the code was to repeat the search with
want_symtab = 0. But as the code stands now, we never hit the "default"
case of the switch statement. Which means we never repeat the search.

Tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Reported-by: Arun Sharma <asharma@fb.com>
Reported-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Dave Martin <dave.martin@linaro.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-23 19:29:46 -03:00
Arnaldo Carvalho de Melo
b25114817a perf build-id: Add quirk to deal with perf.data file format breakage
The a1645ce1 changeset:

"perf: 'perf kvm' tool for monitoring guest performance from host"

Added a field to struct build_id_event that broke the file format.

Since the kernel build-id is the first entry, process the table using
the old format if the well known '[kernel.kallsyms]' string for the
kernel build-id has the first 4 characters chopped off (where the pid_t
sits).

Reported-by: Han Pingtian <phan@redhat.com>
Cc: Avi Kivity <avi@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Zhang Yanmin <yanmin_zhang@linux.intel.com>
Cc: stable@kernel.org
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-23 19:29:40 -03:00
Arnaldo Carvalho de Melo
9e69c21082 perf session: Pass evsel in event_ops->sample()
Resolving the sample->id to an evsel since the most advanced tools,
report and annotate, and the others will too when they evolve to
properly support multi-event perf.data files.

Good also because it does an extra validation, checking that the ID is
valid when present. When that is not the case, the overhead is just a
branch + function call (perf_evlist__id2evsel).

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-23 19:28:58 -03:00
Josh Hunt
58d406ed6a perf tools: Version incorrect with some versions of grep
Some versions of grep don't treat '\s' properly. When building perf on such
systems and using a kernel tarball the perf version is unable to be determined
from the main kernel Makefile and the user is left with a version of '..'.
Replacing the use of '\s' with '[[:space:]]', which should work in all grep
versions, gives a usable version number.

Reported-by: Tapan Dhimant <tdhimant@akamai.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Tapan Dhimant <tdhimant@akamai.com>
Cc: linux-kernel@vger.kernel.org
Cc: stable@kernel.org
LKML-Reference: <1300241800-30281-1-git-send-email-johunt@akamai.com>
Signed-off-by: Josh Hunt <johunt@akamai.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-16 08:59:50 -03:00
Ingo Molnar
8b7cdd08fe Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux-2.6 into perf/urgent 2011-03-16 13:44:06 +01:00
Linus Torvalds
a926021cb1 Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (184 commits)
  perf probe: Clean up probe_point_lazy_walker() return value
  tracing: Fix irqoff selftest expanding max buffer
  tracing: Align 4 byte ints together in struct tracer
  tracing: Export trace_set_clr_event()
  tracing: Explain about unstable clock on resume with ring buffer warning
  ftrace/graph: Trace function entry before updating index
  ftrace: Add .ref.text as one of the safe areas to trace
  tracing: Adjust conditional expression latency formatting.
  tracing: Fix event alignment: skb:kfree_skb
  tracing: Fix event alignment: mce:mce_record
  tracing: Fix event alignment: kvm:kvm_hv_hypercall
  tracing: Fix event alignment: module:module_request
  tracing: Fix event alignment: ftrace:context_switch and ftrace:wakeup
  tracing: Remove lock_depth from event entry
  perf header: Stop using 'self'
  perf session: Use evlist/evsel for managing perf.data attributes
  perf top: Don't let events to eat up whole header line
  perf top: Fix events overflow in top command
  ring-buffer: Remove unused #include <linux/trace_irq.h>
  tracing: Add an 'overwrite' trace_option.
  ...
2011-03-15 18:31:30 -07:00
Ingo Molnar
5e814dd597 perf probe: Clean up probe_point_lazy_walker() return value
Newer compilers (gcc 4.6) complains about:

        return ret < 0 ?: 0;

For the following reason:

  util/probe-finder.c: In function ‘probe_point_lazy_walker’:
  util/probe-finder.c:1331:18: error: the omitted middle operand in ?: will always be ‘true’, suggest explicit middle operand [-Werror=parentheses]

And indeed the return value is a somewhat obscure (but correct) value
of 'true', so return 'ret' instead - this is cleaner and unconfuses
GCC as well.

Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-15 20:51:09 +01:00
David Ahern
1424dc9680 perf script: Add support for H/W and S/W events
Custom fields set for each type by prepending field argument with type.
For file with multiple event types (e.g., trace and S/W) display of an
event type suppressed by setting output fields to "".

e.g.,
perf record -ga -e sched:sched_switch -e cpu-clock -c 10000000 -R -- sleep 1
perf script

openssl 11496 [000]  9711.807107: cpu-clock-msecs:
        ffffffff810c22dc arch_local_irq_restore ([kernel.kallsyms])
        ffffffff810c518c __alloc_pages_nodemask ([kernel.kallsyms])
        ffffffff810297b2 pte_alloc_one ([kernel.kallsyms])
        ffffffff810d8b98 __pte_alloc ([kernel.kallsyms])
        ffffffff810daf07 handle_mm_fault ([kernel.kallsyms])
        ffffffff8138763a do_page_fault ([kernel.kallsyms])
        ffffffff81384a65 page_fault ([kernel.kallsyms])
            7f6130507d70 asn1_check_tlen (/lib64/libcrypto.so.1.0.0c)
                       0  ()

         openssl 11496 [000]  9711.808042: sched_switch: prev_comm=openssl ...
     kworker/0:0     4 [000]  9711.808067: sched_switch: prev_comm=kworker/...
         swapper     0 [001]  9711.808090: sched_switch: prev_comm=kworker/...
            sshd 11451 [001]  9711.808185: sched_switch: prev_comm=sshd pre...
swapper     0 [001]  9711.816155: cpu-clock-msecs:
        ffffffff81023609 native_safe_halt ([kernel.kallsyms])
        ffffffff8100132a cpu_idle ([kernel.kallsyms])
        ffffffff8137cf9b start_secondary ([kernel.kallsyms])

openssl 11496 [000]  9711.817104: cpu-clock-msecs:
            7f61304ad723 AES_cbc_encrypt (/lib64/libcrypto.so.1.0.0c)
            7fff3402f950  ()
        12f0debc9a785634  ()

swapper     0 [001]  9711.826155: cpu-clock-msecs:
        ffffffff81023609 native_safe_halt ([kernel.kallsyms])
        ffffffff8100132a cpu_idle ([kernel.kallsyms])
        ffffffff8137cf9b start_secondary ([kernel.kallsyms])

To suppress trace events within the file and use default output for S/W events:
perf script -f trace:

or to suppress S/W events and do default display for trace events:
perf script -f sw:

Custom field selections:
perf script -f sw:comm,tid,time -f trace:time,trace

         openssl 11496  9711.797162:
         swapper     0  9711.807071:
         openssl 11496  9711.807107:
 9711.808042: prev_comm=openssl prev_pid=11496 prev_prio=120 prev_state=R ...
 9711.808067: prev_comm=kworker/0:0 prev_pid=4 prev_prio=120 prev_state=S ...
 9711.808090: prev_comm=kworker/0:0 prev_pid=0 prev_prio=120 prev_state=R ...
 9711.808185: prev_comm=sshd prev_pid=11451 prev_prio=120 prev_state=S ==>...
         swapper     0  9711.816155:
         openssl 11496  9711.817104:
         swapper     0  9711.826155:

Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <1299734608-5223-7-git-send-email-daahern@cisco.com>
Signed-off-by: David Ahern <daahern@cisco.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-14 17:07:20 -03:00
David Ahern
c0230b2bfb perf script: Add support for dumping symbols
Add option to dump symbols found in events.

e.g., perf script -f comm,pid,tid,time,trace,sym

swapper     0/0       537.037184: prev_comm=swapper prev_pid=0 prev_prio=120...
        ffffffff81030350 perf_trace_sched_switch ([kernel.kallsyms])
        ffffffff81382ac5 schedule ([kernel.kallsyms])
        ffffffff8100134a cpu_idle ([kernel.kallsyms])
        ffffffff81370b39 rest_init ([kernel.kallsyms])
        ffffffff81696c23 start_kernel ([kernel.kallsyms].init.text)
        ffffffff816962af x86_64_start_reservations ([kernel.kallsyms].init.text)
        ffffffff816963b9 x86_64_start_kernel ([kernel.kallsyms].init.text)

sshd  1675/1675    537.037309: prev_comm=sshd prev_pid=1675 prev_prio=120...
        ffffffff81030350 perf_trace_sched_switch ([kernel.kallsyms])
        ffffffff81382ac5 schedule ([kernel.kallsyms])
        ffffffff813837aa schedule_hrtimeout_range_clock ([kernel.kallsyms])
        ffffffff81383886 schedule_hrtimeout_range ([kernel.kallsyms])
        ffffffff8110c4f9 poll_schedule_timeout ([kernel.kallsyms])
        ffffffff8110cd20 do_select ([kernel.kallsyms])
        ffffffff8110ced8 core_sys_select ([kernel.kallsyms])
        ffffffff8110d00d sys_select ([kernel.kallsyms])
        ffffffff81002bc2 system_call ([kernel.kallsyms])
            7f1647e56e93 __GI_select (/lib64/libc-2.12.90.so)

netstat  1692/1692    537.038664: prev_comm=netstat prev_pid=1692 prev_prio=...
        ffffffff81030350 perf_trace_sched_switch ([kernel.kallsyms])
        ffffffff81382ac5 schedule ([kernel.kallsyms])
        ffffffff81002c3a sysret_careful ([kernel.kallsyms])
            7f7a6cd1b210 __GI___libc_read (/lib64/libc-2.12.90.so)

Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <1299734608-5223-6-git-send-email-daahern@cisco.com>
Signed-off-by: David Ahern <daahern@cisco.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-14 17:06:50 -03:00
David Ahern
c70c94b474 perf script: Move printing of 'common' data from print_event and rename
This change does impact output: latency data is trace specific and is
now printed after the common data - comm, tid, cpu, time and event name.

Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <1299734608-5223-4-git-send-email-daahern@cisco.com>
Signed-off-by: David Ahern <daahern@cisco.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-14 17:05:55 -03:00
David Ahern
2ee7a49f93 perf tracing: Remove print_graph_cpu and print_graph_proc from trace-event-parse
Next patch moves printing of 'common' data into perf-script which
removes the need for these functions.

Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <1299734608-5223-3-git-send-email-daahern@cisco.com>
Signed-off-by: David Ahern <daahern@cisco.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-14 17:05:51 -03:00
David Ahern
be6d842a65 perf script: Change process_event prototype
Prepare for handling of samples for any event type.

Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <1299734608-5223-2-git-send-email-daahern@cisco.com>
Signed-off-by: David Ahern <daahern@cisco.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-14 17:05:50 -03:00
Arnaldo Carvalho de Melo
171b3be9c4 perf symbol: Move sym_entry->skip to symbol->ignore
While going thru each of the sym_entry fields looking to reduce it to
the set of entries needed when in an active symbols list, 'skip' should
really be in symbol, as we set it when loading the symtab.

And the space used by the basic symbol allocation remains the same as
we had 5 bytes of padding.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-11 13:36:01 -03:00
Arnaldo Carvalho de Melo
878b439dcc perf symbols: Rename dso->origin to dso->symtab_type
And the DSO__ORIG_ enum to SYMTAB__, to clarify that this is about from
where the symtab was obtained.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-11 13:28:45 -03:00
Arnaldo Carvalho de Melo
8b8ba4a9a5 perf top: Remove redundant syme->origin field
We can get it from syme->map->dso->kernel (that should be renamed to
origin, but leave this for another patch).

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-11 13:28:45 -03:00
Arnaldo Carvalho de Melo
ec52d9765a perf top: Remove redundant perf_top->sym_counter
We can get that counter index from perf_top->sym_evsel->idx instead.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-11 13:28:45 -03:00
Arnaldo Carvalho de Melo
1c0b04d10b perf header: Stop using 'self'
Stop using this python/OOP convention, doesn't really helps. Will do
more from time to time till we get it cleaned up in all of tools/perf.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <new-submission>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-10 11:16:28 -03:00
Arnaldo Carvalho de Melo
a91e5431d5 perf session: Use evlist/evsel for managing perf.data attributes
So that we can reuse things like the id to attr lookup routine
(perf_evlist__id2evsel) that uses a hash table instead of the linear
lookup done in the older perf_header_attr routines, etc.

Also to make evsels/evlist more pervasive an API, simplyfing using the
emerging perf lib.

cc: Arun Sharma <arun@sharma-home.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-10 11:15:54 -03:00
Jiri Olsa
6547250381 perf top: Don't let events to eat up whole header line
Passing multiple events might force out information about pid/tid/cpu.
Attached patch leaves 30 characters for this info at the expense of the
events' names.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Han Pingtian <phan@redhat.com>
LKML-Reference: <1299528821-17521-3-git-send-email-jolsa@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-10 10:55:00 -03:00
Jiri Olsa
b9a46bba88 perf top: Fix events overflow in top command
The snprintf function returns number of printed characters even if it
cross the size parameter. So passing enough events via '-e' parameter
will cause segmentation fault.

It's reproduced by following command:

perf top -e `perf list | grep Tracepoint | awk -F'[' '\
{gsub(/[[:space:]]+/,"",$1);array[FNR]=$1}END{outputs=array[1];\
for (i=2;i<=FNR;i++){ outputs=outputs "," array[i];};print outputs}'`

Attached patch is adding SNPRINTF macro that provides the overflow check
and returns actuall number of printed characters.

Reported-by: Han Pingtian <phan@redhat.com>
Cc: Han Pingtian <phan@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1299528821-17521-2-git-send-email-jolsa@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-10 10:54:13 -03:00
Lin Ming
7ee235efe5 perf symbols: Avoid resolving [kernel.kallsyms] to real path for buildid cache
kallsyms has a virtual file name [kernel.kallsyms].  Currently, it can't
be added to buildid cache successfully because the code
(build_id_cache__add_s) tries to resolve [kernel.kallsyms] to a real
absolute pathname and that fails.

Fixes it by not resolving it and just use the name [kernel.kallsyms].
So dir ~/.debug/[kernel.kallsyms] is created.

Original bug report at:
https://lkml.org/lkml/2011/3/1/524

Tested-by: Han Pingtian <phan@redhat.com>
Cc: Han Pingtian <phan@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1299165837-27817-1-git-send-email-ming.m.lin@intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-09 13:44:10 -03:00
Arnaldo Carvalho de Melo
7f0030b211 perf report tui: Improve multi event session support
When multiple events were used in 'perf record', allow the user to
choose which one is wanted before showing the per event histograms.

Annotations will be performed on the chosen event.

Allow going back and forth from event to event quickly using just the
arrow keys and enter.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: William Cohen <wcohen@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-06 13:14:53 -03:00
Arnaldo Carvalho de Melo
e248de331a perf tools: Improve support for sessions with multiple events
By creating an perf_evlist out of the attributes in the perf.data file
header, so that we can use evlists and evsels when reading recorded
sessions in addition to when we record sessions.

More work is needed to allow tools to allow the user to select which
events are wanted when browsing sessions, be it just one or a subset of
them, aggregated or showed at the same time but with different
indications on the UI to allow seeing workloads thru different views at
the same time.

But the overall goal/trend is to more uniformly use evsels and evlists.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-06 13:13:40 -03:00
Arnaldo Carvalho de Melo
3d3b5e9599 perf evlist: Split perf_evlist__id_hash
The previous situation was to receive an fd from where to read the event
ID.

Spin off a routine for when we have the ID handy, not having to read it
from some fd.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-06 13:10:45 -03:00
Arnaldo Carvalho de Melo
60098917c0 perf hists browser: Handle browsing empty hists tree
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-06 13:10:05 -03:00
Arnaldo Carvalho de Melo
d7603d5122 perf hists: Remove needless global col lenght calcs
To support multiple events we need to do these calcs per 'struct hists'
instance, and it turns out we already do that at:

	__hists__add_entry
		hists__inc_nr_entries
			hists__calc_col_len

for all the unfiltered hist_entry instances we stash in the rb tree, so
trow away the dead code.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-05 22:31:04 -03:00
Arnaldo Carvalho de Melo
a03f35ceeb perf report tui: Fix multi event switching
TAB/UNTAB were not hotkeys, so didn't exit hists__browse back to
hists__tui_browse_tree, allowing just the first event to be browsed.

Reported-by: William Cohen <wcohen@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: William Cohen <wcohen@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-05 22:31:04 -03:00
Ingo Molnar
888a8a3e9d Merge branch 'perf/urgent' into perf/core
Merge reason: Pick up updates before queueing up dependent patches.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-04 10:40:25 +01:00
Frederic Weisbecker
cfff2d909c perf: Fix undefined PyVarObject_HEAD_INIT in python 2.5
PyVarObject_HEAD_INIT is undefined in python 2.5, resulting
in a build crash:

	util/python.c:81: attention : déclaration implicite de la fonction « «PyVarObject_HEAD_INIT» »
	util/python.c:82: erreur: request for member «tp_name» in something not a structure or union
	util/python.c:117: erreur: request for member «tp_name» in something not a structure or union
	util/python.c:146: erreur: request for member «tp_name» in something not a structure or union
	util/python.c:177: erreur: request for member «tp_name» in something not a structure or union
	util/python.c:290: erreur: request for member «tp_name» in something not a structure or union
	util/python.c:359: erreur: request for member «tp_name» in something not a structure or union
	util/python.c:532: erreur: request for member «tp_name» in something not a structure or union
	util/python.c:761: erreur: request for member «tp_name» in something not a structure or union
	error: command 'gcc' failed with exit status 1
	make: *** [python/perf.so] Erreur 1

We can fix that by defining PyVarObject_HEAD_INIT as a wrapper on
PyObject_HEAD_INIT, thanks to a trick found on biopython:
d4eaf57946

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-04 01:17:27 +01:00
Frederic Weisbecker
ff9ae1babd perf: Fix missing strndup declaration
<ctype.h> is included first without _GNU_SOURCE, so it ends up
including <string.h> without declaring strndup(). And further
<string.h> declarations, even with _GNU_SOURCE defined, are
of course without effect.

Therefore:

	util/strfilter.c: Dans la fonction «strfilter_node__new» :
	util/strfilter.c:134: attention : déclaration implicite de la fonction « «strndup» »
	util/strfilter.c:134: attention : incompatible implicit declaration of built-in function «strndup»
	make: *** [util/strfilter.o] Erreur 1

Just don't include ctype.h as it doesn't appear to be necessary
anyway.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-04 01:17:18 +01:00
Frederic Weisbecker
0a10247914 perf: Set filters before mmaping events
We currently set the filters after we mmap the events, this is a
race that let undesired events record themselves in the buffer before
we had the time to set the filters.

So set the filters before they can be recorded. That also librarizes
the filters setting so that filtering can be done more easily
from other tools than perf record later.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
2011-03-02 16:05:51 +01:00
Arnaldo Carvalho de Melo
5807806a92 perf top tui: Wait till the first sample to refresh the screen.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-01 10:43:03 -03:00
Arnaldo Carvalho de Melo
a1ceb741cf perf tui: Make ui__warning modal
By taking the ui__lock so that no other screen updates take place while
waiting for the user.

That was happening when handling an invalid --vmlinux parameter in 'perf
top --tui', with the screen refresh routine repainting the screen and
removing the warning window.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-01 10:24:43 -03:00
Arnaldo Carvalho de Melo
3166fc8fb6 perf top browser: Handle empty active symbols list
Fixing a SEGV. An empty list could happen when not being able to resolve
symbols, for instance when --vmlinux invalid-file is used.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-03-01 10:21:44 -03:00
Arnaldo Carvalho de Melo
a639dc64e5 perf symbols: Fix vmlinux path when not using --symfs
The ec5761e cset introduced the symfs feature with a bug for loading vmlinux
files that ended up causing this failure:

[root@emilia v2.6.38-rc5+]# strace -e trace=open perf top --vmlinux ./vmlinux 2>&1 | tail -3
open("/./vmlinux", O_RDONLY)            = -1 ENOENT (No such file or directory)
./vmlinux with build id b9266bf40e98dadb5d43a2f3e95d3c5d4aff46dc not found, continuing without symbols
The ./vmlinux file can't be used
[root@emilia v2.6.38-rc5+]#

Remove the extra slash, just like is done in the DSO__ORIG_DSO handling in
dso__load() and other parts of the ec5761e cset.

Reported-by: Ingo Molnar <mingo@elte.hu>
Cc: David Ahern <daahern@cisco.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-28 13:54:38 -03:00
Thomas Renninger
e853072055 perf timechart: Fix black idle boxes in the title
This fix is needed for eye of gnome and firefox svg viewers.
Only Inkscape can handle the broken case.

Compare with the other svg_legenda_box declarations, looks
like a typo slipped in at this place.

Signed-off-by: Thomas Renninger <trenn@suse.de>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: lenb@kernel.org
LKML-Reference: <1298842606-55712-5-git-send-email-trenn@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-28 08:56:14 +01:00
Arnaldo Carvalho de Melo
b210b3bb1b perf ui browser: Introduce ui_browser__show_title
Needed because we were only showing the title in ui_browser__show,
not in ui_browser__run, and in the run loop we may be calling other
browsers that would then change the title, when we go back to the
previous browser, we need to redraw the title.

We could have done this as the Newt help line, with pop, etc, but I
don't think its worth, doing it explicitely, when needed (some browsers
may not use the title area at all) seems enough/more flexible.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-25 11:33:31 -03:00
Arnaldo Carvalho de Melo
c16bfe9ac3 perf top browser: Fix up exit keys
The left key was exiting 'perf top --tui' when it really shouldn't, it
was too easy to leave the live annotation window and then press one too
many <- and get out of the tool altogether.

Do just like the report TUI does, ignore the left key for exit and also
ask the user when pressing ESC if that is really what is wanted.

Reported-by: Mike Galbraith <efault@gmx.de>
Suggested-by: Ingo Molnar <mingo@elte.hu>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-25 09:30:29 -03:00
Arnaldo Carvalho de Melo
69cf0218d1 perf hists: Print number of samples, not the period sum
So that we match the header where we state the number of events with the
"Samples" column when using 'perf report -n/--show-nr-samples':

 [root@emilia ~]# perf record -a sleep 1
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.111 MB perf.data (~4860 samples) ]
 [root@emilia ~]# perf report --stdio --show-nr-samples
 # Events: 11  cycles
 #
 # Overhead  Samples        Command       Shared Object                        Symbol
 # ........ ..........  ...........  ..................  ............................
 #
     16.65%          1        sleep  [kernel.kallsyms]   [k] unmap_vmas
     16.10%          1         perf  libpthread-2.12.so  [.] __pthread_cleanup_push_defer
     15.79%          2         perf  [kernel.kallsyms]   [k] format_decode
     12.88%          1  kworker/1:2  [kernel.kallsyms]   [k] cache_reap
     10.69%          1      swapper  [kernel.kallsyms]   [k] _raw_spin_lock
      7.55%          1        sleep  [kernel.kallsyms]   [k] prepare_exec_creds
      6.00%          1         perf  [jbd2]              [k] start_this_handle
      5.29%          1         perf  [kernel.kallsyms]   [k] seq_read
      4.75%          1         perf  [kernel.kallsyms]   [k] get_pid_task
      4.30%          1         perf  [kernel.kallsyms]   [k] _raw_spin_unlock_irqrestore

 #
 # (For a higher level overview, try: perf report --sort comm,dso)
 #
 [root@emilia ~]#

Reported-by: Stephane Eranian <eranian@google.com>
Reported-by: Cliff Wickman <cpw@sgi.com>
Acked-by: Stephane Eranian <eranian@google.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: <stable@kernel.org>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
[ cherry-picked it from perf/core, as it has been reported by others as well. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-25 10:55:03 +01:00
Arnaldo Carvalho de Melo
170ae6bc24 perf annotate: Show better message when no vmlinux is found
In both --tui and --stdio, in 'annotate', 'top', 'report' when trying to
annotate a kernel symbol having just access to a kallsyms file, that
doesn't have the DWARF info needed for annotation.

Suggested-by: Ingo Molnar <mingo@elte.hu>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-23 12:13:39 -03:00
Arnaldo Carvalho de Melo
6435a5e39d perf top browser: Adjust the browser indexes when refreshing
This is not a problem when we're not at the bottom of the active symbols
list, so was not noticed, but at the end of the screen it falls apart.

Fix it by adjusting the ui_browser indexes according to the new number
of entries in the rb_tree and by seeking from the start of the rb_tree
to find the new symbol at the top of the screen.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-23 07:25:02 -03:00
Arnaldo Carvalho de Melo
c97cf42219 perf top: Live TUI Annotation
Now one has just to press the right key, 'a' or Enter on the main 'perf
top --tui' screen to live annotate the symbol under the cursor.

The annotate window starts centered on the hottest line (the one with
most samples so far) then TAB and shift+TAB can be used to go to the
prev/next hot line.

Pressing 'H' at any point will center again the screen on the hottest
line.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-22 12:02:07 -03:00
Arnaldo Carvalho de Melo
8635bf6ea3 perf probe: Remove redundant checks
While fixing an error propagating problem in f809b25 I added two
redundant checks.

I did that because I didn't expect the checks to be on the while and for
loop condition expression, where they are tested before we run the loop,
where the 'ret' variable is set.

So remove it from there and leave it just after it is actually set,
eliminating unneded tests.

Reported-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-22 07:02:07 -03:00
Arnaldo Carvalho de Melo
e603dc1507 perf evsel: Fix inverted test for fixing up attr.inherit flag
The kernel refuses mmapping an event with the inherit flag set for
something that is systemwide (cpu == -1), and the evsel layer got this
reversed at some point, fix it.

The symtom was that the --pid and --tid parameters for 'perf record' and
'perf top' returned with -EINVAL, like:

 # /tmp/build-perf/perf record -v -fo/tmp/perf.data -p 1042
   Warning:  ... trying to fall back to cpu-clock-ticks

   Fatal: failed to mmap with 22 (Invalid argument)

Reported-by: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-21 22:27:59 -03:00
Arnaldo Carvalho de Melo
fbee632d0c perf probe: Fix error propagation leading to segfault
There are two hunks in this patch that stops probe processing as soon as one
error is found, breaking out of loops, the other fix an error propagation that
should return a negative error number but instead was returning the result of
"ret < 0", which is 1 and thus made several error checks fail because they test
agains < 0.

The problem could be triggered by asking for a variable that was optimized out,
fact that should stop the whole probe processing but instead was segfaulting
while installing broken probes:

[root@emilia ~]# probe perf_mmap:55 user_lock_limit
Failed to find the location of user_lock_limit at this address.
 Perhaps, it has been optimized out.
Failed to find 'user_lock_limit' in this function.
Add new events:
  probe:perf_mmap      (on perf_mmap:55 with user_lock_limit)
  probe:perf_mmap_1    (on perf_mmap:55 with user_lock_limit)
Segmentation fault (core dumped)
[root@emilia ~]# perf probe -l
  probe:perf_mmap      (on perf_mmap:55@git/linux/kernel/perf_event.c with user_lock_limit)
  probe:perf_mmap_1    (on perf_mmap:55@git/linux/kernel/perf_event.c with user_lock_limit)
[root@emilia ~]#

After the fix:

[root@emilia ~]# probe perf_mmap:55 user_lock_limit
Failed to find the location of user_lock_limit at this address.
 Perhaps, it has been optimized out.
Failed to find 'user_lock_limit' in this function.
  Error: Failed to add events. (-2)
[root@emilia ~]#

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-21 22:21:27 -03:00
Michael Witten
a3d1ee10d1 perf tools: Makefile: Remove various and sundry cruft
This commit squashes several commits that remove:

 unnecessary uname calls
 `sh -c'
 BUILT_INS and QUIET_BUILT_IN

    They have no effect, and the `fixup-builtins' and `check-builtins.sh'
    scripts don't even exist.

 RUNTIME_PREFIX

    It's currently never anything but unset, and it's apparently
    only meaningful when Microsoft Windows is the operating system
    (according to the source for git).

 TEST_PROGRAMS
 EXTRA_PROGRAMS
 unused SHELL_PATH_SQ portions
 unused test for V=2
 useless exports

    Only when `V' is undefined (that is, only when the value of `V'
    is empty) is `export V' performed, which just has the effect of
    placing the empty-valued variable `V' in the environment.

    The only other script to make use of `V' is `Documentation/Makefile',
    which only checks whether `V' is undefined (that is, whether the value
    of `V' is empty); hence, the `export V' has no effect whatsoever.

    Similarly, `export QUIET_GEN' is useless because it will only have
    a non-empty value when `V' has an empty-value, and when `V' has
    an empty-value, `QUIET_GEN' is always explicitly set in every
    script in which it is used.

    `DESTDIR' is only ever defined by the user via the environment
    or the command line, both of which are automatically exported
    to sub-make processes. Furthermore, no non-make sub-scripts
    make use of `DESTDIR' as an environment variable.

    No other scripts use `perfexec_instdir'.

 unused QUIET_SUBDIR{0,1}
 TAR and RPMBUILD
 PTHREAD_LIBS
 Maintainer's dist rules and commands
 distclean target
 Test suite coverage testing
 PRINT_DIR and NO_SUBDIR
 `configure' target
 NO_CURL
 @@PERF_VERSION@@ substitution

    Without the sed command, all of the rule's commands can be reduced
    to a single line that copies a file and sets the permissions properly
    in the process.

 `make test' echo line
 template_instdir
 PERF-BUILD-OPTIONS
 double-colon rules

    The use of double-colon rules seems misguided or vestigial git.

 Essentially hard-coded $(SCRIPTS) expansion

Signed-off-by: Michael Witten <mfwitten@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-18 07:43:06 -02:00
Michael Witten
8796cb9d7d perf tools: Makefile: Remove platform-specific cruft
While it makes sense that this tool could be used on
other platforms at least to parse data, there doesn't
appear to be any real support for such usage.

This commit squashes several commits that remove:

 SNPRINTF_RETURNS_BOGUS
 FREAD_READS_DIRECTORIES
 NO_D_{INO,TYPE}_IN_DIRENT
 NO_STRCASESTR
 NO_MEMMEM
 NO_STRTOUMAX and NO_STRTOULL
 NO_SETENV
 NO_UNSETENV
 NO_MKDTEMP
 NEEDS_LIBICONV
 NEEDS_SOCKET
 NO_MMAP
 NO_PTHREADS
 NO_PREAD
 NO_TRUSTABLE_FILEMODE
 NO_IPV6 and NO_SOCKADDR_STORAGE
 NO_ICONV and OLD_ICONV
 NO_NSEC, USE_NSEC, and USE_ST_TIMESPEC
 NO_ST_BLOCKS_IN_STRUCT_STAT
 NO_FINK and NO_DARWIN_PORTS
 NO_SYS_SELECT_H
 NO_HSTRERROR
 DIR_HAS_BSD_GROUP_SEMANTICS and FORCE_DIR_SET_GID
 NEEDS_NSL, NO_UINTMAX_T, NO_INET_{N,P}TON
 COMPAT_{CFLAGS,OBJS}
 Executable extension `X'

Signed-off-by: Michael Witten <mfwitten@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-18 07:42:07 -02:00
Arnaldo Carvalho de Melo
668b8788f4 perf list: Allow filtering list of events
The man page has the details, here are some examples:

[root@emilia ~]# perf list *fault*  *:*wait*

List of pre-defined events (to be used in -e):
  page-faults OR faults                      [Software event]
  minor-faults                               [Software event]
  major-faults                               [Software event]
  alignment-faults                           [Software event]
  emulation-faults                           [Software event]

  radeon:radeon_fence_wait_begin             [Tracepoint event]
  radeon:radeon_fence_wait_end               [Tracepoint event]
  writeback:wbc_writeback_wait               [Tracepoint event]
  writeback:wbc_balance_dirty_wait           [Tracepoint event]
  writeback:writeback_congestion_wait        [Tracepoint event]
  writeback:writeback_wait_iff_congested     [Tracepoint event]
  sched:sched_wait_task                      [Tracepoint event]
  sched:sched_process_wait                   [Tracepoint event]
  sched:sched_stat_wait                      [Tracepoint event]
  sched:sched_stat_iowait                    [Tracepoint event]
  syscalls:sys_enter_epoll_wait              [Tracepoint event]
  syscalls:sys_exit_epoll_wait               [Tracepoint event]
  syscalls:sys_enter_epoll_pwait             [Tracepoint event]
  syscalls:sys_exit_epoll_pwait              [Tracepoint event]
  syscalls:sys_enter_rt_sigtimedwait         [Tracepoint event]
  syscalls:sys_exit_rt_sigtimedwait          [Tracepoint event]
  syscalls:sys_enter_waitid                  [Tracepoint event]
  syscalls:sys_exit_waitid                   [Tracepoint event]
  syscalls:sys_enter_wait4                   [Tracepoint event]
  syscalls:sys_exit_wait4                    [Tracepoint event]
  syscalls:sys_enter_waitpid                 [Tracepoint event]
  syscalls:sys_exit_waitpid                  [Tracepoint event]
[root@emilia ~]#

Suggested-by: Ingo Molnar <mingo@elte.hu>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-17 15:38:58 -02:00
Arnaldo Carvalho de Melo
fec9cbd15b perf hists: Print number of samples, not the period sum
So that we match the header where we state the number of events with the
"Samples" column when using 'perf report -n/--show-nr-samples':

 [root@emilia ~]# perf record -a sleep 1
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.111 MB perf.data (~4860 samples) ]
 [root@emilia ~]# perf report --stdio --show-nr-samples
 # Events: 11  cycles
 #
 # Overhead  Samples        Command       Shared Object                        Symbol
 # ........ ..........  ...........  ..................  ............................
 #
     16.65%          1        sleep  [kernel.kallsyms]   [k] unmap_vmas
     16.10%          1         perf  libpthread-2.12.so  [.] __pthread_cleanup_push_defer
     15.79%          2         perf  [kernel.kallsyms]   [k] format_decode
     12.88%          1  kworker/1:2  [kernel.kallsyms]   [k] cache_reap
     10.69%          1      swapper  [kernel.kallsyms]   [k] _raw_spin_lock
      7.55%          1        sleep  [kernel.kallsyms]   [k] prepare_exec_creds
      6.00%          1         perf  [jbd2]              [k] start_this_handle
      5.29%          1         perf  [kernel.kallsyms]   [k] seq_read
      4.75%          1         perf  [kernel.kallsyms]   [k] get_pid_task
      4.30%          1         perf  [kernel.kallsyms]   [k] _raw_spin_unlock_irqrestore

 #
 # (For a higher level overview, try: perf report --sort comm,dso)
 #
 [root@emilia ~]#

Reported-by: Stephane Eranian <eranian@google.com>
Acked-by: Stephane Eranian <eranian@google.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-17 13:56:20 -02:00
Stephane Eranian
f0c55bcf4a perf: make perf stat print user provided full event names
This patch changes the way perf stat prints event names at the end of a
run. Until now, it was trying to reconstruct the event name from its
encoding. The problem is that it would only print generic events without
their modifiers (u, k, pp).

This patch saves the event name as passed by the user in the evsel
struct and uses it to print the final event name.

This would also work in case perf is linked with a library (such as
libpfm4) which provides full PMU event tables.

$ perf stat -e cycles:u,cycles:k date
Wed Feb 16 14:58:52 CET 2011

 Performance counter stats for 'date':

            568600 cycles:u
           2779715 cycles:k

        0.001908182  seconds time elapsed

Cc: Arun Sharma <arun@sharma-home.net>
Cc: David S. Miller <davem@davemloft.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Stephane Eranian <eranian@gmail.com>
LPU-Reference: <4d5bdc64.98a1df0a.7aa3.06c2@mx.google.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
[ committer note: Fixed a merge problem with 023695d "Add cgroup support" ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-17 10:29:19 -02:00
Arnaldo Carvalho de Melo
4498062e72 perf python: Add cgroup.c to setup.py to get it building again
The 023695d cset added a new file, util/cgroup.c, that is referenced from
util/evsel.c, so it needs to be present in util/setup.py so that the python
shared object binding works, fixing this:

[root@emilia linux]# export PYTHONPATH=~acme/git/build/perf/python/
[root@emilia linux]# ./tools/perf/python/twatch.py
Traceback (most recent call last):
  File "./tools/perf/python/twatch.py", line 16, in <module>
    import perf
ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: close_cgroup

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-17 10:07:42 -02:00
Masami Hiramatsu
8737ebdea0 perf probe: Show filename which contains target function
Show filename which contains a target function with the function name on
"--lines" mode, because perf-probe just shows the first function even if
there are many same-name functions.

Originally adopted by Franck Bui-Huu's patch which shows file name
instead of function name. I've just modified it to show both of function
name and file name, because of completeness of output.

 E.g.)
 $ perf probe -L t_show
 <t_show@/home/mhiramat/ksrc/linux-2.6-tip/kernel/trace/ftrace.c:0>
      0  static int t_show(struct seq_file *m, void *v)
      1  {
      2         struct ftrace_iterator *iter = m->private;
 ...

 $ perf probe -L t_show@trace/trace.c
 <t_show@/home/mhiramat/ksrc/linux-2.6-tip/kernel/trace/trace.c:0>
      0  static int t_show(struct seq_file *m, void *v)
      1  {
                struct tracer *t = v;
 ...

Original-patch-by: Franck Bui-Huu <fbuihuu@gmail.com>
Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Franck Bui-Huu <fbuihuu@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <20110210090816.1809.43426.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-16 17:04:09 -02:00
Masami Hiramatsu
e116dfa1c3 perf probe: Support function@filename syntax for --line
Since "perf probe --add" supports function@filename syntax, --line
option should also support it.

Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Franck Bui-Huu <fbuihuu@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: linux-kernel@vger.kernel.org
LKML-Reference: <20110210090810.1809.26913.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-16 17:03:23 -02:00
Arnaldo Carvalho de Melo
b99976e2d2 perf annotate browser: Use the percent color for the whole line
Not just for the percentage number, to see the hot lines more easily.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-16 14:47:55 -02:00
Arnaldo Carvalho de Melo
289c082044 perf annotate: Check if offset is less than symbol size
Just like done on symbol__inc_addr_samples to catch misparsed offsets
from objdump.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-16 14:47:55 -02:00
Arnaldo Carvalho de Melo
5c35d69fb6 perf ui: Serialize screen updates
The ui operations so far were used by just one thread, but 'perf top
--tui' now has two threads updating the screen, so we need to use a
mutex to avoid garbling the screen.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-16 14:47:55 -02:00
Stephane Eranian
023695d96e perf tool: Add cgroup support
This patch adds the ability to filter monitoring based on container groups
(cgroups) for both perf stat and perf record. It is possible to monitor
multiple cgroup in parallel. There is one cgroup per event. The cgroups to
monitor are passed via a new -G option followed by a comma separated list of
cgroup names.

The cgroup filesystem has to be mounted. Given a cgroup name, the perf tool
finds the corresponding directory in the cgroup filesystem and opens it. It
then passes that file descriptor to the kernel.

Example:

$ perf stat -B -a -e cycles:u,cycles:u,cycles:u -G test1,,test2 -- sleep 1
 Performance counter stats for 'sleep 1':

      2,368,667,414  cycles                   test1
      2,369,661,459  cycles
      <not counted>  cycles                   test2

        1.001856890  seconds time elapsed

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4d590290.825bdf0a.7d0a.4890@mx.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-16 13:30:48 +01:00
Arnaldo Carvalho de Melo
7c940c18c5 Merge remote branch 'acme/perf/urgent' into perf/core
Fixups due to rename of event_t routines from event__ to perf_event__
done in perf/core.

Conflicts:
	tools/perf/builtin-record.c
	tools/perf/builtin-top.c
	tools/perf/util/event.c
	tools/perf/util/event.h

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-11 11:45:54 -02:00
Arnaldo Carvalho de Melo
401b8e1317 perf tools: Fix thread_map event synthesizing in top and record
Jeff Moyer reported these messages:

  Warning:  ... trying to fall back to cpu-clock-ticks

couldn't open /proc/-1/status
couldn't open /proc/-1/maps
[ls output]
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.008 MB perf.data (~363 samples) ]

That lead me and David Ahern to see that something was fishy on the thread
synthesizing routines, at least for the case where the workload is started
from 'perf record', as -1 is the default for target_tid in 'perf record --tid'
parameter, so somehow we were trying to synthesize the PERF_RECORD_MMAP and
PERF_RECORD_COMM events for the thread -1, a bug.

So I investigated this and noticed that when we introduced support for
recording a process and its threads using --pid some bugs were introduced and
that the way to fix it was to instead of passing the target_tid to the event
synthesizing routines we should better pass the thread_map that has the list of
threads for a --pid or just the single thread for a --tid.

Checked in the following ways:

On a 8-way machine run cyclictest:

[root@emilia ~]# perf record cyclictest -a -t -n -p99 -i100 -d50
policy: fifo: loadavg: 0.00 0.13 0.31 2/139 28798

T: 0 (28791) P:99 I:100 C:  25072 Min:      4 Act:    5 Avg:    6 Max:     122
T: 1 (28792) P:98 I:150 C:  16715 Min:      4 Act:    6 Avg:    5 Max:      27
T: 2 (28793) P:97 I:200 C:  12534 Min:      4 Act:    5 Avg:    4 Max:       8
T: 3 (28794) P:96 I:250 C:  10028 Min:      4 Act:    5 Avg:    5 Max:      96
T: 4 (28795) P:95 I:300 C:   8357 Min:      5 Act:    6 Avg:    5 Max:      12
T: 5 (28796) P:94 I:350 C:   7163 Min:      5 Act:    6 Avg:    5 Max:      12
T: 6 (28797) P:93 I:400 C:   6267 Min:      4 Act:    5 Avg:    5 Max:       9
T: 7 (28798) P:92 I:450 C:   5571 Min:      4 Act:    5 Avg:    5 Max:       9
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.108 MB perf.data (~4719 samples) ]

[root@emilia ~]#

This will create one extra thread per CPU:

[root@emilia ~]# tuna -t cyclictest -CP
                      thread       ctxt_switches
    pid SCHED_ rtpri affinity voluntary nonvoluntary             cmd
 28825   OTHER     0     0xff      2169          671      cyclictest
  28832   FIFO    93        6     52338            1      cyclictest
  28833   FIFO    92        7     46524            1      cyclictest
  28826   FIFO    99        0    209360            1      cyclictest
  28827   FIFO    98        1    139577            1      cyclictest
  28828   FIFO    97        2    104686            0      cyclictest
  28829   FIFO    96        3     83751            1      cyclictest
  28830   FIFO    95        4     69794            1      cyclictest
  28831   FIFO    94        5     59825            1      cyclictest
[root@emilia ~]#

So we should expect only samples for the above 9 threads when using the
--dump-raw-trace|-D perf report switch to look at the column with the tid:

[root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c
    629 28825
    110 28826
    491 28827
    308 28828
    198 28829
    621 28830
    225 28831
    203 28832
     89 28833
[root@emilia ~]#

So for workloads started by 'perf record' seems to work, now for existing workloads,
just run cyclictest first, without 'perf record':

[root@emilia ~]# tuna -t cyclictest -CP
                      thread       ctxt_switches
    pid SCHED_ rtpri affinity voluntary nonvoluntary             cmd
 28859   OTHER     0     0xff       594          200      cyclictest
  28864   FIFO    95        4     16587            1      cyclictest
  28865   FIFO    94        5     14219            1      cyclictest
  28866   FIFO    93        6     12443            0      cyclictest
  28867   FIFO    92        7     11062            1      cyclictest
  28860   FIFO    99        0     49779            1      cyclictest
  28861   FIFO    98        1     33190            1      cyclictest
  28862   FIFO    97        2     24895            1      cyclictest
  28863   FIFO    96        3     19918            1      cyclictest
[root@emilia ~]#

and then later did:

[root@emilia ~]# perf record --pid 28859 sleep 3
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.027 MB perf.data (~1195 samples) ]
[root@emilia ~]#

To collect 3 seconds worth of samples for pid 28859 and its children:

[root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c
     15 28859
     33 28860
     19 28861
     13 28862
     13 28863
     10 28864
     11 28865
      9 28866
    255 28867
[root@emilia ~]#

Works, last thing is to check if looking at just one of those threads also works:

[root@emilia ~]# perf record --tid 28866 sleep 3
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.006 MB perf.data (~242 samples) ]
[root@emilia ~]# perf report -D | grep RECORD_SAMPLE | cut -d/ -f2 | cut -d: -f1 | sort | uniq -c
      3 28866
[root@emilia ~]#

Works too.

Reported-by: Jeff Moyer <jmoyer@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-10 12:52:47 -02:00
Arnaldo Carvalho de Melo
d5e3d74700 perf annotate: Fix annotate context lines regression
The live annotation done in 'perf top' needs to limit the context before
lines that aren't filtered out by the min percent filter, if we don't do
that, the screen in a tty often is not enough for showing what is
interesting: lines with hits and a few source code lines before it.

Reported-by: Mike Galbraith <efault@gmx.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-08 15:29:25 -02:00
Arnaldo Carvalho de Melo
ce6f4fab40 perf annotate: Move locking to struct annotation
Since we'll need it when implementing the live annotate TUI browser.

This also simplifies things a bit by having the list head for the source
code to be in the dynamicly allocated part of struct annotation, that
way we don't have to pass it around, it can be found from the struct
symbol that is passed everywhere.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-08 15:03:36 -02:00
Arnaldo Carvalho de Melo
e3087b80aa perf annotate: Fix --stdio rendering
The checks for not using a max_lines parameter were b0rked, problem
introduced in 3653246.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-08 15:01:39 -02:00
Masami Hiramatsu
124bb83cd7 perf probe: Add bitfield member support
Add bitfield member accessing support to probe arguments.

Suggested-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <20110204125211.9507.60265.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
[ committer note: Fixed up '%lu' use for return of BYTES_TO_BITS ('%zd') ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-07 12:48:48 -02:00
Borislav Petkov
a222179625 perf annotate: Fix build error
A small fix for when NO_NEWT_SUPPORT is defined.

Add a missing "struct" to the function prototype.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <20110207143218.GA31197@kryptos.osrc.amd.com>
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-07 12:41:55 -02:00
Kyle McMartin
fb7d0b3cef perf tool: Fix gcc 4.6.0 issues
GCC 4.6.0 in Fedora rawhide turned up some compile errors in tools/perf
due to the -Werror=unused-but-set-variable flag.

I've gone through and annotated some of the assignments that had side
effects (ie: return value from a function) with the __used annotation,
and in some cases, just removed unused code.

In a few cases, we were assigning something useful, but not using it in
later parts of the function.

kyle@dreadnought:~/src% gcc --version
gcc (GCC) 4.6.0 20110122 (Red Hat 4.6.0-0.3)

Cc: Ingo Molnar <mingo@redhat.com>
LKML-Reference: <20110124161304.GK27353@bombadil.infradead.org>
Signed-off-by: Kyle McMartin <kyle@redhat.com>
[ committer note: Fixed up the annotation fixes, as that code moved recently ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-07 12:41:41 -02:00
Franck Bui-Huu
f50c2169bd perf probe: Rewrite find_lazy_match_lines() by using getline(3)
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: lkml <linux-kernel@vger.kernel.org>
LKML-Reference: <m3d3o185u1.fsf@gmail.com>
Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-07 09:12:42 -02:00
Denis Kirjanov
ef4d001d79 perf top: Use pid_t for target_{pid|tid}
Use pid_t data type for target_{pid|tid} vars.

Cc: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <20110205203938.GA15328@hera.kernel.org>
Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org>
[ committer note: those variables are now in struct perf_top, fixed ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-07 09:10:15 -02:00
Ingo Molnar
075de90c46 Merge branch 'perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux-2.6 into perf/core 2011-02-07 08:45:48 +01:00
Ingo Molnar
c7f9a6f377 Merge branch 'linus' into perf/core
Merge reason: Pick up perf fixes that are now upstream

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-07 08:44:26 +01:00
Arnaldo Carvalho de Melo
36532461a0 perf top: Ditch private annotation code, share perf annotate's
Next step: Live TUI annotation in perf top, just press enter on a symbol
line.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-06 16:08:50 -02:00
Arnaldo Carvalho de Melo
f1e2701de0 perf annotate: Separate objdump parsing from actual screen rendering
Because in 'perf top' we'll need to parse just once and then, as samples
come, render multiple times with evolving counter values.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-06 13:40:31 -02:00
Arnaldo Carvalho de Melo
d040bd3638 perf annotate: Config options for symbol__tty_annotate
Max line# that should be printed, minimum percentage filter, just like
'perf top', alas, due to it :-)

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-05 15:37:31 -02:00
Arnaldo Carvalho de Melo
2f525d0148 perf annotate: Support multiple histograms in annotation
The perf annotate tool continues aggregating everything on just one
histograms, but to support the top model add support for one histogram
perf evsel in the evlist.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-05 12:28:48 -02:00
Arnaldo Carvalho de Melo
78f7defedb perf annotate: Move annotate functions to util/
They will be used by perf top, so that we have just one set of routines
to do annotation.

Rename "struct sym_priv" to "struct annotation", etc, to clarify this
code a bit.

Rename "struct sym_ext" to "struct source_line", to give it a meaningful
name, that clarifies that it is a the result of an addr2line call, that
is sorted by percentage one particular source code line appeared in the
annotation.

And since we're moving things around also rename 'sym_hist->ip' to
'sym_hist->addr' as we want to do data structure annotation at some
point.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-05 12:28:21 -02:00
Arnaldo Carvalho de Melo
764328d320 perf top: Remove superfluous name_len field
From the sym_entry struct, struct symbol already has this field.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-05 12:26:40 -02:00
Arnaldo Carvalho de Melo
52bcd9947b perf stat: Fix aggreate counter reading accounting
Introduced in: c52b12ed, when this sequence:

  count[0] = count[1] = count[2] = 0;

Was replaced with:

  aggr->val = 0;

Which is equivalent to zeroing just the first entry in the 'count'
array.

Fix it by zeroing the three entries with:

  aggr->val = aggr->ena = aggr->run = 0;

Reported-by: Ingo Molnar <mingo@elte.hu>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-03 17:26:06 -02:00
Yinghai Lu
cdb0861c85 perf top: Fix TUI compilation
> +	slsmg_write_nstring(width >= syme->map->dso->long_name_len ?
> +				syme->map->dso->long_name :
> +				syme->map->dso->short_name, width);

need update macro for that calling

util/ui/browsers/top.c: In function ‘perf_top_browser__write’:
util/ui/browsers/top.c:60:2: error: cast to pointer from integer of different size
util/ui/browsers/top.c:60:2: error: comparison between pointer and integer
util/ui/browsers/top.c:60:2: error: passing argument 1 of ‘SLsmg_write_nstring’ discards qualifiers from pointer target type
/usr/include/slang.h:1728:16: note: expected ‘char *’ but argument is of type ‘const char *’
make: *** [util/ui/browsers/top.o] Error 1

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <4D48562B.20006@kernel.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-02-01 17:33:06 -02:00
Arnaldo Carvalho de Melo
f6bbc1daac perf python: Fix build on 32-bit
Where there are lots of errors related to python methods receiving
'char *' for things like file open mode, which break the build, also
disable strict aliasing and fixup some other warnings. Now builds on
both 32-bit and 64-bit fedora systems.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-31 20:56:27 -02:00
Arnaldo Carvalho de Melo
c0443df1b6 perf top: Introduce slang based TUI
Disabled by default as there are features found in the stdio based one
that aren't implemented, like live annotation, filtering knobs data
entry.

Annotation hopefully will get somehow merged with the 'perf annotate'
code.

To use it:

perf top --tui

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-31 18:19:33 -02:00
Arnaldo Carvalho de Melo
229ade9ba3 perf tools: Don't fallback to setup_pager unconditionally
Because in tools like 'top' we don't want the pager.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-31 18:08:39 -02:00
Arnaldo Carvalho de Melo
8c3e10eb19 perf top: Move display agnostic routines to util/top.[ch]
Paving the way for a slang browser a la 'perf report --tui'.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-31 14:50:39 -02:00
Arnaldo Carvalho de Melo
7e2ed09753 perf evlist: Store pointer to the cpu and thread maps
So that we don't have to pass it around to the several methods that
needs it, simplifying usage.

There is one case where we don't have the thread/cpu map in advance,
which is in the parsing routines used by top, stat, record, that we have
to wait till all options are parsed to know if a cpu or thread list was
passed to then create those maps.

For that case consolidate the cpu and thread map creation via
perf_evlist__create_maps() out of the code in top and record, while also
providing a perf_evlist__set_maps() for cases where multiple evlists
share maps or for when maps that represent CPU sockets, for instance,
get crafted out of topology information or subsets of threads in a
particular application are to be monitored, providing more granularity
in specifying which cpus and threads to monitor.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-31 12:40:52 -02:00
Arnaldo Carvalho de Melo
f8a9530939 perf evlist: Move evlist methods to evlist.c
They were on evsel.c because they came from refactoring existing evsel
methods, so, to make reviewing the changes easier, I kept it there, now
its a plain move.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-30 11:41:13 -02:00
Arnaldo Carvalho de Melo
877108e42b perf tools: Initial python binding
First clarifying that this kind of binding is not a replacement or an
equivalent to the 'perf script' way of using python with perf.

The 'perf script' way is to process events and look at a given script
for some python function that matches the events to pass each event for
processing.

This is a python module, i.e. everything is driven from the python
script, that merely uses "import perf" or "from perf import".

perf script is focused on tracepoints, this binding is focused on profiling as
an initial target. More work is needed to make available tracepoint specific
variables as event variables accessible via this binding.

There is one example of such usage model, in
tools/perf/python/twatch.py, a tool to watch "cycles" events together
with task (fork, exit) and comm perf events.

For now, due to me not being able to grok how python distutils cope with
building C extensions outside the sources dir the install target just
builds it, I'm using it as:

[root@emilia linux]# export PYTHONPATH=~acme/git/build/perf/lib.linux-x86_64-2.6/
[root@emilia linux]# tools/perf/python/twatch.py
cpu:  4, pid: 30126, tid: 30126 { type: mmap, pid: 30126, tid: 30126, start: 0x4, length: 0x82e9ca03, offset: 0, filename:  }
cpu:  6, pid:   47, tid:   47 { type: mmap, pid: 47, tid: 47, start: 0x6, length: 0xbef87c36, offset: 0, filename:  }
cpu:  1, pid:    0, tid:    0 { type: mmap, pid: 0, tid: 0, start: 0x1, length: 0x775d1904, offset: 0, filename:  }
cpu:  7, pid:    0, tid:    0 { type: mmap, pid: 0, tid: 0, start: 0x7, length: 0xc750aeb6, offset: 0, filename:  }
cpu:  5, pid: 2255, tid: 2255 { type: mmap, pid: 2255, tid: 2255, start: 0x5, length: 0x76669635, offset: 0, filename:  }
cpu:  0, pid:    0, tid:    0 { type: mmap, pid: 0, tid: 0, start: 0, length: 0x6422ef6b, offset: 0, filename:  }
cpu:  2, pid: 2255, tid: 2255 { type: mmap, pid: 2255, tid: 2255, start: 0x2, length: 0xe078757a, offset: 0, filename:  }
cpu:  1, pid: 5769, tid: 5769 { type: fork, pid: 30127, ppid: 5769, tid: 30127, ptid: 5769, time: 103893991270534}
cpu:  6, pid: 30127, tid: 30127 { type: comm, pid: 30127, tid: 30127, comm: ls }
cpu:  6, pid: 30127, tid: 30127 { type: exit, pid: 30127, ppid: 30127, tid: 30127, ptid: 30127, time: 103893993273024}

The first 8 mmap events in this 8 way machine are a mistery that is still being
investigated.

More of the tools/perf/util/ APIs will be exposed via this python binding as
the need arises. For now the focus is on creating events and processing them,
symbol resolution is an obvious next step, with tracepoint variables as a close
second step.

Cc: Clark Williams <williams@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-30 11:37:38 -02:00
Arnaldo Carvalho de Melo
8115d60c32 perf tools: Kill event_t typedef, use 'union perf_event' instead
And move the event_t methods to the perf_event__ too.

No code changes, just namespace consistency.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-29 16:25:37 -02:00
Arnaldo Carvalho de Melo
8d50e5b417 perf tools: Rename 'struct sample_data' to 'struct perf_sample'
Making the namespace more uniform.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-29 16:25:20 -02:00
Arnaldo Carvalho de Melo
7bb41152b9 perf evlist: Support non overwrite mode in perf_evlist__read_on_cpu
I.e. stash the overwrite mode in struct perf_evlist and act accordingly
in perf_evlist__read_on_cpu, not checking for overwrites and touching
the tail after consuming one event, like perf record does, for instance.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-29 16:24:40 -02:00
Arnaldo Carvalho de Melo
ef2bf6d043 perf events: Account PERF_RECORD_LOST events in event__process
Right now this function is only used by perf top, that uses PROT_READ
only, i.e. overwrite mode, so no PERF_RECORD_LOST events are generated,
but don't forget those events.

The patch that moved this out of perf top was made so that this routine
could be used by 'perf probe' in the uprobes patchset, so perhaps there
they need to check for LOST events and warn the user, as will be done in
the following patches that will switch 'perf top' to non overwrite mode
(mmap with PROT_READ|PROT_WRITE).

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-29 16:24:24 -02:00
Masami Hiramatsu
3c42258c9a perf probe: Add filters support for available functions
Add filters support for available function list.

Default filter is "!_*" for filtering out local-purpose symbols.

e.g.:
 # perf probe --filter="add*" -F
add_disk
add_disk_randomness
add_input_randomness
add_interrupt_randomness
add_memory
add_page_to_unevictable_list
add_page_wait_queue
...

Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Chase Douglas <chase.douglas@canonical.com>
Cc: Franck Bui-Huu <fbuihuu@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <20110120141545.25915.85930.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-28 09:20:25 -02:00
Masami Hiramatsu
bd09d7b5ef perf probe: Add variable filter support
Add filters support for available variable list.

Default filter is "!__k???tab_*&!__crc_*" for filtering out
automatically generated symbols.

The format of filter rule is "[!]GLOBPATTERN", so you can use wild
cards. If the filter rule starts with '!', matched variables are filter
out.

e.g.:
 # perf probe -V schedule --externs --filter=cpu*
Available variables at schedule
        @<schedule+0>
                cpumask_var_t   cpu_callout_mask
                cpumask_var_t   cpu_core_map
                cpumask_var_t   cpu_isolated_map
                cpumask_var_t   cpu_sibling_map
                int     cpu_number
                long unsigned int*      cpu_bit_bitmap
		...

Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Chase Douglas <chase.douglas@canonical.com>
Cc: Franck Bui-Huu <fbuihuu@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <20110120141539.25915.43401.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
[ committer note: Removed the elf.h include as it was fixed up in e80711c]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-28 09:20:01 -02:00
Masami Hiramatsu
68baa431ec perf tools: Add strfilter for general purpose string filter
Add strfilter for general purpose string filter.

Every filter rules are descrived by glob matching pattern and '!' prefix
which means Logical NOT.

A strfilter consists of those filter rules connected with '&' and '|'.

A set of rules can be folded by using '(' and ')'.

It also accepts spaces around rules and those operators.

Format:
<rule> ::= <glob-exp> | "!" <rule> | <rule> <op> <rule> | "(" <rule> ")"
<op> ::= "&" | "|"

e.g.:

 "(add* | del*) & *timer" filter rules pass strings which start with add
 or del and end with timer.

This will be used by perf probe --filter.

Changes in V2:
 - Fix to check result of strdup() and strfilter__alloc().
 - Encapsulate and simplify interfaces as like regex(3).

Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Franck Bui-Huu <fbuihuu@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <20110120141530.25915.12673.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-28 09:19:38 -02:00
Arnaldo Carvalho de Melo
ef1d1af28c perf evsel: Introduce perf_evsel__{in,ex}it
Out of the {con,des}structor, as in interpreted language bindings we will
need to go back from the wrapper object to the real thing. In that case
using container_of will save us to have an extra pointer in the perf_evsel
struct.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-24 13:18:05 -02:00
Arnaldo Carvalho de Melo
d0dd74e853 perf tools: Move event__parse_sample to evsel.c
To avoid linking more stuff in the python binding I'm working on, future
csets will make the sample type be taken from the evsel itself, but for
that we need to first have one file per cpu and per sample_type, not a
single perf.data file.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-24 13:17:56 -02:00
Arnaldo Carvalho de Melo
fd78260b53 perf threads: Move thread_map to separate file
To untangle it from struct thread handling, that is tied to symbols, etc.

Right now in the python bindings I'm working on I need just a subset of
the util/ files, untangling it allows me to do that.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-24 10:59:00 -02:00
Arnaldo Carvalho de Melo
17ea1b70a8 perf tools: Pass the struct opt to the wildcard parsing routine
It is needed because it will call parse_event for each tracepoint
name that matches, and we pass the perf_evlist via opt->value.

Problem introduced in 4503fdd where my assumption about opt being
always non NULL made me not look at callers of parse_events outside
builtin-*.c.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-24 10:58:39 -02:00
Masami Hiramatsu
e80711ca85 perf probe: Add --funcs to show available functions in symtab
Add --funcs to show available functions in symtab.

Originally this feature came from Srikar's uprobes patches
( http://lkml.org/lkml/2010/8/27/244 )

e.g.
...
__ablkcipher_walk_complete
__absent_pages_in_range
__account_scheduler_latency
__add_pages
__alloc_pages_nodemask
__alloc_percpu
__alloc_reserved_percpu
__alloc_skb
__alloc_workqueue_key
__any_online_cpu
__ata_ehi_push_desc
...

This also supports symbols in module, e.g.

...
cleanup_module
cpuid_maxphyaddr
emulate_clts
emulate_instruction
emulate_int_real
emulate_invlpg
emulator_get_dr
emulator_set_dr
emulator_task_switch
emulator_write_emulated
emulator_write_phys
fx_init
...

Original-patch-from: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Franck Bui-Huu <fbuihuu@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <20110113124611.22426.10835.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
[ committer note: Add missing elf.h for STB_GLOBAL that broke a RHEL4 build ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-24 10:57:55 -02:00
Masami Hiramatsu
5069ed86be perf probe: Enable to put probe inline function call site
Enable to put probe inline function call site. This will increase line-based
probe-ability.

<Without this patch>
$ ./perf probe -L schedule:48
<schedule:48>
                pre_schedule(rq, prev);

     50         if (unlikely(!rq->nr_running))
                        idle_balance(cpu, rq);

                put_prev_task(rq, prev);
                next = pick_next_task(rq);

     56         if (likely(prev != next)) {
                        sched_info_switch(prev, next);
                        trace_sched_switch_out(prev, next);
                        perf_event_task_sched_out(prev, next);

<With this patch>
$ ./perf probe -L schedule:48
<schedule:48>
     48         pre_schedule(rq, prev);

     50         if (unlikely(!rq->nr_running))
     51                 idle_balance(cpu, rq);

     53         put_prev_task(rq, prev);
     54         next = pick_next_task(rq);

     56         if (likely(prev != next)) {
     57                 sched_info_switch(prev, next);
     58                 trace_sched_switch_out(prev, next);
     59                 perf_event_task_sched_out(prev, next);

Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Franck Bui-Huu <fbuihuu@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <20110113124604.22426.48873.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-24 10:57:03 -02:00
Masami Hiramatsu
4cc9cec636 perf probe: Introduce lines walker interface
Introduce die_walk_lines() for walking on the line list of given die, and use
it in line_range finder and probe point finder.

Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Franck Bui-Huu <fbuihuu@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <20110113124558.22426.48170.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
[ committer note: s/%ld/%zd/ for a size_t nlines var that broke f14 x86 build]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-24 10:49:52 -02:00
Frederic Weisbecker
529363b769 perf callchain: Don't give arbitrary gender to callchain tree nodes
Some little callchain tree nodes shyly asked me if they can have
sisters.

How cute!

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1294977121-5700-5-git-send-email-fweisbec@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-22 19:56:31 -02:00
Frederic Weisbecker
16537f1355 perf callchain: Rename register_callchain_param into callchain_register_param
To make the callchain API naming more consistent.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1294977121-5700-4-git-send-email-fweisbec@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-22 19:56:31 -02:00
Frederic Weisbecker
f08c3154ac perf callchain: Rename cumul_hits into callchain_cumul_hits
That makes the callchain API naming more consistent and
reduce potential naming clashes.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1294977121-5700-3-git-send-email-fweisbec@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-22 19:56:31 -02:00
Frederic Weisbecker
1b3a0e9592 perf callchain: Feed callchains into a cursor
The callchains are fed with an array of a fixed size.
As a result we iterate over each callchains three times:

- 1st to resolve symbols
- 2nd to filter out context boundaries
- 3rd for the insertion into the tree

This also involves some pairs of memory allocation/deallocation
everytime we insert a callchain, for the filtered out array of
addresses and for the array of symbols that comes along.

Instead, feed the callchains through a linked list with persistent
allocations. It brings several pros like:

- Merge the 1st and 2nd iterations in one. That was possible before
but in a way that would involve allocating an array slightly taller
than necessary because we don't know in advance the number of context
boundaries to filter out.

- Much lesser allocations/deallocations. The linked list keeps
persistent empty entries for the next usages and is extendable at
will.

- Makes it easier for multiple sources of callchains to feed a
stacktrace together. This is deemed to pave the way for cfi based
callchains wherein traditional frame pointer based kernel
stacktraces will precede cfi based user ones, producing an overall
callchain which size is hardly predictable. This requirement
makes the static array obsolete and makes a linked list based
iterator a much more flexible fit.

Basic testing on a big perf file containing callchains (~ 176 MB)
has shown a throughput gain of about 11% with perf report.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1294977121-5700-2-git-send-email-fweisbec@gmail.com>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-22 19:56:31 -02:00
Arnaldo Carvalho de Melo
04391debc3 perf evlist: Steal mmap reading routine from 'perf top'
Will be used in the upcoming 'perf test' entry for the evlist mmap
routines.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-22 19:56:30 -02:00
Arnaldo Carvalho de Melo
915fce20ec perf tools: Add missing cpu_map__delete()
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-22 19:56:30 -02:00
Arnaldo Carvalho de Melo
70db7533ca perf evlist: Move the mmap array from perf_evsel
Adopting the new model used in 'perf record', where we don't have a map
per thread per cpu, instead we have an mmap per cpu, established on the
first fd for that cpu and ask the kernel using the
PERF_EVENT_IOC_SET_OUTPUT ioctl to send events for the other fds on that
cpu for the one with the mmap.

The methods moved from perf_evsel to perf_evlist, but for easing review
they were modified in place, in evsel.c, the next patch will move the
migrated methods to evlist.c.

With this 'perf top' now uses the same mmap model used by 'perf record'
and the next patches will make 'perf record' use these new routines,
establishing a common codebase for both tools.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-22 19:56:29 -02:00
Arnaldo Carvalho de Melo
70082dd92c perf evsel: Introduce mmap support
Out of the code in 'perf top'. Record is next in line.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-22 19:56:29 -02:00
Arnaldo Carvalho de Melo
9d04f17817 perf evsel: Allow specifying if the inherit bit should be set
As this is a per-cpu attribute, we can't set it up in advance and use it
for all the calls.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-22 19:56:29 -02:00
Arnaldo Carvalho de Melo
f08199d314 perf evsel: Support event groups
The perf_evsel__open now have an extra boolean argument specifying if
event grouping is desired.

The first file descriptor created on a CPU becomes the group leader.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-22 19:56:28 -02:00
Arnaldo Carvalho de Melo
5c581041cf perf evlist: Adopt the pollfd array
Allocating just the space needed for nr_cpus * nr_threads * nr_evsels,
not the MAX_NR_CPUS and counters.

LKML-Reference: <new-submission>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-22 19:56:28 -02:00
Arnaldo Carvalho de Melo
361c99a661 perf evsel: Introduce perf_evlist
Killing two more perf wide global variables: nr_counters and evsel_list
as a list_head.

There are more operations that will need more fields in perf_evlist,
like the pollfd for polling all the fds in a list of evsel instances.

Use option->value to pass the evsel_list to parse_{events,filters}.

LKML-Reference: <new-submission>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-22 19:56:28 -02:00
Thomas Renninger
00e99a49f6 perf tools: Fix time function double declaration with glibc
It's enough to include the local "debug.h" file to trigger it.

man time reveals this is already declared in glibc:

time - get time in seconds
-> rename the variable.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: arjan@infradead.org
LPU-Reference: <1295620209-13859-2-git-send-email-trenn@suse.de>
Signed-off-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-22 19:53:00 -02:00
Arnaldo Carvalho de Melo
5c7a66822c perf tools: Fix build when using gcc 3.4.6
[acme@localhost linux]$ make O=~acme/git/build/perf -C tools/perf
make: Entering directory `/home/acme/git/linux/tools/perf'
Makefile:526: No libdw.h found or old libdw.h found or elfutils is older than 0.138, disables dwarf support. Please install new elfutils-devel/libdw-dev
Makefile:582: newt not found, disables TUI support. Please install newt-devel or libnewt-dev
    CC /home/acme/git/build/perf/builtin-annotate.o
In file included from builtin-annotate.c:23:
util/parse-events.h:26: warning: declaration of 'evsel_list' shadows a global declaration
util/parse-events.h:12: warning: shadowed declaration is here
make: *** [/home/acme/git/build/perf/builtin-annotate.o] Error 1
make: Leaving directory `/home/acme/git/linux/tools/perf'
[acme@localhost linux]$ gcc --version | head -1
gcc (GCC) 3.4.6 20060404 (Red Hat 3.4.6-11)
[acme@localhost linux]$

Fix it by renaming the parameter to evlist.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-22 19:15:39 -02:00
Arnaldo Carvalho de Melo
a860a60818 perf tools: Add missing header, fixes build
We need the definiton for __always_inline in bitops.h to fix the build
on distros where it isn't available or compiler.h doesn't get included
indirectly.

One of the fixes needed to build perf on RHEL4 systems, for instance.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-22 19:15:39 -02:00
Arnaldo Carvalho de Melo
9486aa3877 perf tools: Fix 64 bit integer format strings
Using %L[uxd] has issues in some architectures, like on ppc64.  Fix it
by making our 64 bit integers typedefs of stdint.h types and using
PRI[ux]64 like, for instance, git does.

Reported by Denis Kirjanov that provided a patch for one case, I went
and changed all cases.

Reported-by: Denis Kirjanov <dkirjanov@kernel.org>
Tested-by: Denis Kirjanov <dkirjanov@kernel.org>
LKML-Reference: <20110120093246.GA8031@hera.kernel.org>
Cc: Denis Kirjanov <dkirjanov@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Pingtian Han <phan@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-22 23:41:57 -02:00
Dr. David Alan Gilbert
b2f8fb237e perf symbols: Fix annotation of thumb code
In ARM's Thumb mode the bottom bit of the symbol address is set to mark
the function as Thumb; the instructions are in reality 2 or 4 byte on 2
byte alignments, and when the +1 address is used in annotate it causes
objdump to disassemble invalid instructions.

The patch removes that bottom bit during symbol loading.

Many thinks to Dave Martin for comments on an initial version of the
patch.

(For reference this corresponds to this bug
https://bugs.launchpad.net/linux-linaro/+bug/677547 )

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Dave Martin <dave.martin@linaro.org>
LKML-Reference: <20110121163922.GA31398@davesworkthinkpad>
Signed-off-by: Dr. David Alan Gilbert <david.gilbert@linaro.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-21 16:32:18 -02:00
Arnaldo Carvalho de Melo
ad7f4e3f7b perf tools: Fix tracepoint id to string perf.data header table
It was broken by f006d25 that passed just the event name, not the complete
sys:event that it expected to open the /sys/.../sys/sys:event/id file to get
the id.

Fix it by moving it to after parse_events in cmd_record, as at that point
we can just traverse the evsel_list and use evsel->attr.config +
event_name(evsel) instead of re-opening the /id file.

Reported-by: Franck Bui-Huu <vagabon.xyz@gmail.com>
Cc: Franck Bui-Huu <vagabon.xyz@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Han Pingtian <phan@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <20110117202801.GG2085@ghostprotocols.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-17 18:28:13 -02:00
Arnaldo Carvalho de Melo
dd9a9ad5e1 perf tools: Fix handling of wildcards in tracepoint event selectors
It wasn't accounting the ':' when consuming bytes in the the event
selector string, so parse_events() would fail in this test:

                if (!(*str == 0 || *str == ',' || isspace(*str)))
                        return -1;

as *str would be pointing to '*', the last character in the '-e' arg in:

$ perf record -q -a -D -e sched:sched_* | perf script -i - -s perf-script.py

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-17 15:26:07 -02:00
Arnaldo Carvalho de Melo
3d03e2ea74 perf session: Fix infinite loop in __perf_session__process_events
In this if statement:

        if (head + event->header.size >= mmap_size) {
                if (mmaps[map_idx]) {
                        munmap(mmaps[map_idx], mmap_size);
                        mmaps[map_idx] = NULL;
                }

                page_offset = page_size * (head / page_size);
                file_offset += page_offset;
                head -= page_offset;
                goto remap;
        }

With, for instance, these values:

head=2992
event->header.size=48
mmap_size=3040

We end up endlessly looping back to remap. Off by one.

Problem introduced in 55b4462.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Ingo Molnar <mingo@elte.hu>
Reported-by: David Ahern <daahern@cisco.com>
Bisected-by: David Ahern <daahern@cisco.com>
Tested-by: David Ahern <daahern@cisco.com>
Cc: David Ahern <daahern@cisco.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-10 22:23:08 -02:00
Arnaldo Carvalho de Melo
0252208eb5 perf evsel: Support perf_evsel__open(cpus > 1 && threads > 1)
And a test for it:

[acme@felicio linux]$ perf test
 1: vmlinux symtab matches kallsyms: Ok
 2: detect open syscall event: Ok
 3: detect open syscall event on all cpus: Ok
[acme@felicio linux]$

Translating C the test does:

1. generates different number of open syscalls on each CPU
   by using sched_setaffinity
2. Verifies that the expected number of events is generated
   on each CPU

It works as expected.

LKML-Reference: <new-submission>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-10 22:03:26 -02:00
Lin Ming
23a2f3ab46 perf tools: Pass whole attr to event selectors
Since commit 69aad6f1(perf tools: Introduce event selectors), only
perf_event_attr::type and ::config are passed to event selector, which
makes perf tool not work correctly.

For example, PEBS does not work because perf_event_attr::precise_ip is
not passed to the syscall.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <1294369869.20563.19.camel@minggr.sh.intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-07 01:44:36 -02:00
Han Pingtian
f006d25a15 perf tools: Fix buffer overflow error when specifying all tracepoints
I found when specifying all tracepoints with -e to one of subcommand,
such as 'stat', the program will trigger a buffer overflow error, like
this:

*** buffer overflow detected ***: ./perf terminated
======= Backtrace: =========
/lib64/libc.so.6(__fortify_fail+0x37)[0x382cefb2c7]
....

The tracepoints are separated by comma, something like this:

$ perf stat -a -e `perf list |grep Tracepoint|awk -F'[' '{gsub(/[[:space:]]+/,"",$1);array[FNR]=$1}END{outputs=array[1];for (i=2;i<=FNR;i++){ outputs=outputs "," array[i];};print outputs}'`

The root reason of this problem is that store_event_type() is called for all
events, and will overflow the 'filename' at:

    strncat(filename, orgname, strlen(orgname));

This patch fixes it by calling store_event_type() only when the event name has
been found.

LKML-Reference: <20110106093922.GB6713@hpt.nay.redhat.com>
Signed-off-by: Han Pingtian <phan@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-06 18:04:46 -02:00
Arnaldo Carvalho de Melo
1109599458 perf session: Warn about errors when processing pipe events too
Just like we do at __perf_session__process_events

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-05 14:53:10 -02:00
Stephane Eranian
d030260ad3 perf tools: Fix perf_event.h header usage
This patch fixes the usage of the perf_event.h header file
between command modules and the supporting code in util.

It is necessary to ensure that ALL files use the SAME
perf_event.h header from the kernel source tree.

There were a couple of #include <linux/perf_event.h> mixed
with #include "../../perf_event.h".

This caused issues on some distros because of mismatch
in the layout of struct perf_event_attr. That eventually
led perf stat to segfault.

Cc: David S. Miller <davem@davemloft.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Stephane Eranian <eranian@gmail.com>
LKML-Reference: <4d233cf0.2308e30a.7b00.ffffc187@mx.google.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-05 14:52:54 -02:00
Ingo Molnar
aef1b9cef7 Merge commit 'v2.6.37' into perf/core
Merge reason: Add the final .37 tree.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-05 14:22:10 +01:00
Thomas Renninger
20c457b858 perf timechart: Adjust perf timechart to the new power events
builtin-timechart must only pass -e power:xy events if they are supported by
the running kernel, otherwise try to fetch the old power:power{start,end}
events.

For this I added the tiny helper function:

   int is_valid_tracepoint(const char *event_string)

to parse-events.[hc], which could be more generic as an interface and support
hardware/software/... events, not only tracepoints, but someone else could
extend that if needed...

Signed-off-by: Thomas Renninger <trenn@suse.de>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Jean Pihet <j-pihet@ti.com>
LKML-Reference: <1294073445-14812-4-git-send-email-trenn@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-04 08:16:54 +01:00
Ingo Molnar
928585536f Merge branch 'perf/test' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux-2.6 into perf/core 2011-01-04 08:10:28 +01:00
Ingo Molnar
cc22219699 Merge commit 'v2.6.37-rc8' into perf/core
Merge reason: pick up latest -rc.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-01-04 08:08:54 +01:00
Arnaldo Carvalho de Melo
4eed11d5e2 perf evsel: Auto allocate resources needed for some methods
While writing the first user of the routines created from the ad-hoc
routines in the existing builtins I noticed that the resulting set of
calls was too long, reduce it by doing some best effort allocations.

Tools that need to operate on multiple threads and cpus should pre-allocate
enough resources by explicitely calling the perf_evsel__alloc_{fd,counters}
methods.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-04 00:31:32 -02:00
Arnaldo Carvalho de Melo
86bd5e8603 perf evsel: Use {cpu,thread}_map to shorten list of parameters
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-04 00:24:36 -02:00
Arnaldo Carvalho de Melo
5c98d466e4 perf tools: Refactor all_tids to hold nr and the map
So that later, we can pass the thread_map instance instead of
(thread_num, thread_map) for things like perf_evsel__open and friends,
just like was done with cpu_map.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-04 00:24:16 -02:00
Arnaldo Carvalho de Melo
60d567e2d9 perf tools: Refactor cpumap to hold nr and the map
So that later, we can pass the cpu_map instance instead of (nr_cpus, cpu_map)
for things like perf_evsel__open and friends.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-04 00:23:55 -02:00
Arnaldo Carvalho de Melo
48290609c0 perf evsel: Introduce per cpu and per thread open helpers
Abstracting away the loops needed to create the various event fd handlers.

The users have to pass a confiruged perf->evsel.attr field, which is already
usable after perf_evsel__new (constructor) time, using defaults.

Comes out of the ad-hoc routines in builtin-stat, that now uses it.

Fixed a small silly bug where we were die()ing before killing our
children, dysfunctional family this one 8-)

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-04 00:23:27 -02:00
Arnaldo Carvalho de Melo
c52b12ed25 perf evsel: Steal the counter reading routines from stat
Making them hopefully generic enough to be used in 'perf test',
well see.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-04 00:22:55 -02:00
Arnaldo Carvalho de Melo
70d544d057 perf evsel: Delete the event selectors at exit
Freeing all the possibly allocated resources, reducing complexity
on each tool exit path.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-03 16:51:39 -02:00
Arnaldo Carvalho de Melo
1e7972cc5c perf util: Move do_read from session to util
Not really something to be exported from session.c. Rename it to
'readn' as others did in the past.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-03 16:50:55 -02:00
Arnaldo Carvalho de Melo
daec78a09d perf evsel: Adopt MATCH_EVENT macro from 'stat'
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-03 16:49:44 -02:00
Arnaldo Carvalho de Melo
69aad6f1ee perf tools: Introduce event selectors
Out of ad-hoc code and global arrays with hard coded sizes.

This is the first step on having a library that will be first
used on regression tests in the 'perf test' tool.

[acme@felicio linux]$ size /tmp/perf.before
   text	   data	    bss	    dec	    hex	filename
1273776	  97384	5104416	6475576	 62cf38	/tmp/perf.before
[acme@felicio linux]$ size /tmp/perf.new
   text	   data	    bss	    dec	    hex	filename
1275422	  97416	1392416	2765254	 2a31c6	/tmp/perf.new

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2011-01-03 16:39:04 -02:00
Frederic Weisbecker
d425de5436 perf: Fix callchain hit bad cast on ascii display
ipchain__fprintf_graph() casts the number of hits in a branch as an
int, which means we lose its highests bits.

This results in meaningless number of callchain hits in perf.data
that have a high number of hits recorded, typically those that have
callchain branches hits appearing more than INT_MAX. This happens
easily as those are pondered by the event period.

Reported-by: Nick Piggin <npiggin@kernel.dk>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
2011-01-03 16:13:11 +01:00
Franck Bui-Huu
32ae2ade46 perf probe: Fix short file name probe location reporting
After adding probes, perf-probe(1) reports the probes locations which include
filenames for certain cases.

But for short file names (whose length < 32), perf-probe didn't display the
name correctly. It actually skipped the first character.

Here's an example where 'icmp.c' was screwed:

   $ perf probe -n -a "icmp.c;sk=*"
   Add new events:
     probe:icmp_push_reply (on @cmp.c)
     probe:icmp_reply     (on @cmp.c)
     probe:icmp_reply_1   (on @cmp.c)
     probe:icmp_send      (on @cmp.c)
     probe:icmp_send_1    (on @cmp.c)
     probe:icmp_error     (on @cmp.c)
     probe:icmp_error_1   (on @cmp.c)
     probe:icmp_error_2   (on @cmp.c)
     probe:icmp_error_3   (on @cmp.c)

This patch fixes this bug in synthesize_perf_probe_point().

Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
LKML-Reference: <m31v588r9k.fsf@gmail.com>
Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-27 19:48:21 -02:00
Franck Bui-Huu
32b2b6ec57 perf probe: Fix wrong warning in __show_one_line() if read(1) errors happen
This was introduced by commit fde52dbd7f.

Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <m3y67hsr0m.fsf@gmail.com>
Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-22 20:32:08 -02:00
Arnaldo Carvalho de Melo
3b01a413c1 perf symbols: Improve kallsyms symbol end addr calculation
For kallsyms we don't have the symbol address end, so we do an extra pass and
set the symbol end addr as being the start of the next minus one.

But this was being done just after we filtered the symbols of a
particular type (functions, variables), so the symbol end was sometimes
after what it really is.

Fixing up symbol end also was falling apart when we have symbol aliases,
then the end address of all but the last alias was being set to be
before its start.

Fix it up by checking for symbol aliases and making the kallsyms__parse
routine use the next symbol, whatever its type, as the limit for the
previous symbol, passing that end address to the callback.

This was detected by the 'perf test' synthetic paranoid regression
tests, fix it up so that even that case doesn't mislead us.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-22 20:31:45 -02:00
Masami Hiramatsu
3b4694de35 perf probe: Fix to support libdwfl older than 0.148
Since the libdwfl library before 0.148 fails to analyze live kernel debuginfo,
'perf probe --list' compiled with those old libdwfl sometimes crashes.

To avoid that bug, perf probe does not use libdwfl's live kernel analysis
routine when it is compiled with older libdwfl.

Side effect: perf with older libdwfl doesn't support listing probe in modules
with source code line. Those could be shown by symbol+offset.

Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <20101217131218.24123.62424.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-21 19:24:57 -02:00
Masami Hiramatsu
ea187cfbb9 perf tools: Fix lazy wildcard matching
Fix lazy wildcard matching to ignore space after wild card.

Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <20101217131200.24123.8202.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-21 19:15:42 -02:00
Franck Bui-Huu
21dd9ae5a4 perf probe: Handle gracefully some stupid and buggy line syntaxes
Currently perf probe doesn't handle those incorrect syntaxes:

   $ perf probe -L sched.c:++13
   $ perf probe -L sched.c:-+13
   $ perf probe -L sched.c:10000000000000000000000000000+13

This patches rewrites parse_line_range_desc() to handle them.

As a bonus, it reports more useful error messages instead of: "Tailing
with invalid character...".

Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
LKML-Reference: <1292854685-8230-7-git-send-email-fbuihuu@gmail.com>
Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-21 17:20:13 -02:00
Franck Bui-Huu
fde52dbd7f perf probe: Don't always consider EOF as an error when listing source code
When listing a whole file or a function which is located at the end,
perf-probe -L output wrongly: "Source file is shorter than expected.".

This is because show_one_line() always consider EOF as an error.

This patch fixes this by not considering EOF as an error when dumping
the trailing lines. Otherwise it's still an error and perf-probe still
outputs its warning.

Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
LKML-Reference: <1292854685-8230-6-git-send-email-fbuihuu@gmail.com>
Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-21 16:20:12 -02:00
Franck Bui-Huu
9d95b580a8 perf probe: Fix line range description since a single file is allowed
$ perf-probe -L sched.c

is currently allowed but not documented.

Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
LKML-Reference: <1292854685-8230-5-git-send-email-fbuihuu@gmail.com>
Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-21 16:20:12 -02:00
Franck Bui-Huu
44b81e929b perf probe: Clean up redundant tests in show_line_range()
It also removes some superflous parentheses.

Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
LKML-Reference: <1292854685-8230-4-git-send-email-fbuihuu@gmail.com>
Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-21 16:20:12 -02:00
Franck Bui-Huu
befe341468 perf probe: Rewrite show_one_line() to make it simpler
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
LKML-Reference: <1292854685-8230-3-git-send-email-fbuihuu@gmail.com>
Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-21 16:20:11 -02:00
Franck Bui-Huu
62c15fc49b perf probe: Make -L display the absolute path of the dumped file
The actual file used by 'perf probe -L sched.c' is reported in the ouput
of the command.

But it's simply displayed as it has been given to the command (simply
sched.c) which is too ambiguous to be really usefull since several
sched.c files can be found into the same project and we also don't know
which search path has been used.

Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
LKML-Reference: <1292854685-8230-2-git-send-email-fbuihuu@gmail.com>
Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-21 16:20:11 -02:00
Masami Hiramatsu
0e43e5d222 perf probe: Cleanup messages
Add new lines for error or debug messages, change dwarf related words to more
generic words (or just removed).

Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
LKML-Reference: <20101217131211.24123.40437.stgit@ltc236.sdl.hitachi.co.jp>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-21 20:20:58 -02:00
David Ahern
ec5761eab3 perf symbols: Add symfs option for off-box analysis using specified tree
The symfs argument allows analysis of perf.data file using a locally accessible
filesystem tree with debug symbols - e.g., tree created during image builds,
sshfs mount, loop mounted KVM disk images, USB keys, initrds, etc. Anything
with an OS tree can be analyzed from anywhere without the need to populate a
local data store with build-ids.

Commiter notes:

o Fixed up symfs="/" variants handling.

o prefixed DSO__ORIG_GUEST_KMODULE case with symfs too, avoiding use of files
  outside the symfs directory.

LKML-Reference: <1291926427-28846-1-git-send-email-daahern@cisco.com>
Signed-off-by: David Ahern <daahern@cisco.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-21 20:17:51 -02:00
Ian Munsie
21ef97f05a perf session: Fallback to unordered processing if no sample_id_all
If we are running the new perf on an old kernel without support for
sample_id_all, we should fall back to the old unordered processing of
events. If we didn't than we would *always* process events without
timestamps out of order, whether or not we hit a reordering race. In
other words, instead of there being a chance of not attributing samples
correctly, we would guarantee that samples would not be attributed.

While processing all events without timestamps before events with
timestamps may seem like an intuitive solution, it falls down as
PERF_RECORD_EXIT events would also be processed before any samples.
Even with a workaround for that case, samples before/after an exec would
not be attributed correctly.

This patch allows commands to indicate whether they need to fall back to
unordered processing, so that commands that do not care about timestamps
on every event will not be affected. If we do fallback, this will print
out a warning if report -D was invoked.

This patch adds the test in perf_session__new so that we only need to
test once per session. Commands that do not use an event_ops (such as
record and top) can simply pass NULL in it's place.

Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <1291951882-sup-6069@au1.ibm.com>
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-21 20:17:51 -02:00
Franck Bui-Huu
68a7a771ad perf buildid-cache: Fix symbolic link handling
This was broken since link(2) doesn't dereference symbolic
links. Instead 'filename' becomes a symbolic link to the same file
that 'name' refers to.

This had the bad effect to create dangling symlinks in the case that
even can't be removed with perf-buildid-cache(1).

LKML-Reference: <m38vzxxrql.fsf@gmail.com>
Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-16 09:41:45 -02:00
Franck Bui-Huu
c3a34e06db perf symbols: Stop using vmlinux files with no symbols
Fail if the kernel image contains no symbol, allowing using other images
in the vmlinux search path that may have a usable symtab.

Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Francis Moreau <francis.moro@gmail.com>
Cc: Franck Bui-Huu <vagabon.xyz@gmail.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
LPU-Reference: <m3d3p9ydx9.fsf_-_@gmail.com>
Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-16 09:41:45 -02:00
Franck Bui-Huu
fd930ff91e perf probe: Fix use of kernel image path given by 'k' option
Users were not being able to have the explicitely specified vmlinux
pathname used, instead a search on the vmlinux path was always being
made.

Reported-by: Francis Moreau <francis.moro@gmail.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: 2nddept-manager@sdl.hitachi.co.jp
Cc: Francis Moreau <francis.moro@gmail.com>
Cc: Franck Bui-Huu <vagabon.xyz@gmail.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
LPU-Reference: <m3hbelydz8.fsf_-_@gmail.com>
Signed-off-by: Franck Bui-Huu <fbuihuu@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-16 09:41:45 -02:00
Ingo Molnar
006b20fe4c Merge branch 'perf/urgent' into perf/core
Merge reason: We want to apply a dependent patch.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-12-16 11:22:27 +01:00
Arnaldo Carvalho de Melo
ddbc24b72c perf session: Remove unneeded dump_printf calls
Since we check at the beginning of the callers, no need to ask if
dump_trace is set multiple times.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-09 12:20:20 -02:00
Thomas Gleixner
ba74f0640d perf session: Split out user event processing
Simplify further.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20101207124551.110956235@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-09 12:15:43 -02:00
Thomas Gleixner
3dfc2c0aee perf session: Split out sample preprocessing
Simplify the code a bit.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20101207124551.014649793@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-09 12:13:13 -02:00
Thomas Gleixner
532e7269c0 perf session: Move dump code to event delivery path
Preparatory patch for ordered perf report -D

Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20101207124550.918655066@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-09 12:10:53 -02:00
Thomas Gleixner
f74725dcf2 perf session: Add file_offset to event delivery function
Preparatory patch for ordered output of perf report -D

Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20101207124550.818568607@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-09 12:09:51 -02:00
Thomas Gleixner
e4c2df132f perf session: Store file offset in sample_queue
Preparatory patch for ordered output of perf report -D.

Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20101207124550.725128545@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-09 12:09:18 -02:00
Thomas Gleixner
9aefcab0de perf session: Consolidate the dump code
The dump code used by perf report -D is scattered all over the place.
Move it to separate functions.

Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20101207124550.625434869@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-09 11:18:06 -02:00
Thomas Gleixner
79a14c1f45 perf session: Dont queue events w/o timestamps
If the event has no timestamp assigned then the parse code sets it to
~0ULL which causes the ordering code to enqueue it at the end.

Process it right away.

Reported-by: Ian Munsie <imunsie@au1.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20101207124550.528788441@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-09 11:15:38 -02:00
Thomas Gleixner
3835bc00c5 perf event: Prevent unbound event__name array access
event__name[] is missing an entry for PERF_RECORD_FINISHED_ROUND, but we
happily access the array from the dump code.

Make event__name[] static and provide an accessor function, fix up all
callers and add the missing string.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20101207124550.432593943@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-09 11:15:07 -02:00
David Ahern
b226a5a729 perf report: Allow user to specify path to kallsyms file
This is useful for analyzing a perf data file on a different system than
the one data was collected on and still include symbols from loaded
kernel modules in the output.

Commiter note: Updated the man page accordingly.

LKML-Reference: <1291775986-16475-1-git-send-email-daahern@cisco.com>
Signed-off-by: David Ahern <daahern@cisco.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-09 11:07:07 -02:00
Chris Samuel
ce47dc56a2 perf tools: Catch a few uncheck calloc/malloc's
There were a few stray calloc()'s and malloc()'s which were not having
their return values checked for success.

As the calling code either already coped with failure or didn't actually
care we just return -ENOMEM at that point.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Chris Samuel <chris@csamuel.org>
LKML-Reference: <4CDDF95A.1050400@csamuel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-06 12:52:35 -02:00
Thomas Gleixner
cbf41645f3 perf session: Sort all events if ordered_samples=true
Now that we have timestamps on FORK, EXIT, COMM, MMAP events we can
sort everything in time order. This fixes the following observed
problem:

mmap(file1) -> pagefault() -> munmap(file1)
mmap(file2) -> pagefault() -> munmap(file2)

Resulted in decoding both pagefaults in file2 because the file1 map
was already replaced by the file2 map when the map address was
identical.

With all events sorted we decode both pagefaults correctly.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <alpine.LFD.2.00.1012051220450.2653@localhost6.localdomain6>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-06 15:43:00 -02:00
Akihiro Nagai
e4e18d568b perf options: add OPT_CALLBACK_DEFAULT_NOOPT
Add new macro OPT_CALLBACK_DEFAULT_NOOPT for parse_options.

It enables to pass the default value (opt->defval) to the callback function
processing options require no argument.

Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
LKML-Reference: <20101203035853.7827.17502.stgit@localhost6.localdomain6>
Signed-off-by: Akihiro Nagai <akihiro.nagai.hw@hitachi.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-06 15:33:29 -02:00
Ian Munsie
1437a30aae perf hist: Better displaying of unresolved DSOs and symbols
In the event that a DSO has not been identified, just print out [unknown]
instead of the instruction pointer as we previously were doing, which is pretty
meaningless for a shared object (at least to the users perspective).

The IP we print out is fairly meaningless in general anyway - it's just one
(the first) of the many addresses that were lumped together as unidentified,
and could span many shared objects and symbols. In reality if we see this
[unknown] output then the report -D output is going to be more useful anyway as
we can see all the different address that it represents.

If we are printing the symbols we are still going to see this IP in that column
anyway since they shouldn't resolve either.

This patch also changes the symbol address printouts so that they print out 0x
before the address, are left aligned, and changes the %L format string (which
relies on a glibc bug) to %ll.

Before:
    74.11%    :3259               4a6c  [k]     4a6c
After:
    74.11%    :3259  [unknown]          [k] 0x4a6c

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <1291603026-11785-2-git-send-email-imunsie@au1.ibm.com>
Signed-off-by: Ian Munsie <imunsie@au1.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-06 15:12:34 -02:00
Arnaldo Carvalho de Melo
9c90a61c7e perf tools: Ask for ID PERF_SAMPLE_ info on all PERF_RECORD_ events
So that we can use -T == --timestamp, asking for PERF_SAMPLE_TIME:

  $ perf record -aT
  $ perf report -D | grep PERF_RECORD_
  <SNIP>
   3   5951915425 0x47530 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff8138c1a2 period: 215979 cpu:3
   3   5952026879 0x47588 [0x90]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff810cb480 period: 215979 cpu:3
   3   5952059959 0x47618 [0x38]: PERF_RECORD_FORK(6853:6853):(16811:16811)
   3   5952138878 0x47650 [0x78]: PERF_RECORD_SAMPLE(IP, 1): 16811/16811: 0xffffffff811bac35 period: 431478 cpu:3
   3   5952375068 0x476c8 [0x30]: PERF_RECORD_COMM: find:6853
   3   5952395923 0x476f8 [0x50]: PERF_RECORD_MMAP 6853/6853: [0x400000(0x25000) @ 0]: /usr/bin/find
   3   5952413756 0x47748 [0xa0]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff810d080f period: 859332 cpu:3
   3   5952419837 0x477e8 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44600000(0x21d000) @ 0]: /lib64/ld-2.5.so
   3   5952437929 0x47840 [0x48]: PERF_RECORD_MMAP 6853/6853: [0x7fff7e1c9000(0x1000) @ 0x7fff7e1c9000]: [vdso]
   3   5952570127 0x47888 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f46200000(0x218000) @ 0]: /lib64/libselinux.so.1
   3   5952623637 0x478e0 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44a00000(0x356000) @ 0]: /lib64/libc-2.5.so
   3   5952675720 0x47938 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f44e00000(0x204000) @ 0]: /lib64/libdl-2.5.so
   3   5952710080 0x47990 [0x58]: PERF_RECORD_MMAP 6853/6853: [0x3f45a00000(0x246000) @ 0]: /lib64/libsepol.so.1
   3   5952847802 0x479e8 [0x58]: PERF_RECORD_SAMPLE(IP, 1): 6853/6853: 0xffffffff813897f0 period: 1142536 cpu:3
  <SNIP>

First column is the cpu and the second the timestamp.

That way we can investigate problems in the event stream.

If the new perf binary is run on an older kernel, it will disable this feature
automatically.

Tested-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <1291318772-30880-5-git-send-email-acme@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-04 23:08:40 -02:00
Arnaldo Carvalho de Melo
640c03ce83 perf session: Parse sample earlier
At perf_session__process_event, so that we reduce the number of lines in eache
tool sample processing routine that now receives a sample_data pointer already
parsed.

This will also be useful in the next patch, where we'll allow sample the
identity fields in MMAP, FORK, EXIT, etc, when it will be possible to see (cpu,
timestamp) just after before every event.

Also validate callchains in perf_session__process_event, i.e. as early as
possible, and keep a counter of the number of events discarded due to invalid
callchains, warning the user about it if it happens.

There is an assumption that was kept that all events have the same sample_type,
that will be dealt with in the future, when this preexisting limitation will be
removed.

Tested-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Ian Munsie <imunsie@au1.ibm.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ian Munsie <imunsie@au1.ibm.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <1291318772-30880-4-git-send-email-acme@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-12-04 23:05:19 -02:00
Ingo Molnar
b3d006c0e7 Merge branch 'perf/rename' into perf/core
Merge reason: This is an older commit under testing that was not pushed yet - merge it.

Also fix up the merge in command-list.txt.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Tom Zanussi <tzanussi@gmail.com>
2010-12-01 09:22:19 +01:00
Corey Ashford
4c635a4e04 perf tools: fix event parsing of comma-separated tracepoint events
There are number of issues that prevent the use of multiple tracepoint events
being specified in a -e/--event switch, separated by commas.

For example, perf stat -e irq:irq_handler_entry,irq:irq_handler_exit ...  fails
because the tracepoint event parsing code doesn't recognize the comma separator
properly.

This patch corrects those issues.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Julia Lawall <julia@diku.dk>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reported-by: Michael Ellerman <michaele@au1.ibm.com>
LKML-Reference: <1291156021-17711-1-git-send-email-cjashfor@linux.vnet.ibm.com>
Signed-off-by: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-11-30 23:04:39 -02:00
Arnaldo Carvalho de Melo
5b1c144475 perf debug: Simplify trace_event
No need to check that many times if debug_trace is on.

Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-11-30 20:58:42 -02:00
Thomas Gleixner
5c891f3840 perf session: Allocate chunks of sample objects
The ordered sample code allocates singular reference objects struct
sample_queue which have 48byte size on 64bit and 20 bytes on 32bit. That's
silly. Allocate ~64k sized chunks and hand them out.

Performance gain: ~ 15%

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101130163820.398713983@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-11-30 20:05:25 -02:00
Thomas Gleixner
020bb75a6d perf session: Cache sample objects
When the sample queue is flushed we free the sample reference objects. Though
we need to malloc new objects when we process further. Stop the malloc/free
orgy and cache the already allocated object for resuage. Only allocate when
the cache is empty.

Performance gain: ~ 10%

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101130163820.338488630@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-11-30 20:04:18 -02:00
Thomas Gleixner
fe17420784 perf session: Keep file mmaped instead of malloc/memcpy
Profiling perf with perf revealed that a large part of the processing time is
spent in malloc/memcpy/free in the sample ordering code. That code copies the
data from the mmap into malloc'ed memory. That's silly. We can keep the mmap
and just store the pointer in the queuing data structure. For 64 bit this is
not a problem as we map the whole file anyway. On 32bit we keep 8 maps around
and unmap the oldest before mmaping the next chunk of the file.

Performance gain: 2.95s -> 1.23s (Faktor 2.4)

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101130163820.278787719@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-11-30 20:01:08 -02:00
Thomas Gleixner
55b44629f5 perf session: Use sensible mmap size
On 64bit we can map the whole file in one go, on 32bit we can at least map
32MB and not map/unmap tiny chunks of the file.

Base the progress bar on 1/16 of the data size.

Preparatory patch to get rid of the malloc/memcpy/free of trace data.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101130163820.213687773@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-11-30 19:59:34 -02:00
Thomas Gleixner
d6513281c5 perf session: Simplify termination checks
No need to check twice.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101130163820.152886642@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-11-30 19:58:10 -02:00
Thomas Gleixner
85b99952cc perf session: Move ui_progress_update in __perf_session__process_events()
The progress bar is changed when the file offset changes. This happens only
when the next mmap is done. No need to call ui_progress_update() for every
event.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101130163820.094836523@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-11-30 19:57:13 -02:00
Thomas Gleixner
0331ee0cf4 perf session: Cleanup __perf_session__process_events()
Replace the pseudo C++ self argument with session and give the mmap related
variables a sensible name. shift is a complete misnomer - it took me several
rounds of cursing to figure out that it's not a shift value.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101130163820.029687218@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-11-30 19:57:01 -02:00
Thomas Gleixner
28990f75e6 perf session: Use appropriate pointer type instead of silly typecasting
There is no reason to use a struct sample_event pointer in struct sample_queue
and type cast it when flushing the queue.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101130163819.969462809@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-11-30 19:55:26 -02:00
Thomas Gleixner
a1225decc4 perf session: Fix list sort algorithm
The homebrewn sort algorithm fails to sort in time order. One of the problem
spots is that it fails to deal with equal timestamps correctly.

My first gut reaction was to replace the fancy list with an rbtree, but the
performance is 3 times worse.

Rewrite it so it works.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20101130163819.908482530@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-11-30 19:52:36 -02:00