aboutsummaryrefslogtreecommitdiff
path: root/tools/perf/builtin-stat.c
AgeCommit message (Collapse)AuthorFilesLines
2013-03-15perf evlist: Add thread_map__nr() helperNamhyung Kim1-2/+2
Introduce and use the thread_map__nr() function to protect a possible NULL pointer dereference and cleanup the code a bit. Signed-off-by: Namhyung Kim <[email protected]> Cc: David Ahern <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-03-15perf evlist: Remove cpus and threads arguments from perf_evlist__new()Namhyung Kim1-1/+1
It's almost always used with NULL for both arguments. Get rid of the arguments from the signature and use perf_evlist__set_maps() if needed. Signed-off-by: Namhyung Kim <[email protected]> Cc: David Ahern <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ committer note: replaced spaces with tabs in some of the affected lines ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-02-06perf stat: Add per processor socket count aggregationStephane Eranian1-11/+115
This patch adds per-processor socket count aggregation for system-wide mode measurements. This is a useful mode to detect imbalance between sockets. To enable this mode, use --aggr-socket in addition to -a. (system-wide). The output includes the socket number and the number of online processors on that socket. This is useful to gauge the amount of aggregation. # ./perf stat -I 1000 -a --aggr-socket -e cycles sleep 2 # time socket cpus counts events 1.000097680 S0 4 5,788,785 cycles 2.000379943 S0 4 27,361,546 cycles 2.001167808 S0 4 818,275 cycles Signed-off-by: Stephane Eranian <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ committer note: Added missing man page entry based on above comments ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-01-30perf evsel: Fix memory leaks on evsel->countsNamhyung Kim1-0/+1
The ->counts field was never freed in the current code. Add perf_evsel__free_counts() function to free it properly. Signed-off-by: Namhyung Kim <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-01-30perf stat: Add interval printingStephane Eranian1-15/+142
This patch adds a new printing mode for perf stat. It allows interval printing. That means perf stat can now print event deltas at regular time interval. This is useful to detect phases in programs. The -I option enables interval printing. It expects an interval duration in milliseconds. Minimum is 100ms. Once, activated perf stat prints events deltas since last printout. All modes are supported. $ perf stat -I 1000 -e cycles noploop 10 noploop for 10 seconds # time counts events 1.000109853 2,388,560,546 cycles 2.000262846 2,393,332,358 cycles 3.000354131 2,393,176,537 cycles 4.000439503 2,393,203,790 cycles 5.000527075 2,393,167,675 cycles 6.000609052 2,393,203,670 cycles 7.000691082 2,393,175,678 cycles The output format makes it easy to feed into a plotting program such as gnuplot when the -I option is used in combination with the -x option: $ perf stat -x, -I 1000 -e cycles noploop 10 noploop for 10 seconds 1.000084113,2378775498,cycles 2.000245798,2391056897,cycles 3.000354445,2392089414,cycles 4.000459115,2390936603,cycles 5.000565341,2392108173,cycles Signed-off-by: Stephane Eranian <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-01-24perf evsel: Introduce perf_evsel__open_strerror methodArnaldo Carvalho de Melo1-11/+5
That consolidates the error messages in 'record', 'stat' and 'top', that now get a consistent set of messages and allow other tools to use the new method to report problems using whatever UI toolkit. Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2013-01-24perf evsel: Do missing feature fallbacks in just one placeArnaldo Carvalho de Melo1-27/+3
Instead of doing it in stat, top, record or any other tool that opens event descriptors. Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-12-09perf evsel: Convert to _is_group_leader methodNamhyung Kim1-1/+1
Convert perf_evsel__is_group_member to perf_evsel__is_group_leader. This is because the most usecases are using negative form to check whether the given evsel is a leader or not and it's IMHO somewhat ambiguous - leader also *is* a member of the group. Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-11-14perf evsel: Introduce is_group_member methodArnaldo Carvalho de Melo1-1/+2
To clarify what is being tested, instead of assuming that evsel->leader == NULL means either an 'isolated' evsel or a 'group leader'. Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-11-14perf tools: Fix attributes for '{}' defined event groupsJiri Olsa1-7/+4
Fixing events attributes for groups defined via '{}'. Currently 'enable_on_exec' attribute in record command and both 'disabled ' and 'enable_on_exec' attributes in stat command are set based on the 'group' option. This eliminates proper setup for '{}' defined groups as they don't set 'group' option. Making above attributes values based on the 'evsel->leader' as this is common to both group definition. Moving perf_evlist__set_leader call within builtin-record ahead perf_evlist__config_attrs call, because the latter needs possible group leader links in place. Signed-off-by: Jiri Olsa <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Corey Ashford <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-10-26perf stat: Add --pre and --post commandPeter Zijlstra1-6/+36
In order to measure kernel builds, one has to do some pre/post cleanup work in order to do the repeat build. So provide --pre and --post command hooks to allow doing just that. perf stat --repeat 10 --null --sync --pre 'make -s O=defconfig-build/clean' \ -- make -s -j64 O=defconfig-build/ bzImage Signed-off-by: Peter Zijlstra <[email protected]> Acked-by: Ingo Molnar <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/1350992414.13456.5.camel@twins [ committer note: Added respective entries in Documentation/perf-stat.txt ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-10-02perf stat: Don't use globals where not needed toArnaldo Carvalho de Melo1-169/+159
Some variables were global but used in just one function, so move it to where it belongs. Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-09-26perf evlist: Renane set_filters method to apply_filtersArnaldo Carvalho de Melo1-1/+1
Because that is what it really does, i.e. it applies the filters that were parsed from the command line and stashed into the evsels they refer to. We'll need the set_filter method name to actually apply a filter to all the evsels in an evlist, for instance, to ask that a syswide tracer doesn't trace itself. Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-09-17perf stat: Check PMU cpumask fileYan, Zheng1-10/+20
If user doesn't explicitly specify CPU list, perf-stat only collects events on CPUs listed in the PMU cpumask file. Signed-off-by: "Yah, Zheng" <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-09-17perf stat: Move stats related code to util/stat.cXiao Guangrong1-54/+2
Then, the code can be shared between kvm events and perf stat. Signed-off-by: Xiao Guangrong <[email protected]> [ Dong Hao <[email protected]>: rebase it on acme's git tree ] Signed-off-by: Dong Hao <[email protected]> Cc: Avi Kivity <[email protected]> Cc: David Ahern <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: [email protected] Cc: Marcelo Tosatti <[email protected]> Cc: Runzhen Wang <[email protected]> Cc: Xiao Guangrong <[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-09-11perf tools: Use __maybe_used for unused variablesIrina Tirdea1-12/+28
perf defines both __used and __unused variables to use for marking unused variables. The variable __used is defined to __attribute__((__unused__)), which contradicts the kernel definition to __attribute__((__used__)) for new gcc versions. On Android, __used is also defined in system headers and this leads to warnings like: warning: '__used__' attribute ignored __unused is not defined in the kernel and is not a standard definition. If __unused is included everywhere instead of __used, this leads to conflicts with glibc headers, since glibc has a variables with this name in its headers. The best approach is to use __maybe_unused, the definition used in the kernel for __attribute__((unused)). In this way there is only one definition in perf sources (instead of 2 definitions that point to the same thing: __used and __unused) and it works on both Linux and Android. This patch simply replaces all instances of __used and __unused with __maybe_unused. Signed-off-by: Irina Tirdea <[email protected]> Acked-by: Pekka Enberg <[email protected]> Cc: David Ahern <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ committer note: fixed up conflict with a116e05 in builtin-sched.c ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-09-05perf stat: Remove use of die/exit and handle errorsDavid Ahern1-3/+4
Allows perf to clean up properly on program termination. Signed-off-by: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-08-15perf evlist: Introduce evsel list accessorsArnaldo Carvalho de Melo1-1/+1
To replace the longer list_entry constructs for things that are widely used: perf_evlist__{first,last}(evlist) perf_evsel__next(evsel) Acked-by: Jiri Olsa <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-08-15perf evlist: Rename __group method to __set_leaderArnaldo Carvalho de Melo1-1/+1
Just like was done for parse_events__set_leader. Also we need to have the list_entry set_leader method in evlist.c so that we don't grow another dep in the python binding: # ~acme/git/linux/tools/perf/python/twatch.py Traceback (most recent call last): File "/home/acme/git/linux/tools/perf/python/twatch.py", line 16, in <module> import perf ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: parse_events__set_leader And also remove a pr_debug from evsel.c so that we avoid this one too: # ~acme/git/linux/tools/perf/python/twatch.py Traceback (most recent call last): File "/home/acme/git/linux/tools/perf/python/twatch.py", line 16, in <module> import perf ImportError: /home/acme/git/build/perf/python/perf.so: undefined symbol: eprintf Acked-by: Jiri Olsa <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-08-14perf tools: Enable grouping logic for parsed eventsJiri Olsa1-8/+5
This patch adds a functionality that allows to create event groups based on the way they are specified on the command line. Adding functionality to the '{}' group syntax introduced in earlier patch. The current '--group/-g' option behaviour remains intact. If you specify it for record/stat/top command, all the specified events become members of a single group with the first event as a group leader. With the new '{}' group syntax you can create group like: # perf record -e '{cycles,faults}' ls resulting in single event group containing 'cycles' and 'faults' events, with cycles event as group leader. All groups are created with regards to threads and cpus. Thus recording an event group within a 2 threads on server with 4 CPUs will create 8 separate groups. Examples (first event in brackets is group leader): # 1 group (cpu-clock,task-clock) perf record --group -e cpu-clock,task-clock ls perf record -e '{cpu-clock,task-clock}' ls # 2 groups (cpu-clock,task-clock) (minor-faults,major-faults) perf record -e '{cpu-clock,task-clock},{minor-faults,major-faults}' ls # 1 group (cpu-clock,task-clock,minor-faults,major-faults) perf record --group -e cpu-clock,task-clock -e minor-faults,major-faults ls perf record -e '{cpu-clock,task-clock,minor-faults,major-faults}' ls # 2 groups (cpu-clock,task-clock) (minor-faults,major-faults) perf record -e '{cpu-clock,task-clock} -e '{minor-faults,major-faults}' \ -e instructions ls # 1 group # (cpu-clock,task-clock,minor-faults,major-faults,instructions) perf record --group -e cpu-clock,task-clock \ -e minor-faults,major-faults -e instructions ls perf record -e '{cpu-clock,task-clock,minor-faults,major-faults,instructions}' ls It's possible to use standard event modifier for a group, which spans over all events in the group and updates each event modifier settings, for example: # perf record -r '{faults:k,cache-references}:p' resulting in ':kp' modifier being used for 'faults' and ':p' modifier being used for 'cache-references' event. Reviewed-by: Namhyung Kim <[email protected]> Signed-off-by: Jiri Olsa <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Corey Ashford <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ulrich Drepper <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-06-20Merge tag 'perf-core-for-mingo' of ↵Ingo Molnar1-6/+6
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf improvements from Arnaldo Carvalho de Melo: * Replace event_name with perf_evsel__name, that handles the event modifiers and doesn't use static variables. * GTK browser improvements, from Namhyung Kim * Fix possible NULL pointer deref in the TUI annotate browser, from Samuel Liao * Add sort by source file:line number, using addr2line. * Allow printing histogram text snapshots at any point in top/report. Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2012-06-19perf tools: Move all users of event_name to perf_evsel__nameArnaldo Carvalho de Melo1-6/+6
So that we don't use global variables that could make us misreport event names when having a multi window top, for instance. Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-06-11perf stat: Fix default output fileStephane Eranian1-1/+7
The following commit: commit 56f3bae70638b33477a6015fd362ccfe354fd3ee Author: Jim Cromie <[email protected]> Date: Wed Sep 7 17:14:00 2011 -0600 perf stat: Add --log-fd <N> option to redirect stderr elsewhere introduced a bug in the way perf stat outputs the results by default, i.e., without the --log-fd or --output option. It would default to writing to file descriptor 0, i.e., stdin. Writing to stdin is allowed and is equivalent to writing to stdout. However, there is a major difference for any script that was already capturing the output of perf stat via redirection: perf stat >/tmp/log .... or perf stat 2>/tmp/log .... They would not capture anything anymore. They would have to do: perf stat 0>/tmp/log ... This breaks compatibility with existing scripts and does not look very natural. This patch fixes the problem by looking at output_fd only when it was modified by user (> 0). It also checks that the value if positive. Passing --log-fd 0 is ignored. I would also argue that defaulting to stderr for the results is not the right thing to do, though this patch does not address this specific issue. Signed-off-by: Stephane Eranian <[email protected]> Cc: David Ahern <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Jim Cromie <[email protected]> Link: http://lkml.kernel.org/r/20120515111111.GA9870@quad Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-05-30perf stat: Initialize default events wrt exclude_{guest,host}Arnaldo Carvalho de Melo1-4/+4
When no event is specified the tools use perf_evlist__add_default(), that will call event_attr_init to initialize the KVM exclusion bits. When the change was made to the tools so that by default guest samples would be excluded, the changes were made just to the parsing routines and to perf_evlist__add_default(), not to perf_evlist__add_attrs, that is used so far just by perf stat to add multiple events, according to the level of detail specified. Recently the tools were changed to reconstruct the event name from all the details in perf_event_attr, not just from .type and .config, but taking into account all the feature bits (.exclude_{guest,host,user,kernel,etc}, .precise_ip, etc). That is when we noticed that the default for perf stat wasn't the one for the rest of the tools, i.e. the .exclude_guest bit wasn't being set. I.e. the default, that doesn't call event_attr_init was showing the :HG modifier: $ perf stat usleep 1 Performance counter stats for 'usleep 1': 0.942119 task-clock # 0.454 CPUs utilized 1 context-switches # 0.001 M/sec 0 CPU-migrations # 0.000 K/sec 126 page-faults # 0.134 M/sec 693,193 cycles:HG # 0.736 GHz [40.11%] 407,461 stalled-cycles-frontend:HG # 58.78% frontend cycles idle [72.29%] 365,403 stalled-cycles-backend:HG # 52.71% backend cycles idle 465,982 instructions:HG # 0.67 insns per cycle # 0.87 stalled cycles per insn 89,760 branches:HG # 95.275 M/sec 6,178 branch-misses:HG # 6.88% of all branches 0.002077228 seconds time elapsed While if one explicitely specifies the same events, which will make the parsing code to be called and thus event_attr_init is called: $ perf stat -e task-clock,context-switches,migrations,page-faults,cycles,stalled-cycles-frontend,stalled-cycles-backend,instructions,branches,branch-misses usleep 1 Performance counter stats for 'usleep 1': 1.040349 task-clock # 0.500 CPUs utilized 2 context-switches # 0.002 M/sec 0 CPU-migrations # 0.000 K/sec 127 page-faults # 0.122 M/sec 587,966 cycles # 0.565 GHz [13.18%] 459,167 stalled-cycles-frontend # 78.09% frontend cycles idle 390,249 stalled-cycles-backend # 66.37% backend cycles idle 504,006 instructions # 0.86 insns per cycle # 0.91 stalled cycles per insn 96,455 branches # 92.714 M/sec 6,522 branch-misses # 6.76% of all branches [96.12%] 0.002078681 seconds time elapsed Fix it by introducing a perf_evlist__add_default_attrs method that will call evlist_attr_init in all the perf_event_attr entries before adding the events. Reported-by: Ingo Molnar <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-05-18Merge remote-tracking branch 'tip/perf/urgent' into perf/coreArnaldo Carvalho de Melo1-5/+30
Merge reason: We are going to queue up a dependent patch: "perf tools: Move parse event automated tests to separated object" That depends on: commit e7c72d8 perf tools: Add 'G' and 'H' modifiers to event parsing Conflicts: tools/perf/builtin-stat.c Conflicted with the recent 'perf_target' patches when checking the result of perf_evsel open routines to see if a retry is needed to cope with older kernels where the exclude guest/host perf_event_attr bits were not used. Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-05-16perf target: Rename functions to avoid double negationNamhyung Kim1-7/+7
Rename perf_target__no_{cpu,task} to perf_target__has_{cpu,task} because it's more intuitive and easy to parse (for human beings) when used with negation. The names are came out from David Ahern. It is intended to be a mechanical substitution without any functional change. The perf_target__none remains unchanged since I couldn't find a right name and it is hardly used with negation. Signed-off-by: Namhyung Kim <[email protected]> Suggested-by: David Ahern <[email protected]> Suggested-by: Ingo Molnar <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-05-09perf stat: handle ENXIO error for perf_event_openDavid Ahern1-1/+6
perf stat on PPC currently fails to run: $ perf stat -- sleep 1 Error: open_counter returned with 6 (No such device or address). /bin/dmesg may provide additional information. Fatal: Not all events could be opened. The problem is that until 2.6.37 (behavior changed with commit b0a873e) perf on PPC returns ENXIO when hw_perf_event_init() fails. With this patch we get the expected behavior: $ perf stat -v -- sleep 1 cycles event is not supported by the kernel. stalled-cycles-frontend event is not supported by the kernel. stalled-cycles-backend event is not supported by the kernel. instructions event is not supported by the kernel. branches event is not supported by the kernel. branch-misses event is not supported by the kernel. ... Signed-off-by: David Ahern <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-05-09perf stat: handle ENXIO error for perf_event_openDavid Ahern1-1/+6
perf stat on PPC currently fails to run: $ perf stat -- sleep 1 Error: open_counter returned with 6 (No such device or address). /bin/dmesg may provide additional information. Fatal: Not all events could be opened. The problem is that until 2.6.37 (behavior changed with commit b0a873e) perf on PPC returns ENXIO when hw_perf_event_init() fails. With this patch we get the expected behavior: $ perf stat -v -- sleep 1 cycles event is not supported by the kernel. stalled-cycles-frontend event is not supported by the kernel. stalled-cycles-backend event is not supported by the kernel. instructions event is not supported by the kernel. branches event is not supported by the kernel. branch-misses event is not supported by the kernel. ... Signed-off-by: David Ahern <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-05-07perf stat: Use perf_evlist__create_mapsNamhyung Kim1-14/+8
Use same function with perf record and top to share the code checks combinations of different switches. Signed-off-by: Namhyung Kim <[email protected]> Reviewed-by: David Ahern <[email protected]> Cc: David Ahern <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-05-07perf target: Consolidate target task/cpu checkingNamhyung Kim1-6/+6
There are places that check whether target task/cpu is given or not and some of them didn't check newly introduced uid or cpu list. Add and use three of helper functions to treat them properly. Signed-off-by: Namhyung Kim <[email protected]> Reviewed-by: David Ahern <[email protected]> Cc: David Ahern <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-05-02perf tools: Introduce perf_target__validate() helperNamhyung Kim1-2/+1
The perf_target__validate function is used to check given PID/TID/UID/CPU target options and warn if some combination is impossible. Also this can make some arguments of parse_target_uid() function useless as it is checked before the call via our new helper. Signed-off-by: Namhyung Kim <[email protected]> Reviewed-by: David Ahern <[email protected]> Cc: David Ahern <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-05-02perf stat: Convert to struct perf_targetNamhyung Kim1-25/+22
Use struct perf_target as it is introduced by previous patch. This is a preparation of further changes. Signed-off-by: Namhyung Kim <[email protected]> Reviewed-by: David Ahern <[email protected]> Cc: David Ahern <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-05-01perf stat: Fix case where guest/host monitoring is not supported by kernelStephane Eranian1-4/+29
By default, perf stat sets exclude_guest = 1. But when you run perf on a kernel which does not support host/guest filtering, then you get an error saying the event in unsupported. This comes from the fact that when the perf_event_attr struct passed by the user is larger than the one known to the kernel there is safety check which ensures that all unknown bits are zero. But here, exclude_guest is 1 (part of the unknown bits) and thus the perf_event_open() syscall return EINVAL. To my surprise, running perf record on the same kernel did not exhibit the problem. The reason is that perf record handles the problem by catching the error and retrying with guest/host excludes set to zero. For some reason, this was not done with perf stat. This patch fixes this problem. Signed-off-by: Stephane Eranian <[email protected]> Cc: Gleb Natapov <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Robert Richter <[email protected]> Link: http://lkml.kernel.org/r/20120427124538.GA7230@quad Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-04-11perf stat: Declare some references staticRobert Richter1-13/+13
This references are not exported, use static declaration. Signed-off-by: Robert Richter <[email protected]> Cc: Ingo Molnar <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-03-16perf stat: Fix event grouping on forked taskNamhyung Kim1-1/+1
When event group is enabled for forked task (i.e. no target task was specified) all events were disabled and marked ->enable_on_exec. However they are not counted at all since only group leader will be enabled on exec actually. So the result looked like below: $ ./perf stat --group -- sleep 1 Performance counter stats for 'sleep 1': 0.554926 task-clock # 0.001 CPUs utilized <not counted> context-switches <not counted> CPU-migrations <not counted> page-faults <not counted> cycles <not supported> stalled-cycles-frontend <not supported> stalled-cycles-backend <not counted> instructions <not counted> branches <not counted> branch-misses 1.001228093 seconds time elapsed Fix it by disabling group leader only. Cc: Ingo Molnar <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-02-13perf tools: Allow multiple threads or processes in record, stat, topDavid Ahern1-15/+16
Allow a user to collect events for multiple threads or processes using a comma separated list. e.g., collect data on a VM and its vhost thread: perf top -p 21483,21485 perf stat -p 21483,21485 -ddd perf record -p 21483,21485 or monitoring vcpu threads perf top -t 21488,21489 perf stat -t 21488,21489 -ddd perf record -t 21488,21489 Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: David Ahern <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-02-06perf stat: Align scaled output of cpu-clockNamhyung Kim1-0/+2
The output of cpu-clock event is controlled in nsec_printout(), but its alignment was broken: Performance counter stats for 'sleep 1': 6,038,774 instructions # 0.00 insns per cycle 180 faults # 0.007 K/sec [99.95%] 1,282,201 branches # 0.053 M/sec [99.84%] 24126.221811 cpu-clock [99.62%] 24121.689540 task-clock # 24.098 CPUs utilized [99.52%] 1.001001017 seconds time elapsed This patch fixes this: Performance counter stats for 'sleep 1': 13,540,843 instructions # 0.00 insns per cycle 180 faults # 0.007 K/sec [99.94%] 2,875,386 branches # 0.119 M/sec [99.82%] 24144.221137 cpu-clock [99.61%] 24133.515366 task-clock # 24.109 CPUs utilized [99.52%] 1.001020946 seconds time elapsed Cc: Ingo Molnar <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-02-06perf stat: Adjust print unitNamhyung Kim1-1/+7
The default 'M/sec' unit is not useful if the result is small enough. Adjust it dynamically according to the value. Cc: Ingo Molnar <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-01-24perf tools: Introduce per user viewArnaldo Carvalho de Melo1-1/+1
The new --uid command line option will show only the tasks for a given user, using the proc interface to figure out the existing tasks. Kernel work is needed to close races at startup, but this should already be useful in many use cases. Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2012-01-03perf stat: Introduce get_ratio_color() helperNamhyung Kim1-56/+35
The get_ratio_color() returns appropriate color string based on @ratio. It helps reducing code duplication. Cc: Ingo Molnar <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2011-12-20Merge commit 'v3.2-rc6' into perf/coreIngo Molnar1-1/+2
Merge reason: Update with the latest fixes. Signed-off-by: Ingo Molnar <[email protected]>
2011-12-05perf stat: Failure with "Operation not supported"Anton Blanchard1-1/+2
perf stat is failing on PowerPC: Error: open_counter returned with 95 (Operation not supported). /bin/dmesg may provide additional information. Fatal: Not all events could be opened. commit 370faf1dd046 (perf stat: Fail softly on unsupported events) added a check for failure returning ENOENT, but the POWER backend returns EOPNOTSUPP. It looks like alpha, blackfin and mips do the same. With the patch applied, things work as expected: Performance counter stats for '/bin/true': 0.362176 task-clock # 0.623 CPUs utilized 0 context-switches # 0.000 M/sec 0 CPU-migrations # 0.000 M/sec 28 page-faults # 0.077 M/sec 1,677,020 cycles # 4.630 GHz <not supported> stalled-cycles-frontend <not supported> stalled-cycles-backend 431,220 instructions # 0.26 insns per cycle 101,889 branches # 281.325 M/sec 4,145 branch-misses # 4.07% of all branches 0.000581361 seconds time elapsed Cc: <[email protected]> # 3.0+ Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/20111202093833.5fef7226@kryten Signed-off-by: Anton Blanchard <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2011-11-29perf evlist: Always do automatic allocation of pollfd and mmap structuresArnaldo Carvalho de Melo1-2/+1
At first tools were required to do that, but while writing the python bindings to simplify the API I made them auto-allocate when needed. This just makes record, stat and top use that auto allocation, simplifying them a bit. Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2011-11-28perf evlist: Introduce perf_evlist__add_attrsArnaldo Carvalho de Melo1-33/+7
Replacing the open coded equivalents in 'perf stat'. Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2011-10-26perf evlist: Fix grouping of multiple eventsArnaldo Carvalho de Melo1-6/+14
The __perf_evsel__open routing was grouping just the threads for that specific events per cpu when we want to group all threads in all events to the first fd opened on that cpu. So pass the xyarray with the first event, where the other events will be able to get that first per cpu fd. At some point top and record will switch to using perf_evlist__open that takes care of this detail and probably will also handle the fallback from hw to soft counters, etc. Reported-by: Deng-Cheng Zhu <[email protected]> Tested-by: Deng-Cheng Zhu <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2011-09-29perf tools: Make stat/record print fatal signals of the target programAndi Kleen1-0/+2
When a program crashes under perf there is no message about it, unlike when running it from bash. This can be confusing and lead to wrong actions during debugging. Print fatal signals in perf stat/record. Thanks to Furat Afram for finding the problem originally Link: http://lkml.kernel.org/r/[email protected] Cc: Frederic Weisbecker <[email protected]> Cc: Stephane Eranian <[email protected]> Signed-off-by: Andi Kleen <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2011-09-29perf stat: Fix spelling in commentJim Cromie1-1/+1
Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Jim Cromie <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2011-09-29perf stat: Allow tab as cvs delimiterJim Cromie1-2/+4
If option -x '\t' is given, convert '\t' to "\t". This makes cvs printing more flexible. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Jim Cromie <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2011-09-29perf stat: Suppress printing std-dev when its 0Jim Cromie1-1/+1
For pretty output only (preserve column for cvs output), dont print std-deviation when its 0.00. Do this based upon value, instead of checking for --no-aggr, since the stats could conceivably be computed over the runs on each CPU, and theres no reason to preclude that. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Jim Cromie <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2011-09-29perf stat: Fix +- nan% in --no-aggr runsJim Cromie1-2/+7
Without this patch, running: $ sudo ./perf stat -r20 --no-aggr -a perl -e '$i++ for 1..100000' I get computations like this: CPU0 12.488247 task-clock # 1.224 CPUs utilized ( +- -nan% ) CPU1 12.488909 task-clock # 1.225 CPUs utilized ( +- -nan% ) CPU2 12.500221 task-clock # 1.226 CPUs utilized ( +- -nan% ) CPU3 12.481713 task-clock # 1.224 CPUs utilized ( +- -nan% ) but with patch, I get: CPU0 8.233682 task-clock # 0.754 CPUs utilized ( +- 0.00% ) CPU1 8.226318 task-clock # 0.754 CPUs utilized ( +- 0.00% ) CPU2 8.210737 task-clock # 0.752 CPUs utilized ( +- 0.00% ) CPU3 8.201691 task-clock # 0.751 CPUs utilized ( +- 0.00% ) Note that without --no-aggr, I get non-0 statistics both before and after patch: 231.986022 task-clock # 4.030 CPUs utilized ( +- 0.97% ) 212 context-switches # 0.001 M/sec ( +- 12.07% ) 9 CPU-migrations # 0.000 M/sec ( +- 25.80% ) 466 page-faults # 0.002 M/sec ( +- 3.23% ) 174,318,593 cycles # 0.751 GHz ( +- 1.06% ) I couldnt see anything wrong in the caller, so fixed it in stddev_stats(). ISTM that 0.00 is better than nan, since perf stat was passed -A (--no-aggr) so no standard deviation should be expected, and nan is suggestive of a deeper error. When running with --no-aggr, perhaps we should suppress the statistics printing entirely, or do so when they are 0.00. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Jim Cromie <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>