blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2011-04-26	perf stat: Add stalled cycles accounting, prettify the resulting output	Ingo Molnar	1	-13/+30
	Add stalled cycles accounting and use it to print the "cycles stalled per instruction" value. Also change the unit of the cycles output from M/sec to GHz - this is more intuitive. Prettify the output to: Performance counter stats for './loop_1b_instructions': 239.775036 task-clock # 0.997 CPUs utilized 761,903,912 cycles # 3.178 GHz 356,620,620 stalled-cycles # 46.81% of all cycles are idle 1,001,578,351 instructions # 1.31 insns per cycle # 0.36 stalled cycles per insn 14,782 cache-references # 0.062 M/sec 5,694 cache-misses # 38.520 % of all cache refs 0.240493656 seconds time elapsed Also adjust the --repeat output to make the percentages align vertically: Performance counter stats for './loop_1b_instructions' (10 runs): 236.096793 task-clock # 0.997 CPUs utilized ( +- 0.011% ) 756,553,086 cycles # 3.204 GHz ( +- 0.002% ) 354,942,692 stalled-cycles # 46.92% of all cycles are idle ( +- 0.008% ) 1,001,389,700 instructions # 1.32 insns per cycle # 0.35 stalled cycles per insn ( +- 0.000% ) 10,166 cache-references # 0.043 M/sec ( +- 0.742% ) 468 cache-misses # 4.608 % of all cache refs ( +- 13.385% ) 0.236874136 seconds time elapsed ( +- 0.01% ) Acked-by: Peter Zijlstra <[email protected]> Acked-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2011-04-26	perf stat: Factor our shadow stats	Ingo Molnar	1	-14/+19
	Create update_shadow_stats() which is then used in both read_counter_aggr() and read_counter(). This not only simplifies the code but also fixes a bug: HW_CACHE_REFERENCES was not updated in read_counter(). Acked-by: Peter Zijlstra <[email protected]> Acked-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2011-04-26	perf stat: Print cache misses as percentage	Ingo Molnar	1	-3/+15
	Before: 113,393,041 cache-references # 83.636 M/sec 7,052,454 cache-misses # 5.202 M/sec After: 112,589,441 cache-references # 87.925 M/sec 6,556,354 cache-misses # 5.823 % misses/hits percentages are more expressive than absolute numbers or rates. (Also prettify the CPUs printout line to not have a trailing whitespace.) Acked-by: Peter Zijlstra <[email protected]> Acked-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Frederic Weisbecker <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2011-04-26	perf stat: Print stalled cycles percentage	Ingo Molnar	1	-2/+9
	Print: 611,527 cycles 400,553 instructions # ( 0.71 instructions per cycle ) 77,809 stalled-cycles # ( 12.71% of all cycles ) 0.000610987 seconds time elapsed Acked-by: Peter Zijlstra <[email protected]> Acked-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Frederic Weisbecker <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> Link: http://lkml.kernel.org/n/[email protected]
2011-04-15	perf evsel: Fix use of inherit	Arnaldo Carvalho de Melo	1	-3/+4
	perf stat doesn't mmap and its perfectly fine for it to use task-bound counters with inheritance. So set the attr.inherit on the caller and leave the syscall itself to validate it. When the mmap fails perf_evlist__mmap will just emit a warning if this is the failure reason. Reported-by: Peter Zijlstra <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Tom Zanussi <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2011-03-14	perf stat: Provide support for filters	Frederic Weisbecker	1	-0/+8
	Now the --filter option is usable with perf stat too. Cc: Ingo Molnar <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Frederic Weisbecker <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2011-02-16	perf tool: Add cgroup support	Stephane Eranian	1	-8/+32
	This patch adds the ability to filter monitoring based on container groups (cgroups) for both perf stat and perf record. It is possible to monitor multiple cgroup in parallel. There is one cgroup per event. The cgroups to monitor are passed via a new -G option followed by a comma separated list of cgroup names. The cgroup filesystem has to be mounted. Given a cgroup name, the perf tool finds the corresponding directory in the cgroup filesystem and opens it. It then passes that file descriptor to the kernel. Example: $ perf stat -B -a -e cycles:u,cycles:u,cycles:u -G test1,,test2 -- sleep 1 Performance counter stats for 'sleep 1': 2,368,667,414 cycles test1 2,369,661,459 cycles <not counted> cycles test2 1.001856890 seconds time elapsed Signed-off-by: Stephane Eranian <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2011-02-01	perf stat: Fix up resource release order	Arnaldo Carvalho de Melo	1	-2/+2
	That was causing a SEGV on selected old distros. Problem introduced in 7e2ed09. Reported-by: Peter Zijlstra <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2011-01-31	perf evlist: Store pointer to the cpu and thread maps	Arnaldo Carvalho de Melo	1	-23/+22
	So that we don't have to pass it around to the several methods that needs it, simplifying usage. There is one case where we don't have the thread/cpu map in advance, which is in the parsing routines used by top, stat, record, that we have to wait till all options are parsed to know if a cpu or thread list was passed to then create those maps. For that case consolidate the cpu and thread map creation via perf_evlist__create_maps() out of the code in top and record, while also providing a perf_evlist__set_maps() for cases where multiple evlists share maps or for when maps that represent CPU sockets, for instance, get crafted out of topology information or subsets of threads in a particular application are to be monitored, providing more granularity in specifying which cpus and threads to monitor. Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2011-01-24	perf threads: Move thread_map to separate file	Arnaldo Carvalho de Melo	1	-0/+1
	To untangle it from struct thread handling, that is tied to symbols, etc. Right now in the python bindings I'm working on I need just a subset of the util/ files, untangling it allows me to do that. Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2011-01-22	perf evsel: Allow specifying if the inherit bit should be set	Arnaldo Carvalho de Melo	1	-2/+2
	As this is a per-cpu attribute, we can't set it up in advance and use it for all the calls. Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2011-01-22	perf evsel: Support event groups	Arnaldo Carvalho de Melo	1	-2/+2
	The perf_evsel__open now have an extra boolean argument specifying if event grouping is desired. The first file descriptor created on a CPU becomes the group leader. Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2011-01-22	perf evsel: Introduce perf_evlist	Arnaldo Carvalho de Melo	1	-15/+19
	Killing two more perf wide global variables: nr_counters and evsel_list as a list_head. There are more operations that will need more fields in perf_evlist, like the pollfd for polling all the fds in a list of evsel instances. Use option->value to pass the evsel_list to parse_{events,filters}. LKML-Reference: <new-submission> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Tom Zanussi <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2011-01-22	perf tools: Fix 64 bit integer format strings	Arnaldo Carvalho de Melo	1	-2/+2
	Using %L[uxd] has issues in some architectures, like on ppc64. Fix it by making our 64 bit integers typedefs of stdint.h types and using PRI[ux]64 like, for instance, git does. Reported by Denis Kirjanov that provided a patch for one case, I went and changed all cases. Reported-by: Denis Kirjanov <[email protected]> Tested-by: Denis Kirjanov <[email protected]> LKML-Reference: <[email protected]> Cc: Denis Kirjanov <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Pingtian Han <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Tom Zanussi <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2011-01-11	perf evsel: Fix order of event list deletion	Arnaldo Carvalho de Melo	1	-0/+1
	We need to defer calling perf_evsel_list__delete() till after atexit registered routines, because we need to traverse the events being recorded at that time at least on 'perf record'. This fixes the problem reported by Thomas Renninger where cmd_record called by cmd_timechart would not write the tracing data to the perf.data file header because the evsel_list at atexit (control+C on 'perf timechart record') time would be empty, being already deleted by run_builtin(), and thus 'perf timechart' when trying to process such perf.data file would die with: "no trace data in the file" Problem introduced in 70d544d. Reported-by: Thomas Renninger <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Renninger <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2011-01-10	perf stat: better error message for unsupported events	David Ahern	1	-0/+2
	For unsupported events (e.g., H/W events when running in a VM) perf stat currently fails with the error message: Error: open_counter returned with 2 (No such file or directory). /bin/dmesg may provide additional information. Fatal: Not all events could be opened. dmesg is of no help and it is not clear as to why it fails to open the counter. This patch changes the error message to Error: cache-misses event is not supported. Fatal: Not all events could be opened. Cc: Ingo Molnar <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: [email protected] LPU-Reference: <[email protected]> Signed-off-by: David Ahern <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2011-01-07	perf tools: Pass whole attr to event selectors	Lin Ming	1	-2/+1
	Since commit 69aad6f1(perf tools: Introduce event selectors), only perf_event_attr::type and ::config are passed to event selector, which makes perf tool not work correctly. For example, PEBS does not work because perf_event_attr::precise_ip is not passed to the syscall. Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Lin Ming <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2011-01-04	perf evsel: Use {cpu,thread}_map to shorten list of parameters	Arnaldo Carvalho de Melo	1	-2/+2
	Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2011-01-04	perf tools: Refactor all_tids to hold nr and the map	Arnaldo Carvalho de Melo	1	-24/+17
	So that later, we can pass the thread_map instance instead of (thread_num, thread_map) for things like perf_evsel__open and friends, just like was done with cpu_map. Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2011-01-04	perf tools: Refactor cpumap to hold nr and the map	Arnaldo Carvalho de Melo	1	-18/+18
	So that later, we can pass the cpu_map instance instead of (nr_cpus, cpu_map) for things like perf_evsel__open and friends. Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2011-01-04	perf evsel: Introduce per cpu and per thread open helpers	Arnaldo Carvalho de Melo	1	-58/+26
	Abstracting away the loops needed to create the various event fd handlers. The users have to pass a confiruged perf->evsel.attr field, which is already usable after perf_evsel__new (constructor) time, using defaults. Comes out of the ad-hoc routines in builtin-stat, that now uses it. Fixed a small silly bug where we were die()ing before killing our children, dysfunctional family this one 8-) Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2011-01-04	perf evsel: Steal the counter reading routines from stat	Arnaldo Carvalho de Melo	1	-92/+29
	Making them hopefully generic enough to be used in 'perf test', well see. Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2011-01-03	perf evsel: Delete the event selectors at exit	Arnaldo Carvalho de Melo	1	-3/+1
	Freeing all the possibly allocated resources, reducing complexity on each tool exit path. Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2011-01-03	perf evsel: Adopt MATCH_EVENT macro from 'stat'	Arnaldo Carvalho de Melo	1	-21/+16
	Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2011-01-03	perf tools: Introduce event selectors	Arnaldo Carvalho de Melo	1	-68/+107
	Out of ad-hoc code and global arrays with hard coded sizes. This is the first step on having a library that will be first used on regression tests in the 'perf test' tool. [acme@felicio linux]$ size /tmp/perf.before text data bss dec hex filename 1273776 97384 5104416 6475576 62cf38 /tmp/perf.before [acme@felicio linux]$ size /tmp/perf.new text data bss dec hex filename 1275422 97416 1392416 2765254 2a31c6 /tmp/perf.new Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-12-01	perf stat: Add csv-style output	Stephane Eranian	1	-40/+104
	This patch adds an option (-x/--field-separator) to print counts using a CSV-style output. The user can pass a custom separator. This makes it very easy to import counts directly into your favorite spreadsheet without having to write scripts. Example: $ perf stat --field-separator=, -a -- sleep 1 4009.961740,task-clock-msecs 13,context-switches 2,CPU-migrations 189,page-faults 9596385684,cycles 3493659441,instructions 872897069,branches 41562,branch-misses 22424,cache-references 1289,cache-misses Works also in non-aggregated mode: $ perf stat -x , -a -A -- sleep 1 CPU0,1002.526168,task-clock-msecs CPU1,1002.528365,task-clock-msecs CPU2,1002.523360,task-clock-msecs CPU3,1002.519878,task-clock-msecs CPU0,1,context-switches CPU1,5,context-switches CPU2,5,context-switches CPU3,6,context-switches CPU0,0,CPU-migrations CPU1,1,CPU-migrations CPU2,0,CPU-migrations CPU3,1,CPU-migrations CPU0,2,page-faults CPU1,6,page-faults CPU2,9,page-faults CPU3,174,page-faults CPU0,2399439771,cycles CPU1,2380369063,cycles CPU2,2399142710,cycles CPU3,2373161192,cycles CPU0,872900618,instructions CPU1,873030960,instructions CPU2,872714525,instructions CPU3,874460580,instructions CPU0,221556839,branches CPU1,218134342,branches CPU2,218161730,branches CPU3,218284093,branches CPU0,18556,branch-misses CPU1,1449,branch-misses CPU2,3447,branch-misses CPU3,12714,branch-misses CPU0,8330,cache-references CPU1,313844,cache-references CPU2,47993728,cache-references CPU3,826481,cache-references CPU0,272,cache-misses CPU1,5360,cache-misses CPU2,1342193,cache-misses CPU3,13992,cache-misses This second version adds the ability to name a separator and uses field-separator as the long option to be consistent with perf report. Commiter note: Since we enabled --big-num by default in 201e0b0 and -x can't be used with it, we need to notice if the user explicitely enabled or disabled -B, add code to disable big_num if the user didn't explicitely set --big_num when -x is used. Cc: David S. Miller <[email protected]> Cc: Frederik Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: [email protected] Cc: Peter Zijlstra <[email protected]> Cc: Robert Richter <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Stephane Eranian <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-12-01	perf stat: Use --big-num format by default	Arnaldo Carvalho de Melo	1	-1/+1
	[acme@mica linux]$ perf stat ls > /dev/null Performance counter stats for 'ls': 1.512532 task-clock-msecs # 0.801 CPUs 2 context-switches # 0.001 M/sec 0 CPU-migrations # 0.000 M/sec 241 page-faults # 0.159 M/sec 2,973,331 cycles # 1965.797 M/sec 1,460,802 instructions # 0.491 IPC 314,642 branches # 208.023 M/sec 18,475 branch-misses # 5.872 % <not counted> cache-references <not counted> cache-misses 0.001887676 seconds time elapsed To get the previous behaviour just use --no-big-num: [acme@mica linux]$ perf stat --no-big-num ls > /dev/null Performance counter stats for 'ls': 1.468014 task-clock-msecs # 0.795 CPUs 1 context-switches # 0.001 M/sec 0 CPU-migrations # 0.000 M/sec 241 page-faults # 0.164 M/sec 2900254 cycles # 1975.631 M/sec 1437991 instructions # 0.496 IPC 310905 branches # 211.786 M/sec 17912 branch-misses # 5.761 % <not counted> cache-references <not counted> cache-misses 0.001845435 seconds time elapsed [acme@mica linux]$ Suggested-by: Ingo Molnar <[email protected]> Cc: Frédéric Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Stephane Eranian <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-11-20	perf stat: Change and clean up sys_perf_event_open error handling	Corey Ashford	1	-14/+24
	This patch makes several changes to "perf stat": - "perf stat" will no longer go ahead and run the application when one or more of the specified events could not be opened. - Use error() and die() instead of pr_err() so that the output is more consistent with "perf top" and "perf record". - Handle permission errors in a more robust way, and in a similar way to "perf record" and "perf top". In addition, the sys_perf_event_open() error handling of "perf top" and "perf record" is made more consistent and adds the following phrase when an event doesn't open (with something ther than an access or permission error): "/bin/dmesg may provide additional information." This is added because kernel code doesn't have a good way of expressing detailed errors to user space, so its only avenue is to use printk's. However, many users may not think of looking at dmesg to find out why an event is being rejected. Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ian Munsie <[email protected]> Cc: Michael Ellerman <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Corey Ashford <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-11-19	perf stat: Add no-aggregation mode to -a	Stephane Eranian	1	-25/+144
	This patch adds a new -A option to perf stat. If specified then perf stat does not aggregate counts across all monitored CPUs in system-wide mode, i.e., when using -a. This option is not supported in per-thread mode. Being able to get a per-cpu breakdown is useful to detect imbalances between CPUs when running a uniform workload than spans all monitored CPUs. The second version corrects the missing cpumap[] support, so that it works when the -C option is used. The third version fixes a missing cpumap[] in print_counter() and removes a stray patch in builtin-trace.c. Examples on a 4-way system: # perf stat -a -e cycles,instructions -- sleep 1 Performance counter stats for 'sleep 1': 9592808135 cycles 3490380006 instructions # 0.364 IPC 1.001584632 seconds time elapsed # perf stat -a -A -e cycles,instructions -- sleep 1 Performance counter stats for 'sleep 1': CPU0 2398163767 cycles CPU1 2398180817 cycles CPU2 2398217115 cycles CPU3 2398247483 cycles CPU0 872282046 instructions # 0.364 IPC CPU1 873481776 instructions # 0.364 IPC CPU2 872638127 instructions # 0.364 IPC CPU3 872437789 instructions # 0.364 IPC 1.001556052 seconds time elapsed Cc: David S. Miller <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Robert Richter <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Stephane Eranian <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-06-05	perf tools: Add the ability to specify list of cpus to monitor	Stephane Eranian	1	-4/+10
	This patch adds a -C option to stat, record, top to designate a list of CPUs to monitor. CPUs can be specified as a comma-separated list or ranges, no space allowed. Examples: $ perf record -a -C0-1,4-7 sleep 1 $ perf top -C0-4 $ perf stat -a -C1,2,3,4 sleep 1 With perf record in per-thread mode with inherit mode on, samples are collected only when the thread runs on the designated CPUs. The -C option does not turn on system-wide mode automatically. Cc: David S. Miller <[email protected]> Cc: Frédéric Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Stephane Eranian <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-05-18	perf stat: add perf stat -B to pretty print large numbers	Stephane Eranian	1	-4/+14
	It is hard to read very large numbers so provide an option to perf stat to separate thousands using a separator. The patch leverages the locale support of stdio. You need to set your LC_NUMERIC appropriately, for instance LC_NUMERIC=en_US.UTF8. You need to pass -B to activate this feature. This way existing scripts parsing the output do not need to be changed. Here is an example. $ perf stat noploop 2 noploop for 2 seconds Performance counter stats for 'noploop 2': 1998.347031 task-clock-msecs # 0.998 CPUs 61 context-switches # 0.000 M/sec 0 CPU-migrations # 0.000 M/sec 118 page-faults # 0.000 M/sec 4,138,410,900 cycles # 2070.917 M/sec (scaled from 70.01%) 2,062,650,268 instructions # 0.498 IPC (scaled from 70.01%) 2,057,653,466 branches # 1029.678 M/sec (scaled from 70.01%) 40,267 branch-misses # 0.002 % (scaled from 30.04%) 2,055,961,348 cache-references # 1028.831 M/sec (scaled from 30.03%) 53,725 cache-misses # 0.027 M/sec (scaled from 30.02%) 2.001393933 seconds time elapsed $ perf stat -B noploop 2 noploop for 2 seconds Performance counter stats for 'noploop 2': 1998.297883 task-clock-msecs # 0.998 CPUs 59 context-switches # 0.000 M/sec 0 CPU-migrations # 0.000 M/sec 119 page-faults # 0.000 M/sec 4,131,380,160 cycles # 2067.450 M/sec (scaled from 70.01%) 2,059,096,507 instructions # 0.498 IPC (scaled from 70.01%) 2,054,681,303 branches # 1028.216 M/sec (scaled from 70.01%) 25,650 branch-misses # 0.001 % (scaled from 30.05%) 2,056,283,014 cache-references # 1029.017 M/sec (scaled from 30.03%) 47,097 cache-misses # 0.024 M/sec (scaled from 30.02%) 2.001391016 seconds time elapsed Cc: David S. Miller <[email protected]> Cc: Frédéric Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Tom Zanussi <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Stephane Eranian <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-05-13	perf tools: change event inheritance logic in stat and record	Stephane Eranian	1	-5/+5
	By default, event inheritance across fork and pthread_create was on but the -i option of stat and record, which enabled inheritance, led to believe it was off by default. This patch fixes this logic by inverting the meaning of the -i option. By default inheritance is on whether you attach to a process (-p), a thread (-t) or start a process. If you pass -i, then you turn off inheritance. Turning off inheritance if you don't need it, helps limit perf resource usage as well. The patch also fixes perf stat -t xxxx and perf record -t xxxx which did not start the counters. Acked-by: Frederic Weisbecker <[email protected]> Cc: David S. Miller <[email protected]> Cc: Frédéric Weisbecker <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Stephane Eranian <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2010-04-14	perf: Fix endianness argument compatibility with OPT_BOOLEAN() and introduce ↵	Ian Munsie	1	-5/+5
	OPT_INCR() Parsing an option from the command line with OPT_BOOLEAN on a bool data type would not work on a big-endian machine due to the manner in which the boolean was being cast into an int and incremented. For example, running 'perf probe --list' on a PowerPC machine would fail to properly set the list_events bool and would therefore print out the usage information and terminate. This patch makes OPT_BOOLEAN work as expected with a bool datatype. For cases where the original OPT_BOOLEAN was intentionally being used to increment an int each time it was passed in on the command line, this patch introduces OPT_INCR with the old behaviour of OPT_BOOLEAN (the verbose variable is currently the only such example of this). I have reviewed every use of OPT_BOOLEAN to verify that a true C99 bool was passed. Where integers were used, I verified that they were only being used for boolean logic and changed them to bools to ensure that they would not be mistakenly used as ints. The major exception was the verbose variable which now uses OPT_INCR instead of OPT_BOOLEAN. Signed-off-by: Ian Munsie <[email protected]> Acked-by: David S. Miller <[email protected]> Cc: <[email protected]> # NOTE: wont apply to .3[34].x cleanly, please backport Cc: Git development list <[email protected]> Cc: Ian Munsie <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: KOSAKI Motohiro <[email protected]> Cc: Hitoshi Mitake <[email protected]> Cc: Rusty Russell <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Eric B Munson <[email protected]> Cc: [email protected] Cc: WANG Cong <[email protected]> Cc: Thiago Farina <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Xiao Guangrong <[email protected]> Cc: Jaswinder Singh Rajput <[email protected]> Cc: Arjan van de Ven <[email protected]> Cc: OGAWA Hirofumi <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Tom Zanussi <[email protected]> Cc: Anton Blanchard <[email protected]> Cc: John Kacur <[email protected]> Cc: Li Zefan <[email protected]> Cc: Steven Rostedt <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-22	perf stat: Better report failure to collect system wide stats	Arnaldo Carvalho de Melo	1	-11/+28
	Before: [acme@doppio linux-2.6-tip]$ perf stat -a sleep 1s Performance counter stats for 'sleep 1s': <not counted> task-clock-msecs <not counted> context-switches <not counted> CPU-migrations <not counted> page-faults <not counted> cycles <not counted> instructions <not counted> branches <not counted> branch-misses <not counted> cache-references <not counted> cache-misses 1.016998463 seconds time elapsed [acme@doppio linux-2.6-tip]$ Now: [acme@doppio linux-2.6-tip]$ perf stat -a sleep 1s No permission to collect system-wide stats. Consider tweaking /proc/sys/kernel/perf_event_paranoid. [acme@doppio linux-2.6-tip]$ Reported-by: Ingo Molnar <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Frédéric Weisbecker <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Paul Mackerras <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-18	perf events: Change perf parameter --pid to process-wide collection instead ↵	Zhang, Yanmin	1	-35/+75
	of thread-wide Parameter --pid (or -p) of perf currently means a thread-wide collection. For exmaple, if a process whose id is 8888 has 10 threads, 'perf top -p 8888' just collects the main thread statistics. That's misleading. Users are used to attach a whole process when debugging a process by gdb. To follow normal usage style, the patch change --pid to process-wide collection and add --tid (-t) to mean a thread-wide collection. Usage example is: # perf top -p 8888 # perf record -p 8888 -f sleep 10 # perf stat -p 8888 -f sleep 10 Above commands collect the statistics of all threads of process 8888. Signed-off-by: Zhang Yanmin <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Avi Kivity <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sheng Yang <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Jes Sorensen <[email protected]> Cc: Marcelo Tosatti <[email protected]> Cc: Gleb Natapov <[email protected]> Cc: [email protected] Cc: Zachary Amsden <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-18	perf stat: Enable counters when collecting process-wide or system-wide data	Zhang, Yanmin	1	-10/+14
	Command 'perf stat' doesn't enable counters when collecting an existing (by -p) process or system-wide statistics. Fix the issue. Change the condition of fork/exec subcommand. If there is a subcommand parameter, perf always forks/execs it. The usage example is: # perf stat -a sleep 10 So this command could collect statistics for 10 seconds precisely. User still could stop it by CTRL+C. Without the new capability, user could only use CTRL+C to stop it without precise time clock. Another issue is 'perf stat -a' consumes 100% time of a full single logical cpu. It has a bad impact on running workload. Fix it by adding a sleep(1) in the while(!done) loop in function run_perf_stat. Signed-off-by: Zhang Yanmin <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Avi Kivity <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sheng Yang <[email protected]> Cc: Marcelo Tosatti <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Jes Sorensen <[email protected]> Cc: Gleb Natapov <[email protected]> Cc: Zachary Amsden <[email protected]> Cc: <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-03-11	perf tools: Fix sparse CPU numbering related bugs	Paul Mackerras	1	-4/+6
	At present, the perf subcommands that do system-wide monitoring (perf stat, perf record and perf top) don't work properly unless the online cpus are numbered 0, 1, ..., N-1. These tools ask for the number of online cpus with sysconf(_SC_NPROCESSORS_ONLN) and then try to create events for cpus 0, 1, ..., N-1. This creates problems for systems where the online cpus are numbered sparsely. For example, a POWER6 system in single-threaded mode (i.e. only running 1 hardware thread per core) will have only even-numbered cpus online. This fixes the problem by reading the /sys/devices/system/cpu/online file to find out which cpus are online. The code that does that is in tools/perf/util/cpumap.[ch], and consists of a read_cpu_map() function that sets up a cpumap[] array and returns the number of online cpus. If /sys/devices/system/cpu/online can't be read or can't be parsed successfully, it falls back to using sysconf to ask how many cpus are online and sets up an identity map in cpumap[]. The perf record, perf stat and perf top code then calls read_cpu_map() in the system-wide monitoring case (instead of sysconf) and uses cpumap[] to get the cpu numbers to pass to perf_event_open. Signed-off-by: Paul Mackerras <[email protected]> Cc: Anton Blanchard <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2010-01-13	perf tools: Fix --pid option for stat	Liming Wang	1	-45/+61
	current pid option doesn't work for perf stat. Change it to what perf record --pid acts as. Signed-off-by: Liming Wang <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-11-15	perf stat: Do not print ratio when task-clock event is not counted	Lucas De Marchi	1	-2/+3
	The ratio between the number of events and the time elapsed makes sense only if task-clock event is counted. Otherwise it will be simply a (confusing) # 0.000 M/sec This patch outputs the ratio only if task-clock event is counted. Some test examples of before and after: Before: [lucas@skywalker linux.trees.git]$ sudo perf stat -e branch-misses -a -- sleep 1 Performance counter stats for 'sleep 1': 1367818 branch-misses # 0.000 M/sec 1.001494325 seconds time elapsed After (without task-clock): [lucas@skywalker perf]$ sudo ./perf stat -e branch-misses -a -- sleep 1 Performance counter stats for 'sleep 1': 1135044 branch-misses 1.001370775 seconds time elapsed After (with task-clock): [lucas@skywalker perf]$ sudo ./perf stat -e branch-misses -e task-clock -a -- sleep 1 Performance counter stats for 'sleep 1': 1070111 branch-misses # 0.534 M/sec 2002.730893 task-clock-msecs # 1.999 CPUs 1.001640292 seconds time elapsed Signed-off-by: Lucas De Marchi <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-10-19	perf stat: Count branches first	Ingo Molnar	1	-3/+3
	Count branches first, cache-misses second. The reason is that on x86 branches are not counted by all counters on all CPUs. Before: Performance counter stats for 'ls': 0.756653 task-clock-msecs # 0.802 CPUs 0 context-switches # 0.000 M/sec 0 CPU-migrations # 0.000 M/sec 250 page-faults # 0.330 M/sec 2375725 cycles # 3139.781 M/sec 1628129 instructions # 0.685 IPC 19643 cache-references # 25.960 M/sec 4608 cache-misses # 6.090 M/sec 342532 branches # 452.694 M/sec <not counted> branch-misses 0.000943356 seconds time elapsed After: Performance counter stats for 'ls': 1.056734 task-clock-msecs # 0.859 CPUs 0 context-switches # 0.000 M/sec 0 CPU-migrations # 0.000 M/sec 259 page-faults # 0.245 M/sec 3345932 cycles # 3166.295 M/sec 3074090 instructions # 0.919 IPC 616928 branches # 583.806 M/sec 39279 branch-misses # 6.367 % 21312 cache-references # 20.168 M/sec 3661 cache-misses # 3.464 M/sec 0.001230551 seconds time elapsed (also prettify the printout of branch misses, in case it's getting scaled.) Cc: Tim Blechmann <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> --- tools/perf/builtin-stat.c \| 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c index c373683..95a55ea 100644 --- a/tools/perf/builtin-stat.c +++ b/tools/perf/builtin-stat.c @@ -59,6 +59,8 @@ static struct perf_event_attr default_attrs[] = { { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_INSTRUCTIONS }, { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_REFERENCES}, { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_MISSES }, + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS}, + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES }, }; --- tools/perf/builtin-stat.c \| 20 ++++++++++---------- 1 files changed, 10 insertions(+), 10 deletions(-) diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c index 95a55ea..90e0a26 100644 --- a/tools/perf/builtin-stat.c +++ b/tools/perf/builtin-stat.c @@ -50,17 +50,17 @@ static struct perf_event_attr default_attrs[] = { - { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_TASK_CLOCK }, - { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CONTEXT_SWITCHES}, - { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CPU_MIGRATIONS }, - { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_PAGE_FAULTS }, - - { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CPU_CYCLES }, - { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_INSTRUCTIONS }, - { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_REFERENCES}, - { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_MISSES }, - { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS}, - { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES }, + { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_TASK_CLOCK }, + { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CONTEXT_SWITCHES }, + { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_CPU_MIGRATIONS }, + { .type = PERF_TYPE_SOFTWARE, .config = PERF_COUNT_SW_PAGE_FAULTS }, + + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CPU_CYCLES }, + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_INSTRUCTIONS }, + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_REFERENCES }, + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_MISSES }, + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS }, + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES }, };
2009-10-19	perf stat: Re-align the default_attrs[] array	Ingo Molnar	1	-11/+11
	Clean up the array definition to be vertically aligned. No functional effects. Cc: Tim Blechmann <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]> --- tools/perf/builtin-stat.c \| 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c index c373683..95a55ea 100644 --- a/tools/perf/builtin-stat.c +++ b/tools/perf/builtin-stat.c @@ -59,6 +59,8 @@ static struct perf_event_attr default_attrs[] = { { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_INSTRUCTIONS }, { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_REFERENCES}, { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_CACHE_MISSES }, + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS}, + { .type = PERF_TYPE_HARDWARE, .config = PERF_COUNT_HW_BRANCH_MISSES }, };
2009-10-19	perf stat: Add branch performance events to default output	Tim Blechmann	1	-0/+2
	Adds performance event information about branches and branch misses to the default output of perf stat. Signed-off-by: Tim Blechmann <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-10-19	perf stat: Add branch performance metric	Anton Blanchard	1	-0/+11
	When we count both branches and branch-misses it is useful to print out the percentage of branch-misses: # perf stat -e branches -e branch-misses /bin/true Performance counter stats for '/bin/true': 401684 branches # 0.000 M/sec 23301 branch-misses # 5.801 % Signed-off-by: Anton Blanchard <[email protected]> Cc: [email protected] Cc: [email protected] LKML-Reference: <20091018112923.GQ4808@kryten> Signed-off-by: Ingo Molnar <[email protected]>
2009-10-04	perf: Propagate term signal to child	Chris Wilson	1	-1/+7
	If we launch the child on behalf of the user, ensure that it dies along with ourselves when we are interrupted. Signed-off-by: Chris Wilson <[email protected]> Cc: Chris Wilson <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-09-22	perf stat: Fix zero total printouts	Ingo Molnar	1	-4/+14
	Before: 0 sched:sched_switch # nan M/sec After: 0 sched:sched_switch # 0.000 M/sec Cc: Peter Zijlstra <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Frederic Weisbecker <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>
2009-09-21	perf: Do the big rename: Performance Counters -> Performance Events	Ingo Molnar	1	-5/+5
	Bye-bye Performance Counters, welcome Performance Events! In the past few months the perfcounters subsystem has grown out its initial role of counting hardware events, and has become (and is becoming) a much broader generic event enumeration, reporting, logging, monitoring, analysis facility. Naming its core object 'perf_counter' and naming the subsystem 'perfcounters' has become more and more of a misnomer. With pending code like hw-breakpoints support the 'counter' name is less and less appropriate. All in one, we've decided to rename the subsystem to 'performance events' and to propagate this rename through all fields, variables and API names. (in an ABI compatible fashion) The word 'event' is also a bit shorter than 'counter' - which makes it slightly more convenient to write/handle as well. Thanks goes to Stephane Eranian who first observed this misnomer and suggested a rename. User-space tooling and ABI compatibility is not affected - this patch should be function-invariant. (Also, defconfigs were not touched to keep the size down.) This patch has been generated via the following script: FILES=$(find * -type f \| grep -vE 'oprofile\|[^K]config') sed -i \ -e 's/PERF_EVENT_/PERF_RECORD_/g' \ -e 's/PERF_COUNTER/PERF_EVENT/g' \ -e 's/perf_counter/perf_event/g' \ -e 's/nb_counters/nb_events/g' \ -e 's/swcounter/swevent/g' \ -e 's/tpcounter_event/tp_event/g' \ $FILES for N in $(find . -name perf_counter.[ch]); do M=$(echo $N \| sed 's/perf_counter/perf_event/g') mv $N $M done FILES=$(find . -name perf_event.*) sed -i \ -e 's/COUNTER_MASK/REG_MASK/g' \ -e 's/COUNTER/EVENT/g' \ -e 's/\<event\>/event_id/g' \ -e 's/counter/event/g' \ -e 's/Counter/Event/g' \ $FILES ... to keep it as correct as possible. This script can also be used by anyone who has pending perfcounters patches - it converts a Linux kernel tree over to the new naming. We tried to time this change to the point in time where the amount of pending patches is the smallest: the end of the merge window. Namespace clashes were fixed up in a preparatory patch - and some stylistic fallout will be fixed up in a subsequent patch. ( NOTE: 'counters' are still the proper terminology when we deal with hardware registers - and these sed scripts are a bit over-eager in renaming them. I've undone some of that, but in case there's something left where 'counter' would be better than 'event' we can undo that on an individual basis instead of touching an otherwise nicely automated patch. ) Suggested-by: Stephane Eranian <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Acked-by: Paul Mackerras <[email protected]> Reviewed-by: Arjan van de Ven <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: David Howells <[email protected]> Cc: Kyle McMartin <[email protected]> Cc: Martin Schwidefsky <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>
2009-09-04	perf stat: Clean up statistics calculations a bit more	Peter Zijlstra	1	-17/+14
	Remove some, now useless, global storage. Don't calculate the stddev when not needed. Signed-off-by: Peter Zijlstra <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>
2009-09-04	perf stat: More advanced variance computation	Peter Zijlstra	1	-12/+12
	Use the more advanced single pass variance algorithm outlined on the wikipedia page. This is numerically more stable for larger sample sets. Signed-off-by: Peter Zijlstra <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>
2009-09-04	perf stat: Use stddev_mean in stead of stddev	Peter Zijlstra	1	-6/+19
	When we're computing the mean by sampling the distribution, then the std dev of the mean is related to the std dev of the sample set by: stddev_mean = std_dev / sqrt(N) Which is exactly what we want. This results in the error on the mean decreasing with increasing number of samples. Also fix the scaled == -1, aka not counted case. Signed-off-by: Peter Zijlstra <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>
2009-09-04	perf stat: Remove the limit on repeat	Peter Zijlstra	1	-57/+27
	Since we don't need all the individual samples to calculate the error remove both the limit and the storage overhead associated with that. Signed-off-by: Peter Zijlstra <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>