aboutsummaryrefslogtreecommitdiff
path: root/tools/perf/builtin-report.c
AgeCommit message (Collapse)AuthorFilesLines
2009-06-25perf-report: Add bare minimum PERF_EVENT_READ parsingPeter Zijlstra1-0/+24
Provide the basic infrastructure to provide per task stats. Signed-off-by: Peter Zijlstra <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>
2009-06-25perf_counter: Rework the sample ABIPeter Zijlstra1-13/+19
The PERF_EVENT_READ implementation made me realize we don't actually need the sample_type int the output sample, since we already have that in the perf_counter_attr information. Therefore, remove the PERF_EVENT_MISC_OVERFLOW bit and the event->type overloading, and imply put counter overflow samples in a PERF_EVENT_SAMPLE type. This also fixes the issue that event->type was only 32-bit and sample_type had 64 usable bits. Signed-off-by: Peter Zijlstra <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>
2009-06-25perf_counter tools: Rework the file formatPeter Zijlstra1-9/+28
Create a structured file format that includes the full perf_counter_attr and all its relevant counter IDs so that the reporting program has full information. Signed-off-by: Peter Zijlstra <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>
2009-06-23perf_counter tools: Handle overlapping MMAP eventsPeter Zijlstra1-3/+21
Martin Schwidefsky reported "perf report" symbol resolution problems on S390. Since we only report MMAP, not MUNMAP, we have to deal with overlapping maps. We used to simply throw out the old map on the assumption whole maps got unmapped. This obviously doesn't deal with partial unmaps. However it appears some dynamic linkers do fancy partial unmaps (s390), so do something more elaborate and truncate the old maps, only removing them when they've been fully covered. This resolves (part of) the S390 symbol resolution problems. Reported-by: Martin Schwidefsky <[email protected]> Tested-by: Martin Schwidefsky <[email protected]> Signed-off-by: Peter Zijlstra <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>
2009-06-22perf report: Output more symbol related debug dataPeter Zijlstra1-2/+3
Print more symbol relocation related info under -vv. Signed-off-by: Peter Zijlstra <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>
2009-06-20perfcounter: Handle some IO return valuesFrederic Weisbecker1-1/+4
Building perfcounter tools raises the following warnings: builtin-record.c: In function ‘atexit_header’: builtin-record.c:464: erreur: ignoring return value of ‘pwrite’, declared with attribute warn_unused_result builtin-record.c: In function ‘__cmd_record’: builtin-record.c:503: erreur: ignoring return value of ‘read’, declared with attribute warn_unused_result builtin-report.c: In function ‘__cmd_report’: builtin-report.c:1403: erreur: ignoring return value of ‘read’, declared with attribute warn_unused_result This patch handles these IO return values. Signed-off-by: Frederic Weisbecker <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Frederic Weisbecker <[email protected]> LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-06-19perf_counter tools: Define and use our own u64, s64 etc. definitionsPaul Mackerras1-42/+42
On 64-bit powerpc, __u64 is defined to be unsigned long rather than unsigned long long. This causes compiler warnings every time we print a __u64 value with %Lx. Rather than changing __u64, we define our own u64 to be unsigned long long on all architectures, and similarly s64 as signed long long. For consistency we also define u32, s32, u16, s16, u8 and s8. These definitions are put in a new header, types.h, because these definitions are needed in util/string.h and util/symbol.h. The main change here is the mechanical change of __[us]{64,32,16,8} to remove the "__". The other changes are: * Create types.h * Include types.h in perf.h, util/string.h and util/symbol.h * Add types.h to the LIB_H definition in Makefile * Added (u64) casts in process_overflow_event() and print_sym_table() to kill two remaining warnings. Signed-off-by: Paul Mackerras <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-06-19perf_counter tools: Add a data file headerPeter Zijlstra1-1/+15
Add a data file header so we can transfer data between record and report. LKML-Reference: <new-submission> Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-06-19perf_counter: Update userspace callchain sampling usesPeter Zijlstra1-47/+39
Update the tools to reflect the new callchain sampling format. LKML-Reference: <new-submission> Signed-off-by: Peter Zijlstra <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-06-18perf report: Filter to parent set by defaultIngo Molnar1-3/+27
Make it easier to use parent filtering - default to a filtered output. Also add the parent column so that we get collapsing but dont display it by default. add --no-exclude-other to override this. Cc: Peter Zijlstra <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>
2009-06-18perf_counter tools: Handle lost eventsPeter Zijlstra1-1/+28
Make use of the new ->data_tail mechanism to tell kernel-space about user-space draining the data stream. Emit lost events (and display them) if they happen. Signed-off-by: Peter Zijlstra <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>
2009-06-18perf_counter tools: Add and use isprint()Peter Zijlstra1-1/+1
Introduce isprint() to print out raw event dumps to ASCII, etc. (This is an extension to upstream Git's ctype.c.) Signed-off-by: Peter Zijlstra <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> LKML-Reference: <new-submission> [ removed openssl.h inclusion from util.h - it leaked ctype.h ] Signed-off-by: Ingo Molnar <[email protected]>
2009-06-18perf report: Add validation of call-chain entriesIngo Molnar1-28/+46
Add boundary checks for call-chain events. In case of corrupted entries we could crash otherwise. Cc: Peter Zijlstra <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>
2009-06-18perf report: Tidy up the "--parent <regex>" and "--sort parent" call-chain ↵Ingo Molnar1-33/+34
features Instead of the ambigious 'call' naming use the much more specific 'parent' naming: - rename --call <regex> to --parent <regex> - rename --sort call to --sort parent - rename [unmatched] to [other] - to signal that this is not an error but the inverse set Also add pagefaults to the default parent-symbol pattern too, as it's a 'syscall overhead category' in a sense. Cc: Peter Zijlstra <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>
2009-06-17perf_counter tools: Replace isprint() with issane()Peter Zijlstra1-1/+1
The Git utils came with a ctype replacement that doesn't provide isprint(). Add a replacement. Solves a build bug on certain distros. Signed-off-by: Peter Zijlstra <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>
2009-06-17perf report: Add --sort <call> --call <$regex>Peter Zijlstra1-51/+158
Implement sorting by callchain symbols, --sort <call>. It will create a new column which will show a match to --call $regex or "[unmatched]". Signed-off-by: Peter Zijlstra <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>
2009-06-15perf report: Fix 32-bit printf formatIngo Molnar1-1/+1
Yong Wang reported the following compiler warning: builtin-report.c: In function 'process_overflow_event': builtin-report.c:984: error: cast to pointer from integer of different size Which happens because we try to print ->ips[] out with a limited format, losing the high 32 bits. Print it out using %016Lx instead. Reported-by: Yong Wang <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>
2009-06-15perf report: Add per system call overhead histogramIngo Molnar1-0/+12
Take advantage of call-graph percounter sampling/recording to display a non-trivial histogram: the true, collapsed/summarized cost measurement, on a per system call total overhead basis: aldebaran:~/linux/linux/tools/perf> ./perf record -g -a -f ~/hackbench 10 aldebaran:~/linux/linux/tools/perf> ./perf report -s symbol --syscalls | head -10 # # (3536 samples) # # Overhead Symbol # ........ ...... # 40.75% [k] sys_write 40.21% [k] sys_read 4.44% [k] do_nmi ... This is done by accounting each (reliable) call-chain that chains back to a given system call to that system call function. [ So in the above example we can see that hackbench spends about 40% of its total time somewhere in sys_write() and 40% somewhere in sys_read(), the rest of the time is spent in user-space. The time is not spent in sys_write() _itself_ but in one of its many child functions. ] Or, a recording of a (source files are already in the page-cache) kernel build: $ perf record -g -m 512 -f -- make -j32 kernel $ perf report -s s --syscalls | grep '\[k\]' | grep -v nmi 4.14% [k] do_page_fault 1.20% [k] sys_write 1.10% [k] sys_open 0.63% [k] sys_exit_group 0.48% [k] smp_apic_timer_interrupt 0.37% [k] sys_read 0.37% [k] sys_execve 0.20% [k] sys_mmap 0.18% [k] sys_close 0.14% [k] sys_munmap 0.13% [k] sys_poll 0.09% [k] sys_newstat 0.07% [k] sys_clone 0.06% [k] sys_newfstat 0.05% [k] sys_access 0.05% [k] schedule Shows the true total cost of each syscall variant that gets used during a kernel build. This profile reveals it that pagefaults are the costliest, followed by read()/write(). An interesting detail: timer interrupts cost 0.5% - or 0.5 seconds per 100 seconds of kernel build-time. (this was done with HZ=1000) The summary is done in 'perf report', i.e. in the post-processing stage - so once we have a good call-graph recording, this type of non-trivial high-level analysis becomes possible. Cc: Peter Zijlstra <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: Pekka Enberg <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>
2009-06-14perf record/report: Add call graph / call chain profilingIngo Molnar1-12/+45
Add the first steps of call-graph profiling: - add the -c (--call-graph) option to perf record - parse the call-graph record and printout out under -D (--dump-trace) The call-graph data is not put into the histogram yet, but it can be seen that it's being processed correctly: 0x3ce0 [0x38]: event: 35 . . ... raw event: size 56 bytes . 0000: 23 00 00 00 05 00 38 00 d4 df 0e 81 ff ff ff ff #.....8........ . 0010: 60 0b 00 00 60 0b 00 00 03 00 00 00 01 00 02 00 `...`.......... . 0020: d4 df 0e 81 ff ff ff ff a0 61 ed 41 36 00 00 00 .........a.A6.. . 0030: 04 92 e6 41 36 00 00 00 .a.A6.. . 0x3ce0 [0x38]: PERF_EVENT (IP, 5): 2912: 0xffffffff810edfd4 period: 1 ... chain: u:2, k:1, nr:3 ..... 0: 0xffffffff810edfd4 ..... 1: 0x3641ed61a0 ..... 2: 0x3641e69204 ... thread: perf:2912 ...... dso: [kernel] This shows a 3-entry call-graph: with 1 kernel-space and two user-space entries Cc: Frederic Weisbecker <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Arjan van de Ven <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>
2009-06-14perf report: Print out raw events in hexaIngo Molnar1-1/+35
Print out events in hexa dump format, when -D is specified: 0x4868 [0x48]: event: 1 . . ... raw event: size 72 bytes . 0000: 01 00 00 00 00 00 48 00 d4 72 00 00 d4 72 00 00 ......H..r...r. . 0010: 00 00 40 f2 3e 00 00 00 00 30 01 00 00 00 00 00 ..@.>....0..... . 0020: 00 00 00 00 00 00 00 00 2f 75 73 72 2f 6c 69 62 ......../usr/li . 0030: 36 34 2f 6c 69 62 65 6c 66 2d 30 2e 31 34 31 2e 64/libelf-0.141 . 0040: 73 6f 00 00 00 00 00 00 f-0.141 . 0x4868 [0x48]: PERF_EVENT_MMAP 29396: [0x3ef2400000(0x13000) @ (nil)]: /usr/lib64/libelf-0.141.so This helps the debugging of mis-parsing of data files, and helps the addition of new sample/trace formats. Cc: Peter Zijlstra <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>
2009-06-11perf_counter tools: Clean up u64 usageIngo Molnar1-18/+18
A build error slipped in: builtin-report.c: In function ‘hist_entry__fprintf’: builtin-report.c:711: error: format ‘%12d’ expects type ‘int’, but argument 3 has type ‘uint64_t’ Because we got a bit sloppy with those types. uint64_t really sucks, because there's no printf format for it. So standardize on __u64 instead - for all types that go to or come from the ABI (which is __u64), or for values that need to be large enough even on 32-bit. Cc: Peter Zijlstra <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>
2009-06-11perf_counter tools: Normalize data using per sample period dataPeter Zijlstra1-7/+11
When we use variable period sampling, add the period to the sample data and use that to normalize the samples. Signed-off-by: Peter Zijlstra <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>
2009-06-10perf_counter tools: Small frequency related fixesPeter Zijlstra1-2/+4
Create the counter in a disabled state and only enable it after we mmap() the buffer, this allows us to see the first few samples (and observe the frequency ramp). Furthermore, print the period in the verbose report. Signed-off-by: Peter Zijlstra <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>
2009-06-08perf_counter tools: Standardize color printingIngo Molnar1-5/+8
The rule is: - high overhead: red - mid overhead: green - low overhead: normal (white/black) Cc: Peter Zijlstra <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>
2009-06-08perf report: Add support for profiling JIT generated codePekka Enberg1-1/+14
This patch adds support for profiling JIT generated code to 'perf report'. A JIT compiler is required to generate a "/tmp/perf-$PID.map" symbols map that is parsed when looking and displaying symbols. Thanks to Peter Zijlstra for his help with this patch! Example "perf report" output with the Jato JIT: # # (40311 samples) # # Overhead Command Shared Object Symbol # ........ ................ ......................... ...... # 97.80% jato /tmp/perf-11915.map [.] Fibonacci.fib(I)I 0.56% jato 00000000b7fa023b 0x000000b7fa023b 0.45% jato /tmp/perf-11915.map [.] Fibonacci.main([Ljava/lang/String;)V 0.38% jato [kernel] [k] get_page_from_freelist 0.06% jato [kernel] [k] kunmap_atomic 0.05% jato ./jato [.] utf8Hash 0.04% jato ./jato [.] executeJava 0.04% jato ./jato [.] defineClass Cc: Peter Zijlstra <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Pekka Enberg <[email protected]> Cc: [email protected] Cc: [email protected] LKML-Reference: <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2009-06-07perf report: Print more expressive message in case of file open errorIngo Molnar1-1/+4
Before: $ perf report failed to open file: No such file or directory After: $ perf report failed to open file: perf.data (try 'perf record' first) Cc: Peter Zijlstra <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>
2009-06-06perf_counter tools: Move from Documentation/perf_counter/ to tools/perf/Ingo Molnar1-0/+1291
Several people have suggested that 'perf' has become a full-fledged tool that should be moved out of Documentation/. Move it to the (new) tools/ directory. Cc: Peter Zijlstra <[email protected]> Cc: Mike Galbraith <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <[email protected]>