aboutsummaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2018-12-17perf trace: We need to consider "nr" if "__syscall_nr" is not thereArnaldo Carvalho de Melo1-1/+2
To cope with older kernels that don't have this patch backported: 026842d148b9 ("tracing/syscalls: Rename "/format" tracepoint field name "nr" to "__syscall_nr:") This makes 'perf trace' work again in RHEL7 kernels. Cc: Taeung Song <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf tools: Allow specifying proc-map-timeout in config fileMark Drayton14-50/+39
The default timeout of 500ms for parsing /proc/<pid>/maps files is too short for profiling many of our services. This can be overridden by passing --proc-map-timeout to the relevant command but it'd be nice to globally increase our default value. This patch permits setting a different default with the core.proc-map-timeout config file parameter. Signed-off-by: Mark Drayton <[email protected]> Acked-by: Song Liu <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17tools lib subcmd: Fix a few source code comment typosIngo Molnar1-2/+2
Go over the tools/ files that are maintained in Arnaldo's tree and fix common typos: half of them were in comments, the other half in JSON files. No change in functionality intended. Committer notes: This was split from a larger patch as there are code that is, additionally, maintained outside the kernel tree, so to ease cherry-picking and/or backporting, split this into multiple patches. Signed-off-by: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf tools: Fix diverse comment typosIngo Molnar11-12/+12
Go over the tools/ files that are maintained in Arnaldo's tree and fix common typos: half of them were in comments, the other half in JSON files. No change in functionality intended. Committer notes: This was split from a larger patch as there are code that is, additionally, maintained outside the kernel tree, so to ease cherry-picking and/or backporting, split this into multiple patches. Just typos in comments, no need to backport, reducing the possibility of possible backporting artifacts. Signed-off-by: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf bpf-loader: Fix debugging message typoIngo Molnar1-1/+1
Go over the tools/ files that are maintained in Arnaldo's tree and fix common typos: half of them were in comments, the other half in JSON files. No change in functionality intended. Committer notes: This was split from a larger patch as there are code that is, additionally, maintained outside the kernel tree, so to ease cherry picking and/or backporting, split this into multiple patches. This one has information that is presented to the user, albeit in debug mode. Signed-off-by: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf tools Documentation: Fix diverse typosIngo Molnar3-4/+4
Go over the tools/ files that are maintained in Arnaldo's tree and fix common typos: half of them were in comments, the other half in JSON files. No change in functionality intended. Committer notes: This was split from a larger patch as there are code that is, additionally, maintained outside the kernel tree, so to ease cherry picking and/or backporting, split this into multiple patches. In this particular case, it affects documentation, so may be interesting to cherry pick as it is information that is presented to the user. Signed-off-by: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17tools lib traceevent: Fix diverse typos in commentsIngo Molnar2-7/+7
Go over the tools/ files that are maintained in Arnaldo's tree and fix common typos: half of them were in comments, the other half in JSON files. No change in functionality intended. Committer notes: This was split from a larger patch as there are code that is, additionally, maintained outside the kernel tree, so to ease cherry picking and/or backporting, split this into multiple patches. Signed-off-by: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt (VMware) <[email protected]> Cc: Tzvetomir Stoyanov <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf vendor events intel: Fix diverse typosIngo Molnar12-36/+36
Go over the tools/ files that are maintained in Arnaldo's tree and fix common typos: half of them were in comments, the other half in JSON files. ( Care should be taken not to re-import these typos in the future, if the JSON files get updated by the vendor without fixing the typos. ) No change in functionality intended. Committer notes: This was split from a larger patch as there are code that is, additionally, maintained outside the kernel tree, so to ease cherry picking and/or backporting, split this into multiple patches. Signed-off-by: Ingo Molnar <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf tests ARM: Disable breakpoint tests 32-bitFlorian Fainelli1-6/+14
The breakpoint tests on the ARM 32-bit kernel are broken in several ways. The breakpoint length requested does not necessarily match whether the function address has the Thumb bit (bit 0) set or not, and this does matter to the ARM kernel hw_breakpoint infrastructure. See [1] for background. [1]: https://lkml.org/lkml/2018/11/15/205 As Will indicated, the overflow handling would require single-stepping which is not supported at the moment. Just disable those tests for the ARM 32-bit platforms and update the comment above to explain these limitations. Co-developed-by: Will Deacon <[email protected]> Signed-off-by: Florian Fainelli <[email protected]> Signed-off-by: Will Deacon <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf cs-etm: Support for ARM A32/T32 instruction sets in CoreSight traceRobert Walker4-39/+78
This patch adds support for generating instruction samples from trace of AArch32 programs using the A32 and T32 instruction sets. T32 has variable 2 or 4 byte instruction size, so the conversion between addresses and instruction counts requires extra information from the trace decoder, requiring version 0.10.0 of OpenCSD. A check for the OpenCSD library version has been added to the feature check for OpenCSD. Signed-off-by: Robert Walker <[email protected]> Reviewed-by: Mathieu Poirier <[email protected]> Tested-by: Leo Yan <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf beauty mmap_flags: Fixed syntax error Fixed missing ']' errorSihyeon Jang1-2/+2
Committer testing: Before: # tools/perf/trace/beauty/mmap_flags.sh static const char *mmap_flags[] = { [ilog2(0x40) + 1] = "32BIT", tools/perf/trace/beauty/mmap_flags.sh: line 23: [: missing `]' [ilog2(0x01) + 1] = "SHARED", [ilog2(0x02) + 1] = "PRIVATE", [ilog2(0x10) + 1] = "FIXED", [ilog2(0x20) + 1] = "ANONYMOUS", [ilog2(0x100000) + 1] = "FIXED_NOREPLACE", tools/perf/trace/beauty/mmap_flags.sh: line 28: [: missing `]' [ilog2(0x0100) + 1] = "GROWSDOWN", [ilog2(0x0800) + 1] = "DENYWRITE", [ilog2(0x1000) + 1] = "EXECUTABLE", [ilog2(0x2000) + 1] = "LOCKED", [ilog2(0x4000) + 1] = "NORESERVE", [ilog2(0x8000) + 1] = "POPULATE", [ilog2(0x10000) + 1] = "NONBLOCK", [ilog2(0x20000) + 1] = "STACK", [ilog2(0x40000) + 1] = "HUGETLB", [ilog2(0x80000) + 1] = "SYNC", }; # After: $ tools/perf/trace/beauty/mmap_flags.sh static const char *mmap_flags[] = { [ilog2(0x40) + 1] = "32BIT", [ilog2(0x01) + 1] = "SHARED", [ilog2(0x02) + 1] = "PRIVATE", [ilog2(0x10) + 1] = "FIXED", [ilog2(0x20) + 1] = "ANONYMOUS", [ilog2(0x100000) + 1] = "FIXED_NOREPLACE", [ilog2(0x0100) + 1] = "GROWSDOWN", [ilog2(0x0800) + 1] = "DENYWRITE", [ilog2(0x1000) + 1] = "EXECUTABLE", [ilog2(0x2000) + 1] = "LOCKED", [ilog2(0x4000) + 1] = "NORESERVE", [ilog2(0x8000) + 1] = "POPULATE", [ilog2(0x10000) + 1] = "NONBLOCK", [ilog2(0x20000) + 1] = "STACK", [ilog2(0x40000) + 1] = "HUGETLB", [ilog2(0x80000) + 1] = "SYNC", }; $ Signed-off-by: Sihyeon Jang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Fixes: aca70cc40a0b ("perf beauty mmap_flags: Check if the arch has a mmap.h file") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17tools lib traceevent: traceevent API cleanupTzvetomir Stoyanov4-24/+18
In order to make libtraceevent into a proper library, its API should be straightforward. This patch hides few API functions, intended for internal usage only: tep_free_event(), tep_free_format_field(), __tep_data2host2(), __tep_data2host4() and __tep_data2host8(). The patch also alignes the libtraceevent summary man page with these API changes. Signed-off-by: Tzvetomir Stoyanov <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Steven Rostedt (VMware) <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf tools: traceevent API cleanup, remove __tep_data2host*()Tzvetomir Stoyanov1-2/+2
In order to make libtraceevent into a proper library, its API should be straightforward. The __tep_data2host*() functions are going to no longer be available as a libtraceevent API, tep_read_number() should be used instead. This patch replaces __tep_data2host*() usage with tep_read_number() in perf. Signed-off-by: Tzvetomir Stoyanov <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Steven Rostedt (VMware) <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17tools lib traceevent: Rename tep_free_format() to tep_free_event()Tzvetomir Stoyanov2-4/+4
In order to make libtraceevent into a proper library, variables, data structures and functions require a unique prefix to prevent name space conflicts. This renames tep_free_format() to tep_free_event(), which describes more closely the purpose of the function. Signed-off-by: Tzvetomir Stoyanov <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Steven Rostedt (VMware) <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17tools lib traceevent, perf tools: Rename 'struct tep_event_format' to ↵Tzvetomir Stoyanov20-198/+198
'struct tep_event' In order to make libtraceevent into a proper library, variables, data structures and functions require a unique prefix to prevent name space conflicts. This renames 'struct tep_event_format' to 'struct tep_event', which describes more closely the purpose of the struct. Signed-off-by: Tzvetomir Stoyanov <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Steven Rostedt (VMware) <[email protected]> [ Fixup conflict with 6e33c250a88f ("tools lib traceevent: Fix compile warnings in tools/lib/traceevent/event-parse.c") ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17tools lib traceevent: Install trace-seq.h API header fileTzvetomir Stoyanov1-1/+2
This patch installs trace-seq.h header file on "make install". Signed-off-by: Tzvetomir Stoyanov <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Steven Rostedt (VMware) <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17tools lib traceevent: Added support for pkg-configTzvetomir Stoyanov2-3/+33
This patch implements integration with pkg-config framework. pkg-config can be used by the library users to determine required CFLAGS and LDFLAGS in order to use the library Signed-off-by: Tzvetomir Stoyanov <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Steven Rostedt (VMware) <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17tools lib traceevent: Implement new API tep_get_ref()Tzvetomir Stoyanov2-0/+8
This patch implements a new API of the tracevent library: int tep_get_ref(struct tep_handle *tep); The API returns the reference counter "ref_count" of the tep handler. As "struct tep_handle" is internal only, its members cannot be accessed by the library users, the API is used to get the reference counter. Signed-off-by: Tzvetomir Stoyanov <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Steven Rostedt (VMware) <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf report: Documentation average IPC and IPC coverageJin Yao1-0/+8
Add explanations for new columns "IPC" and "IPC coverage" in perf documentation. v5: --- Update the description according to Ingo's comments. Signed-off-by: Jin Yao <[email protected]> Reviewed-by: Ingo Molnar <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf report: Display average IPC and IPC coverage per symbolJin Yao4-3/+87
Support displaying the average IPC and IPC coverage per symbol in 'perf report' --tui and --stdio modes. For example, $ perf record -b ... $ perf report -s symbol Overhead Symbol IPC [IPC Coverage] 39.60% [.] __random 2.30 [ 54.8%] 18.02% [.] main 0.43 [ 54.3%] 14.21% [.] compute_flag 2.29 [100.0%] 14.16% [.] rand 0.36 [100.0%] 7.06% [.] __random_r 2.57 [ 70.5%] 6.85% [.] rand@plt 0.00 [ 0.0%] Jiri Olsa <[email protected]> provided the patch to support the --stdio mode. I merged Jiri's code in this patch. $ perf report -s symbol --stdio # Overhead Symbol IPC [IPC Coverage] # ........ ........................... .................... # 39.60% [.] __random 2.30 [ 54.8%] 18.02% [.] main 0.43 [ 54.3%] 14.21% [.] compute_flag 2.29 [100.0%] 14.16% [.] rand 0.36 [100.0%] 7.06% [.] __random_r 2.57 [ 70.5%] 6.85% [.] rand@plt 0.00 [ 0.0%] 0.02% [k] run_timer_softirq 1.60 [ 57.2%] The columns "IPC" and "[IPC Coverage]" are automatically enabled when the sort-key "symbol" is specified. If the perf.data file doesn't contain timed LBR information, columns are filled with "-". For example, # Overhead Symbol IPC [IPC Coverage] # ........ ........................... .................... # 46.57% [.] main - - 17.60% [.] rand - - 15.84% [.] __random_r - - 11.90% [.] __random - - 6.50% [.] compute_flag - - 1.59% [.] rand@plt - - 0.00% [.] _dl_relocate_object - - 0.00% [k] tlb_flush_mmu - - 0.00% [k] perf_event_mmap - - 0.00% [k] native_sched_clock - - 0.00% [k] intel_pmu_handle_irq_v4 - - 0.00% [k] native_write_msr - - v3: --- Removed the sortkey 'ipc' from command-line. The columns "IPC" and "[IPC Coverage]" are automatically enabled when "symbol" is specified. v2: --- Merge in Jiri's patch to support stdio mode Signed-off-by: Jin Yao <[email protected]> Reviewed-by: Ingo Molnar <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf annotate: Create a annotate2 flag in struct symbolJin Yao2-0/+2
We often use the symbol__annotate2() to annotate a specified symbol. While annotating may take some time, so in order to avoid annotating the same symbol repeatedly, the patch creates a new flag to indicate the symbol has been annotated. Signed-off-by: Jin Yao <[email protected]> Reviewed-by: Ingo Molnar <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf annotate: Compute average IPC and IPC coverage per symbolJin Yao2-3/+43
Add support to 'perf report' annotate view or 'perf annotate --stdio2' to aggregate the IPC derived from timed LBRs per symbol. We compute the average IPC and the IPC coverage percentage. For example: $ perf annotate --stdio2 Percent IPC Cycle (Average IPC: 2.30, IPC Coverage: 54.8%) Disassembly of section .text: 000000000003aac0 <random@@GLIBC_2.2.5>: 8.32 3.28 sub $0x18,%rsp 3.28 mov $0x1,%esi 3.28 xor %eax,%eax 3.28 cmpl $0x0,argp_program_version_hook@@GLIBC_2.2.5+0x1e0 11.57 3.28 1 ↓ je 20 lock cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0 ↓ jne 29 ↓ jmp 43 11.57 1.10 20: cmpxchg %esi,__abort_msg@@GLIBC_PRIVATE+0x8a0 0.00 1.10 1 ↓ je 43 29: lea __abort_msg@@GLIBC_PRIVATE+0x8a0,%rdi sub $0x80,%rsp → callq __lll_lock_wait_private add $0x80,%rsp 0.00 3.00 43: lea __ctype_b@GLIBC_2.2.5+0x38,%rdi 3.00 lea 0xc(%rsp),%rsi 8.49 3.00 1 → callq __random_r 7.91 1.94 cmpl $0x0,argp_program_version_hook@@GLIBC_2.2.5+0x1e0 0.00 1.94 1 ↓ je 68 lock decl __abort_msg@@GLIBC_PRIVATE+0x8a0 ↓ jne 70 ↓ jmp 8a 0.00 2.00 68: decl __abort_msg@@GLIBC_PRIVATE+0x8a0 21.56 2.00 1 ↓ je 8a 70: lea __abort_msg@@GLIBC_PRIVATE+0x8a0,%rdi sub $0x80,%rsp → callq __lll_unlock_wake_private add $0x80,%rsp 21.56 2.90 8a: movslq 0xc(%rsp),%rax 2.90 add $0x18,%rsp 9.03 2.90 1 ← retq It shows for this symbol the average IPC is 2.30 and the IPC coverage is 54.8%. Signed-off-by: Jin Yao <[email protected]> Reviewed-by: Ingo Molnar <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17tools lib traceevent: Add sanity check to is_timestamp_in_us()Tzvetomir Stoyanov1-1/+1
This patch adds a sanity check to is_timestamp_in_us() input parameter trace_clock. It avoids a potential segfault in this function for the case trace_clock is NULL. Reported-by: Slavomir Kaslev <[email protected]> Signed-off-by: Tzvetomir Stoyanov <[email protected]> Cc: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Steven Rostedt (VMware) <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf beauty mmap_flags: Check if the arch has a mmap.h fileArnaldo Carvalho de Melo2-3/+3
If not, then just use what is in asm-generic. This fixes the build for my sh4, m68k and riscv64 perf test build containers that were failing due to 80ee5668b8a7 ("perf beauty: Add a generator for MAP_ mmap's flag constants"), that were not covered in the cset introducing those tools/arch/*/include/uapi/asm/mman.h files. f3539c12d819 ("tools include: Add uapi mman.h for each architecture") Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Fixes: 80ee5668b8a7 ("perf beauty: Add a generator for MAP_ mmap's flag constants") Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf record: Extend trace writing to multi AIOAlexey Budankov4-42/+102
Multi AIO trace writing allows caching more kernel data into userspace memory postponing trace writing for the sake of overall profiling data thruput increase. It could be seen as kernel data buffer extension into userspace memory. With an --aio option value different from 0 (default value is 1) the tool has capability to cache more and more data into user space along with delegating spill to AIO. That allows avoiding to suspend at record__aio_sync() between calls of record__mmap_read_evlist() and increases profiling data thruput at the cost of userspace memory. Signed-off-by: Alexey Budankov <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf record: Enable asynchronous trace writingAlexey Budankov7-9/+314
The trace file offset is read once before mmaps iterating loop and written back after all performance data is enqueued for aio writing. The trace file offset is incremented linearly after every successful aio write operation. record__aio_sync() blocks till completion of the started AIO operation and then proceeds. record__aio_mmap_read_sync() implements a barrier for all incomplete aio write requests. Signed-off-by: Alexey Budankov <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf mmap: Map data buffer for preserving collected dataAlexey Budankov3-3/+59
The map->data buffer is used to preserve map->base profiling data for writing to disk. AIO map->cblock is used to queue corresponding map->data buffer for asynchronous writing. Signed-off-by: Alexey Budankov <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17tools build feature: Check if libaio is availableAlexey Budankov6-4/+42
This will be used by 'perf record' to speed up reading the perf ring buffer. Committer testing: $ make -C tools/perf O=/tmp/build/perf make: Entering directory '/home/acme/git/perf/tools/perf' BUILD: Doing 'make -j8' parallel build Auto-detecting system features: ... dwarf: [ on ] ... dwarf_getlocations: [ on ] ... glibc: [ on ] ... gtk2: [ OFF ] ... libaudit: [ OFF ] ... libbfd: [ OFF ] ... libelf: [ on ] ... libnuma: [ OFF ] ... numa_num_possible_cpus: [ OFF ] ... libperl: [ OFF ] ... libpython: [ OFF ] ... libslang: [ on ] ... libcrypto: [ on ] ... libunwind: [ on ] ... libdw-dwarf-unwind: [ on ] ... zlib: [ on ] ... lzma: [ on ] ... get_cpuid: [ on ] ... bpf: [ on ] ... libaio: [ on ] $ ls -la /tmp/build/perf/feature/test-libaio.* -rwxrwxr-x. 1 acme acme 18296 Nov 26 08:49 /tmp/build/perf/feature/test-libaio.bin -rw-rw-r--. 1 acme acme 1165 Nov 26 08:49 /tmp/build/perf/feature/test-libaio.d -rw-rw-r--. 1 acme acme 0 Nov 26 08:49 /tmp/build/perf/feature/test-libaio.make.output $ $ grep -i aio /tmp/build/perf/FEATURE-DUMP feature-libaio=1 $ Signed-off-by: Alexey Budankov <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ split from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf intel-pt: Fix error with config term "pt=0"Adrian Hunter1-0/+11
Users should never use 'pt=0', but if they do it may give a meaningless error: $ perf record -e intel_pt/pt=0/u uname Error: The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (intel_pt/pt=0/u). Fix that by forcing 'pt=1'. Committer testing: # perf record -e intel_pt/pt=0/u uname Error: The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (intel_pt/pt=0/u). /bin/dmesg | grep -i perf may provide additional information. # perf record -e intel_pt/pt=0/u uname pt=0 doesn't make sense, forcing pt=1 Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.020 MB perf.data ] # Signed-off-by: Adrian Hunter <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf top: Allow passing a kallsyms fileArnaldo Carvalho de Melo2-0/+5
This basically replicates what was done for 'perf report' in: b226a5a72901 ("perf report: Allow user to specify path to kallsyms file") This should help with resolving eBPF symbols, that are in kallsyms but, of course, not in vmlinux. Reported-by: Ivan Babrou <[email protected]> Tested-by: Ivan Babrou <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: David Ahern <[email protected]> Cc: David S. Miller <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf bpf: Use ERR_CAST instead of ERR_PTR(PTR_ERR())Wen Yang1-1/+1
Use ERR_CAST inlined function instead of ERR_PTR(PTR_ERR(...)). This makes it more readable and also fix this warning detected by err_cast.cocci: tools/perf/util/bpf-loader.c:1606:11-18: WARNING: ERR_CAST can be used with op Signed-off-by: Wen Yang <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Julia Lawall <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wen Yang <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17tools include: Adopt ERR_CAST() from the kernel err.h headerArnaldo Carvalho de Melo1-0/+13
Add ERR_CAST(), so that tools can use it, just like the kernel. This addresses coccinelle checks that are being performed to tools/ in addition to kernel sources, so lets add this to cover that and to get tools code closer to kernel coding standards. This originally was introduced in the kernel headers in this cset: d1bc8e954452 ("Add an ERR_CAST() function to complement ERR_PTR and co.") Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: David Ahern <[email protected]> Cc: David Howells <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Julia Lawall <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Wang Nan <[email protected]> Cc: Wen Yang <[email protected]> Cc: [email protected] Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf test: Fix perf_event_attr test failureAdrian Hunter1-1/+1
Fix inconsistent use of tabs and spaces error: # perf test 16 -v 16: Setup struct perf_event_attr : --- start --- test child forked, pid 20224 File "/usr/libexec/perf-core/tests/attr.py", line 119 log.warning("expected %s=%s, got %s" % (t, self[t], other[t])) ^ TabError: inconsistent use of tabs and spaces in indentation test child finished with -1 ---- end ---- Setup struct perf_event_attr: FAILED! Signed-off-by: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf tests record: Allow for 'sleep' being 'coreutils'Adrian Hunter1-2/+5
If the 'sleep' command is provided by coreutils, then the "PERF_RECORD_* events & perf_sample fields" test will fail because the MMAP name is 'coreutils' not 'sleep', and there is an extra COMM event. Fix the test to detect that case. Signed-off-by: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17tools lib traceevent: Fix compile warnings in tools/lib/traceevent/event-parse.cAdrian Hunter1-5/+5
Fix following warnings: event-parse.c: In function ‘tep_find_event_by_name’: event-parse.c:3521:21: warning: ‘event’ may be used uninitialized in this function [-Wmaybe-uninitialized] pevent->last_event = event; ~~~~~~~~~~~~~~~~~~~^~~~~~~ CC ui/gtk/hists.o LINK plugin_mac80211.so CC nlattr.o event-parse.c: In function ‘tep_data_lat_fmt’: event-parse.c:5200:4: warning: ‘migrate_disable’ may be used uninitialized in this function [-Wmaybe-uninitialized] trace_seq_printf(s, "%d", migrate_disable); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ event-parse.c:5207:4: warning: ‘lock_depth’ may be used uninitialized in this function [-Wmaybe-uninitialized] trace_seq_printf(s, "%d", lock_depth); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ LINK plugin_sched_switch.so LINK plugin_function.so LINK plugin_xen.so event-parse.c: In function ‘tep_event_info’: event-parse.c:5047:7: warning: ‘len_arg’ may be used uninitialized in this function [-Wmaybe-uninitialized] trace_seq_printf(s, format, len_arg, (char)val); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ event-parse.c:4884:6: note: ‘len_arg’ was declared here int len_arg; ^~~~~~~ event-parse.c:4338:11: warning: ‘vsize’ may be used uninitialized in this function [-Wmaybe-uninitialized] val = tep_read_number(pevent, bptr, vsize); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ event-parse.c:4224:6: note: ‘vsize’ was declared here int vsize; ^~~~~ $ gcc --version gcc (Clear Linux OS for Intel Architecture) 8.2.1 20180502 Signed-off-by: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Steven Rostedt (VMware) <[email protected]> Cc: Tzvetomir Stoyanov (VMware) <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf script: Use fallbacks for branch stacksAdrian Hunter2-14/+14
Branch stacks do not necessarily have the same cpumode as the 'ip'. Use the fallback functions in those cases. This patch depends on patch "perf tools: Add fallback functions for cases where cpumode is insufficient". Signed-off-by: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David S. Miller <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: [email protected] # 4.19 Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf tools: Use fallback for sample_addr_correlates_sym() casesAdrian Hunter1-1/+1
thread__resolve() is used in the sample_addr_correlates_sym() cases where 'addr' is a destination of a branch which does not necessarily have the same cpumode as the 'ip'. Use the fallback function in that case. This patch depends on patch "perf tools: Add fallback functions for cases where cpumode is insufficient". Signed-off-by: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David S. Miller <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: [email protected] # 4.19 Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf thread: Add fallback functions for cases where cpumode is insufficientAdrian Hunter4-0/+60
For branch stacks or branch samples, the sample cpumode might not be correct because it applies only to the sample 'ip' and not necessary to 'addr' or branch stack addresses. Add fallback functions that can be used to deal with those cases Signed-off-by: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David S. Miller <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: [email protected] # 4.19 Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf machine: Record if a arch has a single user/kernel address spaceAdrian Hunter4-0/+16
Some architectures have a single address space for kernel and user addresses, which makes it possible to determine if an address is in kernel space or user space. Some don't, e.g.: sparc. Cache that info in perf_env so that, for instance, code needing to fallback failed symbol lookups at the kernel space in single address space arches can lookup at userspace. Signed-off-by: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David S. Miller <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: [email protected] # 4.19 Link: http://lkml.kernel.org/r/[email protected] [ split from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf env: Also consider env->arch == NULL as local operationArnaldo Carvalho de Melo1-1/+1
We'll set a new machine field based on env->arch, which for live mode, like with 'perf top' means we need to use uname() to figure the name of the arch, fix perf_env__arch() to consider both (env == NULL) and (env->arch == NULL) as local operation. Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: David S. Miller <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Cc: [email protected] # 4.19 Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf map: Remove extra indirection from map__find()Eric Saint-Etienne1-7/+6
A double pointer is used in map__find() where a single pointer is enough because the function doesn't affect the rbtree and the rbtree is locked. Signed-off-by: Eric Saint-Etienne <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Eric Saint-Etienne <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf stat: Fix CSV mode column output for non-cgroup eventsStephane Eranian1-5/+11
When using the -x option, perf stat prints CSV-style output with one event per line. For each event, it prints the count, the unit, the event name, the cgroup, and a bunch of other event specific fields (such as insn per cycles). When you use CSV-style mode, you expect a normalized output where each event is printed with the same number of fields regardless of what it is so it can easily be imported into a spreadsheet or parsed. For instance, if an event does not have a unit, then print an empty field for it. Although this approach was implemented for the unit, it was not for the cgroup. When mixing cgroup and non-cgroup events, then non-cgroup events would not show an empty field, instead the next field was printed, make columns not line up correctly. This patch fixes the cgroup output issues by forcing an empty field for non-cgroup events as soon as one event has cgroup. Before: <not counted> @ @cycles @foo @ 0 @100.00@@ 2531614 @ @cycles @[email protected]@ @ foo cgroup lines up with time_running! After: <not counted> @ @cycles @foo @0 @100.00@@ 2594834 @ @cycles @ @5287372 @100.00@@ Fields line up. Signed-off-by: Stephane Eranian <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf stat: Fix shadow stats for clock eventsRavi Bangoria1-1/+2
Commit 0aa802a79469 ("perf stat: Get rid of extra clock display function") introduced scale and unit for clock events. Thus, perf_stat__update_shadow_stats() now saves scaled values of clock events in msecs, instead of original nsecs. But while calculating values of shadow stats we still consider clock event values in nsecs. This results in a wrong shadow stat values. Ex, # ./perf stat -e task-clock,cycles ls <SNIP> 2.60 msec task-clock:u # 0.877 CPUs utilized 2,430,564 cycles:u # 1215282.000 GHz Fix this by saving original nsec values for clock events in perf_stat__update_shadow_stats(). After patch: # ./perf stat -e task-clock,cycles ls <SNIP> 3.14 msec task-clock:u # 0.839 CPUs utilized 3,094,528 cycles:u # 0.985 GHz Suggested-by: Jiri Olsa <[email protected]> Reported-by: Anton Blanchard <[email protected]> Signed-off-by: Ravi Bangoria <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jin Yao <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Thomas Richter <[email protected]> Cc: [email protected] Fixes: 0aa802a79469 ("perf stat: Get rid of extra clock display function") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17perf build: Give better hint about devel package for libsslArnaldo Carvalho de Melo1-1/+1
In debian/ubuntu its libssl-dev, but for fedora/RHEL/Centos/etc its openssl-devel, fix it. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Wang Nan <[email protected]> Fixes: 8ee4646038e4 ("perf build: Add libcrypto feature detection") Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2018-12-17Merge tag 'v4.20-rc7' into perf/core, to pick up fixesIngo Molnar12-47/+544
Signed-off-by: Ingo Molnar <[email protected]>
2018-12-17selftests: Fix test errors related to lib.mk khdr targetShuah Khan8-9/+13
Commit b2d35fa5fc80 ("selftests: add headers_install to lib.mk") added khdr target to run headers_install target from the main Makefile. The logic uses KSFT_KHDR_INSTALL and top_srcdir as controls to initialize variables and include files to run headers_install from the top level Makefile. There are a few problems with this logic. 1. Exposes top_srcdir to all tests 2. Common logic impacts all tests 3. Uses KSFT_KHDR_INSTALL, top_srcdir, and khdr in an adhoc way. Tests add "khdr" dependency in their Makefiles to TEST_PROGS_EXTENDED in some cases, and STATIC_LIBS in other cases. This makes this framework confusing to use. The common logic that runs for all tests even when KSFT_KHDR_INSTALL isn't defined by the test. top_srcdir is initialized to a default value when test doesn't initialize it. It works for all tests without a sub-dir structure and tests with sub-dir structure fail to build. e.g: make -C sparc64/drivers/ or make -C drivers/dma-buf ../../lib.mk:20: ../../../../scripts/subarch.include: No such file or directory make: *** No rule to make target '../../../../scripts/subarch.include'. Stop. There is no reason to require all tests to define top_srcdir and there is no need to require tests to add khdr dependency using adhoc changes to TEST_* and other variables. Fix it with a consistent use of KSFT_KHDR_INSTALL and top_srcdir from tests that have the dependency on headers_install. Change common logic to include khdr target define and "all" target with dependency on khdr when KSFT_KHDR_INSTALL is defined. Only tests that have dependency on headers_install have to define just the KSFT_KHDR_INSTALL, and top_srcdir variables and there is no need to specify khdr dependency in the test Makefiles. Fixes: b2d35fa5fc80 ("selftests: add headers_install to lib.mk") Cc: [email protected] Signed-off-by: Shuah Khan <[email protected]> Reviewed-by: Khalid Aziz <[email protected]> Reviewed-by: Anders Roxell <[email protected]> Signed-off-by: Shuah Khan <[email protected]>
2018-12-16selftests: mlxsw: Add Bloom delta testNir Dotan1-1/+36
The eRP table is active when there is more than a single rule pattern. It may be that the patterns are close enough and use delta mechanism. Bloom filter index computation is based on the values of {rule & mask, mask ID, region ID} where the rule delta bits must be cleared. Add a test that exercises Bloom filter with delta mechanism. Configure rules within delta range and pass a packet which is supposed to hit the correct rule. Signed-off-by: Nir Dotan <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-12-16selftests: mlxsw: Add Bloom filter complex testNir Dotan1-1/+84
Bloom filter index computation is based on the values of {rule & mask, mask ID, region ID} and the computation also varies according to the region key size. Add a test that exercises the possible combinations by creating multiple chains using different key sizes and then pass a frame that is supposed to to produce a hit on all of the regions. Signed-off-by: Nir Dotan <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-12-16selftests: mlxsw: Add Bloom filter simple testNir Dotan1-1/+56
Add a test that exercises Bloom filter code. Activate eRP table in the region by adding multiple rule patterns which with very high probability use different entries in the Bloom filter. Then send packets in order to check lookup hits on all relevant rules. Signed-off-by: Nir Dotan <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-12-16selftests: net: rtnetlink.sh: add fdb get testRoopa Prabhu1-0/+53
tests the below three cases of bridge fdb get: [bridge, mac, vlan] [bridge_port, mac, vlan, flags=[NTF_MASTER]] [vxlandev, mac, flags=NTF_SELF] depends on iproute2 support for bridge fdb get. Signed-off-by: Roopa Prabhu <[email protected]> Signed-off-by: David S. Miller <[email protected]>