aboutsummaryrefslogtreecommitdiff
path: root/tools/perf/util
AgeCommit message (Collapse)AuthorFilesLines
2016-06-08Merge tag 'perf-core-for-mingo-20160607' of ↵Ingo Molnar12-746/+966
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: User visible changes: - Support cross unwinding, i.e. collecting '--call-graph dwarf' perf.data files in one machine and then doing analysis in another machine of a different hardware architecture. This enables, for instance, to do: perf record -a --call-graph dwarf on a x86-32 or aarch64 system and then do 'perf report' on it on a x86_64 workstation. (He Kuang) - Fix crash in build_id_cache__kallsyms_path(), recent regression (Wang Nan) Infrastructure changes: - Make tools/lib/bpf use the IS_ERR return facility consistently and also stop using the _get_ term for non-reference count methods (Arnaldo Carvalho de Melo) - 'perf config' refactorings (Taeung Song) Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2016-06-08Merge tag 'perf-core-for-mingo-20160606' of ↵Ingo Molnar9-21/+213
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: User visible changes: - Tooling support for TopDown counters, recently added to the kernel (Andi Kleen) - Show call graphs in 'perf script' when 1st event doesn't have it but some other has (He Kuang) - Fix terminal cleanup when handling invalid .perfconfig files in 'perf top' (Taeung Song) Build fixes: - Respect CROSS_COMPILE for the linker in libapi (Lucas Stach) Infrastructure changes: - Fix perf_evlist__alloc_mmap() failure path (Wang Nan) - Provide way to extract integer value from format_field (Arnaldo Carvalho de Melo) Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2016-06-08Merge branch 'linus' into perf/core, to refresh the branchIngo Molnar5-8/+8
Signed-off-by: Ingo Molnar <[email protected]>
2016-06-07perf callchain: Support aarch64 cross-platformHe Kuang3-0/+40
Support aarch64 cross platform callchain unwind. Signed-off-by: He Kuang <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Ekaterina Tumanova <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Kan Liang <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Sukadev Bhattiprolu <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-06-07perf callchain: Support x86 target platformHe Kuang3-2/+51
Support x86(32-bit) cross platform callchain unwind. Signed-off-by: He Kuang <[email protected]> Acked-by: Jiri Olsa <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Ekaterina Tumanova <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Kan Liang <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Sukadev Bhattiprolu <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-06-07perf unwind: Introduce flag to separate local/remote unwind compilationHe Kuang1-0/+4
This is a preparation for including unwind-libunwind-local.c in other files for remote libunwind. Signed-off-by: He Kuang <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Ekaterina Tumanova <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Kan Liang <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Sukadev Bhattiprolu <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-06-07perf unwind: Change fixed name of libunwind__arch_reg_id to macroHe Kuang2-2/+5
For local libunwind, it uses the fixed methods to convert register id according to the host platform, but in remote libunwind, this convert function should be the one for remote architecture. This patch changes the fixed name to macro and code for each remote platform can be compiled indivadually. Signed-off-by: He Kuang <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Ekaterina Tumanova <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Kan Liang <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Sukadev Bhattiprolu <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-06-07perf unwind: Check the target platform before assigning unwind methodsHe Kuang3-5/+30
Currently, 'perf script' uses host unwind methods to parse perf.data callchain info without taking the target architecture into account, i.e. assuming the perf.data file was generated on the same machine where the analysis is being performed. So we get wrong result without any warnings when unwinding callchains of x86(32-bit) on x86(64-bit) machine. This patch adds an extra step that checks the target platform before assigning unwind methods. In later patches in this series, we can use this info to assign the right unwind methods for supported platforms. Committer note: After fixing it to register the local unwinder for live mode tools ('perf trace', 'perf top'), i.e. tools that don't use a perf.data file, it works as intended and passes the 'perf test unwind' test: # perf trace -e nanosleep --call dwarf usleep 1 0.328 ( 0.058 ms): usleep/11115 nanosleep(rqtp: 0x7fff083fa480) = 0 __nanosleep_nocancel+0x7 (/usr/lib64/libc-2.22.so) usleep+0x34 (/usr/lib64/libc-2.22.so) main+0x1eb (/usr/bin/usleep) __libc_start_main+0xf0 (/usr/lib64/libc-2.22.so) _start+0x29 (/usr/bin/usleep) # perf test 48 48: Test dwarf unwind : Ok # Signed-off-by: He Kuang <[email protected]> Acked-by: Jiri Olsa <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Ekaterina Tumanova <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Kan Liang <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Sukadev Bhattiprolu <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Fixed exit path for 'live' mode tools, where we need to default to local unwinding ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-06-07perf tools: Extract common API out of unwind-libunwind-local.cHe Kuang3-34/+39
This patch extracts common unwind-libunwind APIs out of unwind-libunwind-local.c, this part will be used by both local and remote libunwind. Signed-off-by: He Kuang <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Ekaterina Tumanova <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Kan Liang <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Sukadev Bhattiprolu <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-06-07perf unwind: Rename unwind-libunwind.c to unwind-libunwind-local.cHe Kuang2-1/+1
Since unwind-libunwind.c contains code for specific arithecture, we change it's name to unwind-libunwind-local.c, and let it only be built if local libunwind is supported. Signed-off-by: He Kuang <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Ekaterina Tumanova <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Kan Liang <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Sukadev Bhattiprolu <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-06-07perf unwind: Move unwind__prepare_access from thread_new into thread__insert_mapHe Kuang3-7/+22
To determine the libunwind methods to use, we should get the 32bit/64bit information from maps of a thread. When a thread is newly created, the information is not prepared. This patch moves unwind__prepare_access() into thread__insert_map() so we can get the information we need from maps. Meanwhile, let thread__insert_map() return value and show messages on error. Signed-off-by: He Kuang <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Ekaterina Tumanova <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Kan Liang <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Sukadev Bhattiprolu <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-06-07perf unwind: Introduce 'struct unwind_libunwind_ops' for local unwindHe Kuang3-5/+61
Currently, libunwind operations are fixed, and they are chosen according to the host architecture. This will lead to a problem that if a thread is run as x86_32 on a x86_64 machine, perf will use libunwind methods for x86_64 to parse the callchain and get wrong results. This patch changes the fixed methods of libunwind operations to be thread/map related, and each thread can have individual libunwind operations. Local libunwind methods are registered as default value. Signed-off-by: He Kuang <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Ekaterina Tumanova <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Kan Liang <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Sukadev Bhattiprolu <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-06-07perf unwind: Decouple thread->address_space on libunwindHe Kuang1-4/+1
Currently, the type of thread->addr_space is unw_addr_space_t, which is a pointer defined in libunwind headers. For local libunwind, we can simple include "libunwind.h", but for remote libunwind, the header file is depends on the target libunwind platform. This patch uses 'void *' instead to decouple the dependence on libunwind. Signed-off-by: He Kuang <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Ekaterina Tumanova <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Kan Liang <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Sukadev Bhattiprolu <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-06-07perf config: Use new perf_config_set__init() to initialize config setTaeung Song1-1/+46
Instead of perf_config(), this function initializes config set by reading various files: user config ~/.perfconfig and system config $(sysconfdir)/perfconfig). If there are the same config variable in both user and system config files, user config has higher priority than system config. Signed-off-by: Taeung Song <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-06-07perf config: Constructor should free its allocated memory when failingTaeung Song1-2/+4
Because of die() at perf_parse_file() a config set was freed in collect_config(), if failed. But it is natural to free a config set after collect_config() is done when some problems happened. So, in case of failure, lastly free a config set at perf_config_set__new() instead of freeing the config set in collect_config(). Signed-off-by: Taeung Song <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-06-07perf tools: Fix crash in build_id_cache__kallsyms_path()Wang Nan1-7/+4
build_id_cache__kallsyms_path() accepts a string buffer but also allocs a buffer using asnprintf. Unfortunately, the its only user passes it a stack-allocated buffer. Freeing it causes crashes like this: $ perf script *** Error in `/home/wangnan/perf': free(): invalid pointer: 0x00007fffffff9630 *** ======= Backtrace: ========= lib64/libc.so.6(+0x6eeef)[0x7ffff5dbaeef] lib64/libc.so.6(+0x78cae)[0x7ffff5dc4cae] lib64/libc.so.6(+0x79987)[0x7ffff5dc5987] /home/w00229757/perf(build_id_cache__kallsyms_path+0x6b)[0x49681b] /home/w00229757/perf[0x4bdd40] /home/w00229757/perf(dso__load+0xa3a)[0x4c048a] /home/w00229757/perf(map__load+0x6f)[0x4d561f] /home/w00229757/perf(thread__find_addr_map+0x235)[0x49e935] /home/w00229757/perf(machine__resolve+0x7d)[0x49ec6d] /home/w00229757/perf[0x4555a8] /home/w00229757/perf[0x4d9507] /home/w00229757/perf[0x4d9e80] /home/w00229757/perf(ordered_events__flush+0x354)[0x4dd444] /home/w00229757/perf(perf_session__process_events+0x3d0)[0x4dc140] /home/w00229757/perf(cmd_script+0x12b0)[0x4592e0] /home/w00229757/perf[0x4911f1] /home/w00229757/perf(main+0x68f)[0x4352ef] /lib64/libc.so.6(__libc_start_main+0xf5)[0x7ffff5d6dbd5] /home/w00229757/perf[0x435415] ======= Memory map: ======== This patch simplifies build_id_cache__kallsyms_path(), not even considering allocating a string buffer, so never frees anything. Its caller should manage memory allocation. Signed-off-by: Wang Nan <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Zefan Li <[email protected]> Cc: [email protected] Fixes: 01412261d994 ("perf buildid-cache: Use path/to/bin/buildid/elf instead of path/to/bin/buildid") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-06-06tools lib bpf: Rename set_private() to set_priv()Arnaldo Carvalho de Melo1-3/+3
For consistency with class__priv() elsewhere, and with the callback typedef for clearing those areas (e.g. bpf_map_clear_priv_t). Acked-by: Wang Nan <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Milian Wolff <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-06-06tools lib bpf: Make bpf_program__get_private() use IS_ERR()Arnaldo Carvalho de Melo1-15/+12
For consistency with bpf_map__priv() and elsewhere. Acked-by: Wang Nan <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Milian Wolff <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-06-06tools lib bpf: Remove _get_ from non-refcount method namesArnaldo Carvalho de Melo1-2/+2
The use of this term is not warranted here, we use it in the kernel sources and in tools/ for refcounting, so, for consistency, rename them. Acked-bu: Wang Nan <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Milian Wolff <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-06-06tools lib bpf: Rename bpf_map__get_fd() to bpf_map__fd()Arnaldo Carvalho de Melo1-1/+1
For consistency, leaving "get" for reference counting. Acked-by: Wang Nan <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Milian Wolff <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-06-06tools lib bpf: Use IS_ERR() reporting macros with bpf_map__get_def()Arnaldo Carvalho de Melo1-28/+24
And for consistency, rename it to bpf_map__def(), leaving "get" for reference counting. Also make it return a const pointer, as suggested by Wang. Acked-by: Wang Nan <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Milian Wolff <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-06-06tools lib bpf: Rename bpf_map__get_name() to bpf_map__name()Arnaldo Carvalho de Melo1-12/+6
For consistency, leaving "get" for reference counting. Acked-by: Wang Nan <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Milian Wolff <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-06-06tools lib bpf: Use IS_ERR() reporting macros with bpf_map__get_private()Arnaldo Carvalho de Melo1-14/+9
To try to, over time, consistently use the IS_ERR() interface instead of using two return values, i.e. the integer return value for an error and the pointer address to return the bpf_map->priv pointer. Also rename it to bpf__priv(), to leave the "get" term for reference counting. Noticed while working on using BPF for collecting non-integer syscall argument payloads (struct sockaddr in calls such as connect(), for instance), where we need to use BPF maps and thus generalise bpf__setup_stdout() to connect bpf_output events with maps in a bpf proggie. Acked-by: Wang Nan <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Milian Wolff <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-06-06perf config: Handle the error when config set is NULL at collect_config()Taeung Song1-1/+5
collect_config() collect all config key-value pairs from config files and put each config info in config set. But if config set (i.e. 'set' variable at collect_config()) is NULL, this is wrong so handle it. Signed-off-by: Taeung Song <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-06-06perf config: Fix abnormal termination at perf_parse_file()Taeung Song1-9/+7
If a config file has wrong key-value pairs, the perf process will be forcibly terminated by die() at perf_parse_file() called by perf_config() so terminal settings can be crushed because of unusual termination. For example: If user config file has a wrong value 'red;default' instead of a normal value like 'red, default' for a key 'colors.top', # cat ~/.perfconfig [colors] medium = red;default # wrong value and if running sub-command 'top', # perf top perf process is dead by force and terminal setting is broken with a messge like below. Fatal: bad config file line 2 in /root/.perfconfig So fix it. If perf_config() can return on failure without calling die() at perf_parse_file(), this problem can be solved. And if a config file has wrong values, show the error message and then use default config values instead of wrong config values. Signed-off-by: Taeung Song <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-06-06perf stat: Add computation of TopDown formulasAndi Kleen3-0/+172
Implement the TopDown formulas in 'perf stat'. The topdown basic metrics reported by the kernel are collected, and the formulas are computed and output as normal metrics. See the kernel commit exporting the events for details on the used metrics. Committer note: Output example: # perf stat --topdown -a usleep 1 Performance counter stats for 'system wide': retiring bad speculation frontend bound backend bound S0-C0 2 23.8% 11.6% 28.3% 36.3% S0-C1 2 16.2% 15.7% 36.5% 31.6% 0.000579956 seconds time elapsed # v2: Always print all metrics, only use thresholds for coloring. v3: Mark retiring over threshold green, not red. v4: Only print one decimal digit Fix color printing of one metric v5: Avoid printing -0.0 v6: Remove extra frontend event lookup Signed-off-by: Andi Kleen <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-06-06perf stat: Basic support for TopDown in perf statAndi Kleen2-0/+8
Add basic plumbing for TopDown in perf stat TopDown is intended to replace the frontend cycles idle/ backend cycles idle metrics in standard perf stat output. These metrics are not reliable in many workloads, due to out of order effects. This implements a new --topdown mode in perf stat (similar to --transaction) that measures the pipe line bottlenecks using standardized formulas. The measurement can be all done with 5 counters (one fixed counter) The result are four metrics: FrontendBound, BackendBound, BadSpeculation, Retiring that describe the CPU pipeline behavior on a high level. The full top down methology has many hierarchical metrics. This implementation only supports level 1 which can be collected without multiplexing. A full implementation of top down on top of perf is available in pmu-tools toplev. (http://github.com/andikleen/pmu-tools) The current version works on Intel Core CPUs starting with Sandy Bridge, and Atom CPUs starting with Silvermont. In principle the generic metrics should be also implementable on other out of order CPUs. TopDown level 1 uses a set of abstracted metrics which are generic to out of order CPU cores (although some CPUs may not implement all of them): topdown-total-slots Available slots in the pipeline topdown-slots-issued Slots issued into the pipeline topdown-slots-retired Slots successfully retired topdown-fetch-bubbles Pipeline gaps in the frontend topdown-recovery-bubbles Pipeline gaps during recovery from misspeculation These metrics then allow to compute four useful metrics: FrontendBound, BackendBound, Retiring, BadSpeculation. Add a new --topdown options to enable events. When --topdown is specified set up events for all topdown events supported by the kernel. Add topdown-* as a special case to the event parser, as is needed for all events containing -. The actual code to compute the metrics is in follow-on patches. v2: Use standard sysctl read function. v3: Move x86 specific code to arch/ v4: Enable --metric-only implicitly for topdown. v5: Add --single-thread option to not force per core mode v6: Fix output order of topdown metrics v7: Allow combining with -d v8: Remove --single-thread again v9: Rename functions, adding arch_ and topdown_. v10: Expand man page and describe TopDown better Paste intro into commit description. Print error when malloc fails. Signed-off-by: Andi Kleen <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-06-03perf evlist: Fix alloc_mmap() failure pathWang Nan1-1/+4
If zalloc fail, setting evlist->mmap[i].fd is unsafe and perf_evlist__alloc_mmap() should bail out right after that. Signed-off-by: Wang Nan <[email protected]> Acked-by: Masami Hiramatsu <[email protected]> Cc: He Kuang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Zefan Li <[email protected]> Cc: [email protected] Fixes: d4c6fb36ac2c ("perf evsel: Record fd into perf_mmap") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-06-03perf evsel: Provide way to extract integer value from format_fieldArnaldo Carvalho de Melo2-10/+17
Out of perf_evsel__intval(), that requires passing the variable name, that will then be searched in the list of tracepoint variables for the given evsel. In cases such as syscall file descriptor ("fd") tracking, this is wasteful, we need just to use perf_evsel__field() and cache the format_field. Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Milian Wolff <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-06-03tools/perf: Handle -EOPNOTSUPP for sampling eventsVineet Gupta1-0/+3
This allows (with a previous change to the perf error return ABI) for calling out in userspace the exact reason for perf record failing when PMU doesn't support overflow interrupts. Note that this needs to be put ahead of existing precise_ip check as that gets hit otherwise for the sampling fail case as well. Signed-off-by: Vineet Gupta <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Cc: Vineet Gupta <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2016-05-30perf buildid-cache: Use path/to/bin/buildid/elf instead of path/to/bin/buildidMasami Hiramatsu4-31/+96
Use path/to/bin/buildid/elf instead of path/to/bin/buildid to store corresponding elf binary. This also stores vdso in buildid/vdso, kallsyms in buildid/kallsyms. Note that the existing caches are not updated until user adds or updates the cache. Anyway, if there is the old style build-id cache it falls back to use it. (IOW, it is backward compatible) Signed-off-by: Masami Hiramatsu <[email protected]> Signed-off-by: Masami Hiramatsu <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Ananth N Mavinakayanahalli <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: Hemant Kumar <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/20160528151537.16098.85815.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-05-30perf symbols: Cleanup the code flow of dso__find_kallsymsMasami Hiramatsu1-23/+13
Cleanup the code flow of dso__find_kallsyms() to remove redundant checking code and add some comment for readability. Signed-off-by: Masami Hiramatsu <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Ananth N Mavinakayanahalli <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: Hemant Kumar <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/20160528151522.16098.43446.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-05-30perf symbols: Introduce filename__readable to check readabilityMasami Hiramatsu1-10/+22
Introduce filename__readable to check readability by opening the file directly. Since the access(R_OK) just checks the readability based on real UID/GID, it is ignored that the effective UID/GID and capabilities for some special file (e.g. /proc/kcore). filename__readable() directly opens given file with O_RDONLY so that the kernel checks it by effective UID/GID and capabilities. Signed-off-by: Masami Hiramatsu <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Ananth N Mavinakayanahalli <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: Hemant Kumar <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/20160528151513.16098.97576.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-05-30tools: Pass arg to fdarray__filter's call back functionWang Nan1-2/+3
Before this patch there's no way to pass arguments to fdarray__filter's call back function. This improvement will be used by 'perf record' to support unmapping ring buffer for both main evlist and overwrite evlist. Without this patch there's no way to track overwrite evlist from 'struct fdarray'. Signed-off-by: Wang Nan <[email protected]> Cc: He Kuang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Zefan Li <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-05-30perf evlist: Choose correct reading direction according to evlist->backwardWang Nan2-1/+10
Now we have evlist->backward to indicate the mmap direction. Make perf_evlist__mmap_read() choose right direction automatically. Signed-off-by: Wang Nan <[email protected]> Cc: He Kuang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Zefan Li <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-05-30perf evlist: Check 'base' pointer before checking refcnt when put a mmapWang Nan1-2/+4
evlist->mmap[i]->refcnt could be 0 if an evlist has no evsel or if all evsels don't match the evlist during mmap. For example, when all evsels are overwritable but the evlist itself is normal. To avoid crashing, perf should check 'base' pointer before checking refcnt, and raise bug only when base is not NULL. Signed-off-by: Wang Nan <[email protected]> Cc: He Kuang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Zefan Li <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] [ Renamed 'mmap' variable, it is reserved in old distros such as Ubuntu 12.04, breaking the build ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-05-30perf evlist: Don't poll and mmap overwritable eventsWang Nan1-4/+19
There's no need to receive events from overwritable ring buffer. Instead, perf should make them run in background until some external event of interest takes place. This patch makes ignores normal events from overwrite evlists. Overwritable events must be mapped readonly and backward, so if evlist and evsel doesn't match (evsel->overwrite is true but either evlist is read/write or evlist is not backward, and vice versa), skip mapping it. Signed-off-by: Wang Nan <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Zefan Li <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: He Kuang <[email protected]> [ Split from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-05-30perf tools: Per event max-stack settingsArnaldo Carvalho de Melo7-2/+29
The tooling counterpart, now it is possible to do: # perf record -e sched:sched_switch/max-stack=10/ -e cycles/call-graph=dwarf,max-stack=4/ -e cpu-cycles/call-graph=dwarf,max-stack=1024/ usleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.052 MB perf.data (5 samples) ] # perf evlist -v sched:sched_switch: type: 2, size: 112, config: 0x110, { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD|RAW|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, sample_max_stack: 10 cycles/call-graph=dwarf,max-stack=4/: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CALLCHAIN|PERIOD|REGS_USER|STACK_USER|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, freq: 1, enable_on_exec: 1, sample_id_all: 1, exclude_guest: 1, exclude_callchain_user: 1, sample_regs_user: 0xff0fff, sample_stack_user: 8192, sample_max_stack: 4 cpu-cycles/call-graph=dwarf,max-stack=1024/: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CALLCHAIN|PERIOD|REGS_USER|STACK_USER|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, freq: 1, enable_on_exec: 1, sample_id_all: 1, exclude_guest: 1, exclude_callchain_user: 1, sample_regs_user: 0xff0fff, sample_stack_user: 8192, sample_max_stack: 1024 # Tip: use 'perf evlist --trace-fields' to show fields for tracepoint events Using just /max-stack=N/ means /call-graph=fp,max-stack=N/, that should be further configurable by means of some .perfconfig knob. Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Brendan Gregg <[email protected]> Cc: David Ahern <[email protected]> Cc: Frederic Weisbecker <[email protected]> Cc: He Kuang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Milian Wolff <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Cc: Wang Nan <[email protected]> Cc: Zefan Li <[email protected]> Link: http://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-05-30perf thread: Adopt get_main_thread from db-export.cAndi Kleen3-12/+14
Move the get_main_thread function from db-export.c to thread.c so that it can be used elsewhere. Signed-off-by: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Removed leftover bits from db-export.h ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-05-27perf ctf: Convert invalid chars in a string before set valueWang Nan1-2/+39
We observed some crazy apps on Android set their comm to unprintable string. For example: # cat /proc/10607/task/*/comm tencent.qqmusic ... Binder_2 日志输出线 <-- Chinese word 'log output thread' WifiManager ... 'perf data convert' fails to convert perf.data with such string to CTF format. For example: # cat << EOF > ./badguy.c #include <sys/prctl.h> int main(int argc, char *argv[]) { prctl(PR_SET_NAME, "\xe6\x97\xa5\xe5\xbf\x97\xe8\xbe\x93\xe5\x87\xba\xe7\xba\xbf"); while(1) sleep(1); return 0; } EOF # gcc ./badguy.c # perf record -e sched:* ./a.out # perf data convert --to-ctf ./bad.ctf CTF stream 4 flush failed [ perf data convert: Converted 'perf.data' into CTF data './bad.ctf' ] [ perf data convert: Converted and wrote 0.008 MB (78 samples) ] # babeltrace ./bad.ctf/ [error] Packet size (18446744073709551615 bits) is larger than remaining file size (262144 bits). [error] Stream index creation error. [error] Open file stream error. [warning] [Context] Cannot open_trace of format ctf at path ./bad.ctf. [warning] [Context] cannot open trace "./bad.ctf" from ./bad.ctf/ for reading. [error] Cannot open any trace for reading. [error] opening trace "./bad.ctf/" for reading. [error] none of the specified trace paths could be opened. This patch converts unprintable characters to hexadecimal word. After applying this patch the above test works correctly: # ~/perf data convert --to-ctf ./good.ctf [ perf data convert: Converted 'perf.data' into CTF data './good.ctf' ] [ perf data convert: Converted and wrote 0.008 MB (78 samples) ] # babeltrace ./good.ctf .. [23:14:35.491665268] (+0.000001100) sched:sched_wakeup: { cpu_id = 4 }, { perf_ip = 0xFFFFFFFF810AEF33, perf_tid = 0, perf_pid = 0, perf_id = 5123, perf_period = 1, common_type = 270, common_flags = 45, common_preempt_count = 4, common_pid = 0, comm = "\xe6\x97\xa5\xe5\xbf\x97\xe8\xbe\x93\xe5\x87\xba\xe7\xba\xbf", pid = 1057, prio = 120, success = 1, target_cpu = 4 } [23:14:35.491666230] (+0.000000962) sched:sched_wakeup: { cpu_id = 4 }, { perf_ip = 0xFFFFFFFF810AEF33, perf_tid = 0, perf_pid = 0, perf_id = 5122, perf_period = 1, common_type = 270, common_flags = 45, common_preempt_count = 4, common_pid = 0, comm = "\xe6\x97\xa5\xe5\xbf\x97\xe8\xbe\x93\xe5\x87\xba\xe7\xba\xbf", pid = 1057, prio = 120, success = 1, target_cpu = 4 } .. Committer note: To build perf with libabeltrace, use: $ mkdir -p /tmp/build/perf $ make LIBBABELTRACE=1 LIBBABELTRACE_DIR=/usr/local O=/tmp/build/perf -C tools/perf install-bin Or equivalent (no O=, fixup LIBBABELTRACE_DIR, etc). Signed-off-by: Wang Nan <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-05-27perf record: Fix crash when kptr is restrictedWang Nan1-0/+2
Before this patch, a simple 'perf record' could fail if kptr_restrict is set to 1 (for normal user) or 2 (for root): # perf record ls WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted, check /proc/sys/kernel/kptr_restrict. Samples in kernel functions may not be resolved if a suitable vmlinux file is not found in the buildid cache or in the vmlinux path. Samples in kernel modules won't be resolved at all. If some relocation was applied (e.g. kexec) symbols may be misresolved even with a suitable vmlinux or kallsyms file. Segmentation fault (core dumped) This patch skips perf_event__synthesize_kernel_mmap() when kptr is not available. Signed-off-by: Wang Nan <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Fixes: 45e90056904b ("perf machine: Do not bail out if not managing to read ref reloc symbol") Cc: Zefan Li <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-05-27perf symbols: Check kptr_restrict for rootWang Nan1-8/+8
If kptr_restrict is set to 2, even root is not allowed to see pointers. This patch checks kptr_restrict even if euid == 0. For root, report error if kptr_restrict is 2. Signed-off-by: Wang Nan <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Zefan Li <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-05-25Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds21-47/+240
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "Mostly tooling and PMU driver fixes, but also a number of late updates such as the reworking of the call-chain size limiting logic to make call-graph recording more robust, plus tooling side changes for the new 'backwards ring-buffer' extension to the perf ring-buffer" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits) perf record: Read from backward ring buffer perf record: Rename variable to make code clear perf record: Prevent reading invalid data in record__mmap_read perf evlist: Add API to pause/resume perf trace: Use the ptr->name beautifier as default for "filename" args perf trace: Use the fd->name beautifier as default for "fd" args perf report: Add srcline_from/to branch sort keys perf evsel: Record fd into perf_mmap perf evsel: Add overwrite attribute and check write_backward perf tools: Set buildid dir under symfs when --symfs is provided perf trace: Only auto set call-graph to "dwarf" when syscalls are being traced perf annotate: Sort list of recognised instructions perf annotate: Fix identification of ARM blt and bls instructions perf tools: Fix usage of max_stack sysctl perf callchain: Stop validating callchains by the max_stack sysctl perf trace: Fix exit_group() formatting perf top: Use machine->kptr_restrict_warned perf trace: Warn when trying to resolve kernel addresses with kptr_restrict=1 perf machine: Do not bail out if not managing to read ref reloc symbol perf/x86/intel/p4: Trival indentation fix, remove space ...
2016-05-23perf record: Read from backward ring bufferWang Nan2-0/+2
Introduce rb_find_range() to find start and end position from a backward ring buffer. Signed-off-by: Wang Nan <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Zefan Li <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: He Kuang <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-05-23perf evlist: Add API to pause/resumeWang Nan2-0/+29
perf_evlist__toggle_{pause,resume}() are introduced to pause/resume events in an evlist. Utilize PERF_EVENT_IOC_PAUSE_OUTPUT ioctl. Following commits use them to ensure overwrite ring buffer is paused before reading. Signed-off-by: Wang Nan <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Zefan Li <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: He Kuang <[email protected]> [ Return -1, like all other ioctl() usage in evlist.c, rename 'pause' arg to avoid breaking the build on ubuntu 12.04 and other old systems ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-05-23perf report: Add srcline_from/to branch sort keysAndi Kleen5-0/+99
Add "srcline_from" and "srcline_to" branch sort keys that allow to show the source lines of a branch. That makes it much easier to track down where particular branches happen in the program, for example to examine branch mispredictions, or to associate it with cycle counts: % perf record -b -e cycles:p ./tcall % perf report --sort srcline_from,srcline_to,mispredict ... 15.10% tcall.c:18 tcall.c:10 N 14.83% tcall.c:11 tcall.c:5 N 14.12% tcall.c:7 tcall.c:12 N 14.04% tcall.c:12 tcall.c:5 N 12.42% tcall.c:17 tcall.c:18 N 12.39% tcall.c:7 tcall.c:13 N 12.27% tcall.c:13 tcall.c:17 N ... % perf report --sort srcline_from,srcline_to,cycles ... 17.12% tcall.c:18 tcall.c:11 1 17.01% tcall.c:12 tcall.c:6 1 16.98% tcall.c:11 tcall.c:6 1 15.91% tcall.c:17 tcall.c:18 1 6.38% tcall.c:7 tcall.c:17 7 4.80% tcall.c:7 tcall.c:12 8 4.21% tcall.c:7 tcall.c:17 8 2.67% tcall.c:7 tcall.c:12 7 2.62% tcall.c:7 tcall.c:12 10 2.10% tcall.c:7 tcall.c:17 9 1.58% tcall.c:7 tcall.c:12 6 1.44% tcall.c:7 tcall.c:12 5 1.38% tcall.c:7 tcall.c:12 9 1.06% tcall.c:7 tcall.c:17 13 1.05% tcall.c:7 tcall.c:12 4 1.01% tcall.c:7 tcall.c:17 6 Open issues: - Some kernel symbols get misresolved. Signed-off-by: Andi Kleen <[email protected]> Acked-by: Jiri Olsa <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-05-20perf evsel: Record fd into perf_mmapWang Nan2-0/+7
Add a fd field into struct perf_mmap so that perf can track the mmap fd. This feature will be used for toggling overwrite ring buffers. Signed-off-by: Wang Nan <[email protected]> Cc: He Kuang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Zefan Li <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: He Kuang <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-05-20perf evsel: Add overwrite attribute and check write_backwardWang Nan2-0/+14
Add 'overwrite' attribute to evsel to mark whether this event is overwritable. The following commits will support syntax like: # perf record -e cycles/overwrite/ ... An overwritable evsel requires kernel support for the perf_event_attr.write_backward ring buffer feature. Add it to perf_missing_feature. Signed-off-by: Wang Nan <[email protected]> Cc: He Kuang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Zefan Li <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2016-05-20Merge tag 'perf-core-for-mingo-20160520' of ↵Ingo Molnar9-38/+64
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: User visible changes: - We should not use the current value of the kernel.perf_event_max_stack as the default value for --max-stack in tools that can process perf.data files, they will only match if that sysctl wasn't changed from its default value at the time the perf.data file was recorded, fix it. This fixes a bug where a 'perf record -a --call-graph dwarf ; perf report' produces a glibc invalid free backtrace (Arnaldo Carvalho de Melo) - Provide a better warning when running 'perf trace' on a system where the kernel.kptr_restrict is set to 1, similar to the one produced by 'perf record', noticed on ubuntu 16.04 where this is the default kptr_restrict setting. (Arnaldo Carvalho de Melo) - Fix ordering of instructions in the annotation code, noticed when annotating ARM binaries, now that table is auto-ordered at first use, to avoid more such problems (Chris Ryder) - Set buildid dir under symfs when --symfs is provided (He Kuang) - Fix the 'exit_group()' syscall output in 'perf trace' (Arnaldo Carvalho de Melo) - Only auto set call-graph to "dwarf" in 'perf trace' when syscalls are being traced (Arnaldo Carvalho de Melo) Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2016-05-20Merge tag 'powerpc-4.7-1' of ↵Linus Torvalds1-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Highlights: - Support for Power ISA 3.0 (Power9) Radix Tree MMU from Aneesh Kumar K.V - Live patching support for ppc64le (also merged via livepatching.git) Various cleanups & minor fixes from: - Aaro Koskinen, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Chris Smart, Daniel Axtens, Frederic Barrat, Gavin Shan, Ian Munsie, Lennart Sorensen, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring, Michael Ellerman, Oliver O'Halloran, Paul Gortmaker, Paul Mackerras, Rashmica Gupta, Russell Currey, Suraj Jitindar Singh, Thiago Jung Bauermann, Valentin Rothberg, Vipin K Parashar. General: - Update LMB associativity index during DLPAR add/remove from Nathan Fontenot - Fix branching to OOL handlers in relocatable kernel from Hari Bathini - Add support for userspace Power9 copy/paste from Chris Smart - Always use STRICT_MM_TYPECHECKS from Michael Ellerman - Add mask of possible MMU features from Michael Ellerman PCI: - Enable pass through of NVLink to guests from Alexey Kardashevskiy - Cleanups in preparation for powernv PCI hotplug from Gavin Shan - Don't report error in eeh_pe_reset_and_recover() from Gavin Shan - Restore initial state in eeh_pe_reset_and_recover() from Gavin Shan - Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell" from Guilherme G Piccoli - Remove the dependency on EEH struct in DDW mechanism from Guilherme G Piccoli selftests: - Test cp_abort during context switch from Chris Smart - Add several tests for transactional memory support from Rashmica Gupta perf: - Add support for sampling interrupt register state from Anju T - Add support for unwinding perf-stackdump from Chandan Kumar cxl: - Configure the PSL for two CAPI ports on POWER8NVL from Philippe Bergheaud - Allow initialization on timebase sync failures from Frederic Barrat - Increase timeout for detection of AFU mmio hang from Frederic Barrat - Handle num_of_processes larger than can fit in the SPA from Ian Munsie - Ensure PSL interrupt is configured for contexts with no AFU IRQs from Ian Munsie - Add kernel API to allow a context to operate with relocate disabled from Ian Munsie - Check periodically the coherent platform function's state from Christophe Lombard Freescale: - Updates from Scott: "Contains 86xx fixes, minor device tree fixes, an erratum workaround, and a kconfig dependency fix." * tag 'powerpc-4.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (192 commits) powerpc/86xx: Fix PCI interrupt map definition powerpc/86xx: Move pci1 definition to the include file powerpc/fsl: Fix build of the dtb embedded kernel images powerpc/fsl: Fix rcpm compatible string powerpc/fsl: Remove FSL_SOC dependency from FSL_LBC powerpc/fsl-pci: Add a workaround for PCI 5 errata powerpc/fsl: Fix SPI compatible on t208xrdb and t1040rdb powerpc/powernv/npu: Add PE to PHB's list powerpc/powernv: Fix insufficient memory allocation powerpc/iommu: Remove the dependency on EEH struct in DDW mechanism Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell" powerpc/eeh: Drop unnecessary label in eeh_pe_change_owner() powerpc/eeh: Ignore handlers in eeh_pe_reset_and_recover() powerpc/eeh: Restore initial state in eeh_pe_reset_and_recover() powerpc/eeh: Don't report error in eeh_pe_reset_and_recover() Revert "powerpc/powernv: Exclude root bus in pnv_pci_reset_secondary_bus()" powerpc/powernv/npu: Enable NVLink pass through powerpc/powernv/npu: Rework TCE Kill handling powerpc/powernv/npu: Add set/unset window helpers powerpc/powernv/ioda2: Export debug helper pe_level_printk() ...