diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2018-01-30 11:15:14 -0800 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2018-01-30 11:15:14 -0800 |
commit | d8b91dde38f4c43bd0bbbf17a90f735b16aaff2c (patch) | |
tree | bd72dabf6e4b23e060fce429c87e60504f69de54 /tools/perf/builtin-report.c | |
parent | 5e7481a25e90b661d1dbbba18be3fd3dfe12ec6f (diff) | |
parent | e4c1091cb495d9cbec8956d642644a71a1689958 (diff) |
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
"Kernel side changes:
- Clean up the x86 instruction decoder (Masami Hiramatsu)
- Add new uprobes optimization for PUSH instructions on x86 (Yonghong
Song)
- Add MSR_IA32_THERM_STATUS to the MSR events (Stephane Eranian)
- Fix misc bugs, update documentation, plus various cleanups (Jiri
Olsa)
There's a large number of tooling side improvements:
- Intel-PT/BTS improvements (Adrian Hunter)
- Numerous 'perf trace' improvements (Arnaldo Carvalho de Melo)
- Introduce an errno code to string facility (Hendrik Brueckner)
- Various build system improvements (Jiri Olsa)
- Add support for CoreSight trace decoding by making the perf tools
use the external openCSD (Mathieu Poirier, Tor Jeremiassen)
- Add ARM Statistical Profiling Extensions (SPE) support (Kim
Phillips)
- libtraceevent updates (Steven Rostedt)
- Intel vendor event JSON updates (Andi Kleen)
- Introduce 'perf report --mmaps' and 'perf report --tasks' to show
info present in 'perf.data' (Jiri Olsa, Arnaldo Carvalho de Melo)
- Add infrastructure to record first and last sample time to the
perf.data file header, so that when processing all samples in a
'perf record' session, such as when doing build-id processing, or
when specifically requesting that that info be recorded, use that
in 'perf report --time', that also got support for percent slices
in addition to absolute ones.
I.e. now it is possible to ask for the samples in the 10%-20% time
slice of a perf.data file (Jin Yao)
- Allow system wide 'perf stat --per-thread', sorting the result (Jin
Yao)
E.g.:
[root@jouet ~]# perf stat --per-thread --metrics IPC
^C
Performance counter stats for 'system wide':
make-22229 23,012,094,032 inst_retired.any # 0.8 IPC
cc1-22419 692,027,497 inst_retired.any # 0.8 IPC
gcc-22418 328,231,855 inst_retired.any # 0.9 IPC
cc1-22509 220,853,647 inst_retired.any # 0.8 IPC
gcc-22486 199,874,810 inst_retired.any # 1.0 IPC
as-22466 177,896,365 inst_retired.any # 0.9 IPC
cc1-22465 150,732,374 inst_retired.any # 0.8 IPC
gcc-22508 112,555,593 inst_retired.any # 0.9 IPC
cc1-22487 108,964,079 inst_retired.any # 0.7 IPC
qemu-system-x86-2697 21,330,550 inst_retired.any # 0.3 IPC
systemd-journal-551 20,642,951 inst_retired.any # 0.4 IPC
docker-containe-17651 9,552,892 inst_retired.any # 0.5 IPC
dockerd-current-9809 7,528,586 inst_retired.any # 0.5 IPC
make-22153 12,504,194,380 inst_retired.any # 0.8 IPC
python2-22429 12,081,290,954 inst_retired.any # 0.8 IPC
<SNIP>
python2-22429 15,026,328,103 cpu_clk_unhalted.thread
cc1-22419 826,660,193 cpu_clk_unhalted.thread
gcc-22418 365,321,295 cpu_clk_unhalted.thread
cc1-22509 279,169,362 cpu_clk_unhalted.thread
gcc-22486 210,156,950 cpu_clk_unhalted.thread
<SNIP>
5.638075538 seconds time elapsed
[root@jouet ~]#
- Improve shell auto-completion of perf events (Jin Yao)
- 'perf probe' improvements (Masami Hiramatsu)
- Improve PMU infrastructure to support amp64's ThunderX2
implementation defined core events (Ganapatrao Kulkarni)
- Various annotation related improvements and fixes (Thomas Richter)
- Clarify usage of 'overwrite' and 'backward' in the evlist/mmap
code, removing the 'overwrite' parameter from several functions as
it was always used it as 'false' (Wang Nan)
- Fix/improve 'perf record' reverse recording support (Wang Nan)
- Improve command line options documentation (Sihyeon Jang)
- Optimize sample parsing for ordering events, where we don't need to
parse all the PERF_SAMPLE_ bits, just the ones leading to the
timestamp needed to reorder events (Jiri Olsa)
- Generalize the annotation code to support other source information
besides objdump/DWARF obtained ones, starting with python scripts,
that will is slated to be merged soon (Jiri Olsa)
- ... and a lot more that I failed to list, see the shortlog and
changelog for details"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (262 commits)
perf trace beauty flock: Move to separate object file
perf evlist: Remove fcntl.h from evlist.h
perf trace beauty futex: Beautify FUTEX_BITSET_MATCH_ANY
perf trace: Do not print from time delta for interrupted syscall lines
perf trace: Add --print-sample
perf bpf: Remove misplaced __maybe_unused attribute
MAINTAINERS: Adding entry for CoreSight trace decoding
perf tools: Add mechanic to synthesise CoreSight trace packets
perf tools: Add full support for CoreSight trace decoding
pert tools: Add queue management functionality
perf tools: Add functionality to communicate with the openCSD decoder
perf tools: Add support for decoding CoreSight trace data
perf tools: Add decoder mechanic to support dumping trace data
perf tools: Add processing of coresight metadata
perf tools: Add initial entry point for decoder CoreSight traces
perf tools: Integrating the CoreSight decoding library
perf vendor events intel: Update IvyTown files to V20
perf vendor events intel: Update IvyBridge files to V20
perf vendor events intel: Update BroadwellDE events to V7
perf vendor events intel: Update SkylakeX events to V1.06
...
Diffstat (limited to 'tools/perf/builtin-report.c')
-rw-r--r-- | tools/perf/builtin-report.c | 283 |
1 files changed, 267 insertions, 16 deletions
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c index af5dd038195e..42a52dcc41cd 100644 --- a/tools/perf/builtin-report.c +++ b/tools/perf/builtin-report.c @@ -15,6 +15,7 @@ #include "util/color.h" #include <linux/list.h> #include <linux/rbtree.h> +#include <linux/err.h> #include "util/symbol.h" #include "util/callchain.h" #include "util/values.h" @@ -51,6 +52,7 @@ #include <sys/types.h> #include <sys/stat.h> #include <unistd.h> +#include <linux/mman.h> struct report { struct perf_tool tool; @@ -60,6 +62,9 @@ struct report { bool show_threads; bool inverted_callchain; bool mem_mode; + bool stats_mode; + bool tasks_mode; + bool mmaps_mode; bool header; bool header_only; bool nonany_branch_mode; @@ -69,7 +74,9 @@ struct report { const char *cpu_list; const char *symbol_filter_str; const char *time_str; - struct perf_time_interval ptime; + struct perf_time_interval *ptime_range; + int range_size; + int range_num; float min_percent; u64 nr_entries; u64 queue_size; @@ -162,12 +169,28 @@ static int hist_iter__branch_callback(struct hist_entry_iter *iter, struct hist_entry *he = iter->he; struct report *rep = arg; struct branch_info *bi; + struct perf_sample *sample = iter->sample; + struct perf_evsel *evsel = iter->evsel; + int err; + + if (!ui__has_annotation()) + return 0; + + hist__account_cycles(sample->branch_stack, al, sample, + rep->nonany_branch_mode); bi = he->branch_info; + err = addr_map_symbol__inc_samples(&bi->from, sample, evsel->idx); + if (err) + goto out; + + err = addr_map_symbol__inc_samples(&bi->to, sample, evsel->idx); + branch_type_count(&rep->brtype_stat, &bi->flags, bi->from.addr, bi->to.addr); - return 0; +out: + return err; } static int process_sample_event(struct perf_tool *tool, @@ -186,8 +209,10 @@ static int process_sample_event(struct perf_tool *tool, }; int ret = 0; - if (perf_time__skip_sample(&rep->ptime, sample->time)) + if (perf_time__ranges_skip_sample(rep->ptime_range, rep->range_num, + sample->time)) { return 0; + } if (machine__resolve(machine, &al, sample) < 0) { pr_debug("problem processing %d event, skipping it.\n", @@ -312,9 +337,10 @@ static int report__setup_sample_type(struct report *rep) if (symbol_conf.use_callchain || symbol_conf.cumulate_callchain) { if ((sample_type & PERF_SAMPLE_REGS_USER) && - (sample_type & PERF_SAMPLE_STACK_USER)) + (sample_type & PERF_SAMPLE_STACK_USER)) { callchain_param.record_mode = CALLCHAIN_DWARF; - else if (sample_type & PERF_SAMPLE_BRANCH_STACK) + dwarf_callchain_users = true; + } else if (sample_type & PERF_SAMPLE_BRANCH_STACK) callchain_param.record_mode = CALLCHAIN_LBR; else callchain_param.record_mode = CALLCHAIN_FP; @@ -377,6 +403,9 @@ static size_t hists__fprintf_nr_sample_events(struct hists *hists, struct report if (evname != NULL) ret += fprintf(fp, " of event '%s'", evname); + if (rep->time_str) + ret += fprintf(fp, " (time slices: %s)", rep->time_str); + if (symbol_conf.show_ref_callgraph && strstr(evname, "call-graph=no")) { ret += fprintf(fp, ", show reference callgraph"); @@ -567,6 +596,174 @@ static void report__output_resort(struct report *rep) ui_progress__finish(); } +static void stats_setup(struct report *rep) +{ + memset(&rep->tool, 0, sizeof(rep->tool)); + rep->tool.no_warn = true; +} + +static int stats_print(struct report *rep) +{ + struct perf_session *session = rep->session; + + perf_session__fprintf_nr_events(session, stdout); + return 0; +} + +static void tasks_setup(struct report *rep) +{ + memset(&rep->tool, 0, sizeof(rep->tool)); + if (rep->mmaps_mode) { + rep->tool.mmap = perf_event__process_mmap; + rep->tool.mmap2 = perf_event__process_mmap2; + } + rep->tool.comm = perf_event__process_comm; + rep->tool.exit = perf_event__process_exit; + rep->tool.fork = perf_event__process_fork; + rep->tool.no_warn = true; +} + +struct task { + struct thread *thread; + struct list_head list; + struct list_head children; +}; + +static struct task *tasks_list(struct task *task, struct machine *machine) +{ + struct thread *parent_thread, *thread = task->thread; + struct task *parent_task; + + /* Already listed. */ + if (!list_empty(&task->list)) + return NULL; + + /* Last one in the chain. */ + if (thread->ppid == -1) + return task; + + parent_thread = machine__find_thread(machine, -1, thread->ppid); + if (!parent_thread) + return ERR_PTR(-ENOENT); + + parent_task = thread__priv(parent_thread); + list_add_tail(&task->list, &parent_task->children); + return tasks_list(parent_task, machine); +} + +static size_t maps__fprintf_task(struct maps *maps, int indent, FILE *fp) +{ + size_t printed = 0; + struct rb_node *nd; + + for (nd = rb_first(&maps->entries); nd; nd = rb_next(nd)) { + struct map *map = rb_entry(nd, struct map, rb_node); + + printed += fprintf(fp, "%*s %" PRIx64 "-%" PRIx64 " %c%c%c%c %08" PRIx64 " %" PRIu64 " %s\n", + indent, "", map->start, map->end, + map->prot & PROT_READ ? 'r' : '-', + map->prot & PROT_WRITE ? 'w' : '-', + map->prot & PROT_EXEC ? 'x' : '-', + map->flags & MAP_SHARED ? 's' : 'p', + map->pgoff, + map->ino, map->dso->name); + } + + return printed; +} + +static int map_groups__fprintf_task(struct map_groups *mg, int indent, FILE *fp) +{ + int printed = 0, i; + for (i = 0; i < MAP__NR_TYPES; ++i) + printed += maps__fprintf_task(&mg->maps[i], indent, fp); + return printed; +} + +static void task__print_level(struct task *task, FILE *fp, int level) +{ + struct thread *thread = task->thread; + struct task *child; + int comm_indent = fprintf(fp, " %8d %8d %8d |%*s", + thread->pid_, thread->tid, thread->ppid, + level, ""); + + fprintf(fp, "%s\n", thread__comm_str(thread)); + + map_groups__fprintf_task(thread->mg, comm_indent, fp); + + if (!list_empty(&task->children)) { + list_for_each_entry(child, &task->children, list) + task__print_level(child, fp, level + 1); + } +} + +static int tasks_print(struct report *rep, FILE *fp) +{ + struct perf_session *session = rep->session; + struct machine *machine = &session->machines.host; + struct task *tasks, *task; + unsigned int nr = 0, itask = 0, i; + struct rb_node *nd; + LIST_HEAD(list); + + /* + * No locking needed while accessing machine->threads, + * because --tasks is single threaded command. + */ + + /* Count all the threads. */ + for (i = 0; i < THREADS__TABLE_SIZE; i++) + nr += machine->threads[i].nr; + + tasks = malloc(sizeof(*tasks) * nr); + if (!tasks) + return -ENOMEM; + + for (i = 0; i < THREADS__TABLE_SIZE; i++) { + struct threads *threads = &machine->threads[i]; + + for (nd = rb_first(&threads->entries); nd; nd = rb_next(nd)) { + task = tasks + itask++; + + task->thread = rb_entry(nd, struct thread, rb_node); + INIT_LIST_HEAD(&task->children); + INIT_LIST_HEAD(&task->list); + thread__set_priv(task->thread, task); + } + } + + /* + * Iterate every task down to the unprocessed parent + * and link all in task children list. Task with no + * parent is added into 'list'. + */ + for (itask = 0; itask < nr; itask++) { + task = tasks + itask; + + if (!list_empty(&task->list)) + continue; + + task = tasks_list(task, machine); + if (IS_ERR(task)) { + pr_err("Error: failed to process tasks\n"); + free(tasks); + return PTR_ERR(task); + } + + if (task) + list_add_tail(&task->list, &list); + } + + fprintf(fp, "# %8s %8s %8s %s\n", "pid", "tid", "ppid", "comm"); + + list_for_each_entry(task, &list, list) + task__print_level(task, fp, 0); + + free(tasks); + return 0; +} + static int __cmd_report(struct report *rep) { int ret; @@ -598,12 +795,24 @@ static int __cmd_report(struct report *rep) return ret; } + if (rep->stats_mode) + stats_setup(rep); + + if (rep->tasks_mode) + tasks_setup(rep); + ret = perf_session__process_events(session); if (ret) { ui__error("failed to process sample\n"); return ret; } + if (rep->stats_mode) + return stats_print(rep); + + if (rep->tasks_mode) + return tasks_print(rep, stdout); + report__warn_kptr_restrict(rep); evlist__for_each_entry(session->evlist, pos) @@ -760,6 +969,9 @@ int cmd_report(int argc, const char **argv) OPT_BOOLEAN('q', "quiet", &quiet, "Do not show any message"), OPT_BOOLEAN('D', "dump-raw-trace", &dump_trace, "dump raw trace in ASCII"), + OPT_BOOLEAN(0, "stats", &report.stats_mode, "Display event stats"), + OPT_BOOLEAN(0, "tasks", &report.tasks_mode, "Display recorded tasks"), + OPT_BOOLEAN(0, "mmaps", &report.mmaps_mode, "Display recorded tasks memory maps"), OPT_STRING('k', "vmlinux", &symbol_conf.vmlinux_name, "file", "vmlinux pathname"), OPT_STRING(0, "kallsyms", &symbol_conf.kallsyms_name, @@ -907,6 +1119,9 @@ int cmd_report(int argc, const char **argv) report.symbol_filter_str = argv[0]; } + if (report.mmaps_mode) + report.tasks_mode = true; + if (quiet) perf_quiet_option(); @@ -921,13 +1136,6 @@ int cmd_report(int argc, const char **argv) return -EINVAL; } - if (report.use_stdio) - use_browser = 0; - else if (report.use_tui) - use_browser = 1; - else if (report.use_gtk) - use_browser = 2; - if (report.inverted_callchain) callchain_param.order = ORDER_CALLER; if (symbol_conf.cumulate_callchain && !callchain_param.order_set) @@ -1014,6 +1222,13 @@ repeat: perf_hpp_list.need_collapse = true; } + if (report.use_stdio) + use_browser = 0; + else if (report.use_tui) + use_browser = 1; + else if (report.use_gtk) + use_browser = 2; + /* Force tty output for header output and per-thread stat. */ if (report.header || report.header_only || report.show_threads) use_browser = 0; @@ -1021,6 +1236,12 @@ repeat: report.tool.show_feat_hdr = SHOW_FEAT_HEADER; if (report.show_full_info) report.tool.show_feat_hdr = SHOW_FEAT_HEADER_FULL_INFO; + if (report.stats_mode || report.tasks_mode) + use_browser = 0; + if (report.stats_mode && report.tasks_mode) { + pr_err("Error: --tasks and --mmaps can't be used together with --stats\n"); + goto error; + } if (strcmp(input_name, "-") != 0) setup_browser(true); @@ -1043,7 +1264,8 @@ repeat: ret = 0; goto error; } - } else if (use_browser == 0 && !quiet) { + } else if (use_browser == 0 && !quiet && + !report.stats_mode && !report.tasks_mode) { fputs("# To display the perf.data header info, please use --header/--header-only options.\n#\n", stdout); } @@ -1077,9 +1299,36 @@ repeat: if (symbol__init(&session->header.env) < 0) goto error; - if (perf_time__parse_str(&report.ptime, report.time_str) != 0) { - pr_err("Invalid time string\n"); - return -EINVAL; + report.ptime_range = perf_time__range_alloc(report.time_str, + &report.range_size); + if (!report.ptime_range) { + ret = -ENOMEM; + goto error; + } + + if (perf_time__parse_str(report.ptime_range, report.time_str) != 0) { + if (session->evlist->first_sample_time == 0 && + session->evlist->last_sample_time == 0) { + pr_err("HINT: no first/last sample time found in perf data.\n" + "Please use latest perf binary to execute 'perf record'\n" + "(if '--buildid-all' is enabled, please set '--timestamp-boundary').\n"); + ret = -EINVAL; + goto error; + } + + report.range_num = perf_time__percent_parse_str( + report.ptime_range, report.range_size, + report.time_str, + session->evlist->first_sample_time, + session->evlist->last_sample_time); + + if (report.range_num < 0) { + pr_err("Invalid time string\n"); + ret = -EINVAL; + goto error; + } + } else { + report.range_num = 1; } sort__setup_elide(stdout); @@ -1092,6 +1341,8 @@ repeat: ret = 0; error: + zfree(&report.ptime_range); + perf_session__delete(session); return ret; } |