diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2018-01-30 11:15:14 -0800 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2018-01-30 11:15:14 -0800 |
commit | d8b91dde38f4c43bd0bbbf17a90f735b16aaff2c (patch) | |
tree | bd72dabf6e4b23e060fce429c87e60504f69de54 /tools/perf/builtin-record.c | |
parent | 5e7481a25e90b661d1dbbba18be3fd3dfe12ec6f (diff) | |
parent | e4c1091cb495d9cbec8956d642644a71a1689958 (diff) |
Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf updates from Ingo Molnar:
"Kernel side changes:
- Clean up the x86 instruction decoder (Masami Hiramatsu)
- Add new uprobes optimization for PUSH instructions on x86 (Yonghong
Song)
- Add MSR_IA32_THERM_STATUS to the MSR events (Stephane Eranian)
- Fix misc bugs, update documentation, plus various cleanups (Jiri
Olsa)
There's a large number of tooling side improvements:
- Intel-PT/BTS improvements (Adrian Hunter)
- Numerous 'perf trace' improvements (Arnaldo Carvalho de Melo)
- Introduce an errno code to string facility (Hendrik Brueckner)
- Various build system improvements (Jiri Olsa)
- Add support for CoreSight trace decoding by making the perf tools
use the external openCSD (Mathieu Poirier, Tor Jeremiassen)
- Add ARM Statistical Profiling Extensions (SPE) support (Kim
Phillips)
- libtraceevent updates (Steven Rostedt)
- Intel vendor event JSON updates (Andi Kleen)
- Introduce 'perf report --mmaps' and 'perf report --tasks' to show
info present in 'perf.data' (Jiri Olsa, Arnaldo Carvalho de Melo)
- Add infrastructure to record first and last sample time to the
perf.data file header, so that when processing all samples in a
'perf record' session, such as when doing build-id processing, or
when specifically requesting that that info be recorded, use that
in 'perf report --time', that also got support for percent slices
in addition to absolute ones.
I.e. now it is possible to ask for the samples in the 10%-20% time
slice of a perf.data file (Jin Yao)
- Allow system wide 'perf stat --per-thread', sorting the result (Jin
Yao)
E.g.:
[root@jouet ~]# perf stat --per-thread --metrics IPC
^C
Performance counter stats for 'system wide':
make-22229 23,012,094,032 inst_retired.any # 0.8 IPC
cc1-22419 692,027,497 inst_retired.any # 0.8 IPC
gcc-22418 328,231,855 inst_retired.any # 0.9 IPC
cc1-22509 220,853,647 inst_retired.any # 0.8 IPC
gcc-22486 199,874,810 inst_retired.any # 1.0 IPC
as-22466 177,896,365 inst_retired.any # 0.9 IPC
cc1-22465 150,732,374 inst_retired.any # 0.8 IPC
gcc-22508 112,555,593 inst_retired.any # 0.9 IPC
cc1-22487 108,964,079 inst_retired.any # 0.7 IPC
qemu-system-x86-2697 21,330,550 inst_retired.any # 0.3 IPC
systemd-journal-551 20,642,951 inst_retired.any # 0.4 IPC
docker-containe-17651 9,552,892 inst_retired.any # 0.5 IPC
dockerd-current-9809 7,528,586 inst_retired.any # 0.5 IPC
make-22153 12,504,194,380 inst_retired.any # 0.8 IPC
python2-22429 12,081,290,954 inst_retired.any # 0.8 IPC
<SNIP>
python2-22429 15,026,328,103 cpu_clk_unhalted.thread
cc1-22419 826,660,193 cpu_clk_unhalted.thread
gcc-22418 365,321,295 cpu_clk_unhalted.thread
cc1-22509 279,169,362 cpu_clk_unhalted.thread
gcc-22486 210,156,950 cpu_clk_unhalted.thread
<SNIP>
5.638075538 seconds time elapsed
[root@jouet ~]#
- Improve shell auto-completion of perf events (Jin Yao)
- 'perf probe' improvements (Masami Hiramatsu)
- Improve PMU infrastructure to support amp64's ThunderX2
implementation defined core events (Ganapatrao Kulkarni)
- Various annotation related improvements and fixes (Thomas Richter)
- Clarify usage of 'overwrite' and 'backward' in the evlist/mmap
code, removing the 'overwrite' parameter from several functions as
it was always used it as 'false' (Wang Nan)
- Fix/improve 'perf record' reverse recording support (Wang Nan)
- Improve command line options documentation (Sihyeon Jang)
- Optimize sample parsing for ordering events, where we don't need to
parse all the PERF_SAMPLE_ bits, just the ones leading to the
timestamp needed to reorder events (Jiri Olsa)
- Generalize the annotation code to support other source information
besides objdump/DWARF obtained ones, starting with python scripts,
that will is slated to be merged soon (Jiri Olsa)
- ... and a lot more that I failed to list, see the shortlog and
changelog for details"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (262 commits)
perf trace beauty flock: Move to separate object file
perf evlist: Remove fcntl.h from evlist.h
perf trace beauty futex: Beautify FUTEX_BITSET_MATCH_ANY
perf trace: Do not print from time delta for interrupted syscall lines
perf trace: Add --print-sample
perf bpf: Remove misplaced __maybe_unused attribute
MAINTAINERS: Adding entry for CoreSight trace decoding
perf tools: Add mechanic to synthesise CoreSight trace packets
perf tools: Add full support for CoreSight trace decoding
pert tools: Add queue management functionality
perf tools: Add functionality to communicate with the openCSD decoder
perf tools: Add support for decoding CoreSight trace data
perf tools: Add decoder mechanic to support dumping trace data
perf tools: Add processing of coresight metadata
perf tools: Add initial entry point for decoder CoreSight traces
perf tools: Integrating the CoreSight decoding library
perf vendor events intel: Update IvyTown files to V20
perf vendor events intel: Update IvyBridge files to V20
perf vendor events intel: Update BroadwellDE events to V7
perf vendor events intel: Update SkylakeX events to V1.06
...
Diffstat (limited to 'tools/perf/builtin-record.c')
-rw-r--r-- | tools/perf/builtin-record.c | 63 |
1 files changed, 49 insertions, 14 deletions
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c index 003255910c05..65681a1a292a 100644 --- a/tools/perf/builtin-record.c +++ b/tools/perf/builtin-record.c @@ -51,7 +51,6 @@ #include <signal.h> #include <sys/mman.h> #include <sys/wait.h> -#include <asm/bug.h> #include <linux/time64.h> struct switch_output { @@ -79,6 +78,7 @@ struct record { bool no_buildid_cache_set; bool buildid_all; bool timestamp_filename; + bool timestamp_boundary; struct switch_output switch_output; unsigned long long samples; }; @@ -301,7 +301,7 @@ static int record__mmap_evlist(struct record *rec, struct record_opts *opts = &rec->opts; char msg[512]; - if (perf_evlist__mmap_ex(evlist, opts->mmap_pages, false, + if (perf_evlist__mmap_ex(evlist, opts->mmap_pages, opts->auxtrace_mmap_pages, opts->auxtrace_snapshot_mode) < 0) { if (errno == EPERM) { @@ -372,6 +372,8 @@ try_again: ui__error("%s\n", msg); goto out; } + + pos->supported = true; } if (perf_evlist__apply_filters(evlist, &pos)) { @@ -408,8 +410,15 @@ static int process_sample_event(struct perf_tool *tool, { struct record *rec = container_of(tool, struct record, tool); - rec->samples++; + if (rec->evlist->first_sample_time == 0) + rec->evlist->first_sample_time = sample->time; + + rec->evlist->last_sample_time = sample->time; + if (rec->buildid_all) + return 0; + + rec->samples++; return build_id__mark_dso_hit(tool, event, sample, evsel, machine); } @@ -434,9 +443,11 @@ static int process_buildids(struct record *rec) /* * If --buildid-all is given, it marks all DSO regardless of hits, - * so no need to process samples. + * so no need to process samples. But if timestamp_boundary is enabled, + * it still needs to walk on all samples to get the timestamps of + * first/last samples. */ - if (rec->buildid_all) + if (rec->buildid_all && !rec->timestamp_boundary) rec->tool.sample = NULL; return perf_session__process_events(session); @@ -477,7 +488,7 @@ static struct perf_event_header finished_round_event = { }; static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evlist, - bool backward) + bool overwrite) { u64 bytes_written = rec->bytes_written; int i; @@ -487,18 +498,18 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli if (!evlist) return 0; - maps = backward ? evlist->backward_mmap : evlist->mmap; + maps = overwrite ? evlist->overwrite_mmap : evlist->mmap; if (!maps) return 0; - if (backward && evlist->bkw_mmap_state != BKW_MMAP_DATA_PENDING) + if (overwrite && evlist->bkw_mmap_state != BKW_MMAP_DATA_PENDING) return 0; for (i = 0; i < evlist->nr_mmaps; i++) { struct auxtrace_mmap *mm = &maps[i].auxtrace_mmap; if (maps[i].base) { - if (perf_mmap__push(&maps[i], evlist->overwrite, backward, rec, record__pushfn) != 0) { + if (perf_mmap__push(&maps[i], overwrite, rec, record__pushfn) != 0) { rc = -1; goto out; } @@ -518,7 +529,7 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli if (bytes_written != rec->bytes_written) rc = record__write(rec, &finished_round_event, sizeof(finished_round_event)); - if (backward) + if (overwrite) perf_evlist__toggle_bkw_mmap(evlist, BKW_MMAP_EMPTY); out: return rc; @@ -690,8 +701,8 @@ perf_evlist__pick_pc(struct perf_evlist *evlist) if (evlist) { if (evlist->mmap && evlist->mmap[0].base) return evlist->mmap[0].base; - if (evlist->backward_mmap && evlist->backward_mmap[0].base) - return evlist->backward_mmap[0].base; + if (evlist->overwrite_mmap && evlist->overwrite_mmap[0].base) + return evlist->overwrite_mmap[0].base; } return NULL; } @@ -784,6 +795,28 @@ static int record__synthesize(struct record *rec, bool tail) perf_event__synthesize_guest_os, tool); } + err = perf_event__synthesize_extra_attr(&rec->tool, + rec->evlist, + process_synthesized_event, + data->is_pipe); + if (err) + goto out; + + err = perf_event__synthesize_thread_map2(&rec->tool, rec->evlist->threads, + process_synthesized_event, + NULL); + if (err < 0) { + pr_err("Couldn't synthesize thread map.\n"); + return err; + } + + err = perf_event__synthesize_cpu_map(&rec->tool, rec->evlist->cpus, + process_synthesized_event, NULL); + if (err < 0) { + pr_err("Couldn't synthesize cpu map.\n"); + return err; + } + err = __machine__synthesize_threads(machine, tool, &opts->target, rec->evlist->threads, process_synthesized_event, opts->sample_address, opts->proc_map_timeout, 1); @@ -1598,6 +1631,8 @@ static struct option __record_options[] = { "Record build-id of all DSOs regardless of hits"), OPT_BOOLEAN(0, "timestamp-filename", &record.timestamp_filename, "append timestamp to output filename"), + OPT_BOOLEAN(0, "timestamp-boundary", &record.timestamp_boundary, + "Record timestamp boundary (time of first/last samples)"), OPT_STRING_OPTARG_SET(0, "switch-output", &record.switch_output.str, &record.switch_output.set, "signal,size,time", "Switch output when receive SIGUSR2 or cross size,time threshold", @@ -1781,8 +1816,8 @@ int cmd_record(int argc, const char **argv) goto out; } - /* Enable ignoring missing threads when -u option is defined. */ - rec->opts.ignore_missing_thread = rec->opts.target.uid != UINT_MAX; + /* Enable ignoring missing threads when -u/-p option is defined. */ + rec->opts.ignore_missing_thread = rec->opts.target.uid != UINT_MAX || rec->opts.target.pid; err = -ENOMEM; if (perf_evlist__create_maps(rec->evlist, &rec->opts.target) < 0) |