aboutsummaryrefslogtreecommitdiff
path: root/tools/perf
AgeCommit message (Collapse)AuthorFilesLines
2017-11-16perf tests: Set evlist of test__task_exit() to !overwriteWang Nan1-1/+1
Changing ringbuffer to !overwrite in this task is harmless because this test uses a very low frequency (1) and using a very simple program (true). There should have only 3 events in the whole test. Overwriting is impossible to happen. Signed-off-by: Wang Nan <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf tests: Set evlist of test__basic_mmap() to !overwriteWang Nan1-1/+1
In this test, a large ring buffer is required so all events can feed into, so overwrite or not is meaningless. Change to !overwrite so following commits can remove this argument. Signed-off-by: Wang Nan <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf tests: Set evlist of test__sw_clock_freq() to !overwriteWang Nan1-1/+1
Unsetting overwrite when calling perf_evlist__mmap is harmless. This commit passes false to it, makes following commits eliminate the overwrite argument easier. Signed-off-by: Wang Nan <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf tests: Set evlist of test__backward_ring_buffer() to !overwriteWang Nan1-1/+1
Setting overwrite in perf_evlist__mmap() is meaningless because the event in this evlist is already have 'overwrite' postfix and goes to backward ring buffer automatically. Pass 'false' to perf_evlist__mmap() to make it similar to others. Signed-off-by: Wang Nan <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf top: Remove a duplicate wordSihyeon Jang1-1/+1
Signed-off-by: Sihyeon Jang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf top: Document missing optionsSihyeon Jang1-0/+6
Signed-off-by: Sihyeon Jang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf script: Allow printing period for non freq mode groupsAndi Kleen1-5/+0
When using leader sampling the values of the not sampled but counted events are shown by perf script in "period". Currently printing period is only allowed when the main event has a period, that is it is in frequency mode. This implies that we cannot dump the values of counted events when the leader event is not in frequency mode. Just remove the check that the period must be set on all events. It will just be printed as 0 instead if it's not available. This fixes the following: $ perf record -c 100000 -e '{cycles,branches}:S' $ perf script -F event,period Further commentary by Jiri Olsa: The period will be the value of configured period, not 0: int perf_evsel__parse_sample(struct ... ... data->period = evsel->attr.sample_period; $ perf record -c 100000 $ perf script -F event,period | head -3 Failed to open /tmp/perf-2048.map, continuing without symbols 100000 cycles:ppp: 100000 cycles:ppp: other than that I think we can remove that check, because we will have always sane number in period Signed-off-by: Andi Kleen <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf tools: Document some missing perf.data headersAndi Kleen1-0/+23
Document STAT and CACHE header entries. Signed-off-by: Andi Kleen <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf buildid-cache: Update help text for purge commandThomas-Mich Richter1-2/+2
Clarify the perf buildid-cache help text for the purge operation. The purge subcommand takes a list of files (binaries) as option parameter. Make the wording the same as for the add and remove operation. Signed-off-by: Thomas-Mich Richter <[email protected]> Reviewed-by: Hendrik Brueckner <[email protected]> Acked-by: Masami Hiramatsu <[email protected]> Cc: Martin Schwidefsky <[email protected]> LPU-Reference: [email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf vendor events powerpc: Update POWER9 eventsSukadev Bhattiprolu7-248/+88
The POWER9 hardware has dropped support for several events, added a few new events and changed the category for a couple of events. Update the POWER9 events in Linux to reflect these changes. Signed-off-by: Sukadev Bhattiprolu <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf script: Fix --per-event-dump for auxtrace synth evselsArnaldo Carvalho de Melo1-1/+30
When processing PERF_RECORD_AUXTRACE_INFO several perf_evsel entries will be synthesized and inserted into session->evlist, eventually ending in perf_script.tool.sample(), which ends up calling builtin-script.c's process_event(), that expects evsel->priv to be a perf_evsel_script object with a valid FILE pointer in fp. So we need to intercept the processing of PERF_RECORD_AUXTRACE_INFO and then setup evsel->priv for these newly created perf_evsel instances, do it to fix the segfault in process_event() trying to use a NULL for that FILE pointer. Reported-by: Alexander Shishkin <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: David Ahern <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Wang Nan <[email protected]> Cc: yuzhoujian <[email protected]> Fixes: a14390fde64e ("perf script: Allow creating per-event dump files") Link: http://lkml.kernel.org/n/[email protected] [ Merge fix by Ravi Bangoria before pushing upstream to preserv bisectability ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf tests: Add missing WRITE_ASS for new fields of perf_event_attrSeonghyun Park1-0/+6
Include newly added fields 'mmap2', 'comm_exec', 'use_clockid', 'namespaces', 'write_backward' and 'context_switch' from perf_event_attr to store_event(). Signed-off-by: Seonghyun Park <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Seonghyun Park <[email protected]> Link: http://lkml.kernel.org/n/[email protected] [ Fix log message to add 'write_backward', fix the patch to add 'use_clock_id' ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf evsel: Fix up leftover perf_evsel_stat usage via evsel->privArnaldo Carvalho de Melo1-1/+1
I forgot one conversion, which got noticed by Thomas when running: $ perf stat -e '{cpu-clock,instructions}' kill kill: not enough arguments Segmentation fault (core dumped) $ Fix it, those stats are in evsel->stats, not anymore in evsel->priv. Reported-by: Thomas-Mich Richter <[email protected]> Tested-by: Thomas-Mich Richter <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Hendrik Brueckner <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Wang Nan <[email protected]> Fixes: e669e833da8d ("perf evsel: Restore evsel->priv as a tool private area") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf tools: Use shell function for perl cflags retrievalJiri Olsa1-1/+1
Using the shell function for perl CFLAGS retrieval instead of back quotes (``). Both execute shell with the command, but the latter is more explicit and seems to be the preferred way. Also we don't have any other use of the back quotes in perf Makefiles. Signed-off-by: Jiri Olsa <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf trace: Fix an exit code of trace__symbols_initAndrei Vagin1-2/+4
Currently if trace_event__register_resolver() fails, we return -errno, but we can't be sure that errno isn't zero in this case. Signed-off-by: Andrei Vagin <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Vasily Averin <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf evsel: Enable type checking for perf_evsel_config_term typesAndi Kleen2-3/+4
Use a typed enum for the perf_evsel_config_term type enum. This allows gcc to do much stronger type checks, and also check for missing case statements. I removed the unused _MAX member from the number. It found one missing case. I'm not sure it's a real problem, so I just turned it into a BUG_ON for now. Signed-off-by: Andi Kleen <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] [ Renamed the enum name to term_type as per jolsa's request ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf record: Fix -c/-F options for cpu event aliasesAndi Kleen5-4/+19
The Intel PMU event aliases have a implicit period= specifier to set the default period. Unfortunately this breaks overriding these periods with -c or -F, because the alias terms look like they are user specified to the internal parser, and user specified event qualifiers override the command line options. Track that they are coming from aliases by adding a "weak" state to the term. Any weak terms don't override command line options. I only did it for -c/-F for now, I think that's the only case that's broken currently. Before: $ perf record -c 1000 -vv -e uops_issued.any ... { sample_period, sample_freq } 2000003 After: $ perf record -c 1000 -vv -e uops_issued.any ... { sample_period, sample_freq } 1000 Signed-off-by: Andi Kleen <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf annotate: Align source and offset linesJiri Olsa1-10/+24
Align source with offset lines, which are more advanced, because of the address column. Before: : static void *worker_thread(void *__tdata) : { 0.00 : 48a971: push %rbp 0.00 : 48a972: mov %rsp,%rbp 0.00 : 48a975: sub $0x30,%rsp 0.00 : 48a979: mov %rdi,-0x28(%rbp) 0.00 : 48a97d: mov %fs:0x28,%rax 0.00 : 48a986: mov %rax,-0x8(%rbp) 0.00 : 48a98a: xor %eax,%eax : struct thread_data *td = __tdata; 0.00 : 48a98c: mov -0x28(%rbp),%rax 0.00 : 48a990: mov %rax,-0x10(%rbp) : int m = 0, i; 0.00 : 48a994: movl $0x0,-0x1c(%rbp) : int ret; : : for (i = 0; i < loops; i++) { 0.00 : 48a99b: movl $0x0,-0x18(%rbp) After: : static void *worker_thread(void *__tdata) : { 0.00 : 48a971: push %rbp 0.00 : 48a972: mov %rsp,%rbp 0.00 : 48a975: sub $0x30,%rsp 0.00 : 48a979: mov %rdi,-0x28(%rbp) 0.00 : 48a97d: mov %fs:0x28,%rax 0.00 : 48a986: mov %rax,-0x8(%rbp) 0.00 : 48a98a: xor %eax,%eax : struct thread_data *td = __tdata; 0.00 : 48a98c: mov -0x28(%rbp),%rax 0.00 : 48a990: mov %rax,-0x10(%rbp) : int m = 0, i; 0.00 : 48a994: movl $0x0,-0x1c(%rbp) : int ret; : : for (i = 0; i < loops; i++) { 0.00 : 48a99b: movl $0x0,-0x18(%rbp) It makes bigger different when displaying script sources, where the comment lines looks oddly shifted from the lines which actually hold code. I'll send script support separately. Committer note: Do not use a fixed column width for the addresses, as kernel ones se more than 10 columns, look at the last offset and get the right width. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf annotate browser: Add disasm_line__write functionJiri Olsa1-45/+53
Factor disasm_line__write function from annotate_browser__write, which now keeps only generic display code. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf annotate browser: Use struct annotation_line in browser topJiri Olsa1-13/+13
Use struct annotation_line in browser::b::top. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf annotate browser: Use struct annotation_line in find functionsJiri Olsa1-20/+20
Use struct annotation_line in find functions: annotate_browser__find_string annotate_browser__find_string_reverse Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf annotate browser: Use struct annotation_line in browser_lineJiri Olsa1-32/+28
Using struct annotation_line arg in browser_line function to make it generic. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf annotate browser: Change offsets to struct annotation_lineJiri Olsa1-20/+29
Use struct annotation_line as a browser::offsets array entry. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf annotate browser: Change selection to struct annotation_lineJiri Olsa1-29/+34
Use struct annotation_line as a browser::selection. We want to be able to use the annotate_browser for all sorts of source data, so it needs to be able to work over the generic struct annotation_line. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/20171106105617.GC20858@krava Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf annotate browser: Rename disasm_line__browser to browser_lineJiri Olsa1-8/+8
Rename disasm_line__browser function to browser_line, because the browser got generic and is no longer disasm specific. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/20171106105552.GB20858@krava Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf annotate browser: Rename struct browser_disasm_line to browser_lineJiri Olsa1-13/+13
Rename struct browser_disasm_line to browser_line, because the browser operates now on generic lines and no longer on disasm lines. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/20171106105536.GA20858@krava Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf annotate browser: Do not pass nr_events in disasm_rb_tree__insertJiri Olsa1-9/+6
We now keep samples_nr in struct annotation_line, so there's no need to pass nr_events to disasm_rb_tree__insert function. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf annotate browser: Use samples data from struct annotation_lineJiri Olsa1-37/+20
We now carry the data in 'struct annotation_line', so using it instead of samples from 'struct browser_disasm_line' and removing it and its setup. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf annotate: Factor annotation_line__print from disasm_line__printJiri Olsa1-36/+33
Move generic annotation line display code into annotation_line__print function. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf annotate: Add annotation_line__print functionJiri Olsa1-6/+22
Separating struct annotation_line display function, it will hold the generic line display code. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf annotate: Remove struct source_lineJiri Olsa1-14/+0
Remove struct source_line*, no longer needed. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf annotate: Remove disasm__calc_percent functionJiri Olsa2-46/+0
Remove disasm__calc_percent() function, because it's no longer needed. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf annotate: Remove disasm__calc_percent() from ↵Jiri Olsa1-13/+6
annotate_browser__calc_percent() Remove disasm__calc_percent() from annotate_browser__calc_percent(), because we already have the data calculated in struct annotation_line. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf annotate: Remove disasm__calc_percent() from disasm_line__print()Jiri Olsa2-44/+18
Remove disasm__calc_percent() from disasm_line__print(), because we already have the data calculated in struct annotation_line. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf annotate: Add symbol__calc_lines functionJiri Olsa2-120/+68
Replace symbol__get_source_line() with symbol__calc_lines(), which calculates the source line tree over the struct annotation_line. This will allow us to remove redundant struct source_line in following patches. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16perf annotate: Add symbol__calc_percent functionJiri Olsa2-1/+62
Add symbol__calc_percent function, that calculates annotation data for symbol and put the data in the struct annotation_line::samples array. Committer notes: Made symbol__calc_percent non static to be used in the next two patches, which will get some fixups from jolsa, doing it this way to keep this bisectable. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-16s390/perf: add perf register support for floating-point registersHendrik Brueckner2-1/+51
For correct unwinding of user space processes, the floating-point register contents are required. For example, leaf functions might use fp registers to temporarily store the return address. Signed-off-by: Hendrik Brueckner <[email protected]> Reviewed-and-tested-by: Thomas Richter <[email protected]> Signed-off-by: Martin Schwidefsky <[email protected]>
2017-11-16s390/perf: define common DWARF register string tableHendrik Brueckner3-21/+80
Instead of defining DWARF register to string table in dwarf-regs-table.h and dwarf-regs.c, use a common table in dwarf-regs-table.h. Ensure that the DWARF register table is up-to-date with http://refspecs.linuxfoundation.org/ELF/zSeries/lzsabi0_s390/x1542.html. For unwinding with libdw, also ensure to correctly setup the DWARF register frame according to the register mappings. Currently, libdw supports up to 32 registers only. Suggested-by: Thomas Richter <[email protected]> Signed-off-by: Hendrik Brueckner <[email protected]> Reviewed-and-tested-by: Thomas Richter <[email protected]> Signed-off-by: Martin Schwidefsky <[email protected]>
2017-11-16s390/perf: add support for perf_regs and libdwHeiko Carstens5-1/+113
With support for perf_regs and libdw, you can record and report call graphs for user space programs. Simply invoke perf with the --call-graph=dwarf command line option. Signed-off-by: Heiko Carstens <[email protected]> [brueckner: added dwfl_thread_state_register_pc() call] Signed-off-by: Hendrik Brueckner <[email protected]> Reviewed-and-tested-by: Thomas Richter <[email protected]> Signed-off-by: Martin Schwidefsky <[email protected]>
2017-11-16s390/perf: add callback to perf to enable using AUX bufferPu Hou2-0/+120
Perf tool need implement a callback to enable using AUX buffer. Perf will do another mmap() to trigger the setup of AUX buffer in kernel if there is such callback. The default size of the AUX buffer is set properly according to the sampling frequency to avoid overflow. It could also be manually set by -m option of perf. The interface of perf is not changed. Diagnostic mode sampling could be started by `perf record -e rBD000` like before. Signed-off-by: Pu Hou <[email protected]> Reviewed-by: Hendrik Brueckner <[email protected]> Signed-off-by: Martin Schwidefsky <[email protected]>
2017-11-15mm: remove __GFP_COLDMel Gorman1-1/+0
As the page free path makes no distinction between cache hot and cold pages, there is no real useful ordering of pages in the free list that allocation requests can take advantage of. Juding from the users of __GFP_COLD, it is likely that a number of them are the result of copying other sites instead of actually measuring the impact. Remove the __GFP_COLD parameter which simplifies a number of paths in the page allocator. This is potentially controversial but bear in mind that the size of the per-cpu pagelists versus modern cache sizes means that the whole per-cpu list can often fit in the L3 cache. Hence, there is only a potential benefit for microbenchmarks that alloc/free pages in a tight loop. It's even worse when THP is taken into account which has little or no chance of getting a cache-hot page as the per-cpu list is bypassed and the zeroing of multiple pages will thrash the cache anyway. The truncate microbenchmarks are not shown as this patch affects the allocation path and not the free path. A page fault microbenchmark was tested but it showed no sigificant difference which is not surprising given that the __GFP_COLD branches are a miniscule percentage of the fault path. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Mel Gorman <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Dave Chinner <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Jan Kara <[email protected]> Cc: Johannes Weiner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-15kmemcheck: remove whats left of NOTRACK flagsLevin, Alexander (Sasha Levin)1-1/+0
Now that kmemcheck is gone, we don't need the NOTRACK flags. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Eric W. Biederman <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Tim Hansen <[email protected]> Cc: Vegard Nossum <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-11-13Merge branch 'perf-core-for-linus' of ↵Linus Torvalds159-1780/+7637
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: "The main changes in this cycle were: Kernel: - kprobes updates: use better W^X patterns for code modifications, improve optprobes, remove jprobes. (Masami Hiramatsu, Kees Cook) - core fixes: event timekeeping (enabled/running times statistics) fixes, perf_event_read() locking fixes and cleanups, etc. (Peter Zijlstra) - Extend x86 Intel free-running PEBS support and support x86 user-register sampling in perf record and perf script. (Andi Kleen) Tooling: - Completely rework the way inline frames are handled. Instead of querying for the inline nodes on-demand in the individual tools, we now create proper callchain nodes for inlined frames. (Milian Wolff) - 'perf trace' updates (Arnaldo Carvalho de Melo) - Implement a way to print formatted output to per-event files in 'perf script' to facilitate generate flamegraphs, elliminating the need to write scripts to do that separation (yuzhoujian, Arnaldo Carvalho de Melo) - Update vendor events JSON metrics for Intel's Broadwell, Broadwell Server, Haswell, Haswell Server, IvyBridge, IvyTown, JakeTown, Sandy Bridge, Skylake, SkyLake Server - and Goldmont Plus V1 (Andi Kleen, Kan Liang) - Multithread the synthesizing of PERF_RECORD_ events for pre-existing threads in 'perf top', speeding up that phase, greatly improving the user experience in systems such as Intel's Knights Mill (Kan Liang) - Introduce the concept of weak groups in 'perf stat': try to set up a group, but if it's not schedulable fallback to not using a group. That gives us the best of both worlds: groups if they work, but still a usable fallback if they don't. E.g: (Andi Kleen) - perf sched timehist enhancements (David Ahern) - ... various other enhancements, updates, cleanups and fixes" * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (139 commits) kprobes: Don't spam the build log with deprecation warnings arm/kprobes: Remove jprobe test case arm/kprobes: Fix kretprobe test to check correct counter perf srcline: Show correct function name for srcline of callchains perf srcline: Fix memory leak in addr2inlines() perf trace beauty kcmp: Beautify arguments perf trace beauty: Implement pid_fd beautifier tools include uapi: Grab a copy of linux/kcmp.h perf callchain: Fix double mapping al->addr for children without self period perf stat: Make --per-thread update shadow stats to show metrics perf stat: Move the shadow stats scale computation in perf_stat__update_shadow_stats perf tools: Add perf_data_file__write function perf tools: Add struct perf_data_file perf tools: Rename struct perf_data_file to perf_data perf script: Print information about per-event-dump files perf trace beauty prctl: Generate 'option' string table from kernel headers tools include uapi: Grab a copy of linux/prctl.h perf script: Allow creating per-event dump files perf evsel: Restore evsel->priv as a tool private area perf script: Use event_format__fprintf() ...
2017-11-13Merge branch 'locking-core-for-linus' of ↵Linus Torvalds2-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core locking updates from Ingo Molnar: "The main changes in this cycle are: - Another attempt at enabling cross-release lockdep dependency tracking (automatically part of CONFIG_PROVE_LOCKING=y), this time with better performance and fewer false positives. (Byungchul Park) - Introduce lockdep_assert_irqs_enabled()/disabled() and convert open-coded equivalents to lockdep variants. (Frederic Weisbecker) - Add down_read_killable() and use it in the VFS's iterate_dir() method. (Kirill Tkhai) - Convert remaining uses of ACCESS_ONCE() to READ_ONCE()/WRITE_ONCE(). Most of the conversion was Coccinelle driven. (Mark Rutland, Paul E. McKenney) - Get rid of lockless_dereference(), by strengthening Alpha atomics, strengthening READ_ONCE() with smp_read_barrier_depends() and thus being able to convert users of lockless_dereference() to READ_ONCE(). (Will Deacon) - Various micro-optimizations: - better PV qspinlocks (Waiman Long), - better x86 barriers (Michael S. Tsirkin) - better x86 refcounts (Kees Cook) - ... plus other fixes and enhancements. (Borislav Petkov, Juergen Gross, Miguel Bernal Marin)" * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits) locking/x86: Use LOCK ADD for smp_mb() instead of MFENCE rcu: Use lockdep to assert IRQs are disabled/enabled netpoll: Use lockdep to assert IRQs are disabled/enabled timers/posix-cpu-timers: Use lockdep to assert IRQs are disabled/enabled sched/clock, sched/cputime: Use lockdep to assert IRQs are disabled/enabled irq_work: Use lockdep to assert IRQs are disabled/enabled irq/timings: Use lockdep to assert IRQs are disabled/enabled perf/core: Use lockdep to assert IRQs are disabled/enabled x86: Use lockdep to assert IRQs are disabled/enabled smp/core: Use lockdep to assert IRQs are disabled/enabled timers/hrtimer: Use lockdep to assert IRQs are disabled/enabled timers/nohz: Use lockdep to assert IRQs are disabled/enabled workqueue: Use lockdep to assert IRQs are disabled/enabled irq/softirqs: Use lockdep to assert IRQs are disabled/enabled locking/lockdep: Add IRQs disabled/enabled assertion APIs: lockdep_assert_irqs_enabled()/disabled() locking/pvqspinlock: Implement hybrid PV queued/unfair locks locking/rwlocks: Fix comments x86/paravirt: Set up the virt_spin_lock_key after static keys get initialized block, locking/lockdep: Assign a lock_class per gendisk used for wait_for_completion() workqueue: Remove now redundant lock acquisitions wrt. workqueue flushes ...
2017-11-13perf annotate: Add samples into struct annotation_lineJiri Olsa2-5/+20
Add samples array into struct annotation_line to hold the annotation data. The data is populated in the following patches. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-13perf annotate: Add annotated_source__purge functionJiri Olsa3-12/+10
Mov disasm__purge() to annotated_source__purge() to make it work over a generic struct annotation_line. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-13perf annotate: Add annotation_line__(new|delete) functionsJiri Olsa3-9/+68
Changing the way the annotation lines are allocated and adding annotation_line__(new|delete) functions to deal with this. Before the allocation schema was as follows: ----------------------------------------------------------- struct disasm_line | struct annotation_line | private space ----------------------------------------------------------- Where the private space is used in TUI code to store computed annotation data for events. The stdio code computes the data on the fly. The goal is to compute and store annotation line's data directly in the struct annotation_line itself, so this patch changes the line allocation schema as follows: ------------------------------------------------------------ privsize space | struct disasm_line | struct annotation_line ------------------------------------------------------------ Moving struct annotation_line to the end, because in following changes we will move here the non-fixed length event's data. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-13perf annotate: Move rb_node to struct annotation_lineJiri Olsa2-14/+17
Move rb_node to struct annotation_line to make struct annotation_line the rb tree node for sorted lines used in both stdio and TUI code. This way we can unite the sorted lines lines codes for both TUI and stdio in the following patches. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-13perf annotate: Add annotation_line__add functionJiri Olsa1-3/+3
Rename disasm__add() into annotation_line__add() to make it work over a generic struct annotation_line. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-13perf annotate: Add annotation_line__next functionJiri Olsa3-10/+13
Rename disasm__get_next_ip_line() to annotation_line__next() to make it work over a generic struct annotation_line. Signed-off-by: Jiri Olsa <[email protected]> Cc: Andi Kleen <[email protected]> Cc: David Ahern <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>