Age | Commit message (Collapse) | Author | Files | Lines |
|
There are many instructions, esp on PowerPC, whose mnemonics are longer
than 6 characters. Using precision limit causes truncation of such
mnemonics.
Fix this by removing precision limit. Note that, 'width' is still 6, so
alignment won't get affected for length <= 6.
Before:
li r11,-1
xscvdp vs1,vs1
add. r10,r10,r11
After:
li r11,-1
xscvdpsxds vs1,vs1
add. r10,r10,r11
Reported-by: Donald Stence <[email protected]>
Signed-off-by: Ravi Bangoria <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Taeung Song <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The commit 8e99b6d4533c changed prefixcmp() to strstart() but missed to
change the return value in some place. It makes perf help print
annoying output even for sane config items like below:
$ perf help
'.root': unsupported man viewer sub key.
...
Reported-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Tested-by: Taeung Song <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Sihyeon Jang <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/20171114001542.GA16464@sejong
Fixes: 8e99b6d4533c ("tools include: Adopt strstarts() from the kernel")
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
A recent fix for 'perf trace' introduced a bug where
machine__exit(trace->host) could be called while trace->host was still
NULL, so make this more robust by guarding against NULL, just like
free() does.
The problem happens, for instance, when !root users try to run 'perf
trace':
[acme@jouet linux]$ trace
Error: No permissions to read /sys/kernel/debug/tracing/events/raw_syscalls/sys_(enter|exit)
Hint: Try 'sudo mount -o remount,mode=755 /sys/kernel/debug/tracing'
perf: Segmentation fault
Obtained 7 stack frames.
[0x4f1b2e]
/lib64/libc.so.6(+0x3671f) [0x7f43a1dd971f]
[0x4f3fec]
[0x47468b]
[0x42a2db]
/lib64/libc.so.6(__libc_start_main+0xe9) [0x7f43a1dc3509]
[0x42a6c9]
Segmentation fault (core dumped)
[acme@jouet linux]$
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andrei Vagin <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Vasily Averin <[email protected]>
Cc: Wang Nan <[email protected]>
Fixes: 33974a414ce2 ("perf trace: Call machine__exit() at exit")
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Changing ringbuffer to !overwrite in this task is harmless because
this test uses a very low frequency (1) and using a very simple program
(true). There should have only 3 events in the whole test. Overwriting
is impossible to happen.
Signed-off-by: Wang Nan <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
In this test, a large ring buffer is required so all events can feed
into, so overwrite or not is meaningless.
Change to !overwrite so following commits can remove this argument.
Signed-off-by: Wang Nan <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Unsetting overwrite when calling perf_evlist__mmap is harmless. This
commit passes false to it, makes following commits eliminate the
overwrite argument easier.
Signed-off-by: Wang Nan <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Setting overwrite in perf_evlist__mmap() is meaningless because the
event in this evlist is already have 'overwrite' postfix and goes to
backward ring buffer automatically. Pass 'false' to perf_evlist__mmap()
to make it similar to others.
Signed-off-by: Wang Nan <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Signed-off-by: Sihyeon Jang <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Signed-off-by: Sihyeon Jang <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When using leader sampling the values of the not sampled but counted
events are shown by perf script in "period".
Currently printing period is only allowed when the main event has a
period, that is it is in frequency mode.
This implies that we cannot dump the values of counted events when the
leader event is not in frequency mode.
Just remove the check that the period must be set on all events. It will
just be printed as 0 instead if it's not available.
This fixes the following:
$ perf record -c 100000 -e '{cycles,branches}:S'
$ perf script -F event,period
Further commentary by Jiri Olsa:
The period will be the value of configured period, not 0:
int perf_evsel__parse_sample(struct ...
...
data->period = evsel->attr.sample_period;
$ perf record -c 100000
$ perf script -F event,period | head -3
Failed to open /tmp/perf-2048.map, continuing without symbols
100000 cycles:ppp:
100000 cycles:ppp:
other than that I think we can remove that check, because we will have
always sane number in period
Signed-off-by: Andi Kleen <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Document STAT and CACHE header entries.
Signed-off-by: Andi Kleen <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Clarify the perf buildid-cache help text for the purge operation. The
purge subcommand takes a list of files (binaries) as option parameter.
Make the wording the same as for the add and remove operation.
Signed-off-by: Thomas-Mich Richter <[email protected]>
Reviewed-by: Hendrik Brueckner <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Cc: Martin Schwidefsky <[email protected]>
LPU-Reference: [email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The POWER9 hardware has dropped support for several events, added
a few new events and changed the category for a couple of events.
Update the POWER9 events in Linux to reflect these changes.
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When processing PERF_RECORD_AUXTRACE_INFO several perf_evsel entries
will be synthesized and inserted into session->evlist, eventually ending
in perf_script.tool.sample(), which ends up calling builtin-script.c's
process_event(), that expects evsel->priv to be a perf_evsel_script
object with a valid FILE pointer in fp.
So we need to intercept the processing of PERF_RECORD_AUXTRACE_INFO and
then setup evsel->priv for these newly created perf_evsel instances, do
it to fix the segfault in process_event() trying to use a NULL for that
FILE pointer.
Reported-by: Alexander Shishkin <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Wang Nan <[email protected]>
Cc: yuzhoujian <[email protected]>
Fixes: a14390fde64e ("perf script: Allow creating per-event dump files")
Link: http://lkml.kernel.org/n/[email protected]
[ Merge fix by Ravi Bangoria before pushing upstream to preserv bisectability ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Include newly added fields 'mmap2', 'comm_exec', 'use_clockid', 'namespaces',
'write_backward' and 'context_switch' from perf_event_attr to store_event().
Signed-off-by: Seonghyun Park <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Seonghyun Park <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
[ Fix log message to add 'write_backward', fix the patch to add 'use_clock_id' ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
I forgot one conversion, which got noticed by Thomas when running:
$ perf stat -e '{cpu-clock,instructions}' kill
kill: not enough arguments
Segmentation fault (core dumped)
$
Fix it, those stats are in evsel->stats, not anymore in evsel->priv.
Reported-by: Thomas-Mich Richter <[email protected]>
Tested-by: Thomas-Mich Richter <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Hendrik Brueckner <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Fixes: e669e833da8d ("perf evsel: Restore evsel->priv as a tool private area")
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Using the shell function for perl CFLAGS retrieval instead of back
quotes (``). Both execute shell with the command, but the latter is more
explicit and seems to be the preferred way.
Also we don't have any other use of the back quotes in perf Makefiles.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Currently if trace_event__register_resolver() fails, we return -errno,
but we can't be sure that errno isn't zero in this case.
Signed-off-by: Andrei Vagin <[email protected]>
Reviewed-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Vasily Averin <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Use a typed enum for the perf_evsel_config_term type enum. This allows
gcc to do much stronger type checks, and also check for missing case
statements.
I removed the unused _MAX member from the number.
It found one missing case. I'm not sure it's a real problem, so I just
turned it into a BUG_ON for now.
Signed-off-by: Andi Kleen <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Renamed the enum name to term_type as per jolsa's request ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The Intel PMU event aliases have a implicit period= specifier to set the
default period.
Unfortunately this breaks overriding these periods with -c or -F,
because the alias terms look like they are user specified to the
internal parser, and user specified event qualifiers override the
command line options.
Track that they are coming from aliases by adding a "weak" state to the
term. Any weak terms don't override command line options.
I only did it for -c/-F for now, I think that's the only case that's
broken currently.
Before:
$ perf record -c 1000 -vv -e uops_issued.any
...
{ sample_period, sample_freq } 2000003
After:
$ perf record -c 1000 -vv -e uops_issued.any
...
{ sample_period, sample_freq } 1000
Signed-off-by: Andi Kleen <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Two more, that were just in perf/core and thus weren't covered by Ingo's
latest headers synch, kcmp.h and prctl.h, silencing this:
Warning: Kernel ABI header at 'tools/include/uapi/linux/kcmp.h' differs from latest version at 'include/uapi/linux/kcmp.h'
Warning: Kernel ABI header at 'tools/include/uapi/linux/prctl.h' differs from latest version at 'include/uapi/linux/prctl.h'
Cc: Adrian Hunter <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Wang Nan <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Align source with offset lines, which are more advanced, because of the
address column.
Before:
: static void *worker_thread(void *__tdata)
: {
0.00 : 48a971: push %rbp
0.00 : 48a972: mov %rsp,%rbp
0.00 : 48a975: sub $0x30,%rsp
0.00 : 48a979: mov %rdi,-0x28(%rbp)
0.00 : 48a97d: mov %fs:0x28,%rax
0.00 : 48a986: mov %rax,-0x8(%rbp)
0.00 : 48a98a: xor %eax,%eax
: struct thread_data *td = __tdata;
0.00 : 48a98c: mov -0x28(%rbp),%rax
0.00 : 48a990: mov %rax,-0x10(%rbp)
: int m = 0, i;
0.00 : 48a994: movl $0x0,-0x1c(%rbp)
: int ret;
:
: for (i = 0; i < loops; i++) {
0.00 : 48a99b: movl $0x0,-0x18(%rbp)
After:
: static void *worker_thread(void *__tdata)
: {
0.00 : 48a971: push %rbp
0.00 : 48a972: mov %rsp,%rbp
0.00 : 48a975: sub $0x30,%rsp
0.00 : 48a979: mov %rdi,-0x28(%rbp)
0.00 : 48a97d: mov %fs:0x28,%rax
0.00 : 48a986: mov %rax,-0x8(%rbp)
0.00 : 48a98a: xor %eax,%eax
: struct thread_data *td = __tdata;
0.00 : 48a98c: mov -0x28(%rbp),%rax
0.00 : 48a990: mov %rax,-0x10(%rbp)
: int m = 0, i;
0.00 : 48a994: movl $0x0,-0x1c(%rbp)
: int ret;
:
: for (i = 0; i < loops; i++) {
0.00 : 48a99b: movl $0x0,-0x18(%rbp)
It makes bigger different when displaying script sources, where the
comment lines looks oddly shifted from the lines which actually hold
code. I'll send script support separately.
Committer note:
Do not use a fixed column width for the addresses, as kernel ones se
more than 10 columns, look at the last offset and get the right width.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Factor disasm_line__write function from annotate_browser__write, which
now keeps only generic display code.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Use struct annotation_line in browser::b::top.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Use struct annotation_line in find functions:
annotate_browser__find_string
annotate_browser__find_string_reverse
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Using struct annotation_line arg in browser_line
function to make it generic.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Use struct annotation_line as a browser::offsets array entry.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Use struct annotation_line as a browser::selection.
We want to be able to use the annotate_browser for all sorts of source
data, so it needs to be able to work over the generic struct
annotation_line.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/20171106105617.GC20858@krava
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Rename disasm_line__browser function to browser_line, because the browser got
generic and is no longer disasm specific.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/20171106105552.GB20858@krava
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Rename struct browser_disasm_line to browser_line, because the browser
operates now on generic lines and no longer on disasm lines.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/20171106105536.GA20858@krava
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
We now keep samples_nr in struct annotation_line, so there's no need to
pass nr_events to disasm_rb_tree__insert function.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
We now carry the data in 'struct annotation_line', so using it instead
of samples from 'struct browser_disasm_line' and removing it and its
setup.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Move generic annotation line display code into annotation_line__print
function.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Separating struct annotation_line display function, it will hold the
generic line display code.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Remove struct source_line*, no longer needed.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Remove disasm__calc_percent() function, because it's no longer needed.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
annotate_browser__calc_percent()
Remove disasm__calc_percent() from annotate_browser__calc_percent(),
because we already have the data calculated in struct annotation_line.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Remove disasm__calc_percent() from disasm_line__print(), because we
already have the data calculated in struct annotation_line.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Replace symbol__get_source_line() with symbol__calc_lines(), which
calculates the source line tree over the struct annotation_line.
This will allow us to remove redundant struct source_line in following
patches.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add symbol__calc_percent function, that calculates annotation data for
symbol and put the data in the struct annotation_line::samples array.
Committer notes:
Made symbol__calc_percent non static to be used in the next two patches,
which will get some fixups from jolsa, doing it this way to keep this
bisectable.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
For correct unwinding of user space processes, the floating-point
register contents are required. For example, leaf functions might
use fp registers to temporarily store the return address.
Signed-off-by: Hendrik Brueckner <[email protected]>
Reviewed-and-tested-by: Thomas Richter <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>
|
|
Instead of defining DWARF register to string table in dwarf-regs-table.h
and dwarf-regs.c, use a common table in dwarf-regs-table.h.
Ensure that the DWARF register table is up-to-date with
http://refspecs.linuxfoundation.org/ELF/zSeries/lzsabi0_s390/x1542.html.
For unwinding with libdw, also ensure to correctly setup the DWARF
register frame according to the register mappings. Currently, libdw
supports up to 32 registers only.
Suggested-by: Thomas Richter <[email protected]>
Signed-off-by: Hendrik Brueckner <[email protected]>
Reviewed-and-tested-by: Thomas Richter <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>
|
|
With support for perf_regs and libdw, you can record and report
call graphs for user space programs. Simply invoke perf with
the --call-graph=dwarf command line option.
Signed-off-by: Heiko Carstens <[email protected]>
[brueckner: added dwfl_thread_state_register_pc() call]
Signed-off-by: Hendrik Brueckner <[email protected]>
Reviewed-and-tested-by: Thomas Richter <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>
|
|
Perf tool need implement a callback to enable using AUX buffer. Perf
will do another mmap() to trigger the setup of AUX buffer in kernel
if there is such callback. The default size of the AUX buffer is set
properly according to the sampling frequency to avoid overflow. It
could also be manually set by -m option of perf.
The interface of perf is not changed. Diagnostic mode sampling
could be started by `perf record -e rBD000` like before.
Signed-off-by: Pu Hou <[email protected]>
Reviewed-by: Hendrik Brueckner <[email protected]>
Signed-off-by: Martin Schwidefsky <[email protected]>
|
|
5-level paging provides a 56-bit virtual address space for user space
application. But the kernel defaults to mappings below the 47-bit address
space boundary, which is the upper bound for 4-level paging, unless an
application explicitely request it by using a mmap(2) address hint above
the 47-bit boundary. The kernel prevents mappings which spawn across the
47-bit boundary unless mmap(2) was invoked with MAP_FIXED.
Add a self-test that covers the corner cases of the interface and validates
the correctness of the implementation.
[ tglx: Massaged changelog once more ]
Signed-off-by: Kirill A. Shutemov <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: [email protected]
Cc: Linus Torvalds <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Pull drm updates from Dave Airlie:
"This is the main drm pull request for v4.15.
Core:
- Atomic object lifetime fixes
- Atomic iterator improvements
- Sparse/smatch fixes
- Legacy kms ioctls to be interruptible
- EDID override improvements
- fb/gem helper cleanups
- Simple outreachy patches
- Documentation improvements
- Fix dma-buf rcu races
- DRM mode object leasing for improving VR use cases.
- vgaarb improvements for non-x86 platforms.
New driver:
- tve200: Faraday Technology TVE200 block.
This "TV Encoder" encodes a ITU-T BT.656 stream and can be found in
the StorLink SL3516 (later Cortina Systems CS3516) as well as the
Grain Media GM8180.
New bridges:
- SiI9234 support
New panels:
- S6E63J0X03, OTM8009A, Seiko 43WVF1G, 7" rpi touch panel, Toshiba
LT089AC19000, Innolux AT043TN24
i915:
- Remove Coffeelake from alpha support
- Cannonlake workarounds
- Infoframe refactoring for DisplayPort
- VBT updates
- DisplayPort vswing/emph/buffer translation refactoring
- CCS fixes
- Restore GPU clock boost on missed vblanks
- Scatter list updates for userptr allocations
- Gen9+ transition watermarks
- Display IPC (Isochronous Priority Control)
- Private PAT management
- GVT: improved error handling and pci config sanitizing
- Execlist refactoring
- Transparent Huge Page support
- User defined priorities support
- HuC/GuC firmware refactoring
- DP MST fixes
- eDP power sequencing fixes
- Use RCU instead of stop_machine
- PSR state tracking support
- Eviction fixes
- BDW DP aux channel timeout fixes
- LSPCON fixes
- Cannonlake PLL fixes
amdgpu:
- Per VM BO support
- Powerplay cleanups
- CI powerplay support
- PASID mgr for kfd
- SR-IOV fixes
- initial GPU reset for vega10
- Prime mmap support
- TTM updates
- Clock query interface for Raven
- Fence to handle ioctl
- UVD encode ring support on Polaris
- Transparent huge page DMA support
- Compute LRU pipe tweaks
- BO flag to allow buffers to opt out of implicit sync
- CTX priority setting API
- VRAM lost infrastructure plumbing
qxl:
- fix flicker since atomic rework
amdkfd:
- Further improvements from internal AMD tree
- Usermode events
- Drop radeon support
nouveau:
- Pascal temperature sensor support
- Improved BAR2 handling
- MMU rework to support Pascal MMU
exynos:
- Improved HDMI/mixer support
- HDMI audio interface support
tegra:
- Prep work for tegra186
- Cleanup/fixes
msm:
- Preemption support for a5xx
- Display fixes for 8x96 (snapdragon 820)
- Async cursor plane fixes
- FW loading rework
- GPU debugging improvements
vc4:
- Prep for DSI panels
- fix T-format tiling scanout
- New madvise ioctl
Rockchip:
- LVDS support
omapdrm:
- omap4 HDMI CEC support
etnaviv:
- GPU performance counters groundwork
sun4i:
- refactor driver load + TCON backend
- HDMI improvements
- A31 support
- Misc fixes
udl:
- Probe/EDID read fixes.
tilcdc:
- Misc fixes.
pl111:
- Support more variants
adv7511:
- Improve EDID handling.
- HDMI CEC support
sii8620:
- Add remote control support"
* tag 'drm-for-v4.15' of git://people.freedesktop.org/~airlied/linux: (1480 commits)
drm/rockchip: analogix_dp: Use mutex rather than spinlock
drm/mode_object: fix documentation for object lookups.
drm/i915: Reorder context-close to avoid calling i915_vma_close() under RCU
drm/i915: Move init_clock_gating() back to where it was
drm/i915: Prune the reservation shared fence array
drm/i915: Idle the GPU before shinking everything
drm/i915: Lock llist_del_first() vs llist_del_all()
drm/i915: Calculate ironlake intermediate watermarks correctly, v2.
drm/i915: Disable lazy PPGTT page table optimization for vGPU
drm/i915/execlists: Remove the priority "optimisation"
drm/i915: Filter out spurious execlists context-switch interrupts
drm/amdgpu: use irq-safe lock for kiq->ring_lock
drm/amdgpu: bypass lru touch for KIQ ring submission
drm/amdgpu: Potential uninitialized variable in amdgpu_vm_update_directories()
drm/amdgpu: potential uninitialized variable in amdgpu_vce_ring_parse_cs()
drm/amd/powerplay: initialize a variable before using it
drm/amd/powerplay: suppress KASAN out of bounds warning in vega10_populate_all_memory_levels
drm/amd/amdgpu: fix evicted VRAM bo adjudgement condition
drm/vblank: Tune drm_crtc_accurate_vblank_count() WARN down to a debug
drm/rockchip: add CONFIG_OF dependency for lvds
...
|
|
Merge updates from Andrew Morton:
- a few misc bits
- ocfs2 updates
- almost all of MM
* emailed patches from Andrew Morton <[email protected]>: (131 commits)
memory hotplug: fix comments when adding section
mm: make alloc_node_mem_map a void call if we don't have CONFIG_FLAT_NODE_MEM_MAP
mm: simplify nodemask printing
mm,oom_reaper: remove pointless kthread_run() error check
mm/page_ext.c: check if page_ext is not prepared
writeback: remove unused function parameter
mm: do not rely on preempt_count in print_vma_addr
mm, sparse: do not swamp log with huge vmemmap allocation failures
mm/hmm: remove redundant variable align_end
mm/list_lru.c: mark expected switch fall-through
mm/shmem.c: mark expected switch fall-through
mm/page_alloc.c: broken deferred calculation
mm: don't warn about allocations which stall for too long
fs: fuse: account fuse_inode slab memory as reclaimable
mm, page_alloc: fix potential false positive in __zone_watermark_ok
mm: mlock: remove lru_add_drain_all()
mm, sysctl: make NUMA stats configurable
shmem: convert shmem_init_inodecache() to void
Unify migrate_pages and move_pages access checks
mm, pagevec: rename pagevec drained field
...
|
|
As the page free path makes no distinction between cache hot and cold
pages, there is no real useful ordering of pages in the free list that
allocation requests can take advantage of. Juding from the users of
__GFP_COLD, it is likely that a number of them are the result of copying
other sites instead of actually measuring the impact. Remove the
__GFP_COLD parameter which simplifies a number of paths in the page
allocator.
This is potentially controversial but bear in mind that the size of the
per-cpu pagelists versus modern cache sizes means that the whole per-cpu
list can often fit in the L3 cache. Hence, there is only a potential
benefit for microbenchmarks that alloc/free pages in a tight loop. It's
even worse when THP is taken into account which has little or no chance
of getting a cache-hot page as the per-cpu list is bypassed and the
zeroing of multiple pages will thrash the cache anyway.
The truncate microbenchmarks are not shown as this patch affects the
allocation path and not the free path. A page fault microbenchmark was
tested but it showed no sigificant difference which is not surprising
given that the __GFP_COLD branches are a miniscule percentage of the
fault path.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Mel Gorman <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Jan Kara <[email protected]>
Cc: Johannes Weiner <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
During truncation, the mapping has already been checked for shmem and
dax so it's known that workingset_update_node is required.
This patch avoids the checks on mapping for each page being truncated.
In all other cases, a lookup helper is used to determine if
workingset_update_node() needs to be called. The one danger is that the
API is slightly harder to use as calling workingset_update_node directly
without checking for dax or shmem mappings could lead to surprises.
However, the API rarely needs to be used and hopefully the comment is
enough to give people the hint.
sparsetruncate (tiny)
4.14.0-rc4 4.14.0-rc4
oneirq-v1r1 pickhelper-v1r1
Min Time 141.00 ( 0.00%) 140.00 ( 0.71%)
1st-qrtle Time 142.00 ( 0.00%) 141.00 ( 0.70%)
2nd-qrtle Time 142.00 ( 0.00%) 142.00 ( 0.00%)
3rd-qrtle Time 143.00 ( 0.00%) 143.00 ( 0.00%)
Max-90% Time 144.00 ( 0.00%) 144.00 ( 0.00%)
Max-95% Time 147.00 ( 0.00%) 145.00 ( 1.36%)
Max-99% Time 195.00 ( 0.00%) 191.00 ( 2.05%)
Max Time 230.00 ( 0.00%) 205.00 ( 10.87%)
Amean Time 144.37 ( 0.00%) 143.82 ( 0.38%)
Stddev Time 10.44 ( 0.00%) 9.00 ( 13.74%)
Coeff Time 7.23 ( 0.00%) 6.26 ( 13.41%)
Best99%Amean Time 143.72 ( 0.00%) 143.34 ( 0.26%)
Best95%Amean Time 142.37 ( 0.00%) 142.00 ( 0.26%)
Best90%Amean Time 142.19 ( 0.00%) 141.85 ( 0.24%)
Best75%Amean Time 141.92 ( 0.00%) 141.58 ( 0.24%)
Best50%Amean Time 141.69 ( 0.00%) 141.31 ( 0.27%)
Best25%Amean Time 141.38 ( 0.00%) 140.97 ( 0.29%)
As you'd expect, the gain is marginal but it can be detected. The
differences in bonnie are all within the noise which is not surprising
given the impact on the microbenchmark.
radix_tree_update_node_t is a callback for some radix operations that
optionally passes in a private field. The only user of the callback is
workingset_update_node and as it no longer requires a mapping, the
private field is removed.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Mel Gorman <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Dave Chinner <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Fix up makefiles, remove references, and git rm kmemcheck.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Vegard Nossum <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Eric W. Biederman <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Tim Hansen <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|