Age | Commit message (Collapse) | Author | Files | Lines |
|
Introduce callgraph_set to indicate whether the callgraph option was set
by user.
Signed-off-by: Kan Liang <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Currently the command line option settings beats the per event period
settings:
With no global settings, we get per-event configuration:
$ perf record -e 'cpu/instructions,period=20000/' sleep 1
$ perf evlist -v
... { sample_period, sample_freq }: 20000 ...
With 'c' option period setup, we get 'c' option value:
$ perf record -e 'cpu/instructions,period=20000/' -c 1000 sleep 1
$ perf evlist -v
... { sample_period, sample_freq }: 1000 ...
This patch makes the per-event settings overload the global 'c' option
setup:
$ perf record -e 'cpu/instructions,period=20000/' -c 1000 sleep 1
$ perf evlist -v
... { sample_period, sample_freq }: 20000 ...
I think the making the per-event settings to overload any other config
makes more sense than current state. However it breaks the current
'period' term handling, which might cause some noise.. so let's see ;-).
Also fixing parse event tests with the new behaviour.
Signed-off-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Kan Liang <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add support to overload any global settings for event and force user
specified term value. It will be useful for new time and backtrace
terms.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Kan Liang <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The semantic associated in tools/perf/ with foo__delete(instance) is to
release all resources referenced by 'instance' members and then release
the memory for 'instance' itself.
The perf_session_env__delete() function isn't doing this, it just does
the first part, but the space used by 'instance' itself isn't freed, as
it is embedded in a larger structure, that will be freed at other stage.
For these cases we se foo__exit(), i.e. the usage is:
void foo__delete(foo)
{
if (foo) {
foo__exit(foo);
free(foo);
}
}
But when we have something like:
struct bar {
struct foo foo;
. . .
}
Then we can't really call foo__delete(&bar.foo), we must have this
instead:
void bar__exit(bar)
{
foo__exit(&bar.foo);
/* free other bar-> resources */
}
void bar__delete(bar)
{
if (bar) {
bar__exit(bar);
free(bar);
}
}
So just rename perf_session_env__delete() to perf_session_env__exit().
Cc: Adrian Hunter <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When HAVE_ELF_GETPHDRNUM_SUPPORT is false we trip on this problem:
CC /tmp/build/perf/util/symbol-elf.o
util/symbol-elf.c:41:12: error: static declaration of ‘elf_getphdrnum’ follows non-static declaration
static int elf_getphdrnum(Elf *elf, size_t *dst)
^
In file included from util/symbol.h:19:0,
from util/symbol-elf.c:8:
/usr/include/libelf.h:206:12: note: previous declaration of ‘elf_getphdrnum’ was here
extern int elf_getphdrnum (Elf *__elf, size_t *__dst);
^
MKDIR /tmp/build/perf/bench/
/home/git/linux/tools/build/Makefile.build:68: recipe for target '/tmp/build/perf/util/symbol-elf.o' failed
make[3]: *** [/tmp/build/perf/util/symbol-elf.o] Error 1
Fix it.
Cc: Adrian Hunter <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
To not sample, what we want are just the PERF_RECORD_ lifetime events
for threads, using the default, PERF_TYPE_HARDWARE +
PERF_COUNT_HW_CYCLES and freq=1 (the default), makes perf reenable
irq_vectors:local_timer_entry, disabling nohz, not good for some use
cases where all we want is to get notifications when threads comes and
goes...
Fix it by using PERF_TYPE_SOFTWARE (no counter rotation) and
PERF_COUNT_SW_DUMMY (created by Adrian so that we could have access to
those PERF_RECORD_ goodies).
Reported-by: Luiz Fernando Capitulino <[email protected]>
Suggested-by: Peter Zijlstra <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jaroslav Skarvada <[email protected]>
Cc: Jeremy Eder <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Those were added to the kernel and tooling but we forgot to
expose them via the python binding, fix it.
Cc: Adrian Hunter <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Cc: David Ahern <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The python binding still doesn't provide symbol resolving facilities,
but the recent addition of the trace_event__register_resolver() function
made it add as a dependency the machine__resolve_kernel_addr() method,
that in turn drags all the symbol resolving code.
The problem:
[root@zoo ~]# perf test -v python
17: Try 'import perf' in python, checking link problems :
--- start ---
test child forked, pid 6853
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: /tmp/build/perf/python/perf.so: undefined symbol: machine__resolve_kernel_addr
test child finished with -1
---- end ----
Try 'import perf' in python, checking link problems: FAILED!
[root@zoo ~]#
Fix it by requiring this function to receive the resolver as a
parameter, just like pevent_register_function_resolver(), i.e. do
not explicitely refer to an object file not included in
tools/perf/util/python-ext-sources.
[root@zoo ~]# perf test python
17: Try 'import perf' in python, checking link problems : Ok
[root@zoo ~]#
Cc: Adrian Hunter <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Stephane Eranian <[email protected]>
Fixes: c3168b0db93a ("perf symbols: Provide libtraceevent callback to resolve kernel symbols")
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
New features:
- Introduce PERF_RECORD_SWITCH(_CPU_WIDE) and use it in 'record' to
ask for context switches, allowing non priviledged tasks to know
when they are switched in and out, which wasn't possible with
the other context switch tracepoint and software events, see the
patch description for a comprehensive justification (Adrian Hunter)
- Stop collecting /proc/kallsyms in perf.data files, saving about
4.5MB on a typical x86-64 system, use the the symbol resolution
routines used in all the other tools (report, top, etc) now that
we can ask libtraceevent to use perf's symbol resolution code.
(Arnaldo Carvalho de Melo)
User visible fixes:
- Expose perf's symbol resolver to libtraceevent, so that its plugins can
resolve tracepoint fields to kernel functions, like the 'function' field
in the "timer:hrtimer_start tracepoint" (Arnaldo Carvalho de Melo)
Infrastructure changes:
- Map propagation of thread and cpu maps improvements, prep work for
'perf stat' new features (Jiri Olsa)
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Add option --show-switch-events to show switch events in a similar
fashion to --show-task-events and --show-mmap-events.
Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Pawel Moll <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The tracking event does not have to be the first event so replace
perf_evlist__first() with perf_evlist__id2evsel() which uses the event
ID to find the correct evsel.
Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Pawel Moll <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add an option to select PERF_RECORD_SWITCH events.
Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Pawel Moll <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Support processing of PERF_RECORD_SWITCH events and
PERF_RECORD_SWITCH_CPU_WIDE events. There is a single
tools callback for them both so that the tool must
check the event type before using the extra members
in PERF_RECORD_SWITCH_CPU_WIDE.
There is still no way to select the events, though.
That is added in a subsequest patch.
Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Pawel Moll <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
There are already two events for context switches, namely the tracepoint
sched:sched_switch and the software event context_switches.
Unfortunately neither are suitable for use by non-privileged users for
the purpose of synchronizing hardware trace data (e.g. Intel PT) to the
context switch.
Tracepoints are no good at all for non-privileged users because they
need either CAP_SYS_ADMIN or /proc/sys/kernel/perf_event_paranoid <= -1.
On the other hand, kernel software events need either CAP_SYS_ADMIN or
/proc/sys/kernel/perf_event_paranoid <= 1.
Now many distributions do default perf_event_paranoid to 1 making
context_switches a contender, except it has another problem (which is
also shared with sched:sched_switch) which is that it happens before
perf schedules events out instead of after perf schedules events in.
Whereas a privileged user can see all the events anyway, a
non-privileged user only sees events for their own processes, in other
words they see when their process was scheduled out not when it was
scheduled in. That presents two problems to use the event:
1. the information comes too late, so tools have to look ahead in the
event stream to find out what the current state is
2. if they are unlucky tracing might have stopped before the
context-switches event is recorded.
This new PERF_RECORD_SWITCH event does not have those problems
and it also has a couple of other small advantages.
It is easier to use because it is an auxiliary event (like mmap, comm
and task events) which can be enabled by setting a single bit. It is
smaller than sched:sched_switch and easier to parse.
To make the event useful for privileged users also, if the
context is cpu-wide then the event record will be
PERF_RECORD_SWITCH_CPU_WIDE which is the same as
PERF_RECORD_SWITCH except it also provides the next or
previous pid/tid.
Signed-off-by: Adrian Hunter <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Mathieu Poirier <[email protected]>
Cc: Pawel Moll <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Since we now ask libtraceevent, the only user of this payload, to use
perf's symbol resolution routines, there is no need to carry about
~4.5MB per perf.data when we can get it from one of the places the perf
symbol resolution looks for that symtab (debuginfo, ~/.debug/,
/proc/kallsyms, --symfs, etc), using the kernel and modules build-ids to
make sure the right table is used.
Acked-by: David Ahern <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Steven Rostedt <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
As it is not used anymore, since 'perf script' switched to asking
libtraceevent to use tools/perf's symbol resolution routines.
Acked-by: David Ahern <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Steven Rostedt <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
We were storing a copy of kallsyms inside perf.data file so that we
could resolve kernel addresses to function (start, name, mod) tuples,
but that can be achieved using the symbol resolving routines we have
in symbols.c, and that are used elsewhere in tools/perf.
So, do just like 'perf trace' did and ask libtraceevent to use perf's
symbol resolution routines.
The next step is to just skip whatever kallsyms data is embedded in
older perf.data files and finally to stop storing kallsyms in the perf
data file, as the 20-bytes build-id stored in perf.data's header is
enough to find out the right symtab (be it ELF, kcore, kallsyms, etc) to
use.
Acked-by: David Ahern <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Steven Rostedt <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
So that beautifiers wanting to resolve kernel function addresses to
names can do its work, now, for instance, the 'timer' tracepoints
beautifiers works with 'perf trace', see the "function=tick..." part:
# perf trace --event timer:hrtimer_start
<SNIP>
0.000 timer:hrtimer_start:hrtimer=0xffff88026f3101c0 function=tick_sched_timer/0x0 expires=52098339000000 softexpires=52098339000000)
0.003 timer:hrtimer_start:hrtimer=0xffff88026f3101c0 function=tick_sched_timer/0x0 expires=52098339000000 softexpires=52098339000000)
<SNIP>
Reported-by: Thomas Gleixner <[email protected]>
Acked-by: David Ahern <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Steven Rostedt <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
That provides the function signature expected by libtraceevent's
pevent_set_function_resolver().
Acked-by: David Ahern <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Steven Rostedt <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The perf tools have a symbol resolver that includes solving kernel
symbols using either kallsyms or ELF symtabs, and it also is using
libtraceevent to format the trace events fields, including via
subsystem specific plugins, like the "timer" one.
To solve fields like "timer:hrtimer_start"'s "function", libtraceevent
needs a way to map from its value to a function name and addr.
This patch provides a way for tools that already have symbol resolving
facilities to ask libtraceevent to use it when needing to resolve
kernel symbols.
Reviewed-by: Steven Rostedt <[email protected]>
Acked-by: David Ahern <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
To, with members we already have, check if a kernel level map is for the
kernel proper or for a module.
Acked-by: David Ahern <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Acked-by: David Ahern <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
We will reuse argv style data in following change to display counters
header showing monitored command line.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Tolerating NULL maps in perf_evlist__propagate_maps, so we dont need to
pass evlist with both cpus and threads maps defined.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
We need only bool info wether user defined her own set of cpus.
Switching target argument to bool so it could be used from places
without target object defined in following patches.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Forcing perf_evlist__set_maps to propagate maps through events, so
cpu/thread maps get set within evlist.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Checking also for refcnt in thread_map test.
Signed-off-by: Jiri Olsa <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
New features:
- Allow filtering out of perf's PID via 'perf record --exclude-perf'. (Wang Nan)
- 'perf trace' now supports syscall groups, like strace, i.e:
$ trace -e file touch file
Will expand 'file' into multiple, file related, syscalls. More work needed to
add extra groups for other syscall groups, and also to complement what was
added for the 'file' group, included as a proof of concept. (Arnaldo Carvalho de Melo)
- Add lock_pi stresser to 'perf bench futex', to test the kernel code
related to FUTEX_(UN)LOCK_PI. (Davidlohr Bueso)
User visible fixes:
- Apply --filter to all events in a glob matching, not just the last one. (Wang Nan)
Documentation changes:
- Document setting '-e pmu/period=N/' in the 'perf record' man page. (Kan Liang)
Infrastructure changes:
- 'perf probe' code simplifications and movements to separate files. (Masami Hiramatsu)
- Fix makefile generation under 'dash'. (Sergei Trofimovich)
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Signed-off-by: Ingo Molnar <[email protected]>
|
|
Allows a way of measuring low level kernel implementation of FUTEX_LOCK_PI and
FUTEX_UNLOCK_PI.
The program comes in two flavors:
(i) single futex (default), all threads contend on the same uaddr. For the
sake of the benchmark, we call into kernel space even when the lock is
uncontended. The kernel will set it to TID, any waters that come in and
contend for the pi futex will be handled respectively by the kernel.
(ii) -M option for multiple futexes, each thread deals with its own futex. This
is a trivial scenario and only measures kernel handling of 0->TID transition.
Signed-off-by: Davidlohr Bueso <[email protected]>
Cc: Mel Gorman <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Under dash 'echo -n' yields '-n' to stdout. Use printf "" instead.
Signed-off-by: Sergei Trofimovich <[email protected]>
Acked-by: Ingo Molnar <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Introduce SBUILD_ID_SIZE macro and use it instead of using BUILD_ID_SIZE
* 2 + 1.
Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Hemant Kumar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Move ftrace probe-event operations to probe-file.c from probe-event.c.
Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Hemant Kumar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
[ Fixed up strlist__new() calls wrt 4a77e2183fc0 ("perf strlist: Make dupstr be the...") ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Simplify the __add_probe_trace_events() code by taking out the
probe_trace_event__set_name() and updating show_perf_probe_event()
Signed-off-by: Masami Hiramatsu <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Hemant Kumar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This patch allows 'perf record' to exclude events issued by perf itself
by '--exclude-perf' option.
Before this patch, when doing something like:
# perf record -a -e syscalls:sys_enter_write <cmd>
One could easily get result like this:
# /tmp/perf report --stdio
...
# Overhead Command Shared Object Symbol
# ........ ....... .................. ....................
#
99.99% perf libpthread-2.18.so [.] __write_nocancel
0.01% ls libc-2.18.so [.] write
0.01% sshd libc-2.18.so [.] write
...
Where most events are generated by perf itself.
A shell trick can be done to filter perf itself out:
# cat << EOF > ./tmp
> #!/bin/sh
> exec perf record -e ... --filter="common_pid != \$\$" -a sleep 10
> EOF
# chmod a+x ./tmp
# ./tmp
However, doing so is user unfriendly.
This patch extracts evsel iteration framework introduced by patch 'perf
record: Apply filter to all events in a glob matching' into
foreach_evsel_in_last_glob(), and makes exclude_perf() function append
new filter expression to each evsel selected by a '-e' selector.
To avoid losing filters if user pass '--filter' after '--exclude-perf',
this patch uses perf_evsel__append_filter() in both case, instead of
perf_evsel__set_filter() which removes old filter. As a side effect, now
it is possible to use multiple '--filter' option for one selector. They
are combinded with '&&'.
Signed-off-by: Wang Nan <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Brendan Gregg <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fix from Martin Schwidefsky:
"Fast path fix for the thread_struct breakage"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: adapt entry.S to the move of thread_struct
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32
Pull AVR32 update from Hans-Christian Egtvedt.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/egtvedt/linux-avr32:
AVR32/time: Migrate to new 'set-state' interface
|
|
There is an old problem in perf's filter applying which first posted at
Sep. 2014 at https://lkml.org/lkml/2014/9/9/944 that, if passing
multiple events in a glob matching expression in cmdline then add
'--filter' after them, the filter will be applied on only the last one.
For example:
# dd if=/dev/zero of=/dev/null &
[1] 464
# perf record -a -e 'syscalls:sys_*_read' --filter 'common_pid != 464' sleep 0.1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.239 MB perf.data (2094 samples) ]
# perf report --stdio | tee
...
# Samples: 2K of event 'syscalls:sys_enter_read'
# Event count (approx.): 2092
...
# Samples: 2 of event 'syscalls:sys_exit_read'
# Event count (approx.): 2
...
In this example, filter only applied on 'syscalls:sys_exit_read', and
there's no way to set filter for ''syscalls:sys_enter_read'.
This patch adds a 'cmdline_group_boundary' for 'struct evsel', and
apply filter on all events between two boundary marks.
After applying this patch:
# perf record -a -e 'syscalls:sys_*_read' --filter 'common_pid != 464' sleep 0.1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.031 MB perf.data (3 samples) ]
# perf report --stdio | tee
...
# Samples: 1 of event 'syscalls:sys_enter_read'
# Event count (approx.): 1
...
# Samples: 2 of event 'syscalls:sys_exit_read'
# Event count (approx.): 2
...
Signed-off-by: Wang Nan <[email protected]>
Reported-by: Brendan Gregg <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Zefan Li <[email protected]>
Cc: [email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
I.e.:
$ cat ~/share/perf-core/strace/groups/file
access
chmod
creat
execve
faccessat
getcwd
lstat
mkdir
open
openat
quotactl
readlink
rename
rmdir
stat
statfs
symlink
unlink
$
Then, on a quiet desktop, try running this and then moving your mouse to
see the deluge of mouse related activity:
# perf probe 'vfs_getname=getname_flags:72 pathname=filename:string'
Added new event:
probe:vfs_getname (on getname_flags:72 with pathname=filename:string)
You can now use it in all perf tools, such as:
perf record -e probe:vfs_getname -aR sleep 1
#
# trace --ev probe:vfs_getname --filter-pids 2232 -e file
0.042 (0.042 ms): mousetweaks/2235 open(filename: 0x14e3910, mode: 438 ) ...
0.042 ( ): probe:vfs_getname:(ffffffff812230bc) pathname="/home/acme/.icons/Adwaita/cursors/xterm")
0.100 (0.100 ms): mousetweaks/2235 ... [continued]: open()) = -1 ENOENT No such file or directory
0.142 (0.018 ms): mousetweaks/2235 open(filename: 0x14c3c10, mode: 438 ) ...
0.142 ( ): probe:vfs_getname:(ffffffff812230bc) pathname="/home/acme/.icons/Adwaita/index.theme")
0.192 (0.069 ms): mousetweaks/2235 ... [continued]: open()) = -1 ENOENT No such file or directory
0.230 (0.017 ms): mousetweaks/2235 open(filename: 0x14c3c10, mode: 438 ) ...
0.230 ( ): probe:vfs_getname:(ffffffff812230bc) pathname="/usr/share/icons/Adwaita/cursors/xterm")
0.253 (0.041 ms): mousetweaks/2235 ... [continued]: open()) = 14
0.459 (0.008 ms): mousetweaks/2235 open(filename: 0x14e3910, mode: 438 ) ...
0.459 ( ): probe:vfs_getname:(ffffffff812230bc) pathname="/home/acme/.icons/Adwaita/cursors/left_side")
0.468 (0.017 ms): mousetweaks/2235 ... [continued]: open()) = -1 ENOENT No such file or directory
Need to combine that raw_syscalls:sys_enter(open) + probe:vfs_getname +
raw_syscalls:sys_exit(open) sequence...
Now, if you're bored, please write some more syscall groups, like the ones
in 'strace' and send it our way :-)
Cc: Adrian Hunter <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Milian Wolff <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
It is not used anywhere, expose it when/if needed.
Cc: Adrian Hunter <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
So, if we have an strlist equal to:
"file,close"
And we call it as:
struct strlist_config *config = { .dirname = "~/strace/groups", };
struct strlist *slist = strlist__new("file, close", &config);
And we have:
$ cat ~/strace/groups/file
access
open
openat
statfs
Then the resulting strlist will have these contents:
[ "access", "open", "openat", "statfs", "close" ]
This will be used to implement strace syscall groups in 'perf trace',
but can be used in some other tool, thus being implemented in 'strlist'.
Cc: Adrian Hunter <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
So that we can pass more info to strlist__new() without having to change
its function signature, just adding entries to the strlist_config struct
with sensible defaults for when those fields are not specified.
Cc: Adrian Hunter <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: David Ahern <[email protected]>
Cc: Frederic Weisbecker <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: http://lkml.kernel.org/n/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
git commit 0c8c0f03e3a292e031596484275c14cf39c0ab7a
"x86/fpu, sched: Dynamically allocate 'struct fpu'"
moved the thread_struct to the end of the task_struct.
This causes some of the offsets used in entry.S to overflow their
instruction operand field. To fix this use aghi to create a
dedicated pointer for the thread_struct.
Signed-off-by: Martin Schwidefsky <[email protected]>
|
|
Migrate avr32 driver to the new 'set-state' interface provided by
clockevents core, the earlier 'set-mode' interface is marked obsolete
now.
This also enables us to implement callbacks for new states of clockevent
devices, for example: ONESHOT_STOPPED.
We want to call cpu_idle_poll_ctrl() in shutdown only if we were in
oneshot or resume state earlier. Create another variable to save this
information and check that in shutdown callback.
Cc: Haavard Skinnemoen <[email protected]>
Cc: Hans-Christian Egtvedt <[email protected]>
Signed-off-by: Viresh Kumar <[email protected]>
Acked-by: Hans-Christian Egtvedt <[email protected]>
|
|
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"Two fairly simple fixes: one is a change that causes us to have a very
low queue depth leading to performance issues and the other is a null
deref occasionally in tapes thanks to use after put"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: fix host max depth checking for the 'queue_depth' sysfs interface
st: null pointer dereference panic caused by use after kref_put by st_open
|
|
Pull MIPS fixes from Ralf Baechle:
"Another round of MIPS fixes for 4.2.
Things are looking quite decent at this stage but the recent work on
the FPU support took its toll:
- fix an incorrect overly restrictive ifdef
- select O32 64-bit FP support for O32 binary compatibility
- remove workarounds for Sibyte SB1250 Pass1 parts. There are rare
fixing the workarounds is not worth the effort.
- patch up an outdated and now incorrect comment"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: fpu.h: Allow 64-bit FPU on a 64-bit MIPS R6 CPU
MIPS: SB1: Remove support for Pass 1 parts.
MIPS: Require O32 FP64 support for MIPS64 with O32 compat
MIPS: asm-offset.c: Patch up various comments refering to the old filename.
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc fix from Helge Deller:
"A memory leak fix from Christophe Jaillet which was introduced with
kernel 4.0 and which leads to kernel crashes on parisc after 1-3 days"
* 'parisc-4.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: mm: Fix a memory leak related to pmd not attached to the pgd
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Pull ARM SoC fixes from Olof Johansson:
"By far most of the fixes here are updates to DTS files to deal with
some mostly minor bugs.
There's also a fix to deal with non-PM kernel configs on i.MX, a
regression fix for ethernet on PXA platforms and a dependency fix for
OMAP"
* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
ARM: keystone: dts: rename pcie nodes to help override status
ARM: keystone: dts: fix dt bindings for PCIe
ARM: pxa: fix dm9000 platform data regression
ARM: dts: Correct audio input route & set mic bias for am335x-pepper
ARM: OMAP2+: Add HAVE_ARM_SCU for AM43XX
MAINTAINERS: digicolor: add dts files
ARM: ux500: fix MMC/SD card regression
ARM: ux500: define serial port aliases
ARM: dts: OMAP5: Add #iommu-cells property to IOMMUs
ARM: dts: OMAP4: Add #iommu-cells property to IOMMUs
ARM: dts: Fix frequency scaling on Gumstix Pepper
ARM: dts: configure regulators for Gumstix Pepper
ARM: dts: omap3: overo: Update LCD panel names
ARM: dts: cros-ec-keyboard: Add support for some Japanese keys
ARM: imx6: gpc: always enable PU domain if CONFIG_PM is not set
ARM: dts: imx53-qsb: fix TVE entry
ARM: dts: mx23: fix iio-hwmon support
ARM: dts: imx27: Adjust the GPT compatible string
ARM: socfpga: dts: Fix entries order
ARM: socfpga: dts: Fix adxl34x formating and compatible string
|