Age | Commit message (Collapse) | Author | Files | Lines |
|
On large systems, cgroups can be created and deleted often. That means
there's a race between perf tools and cgroups when it gets the cgroup
name and opens the cgroup.
I got a report that 'perf stat' with many cgroups failed quite often due
to the missing cgroups on such a large machine.
I think we can ignore such cgroups when expanding events and use id 0 if
it fails to read the cgroup id. IIUC 0 is not a vaild cgroup id so it
won't update event counts for the failed cgroups.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Tracepoints can start with digits, although we don't have many of these:
$ rg -g '*.h' '\bTRACE_EVENT\([0-9]'
net/mac802154/trace.h
53:TRACE_EVENT(802154_drv_return_int,
...
net/ieee802154/trace.h
66:TRACE_EVENT(802154_rdev_add_virtual_intf,
...
include/trace/events/9p.h
124:TRACE_EVENT(9p_client_req,
...
Just allow names to start with digits too so e.g. "perf trace -e '9p:*'"
works
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Dominique Martinet <[email protected]>
Tested-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The next commit will allow tracepoints starting with digits, but most
systems do not have any available by default so tests should skip the
actual "check if it exists in /sys/kernel/debug/tracing" step.
In order to do that, add a new boolean flag specifying if we should
actually "format" the probe or not.
Originally-by: Jiri Olsa <[email protected]>
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Dominique Martinet <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The next patch will add another flag to parse_state that we will want to
pass to evsel__newtp_idx(), so pass the whole parse_state all the way
down instead of giving only the index
Originally-by: Jiri Olsa <[email protected]>
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Dominique Martinet <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The linked commit updated dso__load_vmlinux() to call
dso__set_long_name() before loading the symbols. Loading the symbols may
not succeed but dso__set_long_name() takes ownership of the string. The
two callers of this function free the string themselves on failure
cases, resulting in the following error:
$ perf record -- ls
$ perf report
free(): double free detected in tcache 2
Fix it by always taking ownership of the string, even on failure. This
means the string is either freed at the very first early exit condition,
or later when the dso is deleted or the long name is replaced. Now no
special return value is needed to signify that the caller needs to
free the string.
Fixes: e59fea47f83e8a9a ("perf symbols: Fix DSO kernel load and symbol process to correctly map DSO to its long_name, type and adjust_symbols")
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: James Clark <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When loading kcore, the main vmlinux map is updated in the same loop
that merges the remaining maps. If a map that overlaps is merged in
before kcore, the list can become unsortable when the main map addresses
are updated. This will later trigger the check_invariants() assert:
$ perf record
$ perf report
util/maps.c:96: check_invariants: Assertion `map__end(prev) <=
map__start(map) || map__start(prev) == map__start(map)' failed.
Aborted
Fix it by moving the main map update prior to the loop so that
maps__merge_in() can split it if necessary.
Fixes: 659ad3492b913c90 ("perf maps: Switch from rbtree to lazily sorted array for addresses")
Signed-off-by: James Clark <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
maps__merge_in() hard codes the steps to free the maps_by_name list. It
seems to not map__put() each element before freeing, and it sets
maps_by_name_sorted to true after freeing, which may be harmless but
is inconsistent with maps__init() and other functions.
maps__maps_by_name_addr() is also quite hard to read because we already
have maps__maps_by_name() and maps__maps_by_address(), but the function
is only used in that place so delete it.
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: James Clark <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Make the order of operations remove, update, add. Updating addresses
before the map is removed causes the ordering check to fail when the map
is removed. This can be reproduced when running Perf on an Arm system
with a static kernel and Perf uses kcore rather than other sources:
$ perf record -- ls
$ perf report
util/maps.c:96: check_invariants: Assertion `map__end(prev) <=
map__start(map) || map__start(prev) == map__start(map)' failed
Fixes: 659ad3492b913c90 ("perf maps: Switch from rbtree to lazily sorted array for addresses")
Signed-off-by: James Clark <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
In is_valid_tracepoint, rather than scanning
"/sys/kernel/tracing/events/*/*" skipping any path where
"/sys/kernel/tracing/events/*/*/id" doesn't exist, and then testing if
"*:*" matches the tracepoint name, just use the given tracepoint name
replace the ':' with '/' and see if the id file exists.
This turns a nested directory search into a single file available test.
Rather than return 1 for valid and 0 for invalid, return true and false.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
check_allowed_ops() is used from both HAVE_DWARF_GETLOCATIONS_SUPPORT
and HAVE_DWARF_CFI_SUPPORT sections, so move it into the right place so
that it's available when either are defined. This shows up when doing
a static cross compile for arm64:
$ make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- LDFLAGS="-static" \
EXTRA_PERFLIBS="-lexpat"
util/dwarf-aux.c:1723:6: error: implicit declaration of function 'check_allowed_ops'
Fixes: 55442cc2f22d0727 ("perf dwarf-aux: Check allowed DWARF Ops")
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: James Clark <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Freeing the thread on failure won't work with reference count checking,
use thread__delete().
Don't allocate the comm_str, use a stack allocation instead.
Fixes: f6005cafebab72f8 ("perf thread: Add reference count checking")
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Searching for the entry in the array needs to avoid the intermediate
pointer with reference count checking.
Refactor the array removal to binary search for the entry.
Change the array to hold an entry with a reference count (so the
intermediate pointer can work) and remove from the array when the
reference count on a comm_str falls to 1.
Fixes: 13ca628716c6f2c3 ("perf comm: Add reference count checking to 'struct comm_str'")
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
It's confusing both pointers and arrays are printed as *. Let's print
array types with [] so that we can identify them easily. Although it's
interchangable, sometimes it can cause confusion with size like in the
below example.
Note that it is not the same with C syntax where it goes to the variable
names, but we want to have it in the type names (like in Go language).
Before:
mov [20] 0x68(reg5) -> reg0 type='struct page**' size=0x80 (die:0x4e61d32)
After:
mov [20] 0x68(reg5) -> reg0 type='struct page*[]' size=0x80 (die:0x4e61d32)
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
'struct mem_info' is reference counted while 'struct branch_info' and
he_cache (struct hist_entry **) are not.
Break apart the priv field in 'struct hist_entry_iter' so that we can
know which values are owned by the iter and do the appropriate free or
put.
Move hide_unresolved to marginally shrink the size of the now grown
struct.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Ben Gainey <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Li Dong <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Oliver Upton <[email protected]>
Cc: Paran Lee <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sun Haiyong <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: Yanteng Si <[email protected]>
Cc: Yicong Yang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add reference count checking and switch 'struct mem_info' usage to use
accessor functions.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Ben Gainey <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Li Dong <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Oliver Upton <[email protected]>
Cc: Paran Lee <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sun Haiyong <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: Yanteng Si <[email protected]>
Cc: Yicong Yang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Move mem-info to its own header rather than having it split between
mem-events and symbol.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Ben Gainey <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Li Dong <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Oliver Upton <[email protected]>
Cc: Paran Lee <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sun Haiyong <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: Yanteng Si <[email protected]>
Cc: Yicong Yang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Reference count checking of an rbtree is troublesome as each pointer
should have a reference, switch to using a sorted array.
Remove an indirection by embedding the reference count with the string.
Use pthread_once to safely initialize the comm_strs and reader writer
mutex.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Ben Gainey <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Li Dong <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Oliver Upton <[email protected]>
Cc: Paran Lee <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sun Haiyong <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: Yanteng Si <[email protected]>
Cc: Yicong Yang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
It is assigned a value of 1 and never incremented. Remove and replace
puts with delete.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Ben Gainey <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Li Dong <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Oliver Upton <[email protected]>
Cc: Paran Lee <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sun Haiyong <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: Yanteng Si <[email protected]>
Cc: Yicong Yang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
block_info__get() has no callers so the refcount is only ever one. As
such remove the reference counting logic and turn puts to deletes.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Ben Gainey <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Li Dong <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Oliver Upton <[email protected]>
Cc: Paran Lee <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sun Haiyong <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: Yanteng Si <[email protected]>
Cc: Yicong Yang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Freeing hash map doesn't free the entries added to the hashmap, add
the missing free().
Fixes: d3e7cad6f36d9e80 ("perf annotate: Add a hashmap for symbol histogram")
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Ben Gainey <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Li Dong <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Oliver Upton <[email protected]>
Cc: Paran Lee <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sun Haiyong <[email protected]>
Cc: Tim Chen <[email protected]>
Cc: Yanteng Si <[email protected]>
Cc: Yicong Yang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Currently it's only possible to initialize with the default number of
queues and then use auxtrace_queues__add_event() to grow the array.
But that's problematic if you don't have a real event to pass into that
function yet.
The queues hold a void *priv member to store custom state, and for
Coresight we want to create decoders upfront before receiving data, so
add a new function that allows pre-allocating queues.
One reason to do this is because we might need to store metadata (HW_ID
events) that effects other queues, but never actually receive auxtrace
data on that queue.
Reviewed-by: Anshuman Khandual <[email protected]>
Signed-off-by: James Clark <[email protected]>
Tested-by: Ganapatrao Kulkarni <[email protected]>
Acked-by: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexandre Torgue <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Maxime Coquelin <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steve Clevenger <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The likely fix for this is to update perf so print a helpful message.
Signed-off-by: James Clark <[email protected]>
Tested-by: Ganapatrao Kulkarni <[email protected]>
Acked-by: Anshuman Khandual <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexandre Torgue <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Maxime Coquelin <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Steve Clevenger <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Fix a comment in function which explains how multi_regs field gets set
for an instruction. In the example, "mov %rsi, 8(%rbx,%rcx,4)", the
comment mistakenly referred to "dst_multi_regs = 0". Correct it to use
"src_multi_regs = 0"
Signed-off-by: Athira Rajeev <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Akanksha J N <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Disha Goel <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Segher Boessenkool <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When freeing a->b it is good practice to set a->b to NULL using
zfree(&a->b) so that when we have a bug where a reference to a freed 'a'
pointer is kept somewhere, we can more quickly cause a segfault if some
code tries to use a->b.
Convert one such case in the callchain code.
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lore.kernel.org/lkml/ZjmcGobQ8E52EyjJ@x1
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When freeing a->b it is good practice to set a->b to NULL using
zfree(&a->b) so that when we have a bug where a reference to a freed 'a'
pointer is kept somewhere, we can more quickly cause a segfault if some
code tries to use a->b.
This is mostly done but some new cases were introduced recently, convert
them to zfree().
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lore.kernel.org/lkml/ZjmbHHrjIm5YRIBv@x1
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The dso pointer in 'struct dso_data' is necessary for reference count
checking to account for the dso_data forming a global list of open dso's
with references to the dso.
The dso pointer also allows for the indirection that reference count
checking needs. Outside of reference count checking the indirection
isn't needed and container_of() is more efficient and saves space.
The reference count won't be increased by placing items onto the global
list, matching how things were before the reference count checking
change, but we assert the dso is in dsos holding it live (and that the
set of open dsos is a subset of all dsos for the machine).
Update the DSO data tests so that they use a dsos struct to make the
invariant true.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Tiezhu Yang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
dso__load_sym_internal() passed curr_mapp as an out argument to
dso__process_kernel_symbol(). The out argument was never used so remove
it to simplify the reference counting logic.
Simplify reference counting issues with curr_dso by ensuring the value
it points to has a +1 reference count, and then putting as
necessary.
This avoids some reference counting games when the dso is created making
the code more obviously correct with some possible introduced overhead
due to the reference counting get/puts.
This, however, silences reference count checking and we can always
optimize from a seemingly correct point.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Tiezhu Yang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The dso__put() after the map creation causes a use after put in
dso__set_loaded().
To ensure there is a +1 reference count on both sides of the if-else, do
a dso__get() on the found map's dso.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Tiezhu Yang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
A dso__put() is needed for the dsos__find() when the map is created and
a buildid is sought.
Fixes: f649ed80f3cabbf1 ("perf dsos: Tidy reference counting and locking")
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Tiezhu Yang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add reference count checking to struct dso, this can help with
implementing correct reference counting discipline. To avoid
RC_CHK_ACCESS everywhere, add accessor functions for the variables in
struct dso.
The majority of the change is mechanical in nature and not easy to
split up.
Committer testing:
'perf test' up to this patch shows no regressions.
But:
util/symbol.c: In function ‘dso__load_bfd_symbols’:
util/symbol.c:1683:9: error: too few arguments to function ‘dso__set_adjust_symbols’
1683 | dso__set_adjust_symbols(dso);
| ^~~~~~~~~~~~~~~~~~~~~~~
In file included from util/symbol.c:21:
util/dso.h:268:20: note: declared here
268 | static inline void dso__set_adjust_symbols(struct dso *dso, bool val)
| ^~~~~~~~~~~~~~~~~~~~~~~
make[6]: *** [/home/acme/git/perf-tools-next/tools/build/Makefile.build:106: /tmp/tmp.ZWHbQftdN6/util/symbol.o] Error 1
MKDIR /tmp/tmp.ZWHbQftdN6/tests/workloads/
make[6]: *** Waiting for unfinished jobs....
This was updated:
- symbols__fixup_end(&dso->symbols, false);
- symbols__fixup_duplicate(&dso->symbols);
- dso->adjust_symbols = 1;
+ symbols__fixup_end(dso__symbols(dso), false);
+ symbols__fixup_duplicate(dso__symbols(dso));
+ dso__set_adjust_symbols(dso);
But not build tested with BUILD_NONDISTRO and libbfd devel files installed
(binutils-devel on fedora).
Add the missing argument:
symbols__fixup_end(dso__symbols(dso), false);
symbols__fixup_duplicate(dso__symbols(dso));
- dso__set_adjust_symbols(dso);
+ dso__set_adjust_symbols(dso, true);
Signed-off-by: Ian Rogers <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ahelenia Ziemiańska <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Ben Gainey <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Chengen Du <[email protected]>
Cc: Colin Ian King <[email protected]>
Cc: Dima Kogan <[email protected]>
Cc: Ilkka Koskinen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Li Dong <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paran Lee <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Sun Haiyong <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Tiezhu Yang <[email protected]>
Cc: Yanteng Si <[email protected]>
Cc: zhaimingbing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Switch to using the bsearch library function rather than having a hand
written binary search. Const-ify some static functions to avoid compiler
warnings.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ahelenia Ziemiańska <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Ben Gainey <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Chengen Du <[email protected]>
Cc: Colin Ian King <[email protected]>
Cc: Dima Kogan <[email protected]>
Cc: Ilkka Koskinen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Li Dong <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paran Lee <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Sun Haiyong <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Tiezhu Yang <[email protected]>
Cc: Yanteng Si <[email protected]>
Cc: zhaimingbing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Function was only called in dsos.c with the dso parameter as
NULL. Remove the function and specialize for the dso being NULL case
removing other unused functions along the way.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ahelenia Ziemiańska <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Ben Gainey <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Chengen Du <[email protected]>
Cc: Colin Ian King <[email protected]>
Cc: Dima Kogan <[email protected]>
Cc: Ilkka Koskinen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Li Dong <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paran Lee <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Sun Haiyong <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Tiezhu Yang <[email protected]>
Cc: Yanteng Si <[email protected]>
Cc: zhaimingbing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Function no longer used so remove.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ahelenia Ziemiańska <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Ben Gainey <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Chengen Du <[email protected]>
Cc: Colin Ian King <[email protected]>
Cc: Dima Kogan <[email protected]>
Cc: Ilkka Koskinen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Li Dong <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paran Lee <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Sun Haiyong <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Tiezhu Yang <[email protected]>
Cc: Yanteng Si <[email protected]>
Cc: zhaimingbing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
DSOs were held on a list for fast iteration and in an rbtree for fast
finds.
Switch to using a lazily sorted array where iteration is just iterating
through the array and binary searches are the same complexity as
searching the rbtree.
The find may need to sort the array first which does increase the
complexity, but add operations have lower complexity and overall the
complexity should remain about the same.
The set name operations on the dso just records that the array is no
longer sorted, avoiding complexity in rebalancing the rbtree.
Tighter locking discipline is enforced to avoid the array being resorted
while long and short names or ids are changed.
The array is smaller in size, replacing 6 pointers with 2, and so even
with extra allocated space in the array, the array may be 50%
unoccupied, the memory saving should be at least 2x.
Committer testing:
On a previous version of this patchset we were getting a lot of warnings
about deleting a DSO still on a list, now it is ok:
root@x1:~# perf probe -l
root@x1:~# perf probe finish_task_switch
Added new event:
probe:finish_task_switch (on finish_task_switch)
You can now use it in all perf tools, such as:
perf record -e probe:finish_task_switch -aR sleep 1
root@x1:~# perf probe -l
probe:finish_task_switch (on finish_task_switch@kernel/sched/core.c)
root@x1:~# perf trace -e probe:finish_task_switch/max-stack=8/ --max-events=1
0.000 migration/0/19 probe:finish_task_switch(__probe_ip: -1894408688)
finish_task_switch.isra.0 ([kernel.kallsyms])
__schedule ([kernel.kallsyms])
schedule ([kernel.kallsyms])
smpboot_thread_fn ([kernel.kallsyms])
kthread ([kernel.kallsyms])
ret_from_fork ([kernel.kallsyms])
ret_from_fork_asm ([kernel.kallsyms])
root@x1:~#
root@x1:~# perf probe -d probe:*
Removed event: probe:finish_task_switch
root@x1:~# perf probe -l
root@x1:~#
I also ran the full 'perf test' suite after applying this one, no
regressions.
Signed-off-by: Ian Rogers <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ahelenia Ziemiańska <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Ben Gainey <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Chengen Du <[email protected]>
Cc: Colin Ian King <[email protected]>
Cc: Dima Kogan <[email protected]>
Cc: Ilkka Koskinen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Li Dong <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paran Lee <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Sun Haiyong <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Tiezhu Yang <[email protected]>
Cc: Yanteng Si <[email protected]>
Cc: zhaimingbing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Perf event names aren't case sensitive. For sysfs events the entire
directory of events is read then iterated comparing names in a case
insensitive way, most often to see if an event is present.
Consider:
$ perf stat -e inst_retired.any true
The event inst_retired.any may be present in any PMU, so every PMU's
sysfs events are loaded and then searched with strcasecmp to see if
any match. This event is only present on the cpu PMU as a JSON event
so a lot of events were loaded from sysfs unnecessarily just to prove
an event didn't exist there.
This change avoids loading all the events by assuming sysfs event
names are always either lower or uppercase. It uses file exists and
only loads the events when the desired event is present.
For the example above, the number of openat calls measured by 'perf
trace' on a tigerlake laptop goes from 325 down to 255. The reduction
will be larger for machines with many PMUs, particularly replicated
uncore PMUs.
Ensure pmu_aliases_parse() is called before all uses of the aliases
list, but remove some "pmu->sysfs_aliases_loaded" tests as they are now
part of the function.
Reviewed-by: Kan Liang <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jing Zhang <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Thomas Richter <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Allow events/aliases to be eagerly loaded for a PMU. Factor out the
pmu_aliases_parse to allow this.
Parse a test event and check it configures the attribute as expected.
There is overlap with the parse-events tests, but this test is done with
a PMU created in a temp directory and doesn't rely on PMUs in sysfs.
Reviewed-by: Kan Liang <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jing Zhang <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Thomas Richter <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
In tests/pmu.c, make a common utility that creates a PMU in a mkdtemp
directory and uses regular PMU parsing logic to load that PMU. Formats
must still be eagerly loaded as by default the PMU code assumes devices
are going to be in sysfs.
In util/pmu.[ch], hide perf_pmu__format_parse but add the eager argument
to perf_pmu__lookup called by perf_pmus__add_test_pmu. Later patches
will eagerly load other non-sysfs files when eager loading is enabled.
In tests/pmu.c, rather than manually constructing a list of term
arguments, just use the term parsing code from a string.
Add more comments and debug logging.
Reviewed-by: Kan Liang <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Bjorn Helgaas <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jing Zhang <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Randy Dunlap <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Thomas Richter <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
I found that the debug build was a slowed down a lot by the maps lock
code since it checks the invariants whenever it gets the pointer to the
lock. This means it checks twice the invariants before and after the
access.
Instead, let's move the checking code within the lock area but after any
modification and remove it from the read paths. This would remove (more
than) half of the maps lock overhead.
The time for perf report with a huge data file (200k+ of MMAP2 events).
Non-debug Before After
--------- -------- --------
2m 43s 6m 45s 4m 21s
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
I sometimes see ("unknown type") in the result and it was because it
didn't check the type of stack variables properly during the instruction
tracking. The stack can carry constant values (without type info) and
if the target instruction is accessing the stack location, it resulted
in the "unknown type".
Maybe we could pick one of integer types for the constant, but it
doesn't really mean anything useful. Let's just drop the stack slot if
it doesn't have a valid type info.
Here's an example how it got the unknown type.
Note that 0xffffff48 = -0xb8.
-----------------------------------------------------------
find data type for 0xffffff48(reg6) at ...
CU for ...
frame base: cfa=0 fbreg=6
scope: [2/2] (die:11cb97f)
bb: [37 - 3a]
var [37] reg15 type='int' size=0x4 (die:0x1180633)
bb: [40 - 4b]
mov [40] imm=0x1 -> reg13
var [45] reg8 type='sigset_t*' size=0x8 (die:0x11a39ee)
mov [45] imm=0x1 -> reg2 <--- here reg2 has a constant
bb: [215 - 237]
mov [218] reg2 -> -0xb8(stack) constant <--- and save it to the stack
mov [225] reg13 -> -0xc4(stack) constant
call [22f] find_task_by_vgpid
call [22f] return -> reg0 type='struct task_struct*' size=0x8 (die:0x11881e8)
bb: [5c8 - 5cf]
bb: [2fb - 302]
mov [2fb] -0xc4(stack) -> reg13 constant
bb: [13b - 14d]
mov [143] 0xd50(reg3) -> reg5 type='struct task_struct*' size=0x8 (die:0xa31f3c)
bb: [153 - 153]
chk [153] reg6 offset=0xffffff48 ok=0 kind=0 fbreg <--- access here
found by insn track: 0xffffff48(reg6) type-offset=0
type='G<EF>^K<F6><AF>U' size=0 (die:0xffffffffffffffff)
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The instruction tracking should be the same for the both registers.
Just do it once and compare the result with multi regs as with the
previous patches.
Then we don't need to call find_data_type_block() separately for each
reg.
Let's remove the 'reg' argument from the relevant functions.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The following instruction pattern is used to access a global variable.
mov $0x231c0, %rax
movsql %edi, %rcx
mov -0x7dc94ae0(,%rcx,8), %rcx
cmpl $0x0, 0xa60(%rcx,%rax,1) <<<--- here
The first instruction set the address of the per-cpu variable (here, it
is 'runqueues' of type 'struct rq'). The second instruction seems like
a cpu number of the per-cpu base. The third instruction get the base
offset of per-cpu area for that cpu. The last instruction compares the
value of the per-cpu variable at the offset of 0xa60.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Like per-cpu base offset array, sometimes it accesses the global
variable directly using the offset. Allow this type of instructions as
long as it finds a global variable for the address.
movslq %edi, %rcx
mov -0x7dc94ae0(,%rcx,8), %rcx <<<--- here
As %rcx has a valid type (i.e. array index) from the first instruction,
it will be checked by the first case in check_matching_type(). But as
it's not a pointer type, the match will fail. But in this case, it
should check if it accesses the kernel global array variable.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Currently it looks up global variables from the current CU using address
and name. But it sometimes fails to find a variable as the variable can
come from a different CU - but it's still strange it failed to find a
declaration for some reason.
Anyway, it can collect all global variables from all CU once and then
lookup them later on. This slightly improves the success rate of my
test data set.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This function is to search all global variables in the CU. We want to
have the list of global variables at once and match them later.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
MemorySanitizer discovered instances where the instruction op value was
not assigned.:
WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0x5581c00a76b3 in intel_pt_sample_flags tools/perf/util/intel-pt.c:1527:17
Uninitialized value was stored to memory at
#0 0x5581c005ddf8 in intel_pt_walk_insn tools/perf/util/intel-pt-decoder/intel-pt-decoder.c:1256:25
The op value is used to set branch flags for branch instructions
encountered when walking the code, so fix by setting op to
INTEL_PT_OP_OTHER in other cases.
Fixes: 4c761d805bb2d2ea ("perf intel-pt: Fix intel_pt_fup_event() assumptions about setting state type")
Reported-by: Ian Rogers <[email protected]>
Signed-off-by: Adrian Hunter <[email protected]>
Tested-by: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Closes: https://lore.kernel.org/linux-perf-users/[email protected]/
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
dso__disassemble_filename() tries to get the filename for objdump (or
capstone) using build-id. But I found sometimes it didn't disassemble
some functions.
It turned out that those functions belong to a DSO which has no binary
type set. It seems it sets the binary type for some special files only
- like kernel (kallsyms or kcore) or BPF images. And there's a logic to
skip dso with DSO_BINARY_TYPE__NOT_FOUND.
As it's checked the build-id cache link, it should set the binary type
as DSO_BINARY_TYPE__BUILD_ID_CACHE.
Fixes: 873a83731f1cc85c ("perf annotate: Skip DSOs not found")
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
I found some cases that capstone failed to disassemble. Probably my
capstone is an old version but anyway there's a chance it can fail. And
then it silently stopped in the middle. In my case, it didn't
understand "RDPKRU" instruction.
Let's check if the capstone disassemble reached the end of the function
and fallback to objdump if not.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
'perf report' TUI
As it removed the sample accounting for code when no symbol sort key is
given for 'perf report' TUI, it might not have allocated the
'struct annotated_source' yet. Let's check if it's NULL first.
Fixes: 6cdd977ec24e1538 ("perf report: Do not collect sample histogram unnecessarily")
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add comments. Pass ownership of the event name to save on a strdup.
Signed-off-by: Ian Rogers <[email protected]>
Reviewed-by: Kan Liang <[email protected]>
Tested-by: Atish Patra <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Beeman Strong <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add comments. Ensure leader->group_name is freed before overwriting
it.
Signed-off-by: Ian Rogers <[email protected]>
Reviewed-by: Kan Liang <[email protected]>
Tested-by: Atish Patra <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Beeman Strong <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|