Age | Commit message (Collapse) | Author | Files | Lines |
|
Moves 352 bytes from .data to .data.rel.ro.
Signed-off-by: Ian Rogers <[email protected]>
Reviewed-by: Kan Liang <[email protected]>
Tested-by: Atish Patra <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Beeman Strong <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Use a struct/bitmap rather than a copied string from lexer.
In lexer give improved error message when too many precise flags are
given or repeated modifiers.
Before:
$ perf stat -e 'cycles:kuk' true
event syntax error: 'cycles:kuk'
\___ Bad modifier
...
$ perf stat -e 'cycles:pppp' true
event syntax error: 'cycles:pppp'
\___ Bad modifier
...
$ perf stat -e '{instructions:p,cycles:pp}:pp' -a true
event syntax error: '..cycles:pp}:pp'
\___ Bad modifier
...
After:
$ perf stat -e 'cycles:kuk' true
event syntax error: 'cycles:kuk'
\___ Duplicate modifier 'k' (kernel)
...
$ perf stat -e 'cycles:pppp' true
event syntax error: 'cycles:pppp'
\___ Maximum precise value is 3
...
$ perf stat -e '{instructions:p,cycles:pp}:pp' true
event syntax error: '..cycles:pp}:pp'
\___ Maximum combined precise value is 3, adding precision to "cycles:pp"
...
Signed-off-by: Ian Rogers <[email protected]>
Reviewed-by: Kan Liang <[email protected]>
Tested-by: Atish Patra <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Beeman Strong <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Inline parse_events_evlist_error that is only used in
parse_events_error. Modify parse_events_error to not report a parser
error unless errors haven't already been reported. Make it clearer
that the latter case only happens for unrecognized input.
Before:
$ perf stat -e 'cycles/period=99999999999999999999/' true
event syntax error: 'cycles/period=99999999999999999999/'
\___ parser error
event syntax error: '..les/period=99999999999999999999/'
\___ Bad base 10 number "99999999999999999999"
Run 'perf list' for a list of valid events
Usage: perf stat [<options>] [<command>]
-e, --event <event> event selector. use 'perf list' to list available events
$ perf stat -e 'cycles:xyz' true
event syntax error: 'cycles:xyz'
\___ parser error
Run 'perf list' for a list of valid events
Usage: perf stat [<options>] [<command>]
-e, --event <event> event selector. use 'perf list' to list available events
After:
$ perf stat -e 'cycles/period=99999999999999999999/xyz' true
event syntax error: '..les/period=99999999999999999999/xyz'
\___ Bad base 10 number "99999999999999999999"
Run 'perf list' for a list of valid events
Usage: perf stat [<options>] [<command>]
-e, --event <event> event selector. use 'perf list' to list available events
$ perf stat -e 'cycles:xyz' true
event syntax error: 'cycles:xyz'
\___ Unrecognized input
Run 'perf list' for a list of valid events
Usage: perf stat [<options>] [<command>]
-e, --event <event> event selector. use 'perf list' to list available events
Signed-off-by: Ian Rogers <[email protected]>
Reviewed-by: Kan Liang <[email protected]>
Tested-by: Atish Patra <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Beeman Strong <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Use the error handler from the parse_state to give a more informative
error message.
Before:
$ perf stat -e 'cycles/period=99999999999999999999/' true
event syntax error: 'cycles/period=99999999999999999999/'
\___ parser error
Run 'perf list' for a list of valid events
Usage: perf stat [<options>] [<command>]
-e, --event <event> event selector. use 'perf list' to list available events
After:
$ perf stat -e 'cycles/period=99999999999999999999/' true
event syntax error: 'cycles/period=99999999999999999999/'
\___ parser error
event syntax error: '..les/period=99999999999999999999/'
\___ Bad base 10 number "99999999999999999999"
Run 'perf list' for a list of valid events
Usage: perf stat [<options>] [<command>]
-e, --event <event> event selector. use 'perf list' to list available events
Signed-off-by: Ian Rogers <[email protected]>
Reviewed-by: Kan Liang <[email protected]>
Tested-by: Atish Patra <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Beeman Strong <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The helper function just wraps a splice and free. Making the free
inline removes a comment, so then it just wraps a splice which we can
make inline too.
Signed-off-by: Ian Rogers <[email protected]>
Reviewed-by: Kan Liang <[email protected]>
Tested-by: Atish Patra <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Beeman Strong <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
It was requested that RISC-V be able to add events to the perf tool so
the PMU driver didn't need to map legacy events to config encodings:
https://lore.kernel.org/lkml/[email protected]/
This change makes the priority of events specified without a PMU the
same as those specified with a PMU, namely sysfs and JSON events are
checked first before using the legacy encoding.
The hw_term is made more generic as a hardware_event that encodes a
pair of string and int value, allowing parse_events_multi_pmu_add to
fall back on a known encoding when the sysfs/JSON adding fails for
core events. As this covers PE_VALUE_SYM_HW, that token is removed and
related code simplified.
Signed-off-by: Ian Rogers <[email protected]>
Reviewed-by: Kan Liang <[email protected]>
Tested-by: Atish Patra <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Beeman Strong <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Allow the term list to be const so that other functions can pass const
term lists. Add const as necessary to called functions.
Signed-off-by: Ian Rogers <[email protected]>
Reviewed-by: Kan Liang <[email protected]>
Tested-by: Atish Patra <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Beeman Strong <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Avoid duplicate logic for name_or_raw and PE_TERM_HW by having a rule
to turn PE_TERM_HW into a name_or_raw.
Signed-off-by: Ian Rogers <[email protected]>
Reviewed-by: Kan Liang <[email protected]>
Tested-by: Atish Patra <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Beeman Strong <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Prior behavior is to not look for legacy cache names in sysfs/JSON and
to create events on all core PMUs. New behavior is to look for
sysfs/JSON events first on all PMUs, for core PMUs add a legacy event
if the sysfs/JSON event isn't present.
This is done so that there is consistency with how event names in
terms are handled and their prioritization of sysfs/JSON over
legacy. It may make sense to use a legacy cache event name as an event
name on a non-core PMU so we should allow it.
Signed-off-by: Ian Rogers <[email protected]>
Reviewed-by: Kan Liang <[email protected]>
Tested-by: Atish Patra <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Beeman Strong <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Move all implementation to pmu code. Don't allocate a fnmatch wildcard
pattern, matching ignoring the suffix already handles this, and only
use fnmatch if the given PMU name has a '*' in it.
Signed-off-by: Ian Rogers <[email protected]>
Reviewed-by: Kan Liang <[email protected]>
Tested-by: Atish Patra <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Beeman Strong <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
In parse_events_add_pmu, delay copying the list of terms until it is
known the list contains terms.
Signed-off-by: Ian Rogers <[email protected]>
Reviewed-by: Kan Liang <[email protected]>
Tested-by: Atish Patra <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Beeman Strong <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Avoid passing the name of a PMU then finding it again, just directly
pass the PMU. parse_events_multi_pmu_add_or_add_pmu() is the only version
that needs to find a PMU, so move the find there. Remove the error
message as parse_events_multi_pmu_add_or_add_pmu will given an error at
the end when a name isn't either a PMU name or event name. Without the
error message being created the location in the input parameter (loc)
can be removed.
Signed-off-by: Ian Rogers <[email protected]>
Reviewed-by: Kan Liang <[email protected]>
Tested-by: Atish Patra <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Beeman Strong <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Factor out the case of an event or PMU name followed by a slash based
term list. This is with a view to sharing the code with new legacy
hardware parsing. Use early return to reduce indentation in the code.
Make parse_events_add_pmu static now it doesn't need sharing with
parse-events.y.
Signed-off-by: Ian Rogers <[email protected]>
Reviewed-by: Kan Liang <[email protected]>
Tested-by: Atish Patra <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Beeman Strong <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Cross-merge networking fixes after downstream PR.
Conflicts:
drivers/net/ethernet/ti/icssg/icssg_prueth.c
net/mac80211/chan.c
89884459a0b9 ("wifi: mac80211: fix idle calculation with multi-link")
87f5500285fb ("wifi: mac80211: simplify ieee80211_assign_link_chanctx()")
https://lore.kernel.org/all/[email protected]/
net/unix/garbage.c
1971d13ffa84 ("af_unix: Suppress false-positive lockdep splat for spin_lock() in __unix_gc().")
4090fa373f0e ("af_unix: Replace garbage collection algorithm.")
drivers/net/ethernet/ti/icssg/icssg_prueth.c
drivers/net/ethernet/ti/icssg/icssg_common.c
4dcd0e83ea1d ("net: ti: icssg-prueth: Fix signedness bug in prueth_init_rx_chns()")
e2dc7bfd677f ("net: ti: icssg-prueth: Move common functions into a separate file")
No adjacent changes.
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
To pick up fixes sent via perf-tools, by Namhyung Kim.
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
This is a common failure mode when probing userspace C++ code (where the
mangling adds significant length to the symbol names).
Prior to this patch, only a very generic error message is produced,
making the user guess at what the issue is.
Signed-off-by: Dima Kogan <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
In several places we had
char buf[64];
...
snprintf(buf, 64, ...);
This patch changes it to
char buf[64];
...
snprintf(buf, sizeof(buf), ...);
so the "64" is only stated once.
Signed-off-by: Dima Kogan <[email protected]>
Acked-by: Masami Hiramatsu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Hardware counter and event information could be used to help creating event
groups that better utilize hardware counters and improve multiplexing.
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Weilin Wang <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Caleb Biggers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Perry Taylor <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Samantha Alt <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When cross-compiling perf with libelf, the following error occurred:
In file included from tests/genelf.c:14:
tests/../util/genelf.h:50:2: error: #error "unsupported architecture"
50 | #error "unsupported architecture"
| ^~~~~
tests/../util/genelf.h:59:5: warning: "GEN_ELF_CLASS" is not defined, evaluates to 0 [-Wundef]
59 | #if GEN_ELF_CLASS == ELFCLASS64
Fix this by adding GEN-ELF-ARCH and GEN-ELF-CLASS definitions for rv32.
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Chen Pei <[email protected]>
Cc: Albert Ou <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Paul Walmsley <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add weight1, weight2 and weight3 fields to -F/--fields and their aliases
like 'ins_lat', 'p_stage_cyc' and 'retire_lat'. Note that they are in
the sort keys too but the difference is that output fields will sum up
the weight values and display the average.
In the sort key, users can see the distribution of weight value and I
think it's confusing we have local vs. global weight for the same weight.
For example, I experiment with mem-loads events to get the weights. On
my laptop, it seems only weight1 field is supported.
$ perf mem record -- perf test -w noploop
Let's look at the noploop function only. It has 7 samples.
$ perf script -F event,ip,sym,weight | grep noploop
# event weight ip sym
cpu/mem-loads,ldlat=30/P: 43 55b3c122bffc noploop
cpu/mem-loads,ldlat=30/P: 48 55b3c122bffc noploop
cpu/mem-loads,ldlat=30/P: 38 55b3c122bffc noploop <--- same weight
cpu/mem-loads,ldlat=30/P: 38 55b3c122bffc noploop <--- same weight
cpu/mem-loads,ldlat=30/P: 59 55b3c122bffc noploop
cpu/mem-loads,ldlat=30/P: 33 55b3c122bffc noploop
cpu/mem-loads,ldlat=30/P: 38 55b3c122bffc noploop <--- same weight
When you use the 'weight' sort key, it'd show entries with a separate
weight value separately. Also note that the first entry has 3 samples
with weight value 38, so they are displayed together and the weight
value is the sum of 3 samples (114 = 38 * 3).
$ perf report -n -s +weight | grep -e Weight -e noploop
# Overhead Samples Command Shared Object Symbol Weight
0.53% 3 perf perf [.] noploop 114
0.18% 1 perf perf [.] noploop 59
0.18% 1 perf perf [.] noploop 48
0.18% 1 perf perf [.] noploop 43
0.18% 1 perf perf [.] noploop 33
If you use 'local_weight' sort key, you can see the actual weight.
$ perf report -n -s +local_weight | grep -e Weight -e noploop
# Overhead Samples Command Shared Object Symbol Local Weight
0.53% 3 perf perf [.] noploop 38
0.18% 1 perf perf [.] noploop 59
0.18% 1 perf perf [.] noploop 48
0.18% 1 perf perf [.] noploop 43
0.18% 1 perf perf [.] noploop 33
But when you use the -F/--field option instead, you can see the average
weight for the while noploop function (as it won't group samples by
weight value and use the default 'comm,dso,sym' sort keys).
$ perf report -n -F +weight | grep -e Weight -e noploop
Warning:
--fields weight shows the average value unlike in the --sort key.
# Overhead Samples Weight1 Command Shared Object Symbol
1.23% 7 42.4 perf perf [.] noploop
The weight1 field shows the average value:
(38 * 3 + 59 + 48 + 43 + 33) / 7 = 42.4
Also it'd show the warning that 'weight' field has the average value.
Using 'weight1' can remove the warning.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Like period and sample numbers, it'd be better to track weight values
and display them in the output rather than having them as sort keys.
This patch just adds a few more fields to save the weights in a hist
entry. It'll be displayed as new output fields in the later patch.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
It's strange that sort.h has the definition of struct hist_entry. As
sort.h already includes hist.h, let's move the data structure to hist.h.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
In some cases, the stack pointer on x86 (rsp = reg7) is used to point
variables on stack but it's not the frame base register. Then it
should handle the register like normal registers (IOW not to access
the other stack variables using offset calculation) but it should not
assume it would have a pointer.
Before:
-----------------------------------------------------------
find data type for 0x7c(reg7) at tcp_getsockopt+0xb62
CU for net/ipv4/tcp.c (die:0x7b5f516)
frame base: cfa=0 fbreg=6
no pointer or no type
check variable "zc" failed (die: 0x7b9580a)
variable location: base=reg7, offset=0x40
type='struct tcp_zerocopy_receive' size=0x40 (die:0x7b947f4)
After:
-----------------------------------------------------------
find data type for 0x7c(reg7) at tcp_getsockopt+0xb62
CU for net/ipv4/tcp.c (die:0x7b5f516)
frame base: cfa=0 fbreg=6
found "zc" in scope=3/3 (die: 0x7b957fc) type_offset=0x3c
variable location: base=reg7, offset=0x40
type='struct tcp_zerocopy_receive' size=0x40 (die:0x7b947f4)
Note that the type-offset was properly calculated to 0x3c as the
variable starts at 0x40.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
In match_var_offset(), it just checked the end address of the variable
with the given offset because it assumed the register holds a pointer
to the data type and the offset starts from the base.
But I found some cases that the stack pointer (rsp = reg7) register is
used to pointer a stack variable while the frame base is maintained by a
different register (rbp = reg6). In that case, it cannot simply use the
stack pointer as it cannot guarantee that it points to the frame base.
So it needs to check both boundaries of the variable location.
Before:
-----------------------------------------------------------
find data type for 0x7c(reg7) at tcp_getsockopt+0xb62
CU for net/ipv4/tcp.c (die:0x7b5f516)
frame base: cfa=0 fbreg=6
no pointer or no type
check variable "tss" failed (die: 0x7b95801)
variable location: base reg7, offset=0x110
type='struct scm_timestamping_internal' size=0x30 (die:0x7b8c126)
So the current code just checks register number for the non-PC and
non-FB registers and assuming it has offset 0. But this variable has
offset 0x110 so it should not match to this.
After:
-----------------------------------------------------------
find data type for 0x7c(reg7) at tcp_getsockopt+0xb62
CU for net/ipv4/tcp.c (die:0x7b5f516)
frame base: cfa=0 fbreg=6
no pointer or no type
check variable "zc" failed (die: 0x7b9580a)
variable location: base=reg7, offset=0x40
type='struct tcp_zerocopy_receive' size=0x40 (die:7b947f4)
Now it find the correct variable "zc". It was located at reg7 + 0x40
and the size if 0x40 which means it should cover [0x40, 0x80). And the
access was for reg7 + 0x7c so it found the right one. But it still
failed to use the variable and it would be handled in the next patch.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
In match_var_offset(), it checks the offset range with the target type
only for non-pointer types. But it also needs to check the pointer
types with the target type.
This is because there can be more than one pointer variable located in
the same register. Let's look at the following example. It's looking
up a variable for reg3 at tcp_get_info+0x62. It found "sk" variable but
it wasn't the right one since it accesses beyond the target type (struct
'sock' in this case) size.
-----------------------------------------------------------
find data type for 0x7bc(reg3) at tcp_get_info+0x62
CU for net/ipv4/tcp.c (die:0x7b5f516)
frame base: cfa=0 fbreg=6
offset: 1980 is bigger than size: 760
check variable "sk" failed (die: 0x7b92b2c)
variable location: reg3
type='struct sock' size=0x2f8 (die:0x7b63c3a)
Actually there was another variable "tp" in the function and it's
located at the same (reg3) because it's just type-casted like below.
void tcp_get_info(struct sock *sk, struct tcp_info *info)
{
const struct tcp_sock *tp = tcp_sk(sk);
...
The 'struct tcp_sock' contains the 'struct sock' at offset 0 so it can
just use the same address as a pointer to tcp_sock. That means it
should match variables correctly by checking the offset and size.
Actually it cannot distinguish if the offset was smaller than the size
of the original struct sock. But I think it's fine as they are the same
at that part.
So let's check the target type size and retry if it doesn't match.
Now it succeeded to find the correct variable.
-----------------------------------------------------------
find data type for 0x7bc(reg3) at tcp_get_info+0x62
CU for net/ipv4/tcp.c (die:0x7b5f516)
frame base: cfa=0 fbreg=6
found "tp" in scope=1/1 (die: 0x7b92b16) type_offset=0x7bc
variable location: reg3
type='struct tcp_sock' size=0xa68 (die:0x7b81380)
Fixes: bc10db8eb8955fbc ("perf annotate-data: Support stack variables")
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
To verify it found the correct variable, let's add the location
expression to the debug message.
$ perf --debug type-profile annotate --data-type
...
-----------------------------------------------------------
find data type for 0xaf0(reg15) at schedule+0xeb
CU for kernel/sched/core.c (die:0x1180523)
frame base: cfa=0 fbreg=6
found "rq" in scope=3/4 (die: 0x11b6a00) type_offset=0xaf0
variable location: reg15
type='struct rq' size=0xfc0 (die:0x11892e2)
-----------------------------------------------------------
find data type for 0x7bc(reg3) at tcp_get_info+0x62
CU for net/ipv4/tcp.c (die:0x7b5f516)
frame base: cfa=0 fbreg=6
offset: 1980 is bigger than size: 760
check variable "sk" failed (die: 0x7b92b2c)
variable location: reg3
type='struct sock' size=0x2f8 (die:0x7b63c3a)
-----------------------------------------------------------
...
The first case is fine. It looked up a data type in r15 with offset of
0xaf0 at schedule+0xeb. It found the CU die and the frame base info and
the variable "rq" was found in the scope 3/4. Its location is the r15
register and the type size is 0xfc0 which includes 0xaf0.
But the second case is not good. It looked up a data type in rbx (reg3)
with offset 0x7bc. It found a CU and the frame base which is good so
far. And it also found a variable "sk" but the access offset is bigger
than the type size (1980 vs. 760 or 0x7bc vs. 0x2f8). The variable has
the right location (reg3) but I need to figure out why it accesses
beyond what it's supposed to.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[ Fix the build on 32-bit by casting Dwarf_Word to (long) in pr_debug_location() ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Name benchmarks with _ret at the end to avoid creating a new set of
benchmarks.
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andrei Vagin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Kees Kook <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add shellcheck to generate-cmdlist.sh to avoid basic shell script
mistakes.
Reviewed-by: James Clark <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Oliver Upton <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Switch loops within dsos.c, add a version that isn't locked. Switch
some unlocked loops to hold the read lock.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Anne Macedo <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Ben Gainey <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Chengen Du <[email protected]>
Cc: Colin Ian King <[email protected]>
Cc: Ilkka Koskinen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Li Dong <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Markus Elfring <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paran Lee <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Sun Haiyong <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Yang Jihong <[email protected]>
Cc: Yanteng Si <[email protected]>
Cc: Yicong Yang <[email protected]>
Cc: zhaimingbing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Move dso and dso_id functions to dso.c to match the struct declarations.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Anne Macedo <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Ben Gainey <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Chengen Du <[email protected]>
Cc: Colin Ian King <[email protected]>
Cc: Ilkka Koskinen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Li Dong <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Markus Elfring <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paran Lee <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Sun Haiyong <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Yang Jihong <[email protected]>
Cc: Yanteng Si <[email protected]>
Cc: Yicong Yang <[email protected]>
Cc: zhaimingbing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
To better abstract the dsos internals, introduce dsos__for_each_dso that
does a callback on each dso.
This also means the read lock can be correctly held.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Anne Macedo <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Ben Gainey <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Chengen Du <[email protected]>
Cc: Colin Ian King <[email protected]>
Cc: Ilkka Koskinen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Li Dong <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Markus Elfring <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paran Lee <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Sun Haiyong <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Yang Jihong <[email protected]>
Cc: Yanteng Si <[email protected]>
Cc: Yicong Yang <[email protected]>
Cc: zhaimingbing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Move more functionality into dsos.c generally from machine.c, renaming
functions to match their new usage.
The find function is made to always "get" before returning a dso.
Reduce the scope of locks in vdso to match the locking paradigm.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Anne Macedo <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Ben Gainey <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Chengen Du <[email protected]>
Cc: Colin Ian King <[email protected]>
Cc: Ilkka Koskinen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Li Dong <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Markus Elfring <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paran Lee <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Sun Haiyong <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Yang Jihong <[email protected]>
Cc: Yanteng Si <[email protected]>
Cc: Yicong Yang <[email protected]>
Cc: zhaimingbing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Move functions from machine and build-id to dsos. Pass 'struct dsos'
rather than internal state.
Rename some functions to better represent which data structure they
operate on.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Anne Macedo <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Ben Gainey <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Chengen Du <[email protected]>
Cc: Colin Ian King <[email protected]>
Cc: Ilkka Koskinen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Li Dong <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Markus Elfring <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Paran Lee <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Sun Haiyong <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Yang Jihong <[email protected]>
Cc: Yanteng Si <[email protected]>
Cc: Yicong Yang <[email protected]>
Cc: zhaimingbing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
In some data file, I see the following messages repeated. It seems it
doesn't have DSOs in the system and the dso->binary_type is set to
DSO_BINARY_TYPE__NOT_FOUND. Let's skip them to avoid the followings.
No output from objdump --start-address=0x0000000000000000 --stop-address=0x00000000000000d4 -d --no-show-raw-insn -C "$1"
Error running objdump --start-address=0x0000000000000000 --stop-address=0x0000000000000631 -d --no-show-raw-insn -C "$1"
...
Closes: https://lore.kernel.org/linux-perf-users/[email protected]/
Reported-by: Konstantin Kharlamov <[email protected]>
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Tested-by: Konstantin Kharlamov <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The data type profiling alone doesn't need the sample histogram for
functions. It only needs the histogram for the types.
Let's remove the condition in the report_callback to check if data type
profiling is selected and make sure the annotation has the 'struct
annotated_source' instantiated before calling symbol__disassemble().
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Support data type profiling output on TUI.
Testing from Arnaldo:
First make sure that the debug information for your workload binaries
in embedded in them by building it with '-g' or install the debuginfo
packages, since our workload is 'find':
root@number:~# type find
find is hashed (/usr/bin/find)
root@number:~# rpm -qf /usr/bin/find
findutils-4.9.0-5.fc39.x86_64
root@number:~# dnf debuginfo-install findutils
<SNIP>
root@number:~#
Then collect some data:
root@number:~# echo 1 > /proc/sys/vm/drop_caches
root@number:~# perf mem record find / > /dev/null
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.331 MB perf.data (3982 samples) ]
root@number:~#
Finally do data-type annotation with the following command, that will
default, as 'perf report' to the --tui mode, with lines colored to
highlight the hotspots, etc.
root@number:~# perf annotate --data-type
Annotate type: 'struct predicate' (58 samples)
Percent Offset Size Field
100.00 0 312 struct predicate {
0.00 0 8 PRED_FUNC pred_func;
0.00 8 8 char* p_name;
0.00 16 4 enum predicate_type p_type;
0.00 20 4 enum predicate_precedence p_prec;
0.00 24 1 _Bool side_effects;
0.00 25 1 _Bool no_default_print;
0.00 26 1 _Bool need_stat;
0.00 27 1 _Bool need_type;
0.00 28 1 _Bool need_inum;
0.00 32 4 enum EvaluationCost p_cost;
0.00 36 4 float est_success_rate;
0.00 40 1 _Bool literal_control_chars;
0.00 41 1 _Bool artificial;
0.00 48 8 char* arg_text;
<SNIP>
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
And move the related code into util/annotate-data.c file.
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
It's a pseudo data type and has no field.
Fixes: b3c95109c131fcc9 ("perf annotate-data: Add stack canary type")
Closes: https://lore.kernel.org/lkml/Zhb6jJneP36Z-or0@x1
Reported-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
In a debug build there is validation that mmap lists are sorted when
taking a lock. In machine__update_kernel_mmap() the start and end
addresses are updated resulting in an unsorted list before the map is
removed from the list. When the map is removed, the lock is taken which
triggers the validation and the failure:
$ perf test "object code reading"
--- start ---
perf: util/maps.c:88: check_invariants: Assertion `map__start(prev) <= map__start(map)' failed.
Aborted
Fix it by updating the addresses after removal, but before insertion.
The bug depends on the ordering and type of debug info on the system and
doesn't reproduce everywhere.
Fixes: 659ad3492b913c90 ("perf maps: Switch from rbtree to lazily sorted array for addresses")
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: James Clark <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Spoorthy S <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Rather than place metrics without a metric group in "No_group" place
them in a a metric group that is their name. Still allow such metrics
to be selected if "No_group" is passed, this change just impacts perf
list.
Reviewed-by: Kan Liang <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Now symbol__annotate() is reentrant and it doesn't need to remove
non-instruction lines. Let's get rid of symbol__ensure_annotate() and
call symbol__annotate() directly. Also we can use it to get the arch
pointer instead of calling evsel__get_arch() directly.
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
For data type profiling, it removed non-instruction lines from the list
of annotation lines. It was to simplify the implementation dealing with
instructions like to calculate the PC-relative address and to search the
shortest path to the target instruction or basic block.
But it means that it removes all the comments and debug information in
the annotate output like source file name and line numbers. To support
both code annotation and data type annotation, it'd be better to keep
the non-instruction lines as well.
So this change is to skip those lines during the data type profiling
and to display them in the normal perf annotate output.
No function changes intended (other than having more lines).
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The recent change in the global variable handling added a bug to miss
setting the return value even if it found a data type. Also add the
type name in the debug message.
Fixes: 1ebb5e17ef21b492 ("perf annotate-data: Add get_global_var_type()")
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
I got a report for a failure in BPF verifier on a recent kernel with
perf lock contention command. It checks task->sighand->siglock without
checking if sighand is NULL or not. Let's add one.
; if (&curr->sighand->siglock == (void *)lock)
265: (79) r1 = *(u64 *)(r0 +2624) ; frame1: R0_w=trusted_ptr_task_struct(off=0,imm=0)
; R1_w=rcu_ptr_or_null_sighand_struct(off=0,imm=0)
266: (b7) r2 = 0 ; frame1: R2_w=0
267: (0f) r1 += r2
R1 pointer arithmetic on rcu_ptr_or_null_ prohibited, null-check it first
processed 164 insns (limit 1000000) max_states_per_insn 1 total_states 15 peak_states 15 mark_read 5
-- END PROG LOAD LOG --
libbpf: prog 'contention_end': failed to load: -13
libbpf: failed to load object 'lock_contention_bpf'
libbpf: failed to load BPF skeleton 'lock_contention_bpf': -13
Failed to load lock-contention BPF skeleton
lock contention BPF setup failed
lock contention did not detect any lock contention
Fixes: 1811e82767dcc ("perf lock contention: Track and show siglock with address")
Reviewed-by: Ian Rogers <[email protected]>
Acked-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Song Liu <[email protected]>
Cc: [email protected]
Signed-off-by: Namhyung Kim <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
The symbol__annotate2() initializes some data structures needed by TUI.
It has a logic to prevent calling it multiple times by checking if it
has the annotated source. But data type profiling uses a different
code (symbol__annotate) to allocate the annotated lines in advance.
So TUI missed to call symbol__annotate2() when it shows the annotation
browser.
Make symbol__annotate() reentrant and handle that situation properly.
This fixes a crash in the annotation browser started by perf report in
TUI like below.
$ perf report -s type,sym --tui
# and press 'a' key and then move down
Fixes: 81e57deec325 ("perf report: Support data type profiling")
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
It's only used in 'perf annotate' output which means functions with actual
samples. No need to consume memory for every symbol ('struct annotation').
Signed-off-by: Namhyung Kim <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Cc: Ian Rogers <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Arnaldo Carvalho de Melo <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: LKML <[email protected]>
Cc: <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
It's only used in 'perf annotate' output which means functions with actual
samples. No need to consume memory for every symbol ('struct annotation').
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
It's only used in 'perf annotate' output which means functions with actual
samples. No need to consume memory for every symbol ('struct annotation').
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
It's only used in 'perf annotate' output which means functions with
actual samples. No need to consume memory for every symbol
('struct annotation').
Also move the 'max_line_len' field into it as it's related.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The struct annotated_source.offsets[] is to save pointers to
annotation_line at each offset. We can use annotated_source__get_line()
helper instead so let's get rid of the array.
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|