Age | Commit message (Collapse) | Author | Files | Lines |
|
The macro maps__for_each_entry is error prone as it doesn't require
holding the maps lock.
Add a new function that iterates the maps holding the read lock.
Convert maps__find_symbol_by_name() and maps__fprintf() to use callbacks,
the latter being an example of where the read lock wasn't being held.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Colin Ian King <[email protected]>
Cc: Dmitrii Dolgov <[email protected]>
Cc: German Gomez <[email protected]>
Cc: Guilherme Amadio <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Li Dong <[email protected]>
Cc: Liam Howlett <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Miguel Ojeda <[email protected]>
Cc: Ming Wang <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Nick Terrell <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sandipan Das <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Steinar H. Gunderson <[email protected]>
Cc: Vincent Whitchurch <[email protected]>
Cc: Wenyu Liu <[email protected]>
Cc: Yang Jihong <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The u64 values are either absolute or relative, try to hint better in
the parameter names.
Suggested-by: Namhyung Kim <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Colin Ian King <[email protected]>
Cc: Dmitrii Dolgov <[email protected]>
Cc: German Gomez <[email protected]>
Cc: Guilherme Amadio <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Li Dong <[email protected]>
Cc: Liam Howlett <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Miguel Ojeda <[email protected]>
Cc: Ming Wang <[email protected]>
Cc: Nick Terrell <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sandipan Das <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Steinar H. Gunderson <[email protected]>
Cc: Vincent Whitchurch <[email protected]>
Cc: Wenyu Liu <[email protected]>
Cc: Yang Jihong <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add a local variable to avoid repeated calls to perf_cpu_map__nr().
Reviewed-by: James Clark <[email protected]>
Signed-off-by: Adrian Hunter <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Adrian Hunter <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexandre Ghiti <[email protected]>
Cc: Andrew Jones <[email protected]>
Cc: André Almeida <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: Atish Patra <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Darren Hart <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Paran Lee <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sandipan Das <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Steinar H. Gunderson <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yang Jihong <[email protected]>
Cc: Yang Li <[email protected]>
Cc: Yanteng Si <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Make the DSO data tests a suite rather than individual so their output
is grouped.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Yang Jihong <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
PMU name
When turning an event with attributes to the format including the PMU we
need to move the "event:attributes" format to "event/attributes/" so
that we can copy the event displayed and use it in the command line,
i.e. in 'perf top' we had:
1K cpu_atom/cycles:P/
11K cpu_core/cycles:P/
If I try to use that on the command line:
# perf top -e cpu_atom/cycles:P/
event syntax error: 'cpu_atom/cycles:P/'
\___ Bad event or PMU
Unable to find PMU or event on a PMU of 'cpu_atom'
Initial error:
event syntax error: 'cpu_atom/cycles:P/'
\___ unknown term 'cycles:P' for pmu
'cpu_atom'
valid terms:
event,pc,edge,offcore_rsp,ldlat,inv,umask,cmask,config,config1,config2,config3,name,period,freq,branch_type,time,call-graph,stack-size,no-inherit,inherit,max-stack,nr,no-overwrite,overwrite ,driver-config,percore,aux-output,aux-sample-size,metric-id,raw,legacy-cache,hardware
Run
'perf list' for a list of valid events
Usage: perf top [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
#
Tested-by: Kan Liang <[email protected]>
Cc: Hector Martin <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
It's hard to distinguish the default cycles events among hybrid PMUs.
For example,
$ perf top
Available samples
385 cycles:P
903 cycles:P
The other tool, e.g., perf record, uniforms the event name and adds the
hybrid PMU name before opening the event. So the events can be easily
distinguished. Apply the same methodology for the perf top as well.
The evlist__uniquify_name() will be invoked by both record and top.
Move it to util/evlist.c
With the patch:
$ perf top
Available samples
148 cpu_atom/cycles:P/
1K cpu_core/cycles:P/
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Hector Martin <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
perf top errors out on a hybrid machine
$perf top
Error:
The cycles:P event is not supported.
The perf top expects that the "cycles" is collected on all CPUs in the
system. But for hybrid there is no single "cycles" event which can cover
all CPUs. Perf has to split it into two cycles events, e.g.,
cpu_core/cycles/ and cpu_atom/cycles/. Each event has its own CPU mask.
If a event is opened on the unsupported CPU. The open fails. That's the
reason of the above error out.
Perf should only open the cycles event on the corresponding CPU. The
commit ef91871c960e ("perf evlist: Propagate user CPU maps intersecting
core PMU maps") intersect the requested CPU map with the CPU map of the
PMU. Use the evsel's cpus to replace user_requested_cpus.
The evlist's threads are also propagated to the evsel's threads in
__perf_evlist__propagate_maps(). For a system-wide event, perf appends
a dummy event and assign it to the evsel's threads. For a per-thread
event, the evlist's thread_map is assigned to the evsel's threads. The
same as the other tools, e.g., perf record, using the evsel's threads
when opening an event.
Reported-by: Arnaldo Carvalho de Melo <[email protected]>
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Hector Martin <[email protected]>
Cc: Marc Zyngier <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Closes: https://lore.kernel.org/linux-perf-users/[email protected]/
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The base address of a DSO mapping should start at the start of the file.
Usually DSOs are mapped from the pgoff 0 so it doesn't matter when it
uses the start of the map address.
But generated DSOs for JIT codes doesn't start from the 0 so it should
subtract the offset to calculate the .eh_frame table offsets correctly.
Fixes: dc2cf4ca866f5715 ("perf unwind: Fix segbase for ld.lld linked objects")
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Fangrui Song <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Milian Wolff <[email protected]>
Cc: Pablo Galindo <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Usually DSOs are mapped from the beginning of the file, so the base
address of the DSO can be calculated by map->start - map->pgoff.
However, JIT DSOs which are generated by `perf inject -j`, are mapped
only the code segment. This makes unwind-libdw code confusing and
rejects processing unwinds in the JIT DSOs. It should use the map
start address as base for them to fix the confusion.
Fixes: 1fe627da30331024 ("perf unwind: Take pgoff into account when reporting elf to libdwfl")
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Fangrui Song <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Milian Wolff <[email protected]>
Cc: Pablo Galindo <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The text section starts after the ELF headers so PHDR.p_vaddr and
others should have the correct addresses.
Fixes: babd04386b1df8c3 ("perf jit: Include program header in ELF files")
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Fangrui Song <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Lieven Hey <[email protected]>
Cc: Milian Wolff <[email protected]>
Cc: Pablo Galindo <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The -A or --no-aggr option disables aggregation of core events:
$ perf stat -A -e cycles,data_total -a true
Performance counter stats for 'system wide':
CPU0 1,287,665 cycles
CPU1 1,831,681 cycles
CPU2 27,345,998 cycles
CPU3 1,964,799 cycles
CPU4 236,174 cycles
CPU5 3,302,825 cycles
CPU6 9,201,446 cycles
CPU7 1,403,043 cycles
CPU0 110.90 MiB data_total
0.008961761 seconds time elapsed
The --no-merge option disables the aggregation of uncore events:
$ perf stat --no-merge -e cycles,data_total -a true
Performance counter stats for 'system wide':
38,482,778 cycles
15.04 MiB data_total [uncore_imc_free_running_1]
15.00 MiB data_total [uncore_imc_free_running_0]
0.005915155 seconds time elapsed
Having two options confuses users who generally don't appreciate the
difference in PMUs. Keep all the options but make it so they all
disable aggregation both of core and uncore events:
$ perf stat -A -e cycles,data_total -a true
Performance counter stats for 'system wide':
CPU0 85,878 cycles
CPU1 88,179 cycles
CPU2 60,872 cycles
CPU3 3,265,567 cycles
CPU4 82,357 cycles
CPU5 83,383 cycles
CPU6 84,156 cycles
CPU7 220,803 cycles
CPU0 2.38 MiB data_total [uncore_imc_free_running_0]
CPU0 2.38 MiB data_total [uncore_imc_free_running_1]
0.001397205 seconds time elapsed
Update the relevant 'perf stat' man page information.
Reviewed-by: Kan Liang <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Kaige Ye <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
ASan complains a memory leakage in hisi_ptt_process_auxtrace_event()
that the data buffer is not freed. Since currently we only support the
raw dump trace mode, the data buffer is used only within this function.
So fix this by freeing the data buffer before going out.
Fixes: 5e91e57e68090c0e ("perf auxtrace arm64: Add support for parsing HiSilicon PCIe Trace packet")
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Yicong Yang <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Junhao He <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Qi Liu <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When dump the raw trace by `perf report -D` ASan reports a memory
leakage in perf_event__fprintf_event_update().
It shows that we allocated a temporary cpumap for dumping the CPUs but
doesn't release it and it's not used elsewhere. Fix this by free the
cpumap after the dumping.
Fixes: c853f9394b7bc189 ("perf tools: Add perf_event__fprintf_event_update function")
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Yicong Yang <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: Junhao He <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When iterating CPUs in a CPU map it is often desirable to skip the "any
CPU" (aka dummy) case. Add a helper for this and use in builtin-record.
Reviewed-by: James Clark <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexandre Ghiti <[email protected]>
Cc: Andrew Jones <[email protected]>
Cc: André Almeida <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Atish Patra <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Darren Hart <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Paran Lee <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sandipan Das <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Steinar H. Gunderson <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yang Jihong <[email protected]>
Cc: Yang Li <[email protected]>
Cc: Yanteng Si <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
perf_cpu_map__new_online_cpus()
Passing NULL to perf_cpu_map__new() performs
perf_cpu_map__new_online_cpus(), just directly call
perf_cpu_map__new_online_cpus() to be more intention revealing.
Reviewed-by: James Clark <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexandre Ghiti <[email protected]>
Cc: Andrew Jones <[email protected]>
Cc: André Almeida <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Atish Patra <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Darren Hart <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Paran Lee <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sandipan Das <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Steinar H. Gunderson <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yang Jihong <[email protected]>
Cc: Yang Li <[email protected]>
Cc: Yanteng Si <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
perf_cpu_map__has_any_cpu_or_is_empty()
The name perf_cpu_map_empty is misleading as true is also returned
when the map contains an "any" CPU (aka dummy) map.
Rename to perf_cpu_map__has_any_cpu_or_is_empty(), later changes will
(re)introduce perf_cpu_map__empty() and perf_cpu_map__has_any_cpu().
Reviewed-by: James Clark <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexandre Ghiti <[email protected]>
Cc: Andrew Jones <[email protected]>
Cc: André Almeida <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Atish Patra <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Darren Hart <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Paran Lee <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sandipan Das <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Steinar H. Gunderson <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yang Jihong <[email protected]>
Cc: Yang Li <[email protected]>
Cc: Yanteng Si <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Rename perf_cpu_map__dummy_new() to perf_cpu_map__new_any_cpu() to
better indicate this is creating a CPU map for the perf_event_open "any"
CPU case.
Reviewed-by: James Clark <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexandre Ghiti <[email protected]>
Cc: Andrew Jones <[email protected]>
Cc: André Almeida <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Atish Patra <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Darren Hart <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Nick Desaulniers <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Paran Lee <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sandipan Das <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Steinar H. Gunderson <[email protected]>
Cc: Suzuki Poulouse <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yang Jihong <[email protected]>
Cc: Yang Li <[email protected]>
Cc: Yanteng Si <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Copy-paste error led to help message for metric-no-threshold repeating
that of metric-no-merge.
Fixes: 1fd09e299bdd434b ("perf metric: Add --metric-no-threshold option")
Reported-by: Stephane Eranian <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
It doesn't need the option in the struct annotation which is allocated
for each symbol. It can directly use the global options and save 8
bytes per symbol.
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
So that it can get rid of the unused data.
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Now it only cares about the global options so it can just handle it
without the argument.
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Now it can use the global options and no need save local browser
options separately.
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Now it can directly use the global options and no need to pass it as an
argument.
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[ Fixup build with GTK2=1 ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Use the global option and drop the local copy.
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Use the global option and drop the local copy.
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The annotation options are to control the behavior of objdump and the
output. It's basically used by 'perf annotate' but 'perf report' and
'perf top' can call it on TUI dynamically.
But it doesn't need to have a copy of annotation options in many places.
As most of the work is done in the util/annotate.c file, add a global
variable and set/use it instead of having their own copies.
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Metrics were added by a callback but commit a4b8cfcabb1d90ec ("perf
stat: Delay metric parsing") postponed this to allow optimizations based
on the CPU configuration.
In doing so it stopped errors in metric parsing from causing 'perf stat'
termination.
This change adds the termination for bad metric names back in.
Fixes: a4b8cfcabb1d90ec ("perf stat: Delay metric parsing")
Reported-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Tested-by: Arnaldo Carvalho de Melo <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Closes: https://lore.kernel.org/lkml/[email protected]/
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Comparing pointers without RC_CHK_ACCESS means the indirect object
will be compared rather than the underlying maps when REFCNT_CHECKING
is enabled. Fix by adding missing RC_CHK_EQUAL.
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Colin Ian King <[email protected]>
Cc: Dmitrii Dolgov <[email protected]>
Cc: German Gomez <[email protected]>
Cc: Guilherme Amadio <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Li Dong <[email protected]>
Cc: Liam Howlett <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masami Hiramatsu (Google) <[email protected]>
Cc: Miguel Ojeda <[email protected]>
Cc: Ming Wang <[email protected]>
Cc: Nick Terrell <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sandipan Das <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Steinar H. Gunderson <[email protected]>
Cc: Vincent Whitchurch <[email protected]>
Cc: Wenyu Liu <[email protected]>
Cc: Yang Jihong <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Move the find and certain other symbol maps__* functions to maps.c for
better abstraction.
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Colin Ian King <[email protected]>
Cc: Dmitrii Dolgov <[email protected]>
Cc: German Gomez <[email protected]>
Cc: Guilherme Amadio <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Li Dong <[email protected]>
Cc: Liam Howlett <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masami Hiramatsu (Google) <[email protected]>
Cc: Miguel Ojeda <[email protected]>
Cc: Ming Wang <[email protected]>
Cc: Nick Terrell <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sandipan Das <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Steinar H. Gunderson <[email protected]>
Cc: Vincent Whitchurch <[email protected]>
Cc: Wenyu Liu <[email protected]>
Cc: Yang Jihong <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When mapping an IP it is either an identity mapping or a DSO relative
mapping, so a single bit is required in the struct to identify
this.
The current code uses function pointers, adding 2 pointers per map and
also pushing the size of a map beyond 1 cache line.
Switch to using a byte to identify the mapping type (as well as priv and
erange_warned), to avoid any masking.
Change struct maps's layout to avoid holes.
Before:
```
struct map {
u64 start; /* 0 8 */
u64 end; /* 8 8 */
_Bool erange_warned:1; /* 16: 0 1 */
_Bool priv:1; /* 16: 1 1 */
/* XXX 6 bits hole, try to pack */
/* XXX 3 bytes hole, try to pack */
u32 prot; /* 20 4 */
u64 pgoff; /* 24 8 */
u64 reloc; /* 32 8 */
u64 (*map_ip)(const struct map *, u64); /* 40 8 */
u64 (*unmap_ip)(const struct map *, u64); /* 48 8 */
struct dso * dso; /* 56 8 */
/* --- cacheline 1 boundary (64 bytes) --- */
refcount_t refcnt; /* 64 4 */
u32 flags; /* 68 4 */
/* size: 72, cachelines: 2, members: 12 */
/* sum members: 68, holes: 1, sum holes: 3 */
/* sum bitfield members: 2 bits, bit holes: 1, sum bit holes: 6 bits */
/* last cacheline: 8 bytes */
};
```
After:
```
struct map {
u64 start; /* 0 8 */
u64 end; /* 8 8 */
u64 pgoff; /* 16 8 */
u64 reloc; /* 24 8 */
struct dso * dso; /* 32 8 */
refcount_t refcnt; /* 40 4 */
u32 prot; /* 44 4 */
u32 flags; /* 48 4 */
enum mapping_type mapping_type:8; /* 52: 0 4 */
/* Bitfield combined with next fields */
_Bool erange_warned; /* 53 1 */
_Bool priv; /* 54 1 */
/* size: 56, cachelines: 1, members: 11 */
/* padding: 1 */
/* last cacheline: 56 bytes */
};
```
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Colin Ian King <[email protected]>
Cc: Dmitrii Dolgov <[email protected]>
Cc: German Gomez <[email protected]>
Cc: Guilherme Amadio <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Li Dong <[email protected]>
Cc: Liam Howlett <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masami Hiramatsu (Google) <[email protected]>
Cc: Miguel Ojeda <[email protected]>
Cc: Ming Wang <[email protected]>
Cc: Nick Terrell <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sandipan Das <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Steinar H. Gunderson <[email protected]>
Cc: Vincent Whitchurch <[email protected]>
Cc: Wenyu Liu <[email protected]>
Cc: Yang Jihong <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
binary
The diff test depends on finding the symbol test_loop in perf and will
fail if perf has been stripped and no debug object is available. In that
case, skip the test instead.
Suggested-by: Adrian Hunter <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Tested-by: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
In the ELF file, multiple NOTE segments may exist.
To locate the build id, the process shall persist
in parsing NOTE segments until the build id is found.
Signed-off-by: Chengen Du <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Wait until a lost sample occurs to allocate the lost samples buffer,
often the buffer isn't necessary. This saves a 64kb allocation and
5.3kb of peak memory consumption.
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Changbin Du <[email protected]>
Cc: Colin Ian King <[email protected]>
Cc: Dmitrii Dolgov <[email protected]>
Cc: German Gomez <[email protected]>
Cc: Guilherme Amadio <[email protected]>
Cc: Huacai Chen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: K Prateek Nayak <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Li Dong <[email protected]>
Cc: Liam Howlett <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masami Hiramatsu (Google) <[email protected]>
Cc: Miguel Ojeda <[email protected]>
Cc: Ming Wang <[email protected]>
Cc: Nick Terrell <[email protected]>
Cc: Paolo Bonzini <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sandipan Das <[email protected]>
Cc: Sean Christopherson <[email protected]>
Cc: Steinar H. Gunderson <[email protected]>
Cc: Vincent Whitchurch <[email protected]>
Cc: Wenyu Liu <[email protected]>
Cc: Yang Jihong <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When the "cycles" event isn't available evsel will fallback to the
"cpu-clock" software event.
"task-clock" is similar to "cpu-clock" but only runs when the process is
running.
Falling back to "cpu-clock" when not system wide leads to confusion, by
falling back to "task-clock" it is hoped the confusion is less.
Pass the target to determine if "task-clock" is more appropriate.
Update a nearby comment and debug string for the change.
Signed-off-by: Ian Rogers <[email protected]>
Tested-by: Athira Rajeev <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ajay Kaher <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexey Makhalov <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Cc: Sandipan Das <[email protected]>
Cc: Yang Jihong <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Json output didn't set the skip_duplicate_pmus callback yielding a
segfault.
Fixes: cd4e1efbbc40 ("perf pmus: Skip duplicate PMUs and don't print list suffix by default")
Signed-off-by: Ian Rogers <[email protected]>
Cc: James Clark <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Athira Rajeev <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
[namhyung: updated subject line according to Arnaldo]
Signed-off-by: Namhyung Kim <[email protected]>
|
|
"synchronous"
There is a spelling mistake in an option description. Fix it.
Signed-off-by: Colin Ian King <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
There are some old bug reports on perf diff crashing:
https://rhaas.blogspot.com/2012/06/perf-good-bad-ugly.html
Happening across them I was prompted to add two very basic tests that
will give some 'perf diff' coverage.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The below error can be triggered on a hybrid machine.
$ perf mem record -t load sleep 1
event syntax error: 'breakpoint/mem-loads,ldlat=30/P'
\___ Bad event or PMU
Unable to find PMU or event on a PMU of 'breakpoint'
In the perf_mem_events__record_args(), the current perf never checks the
availability of a mem event on a given PMU. All the PMUs will be added
to the perf mem event list. Perf errors out for the unsupported PMU.
Extend perf_mem_event__supported() and take a PMU into account. Check
the mem event for each PMU before adding it to the perf mem event list.
Optimize the perf_mem_events__init() a little bit. The function is to
check whether the mem events are supported in the system. It doesn't
need to scan all PMUs. Just return with the first supported PMU is good
enough.
Fixes: 5752c20f3787c9bc ("perf mem: Scan all PMUs instead of just core ones")
Reported-by: Ammy Yi <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
Tested-by: Ammy Yi <[email protected]>
Acked-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ravi Bangoria <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Running "perf list" on powerpc fails with segfault as below:
$ ./perf list
Segmentation fault (core dumped)
$
This happens because of duplicate events in the JSON list. The powerpc
JSON event list contains some event with same event name, but different
event code. They are:
- PM_INST_FROM_L3MISS (Present in datasource and frontend)
- PM_MRK_DATA_FROM_L2MISS (Present in datasource and marked)
- PM_MRK_INST_FROM_L3MISS (Present in datasource and marked)
- PM_MRK_DATA_FROM_L3MISS (Present in datasource and marked)
pmu_events_table__num_events() uses the value from table_pmu->num_entries
which includes duplicate events as well. This causes issue during "perf
list" and results in a segmentation fault.
Since both event codes are valid, append _DSRC to the Data Source events
(datasource.json), so that they would have a unique name.
Also add PM_DATA_FROM_L2MISS_DSRC and PM_DATA_FROM_L3MISS_DSRC events.
With the fix, 'perf list' works as expected.
Fixes: fc143580753348c6 ("perf vendor events power10: Update JSON/events")
Signed-off-by: Athira Jajeev <[email protected]>
Tested-by: Disha Goel <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Disha Goel <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: [email protected]
Cc: Madhavan Srinivasan <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Test that JSON output produces valid JSON.
Reviewed-by: Athira Jajeev <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Athira Jajeev <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Avoid replicated logic by having a common library to set the PYTHON
environment variable.
Reviewed-by: Athira Jajeev <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Migrate Makefile.tests to Build so that variables like rule_mkdir are
defined via Makefile.build (needed so the output directory can be
created). This requires SHELLCHECK being exported and the clean rule
tweaking to remove the files in find.
Change find "-perm -o=x" as it was failing on my Debian based Linux
kernel tree, switch to using "-executable".
Adding a filename prefix of "." to the shellcheck log files is a pain
and error prone in make, remove this prefix and just add the
shellcheck log files to .gitignore.
Fix the command echo so that running the test is displayed.
Fixes: 1638b11ef8156c85 ("perf tools: Add perf binary dependent rule for shellcheck log in Makefile.perf")
Reviewed-by: Athira Jajeev <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Add JSON files for AmpereOneX core PMU events and metrics.
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Ilkka Koskinen <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
GPC_FLUSH_MEM_FAULT
The documentation wrongly called the event as BPU_FLUSH_MEM_FAULT and now
has been fixed. Correct the name in the perf tool as well.
Fixes: a9650b7f6fc09d16 ("perf vendor events arm64: Add AmpereOne core PMU events")
Reviewed-by: Ian Rogers <[email protected]>
Signed-off-by: Ilkka Koskinen <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Ilkka Koskinen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: [email protected]
Cc: Mark Rutland <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Will Deacon <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
AmpereOne metrics were missing DefaultMetricgroupName from metrics with
"Default" in group name resulting perf to segfault. Add the missing
field to address the issue.
Fixes: 59faeaf80d02 ("perf vendor events arm64: Fix for AmpereOne metrics")
Signed-off-by: Ilkka Koskinen <[email protected]>
Reviewed-by: Ian Rogers <[email protected]>
Cc: James Clark <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: John Garry <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Namhyung Kim <[email protected]>
|
|
A metric is default by having "Default" within its groups. The default
metricgroup name needn't be set and this can result in segv in
default_metricgroup_cmp and perf_stat__print_shadow_stats_metricgroup
that assume it has a value when there is a Default metric group. To
avoid the segv initialize the value to "".
Fixes: 1c0e47956a8e ("perf metrics: Sort the Default metricgroup")
Signed-off-by: Ian Rogers <[email protected]>
Reviewed-and-tested-by: Ilkka Koskinen <[email protected]>
Cc: James Clark <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: John Garry <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Namhyung Kim <[email protected]>
|
|
The 'vg' register for arm64 shows up in --user_regs as available when
masking the variable AT_HWCAP with 1 << 22 returns '1' as done in
perf_regs.c.
However, in subtests for support of SVE, the check for the 'vg' register
is done by masking the variable AT_HWCAP with the value 0x200000 which
is equals to 1 << 21 instead of 1 << 22.
This results in inconsistencies on certain systems where the test
expects that the 'vg' register is not operational when it is, and
vice-versa.
During the testing on a machine that the test expected not to have the
'vg' register available, 'perf record' with the option --user-regs
showed records for the 'vg' register together with all of the others,
which means that the mask for the subtest of perf_event_attr is off by
one.
Change the value of the mask from 0x200000 to 0x400000 to correct it.
Fixes: 9440ebdc333dd12e ("perf test arm64: Add attr tests for new VG register")
Reviewed-by: Leo Yan <[email protected]>
Signed-off-by: Veronika Molnarova <[email protected]>
Cc: James Clark <[email protected]>
Cc: Michael Petlan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
So that we don't have to go thru the series of strcmp(arch) calls for
each id -> string translation.
Reviewed-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: David Laight <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
That will cache the arch specific function translating error numbers to
strings.
Reviewed-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: David Laight <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Namhyung reported:
I'm seeing a build error on my Alpine linux image which uses busybox +
musl libc:
In file included from trace/beauty/arch_errno_names.c:1,
from builtin-trace.c:899:
/build/trace/beauty/generated/arch_errno_name_array.c: In function 'arch_syscalls__strerrno':
/build/trace/beauty/generated/arch_errno_name_array.c:142:49: error: unused parameter 'arch' [-Werror=unused-parameter]
142 | const char *arch_syscalls__strerrno(const char *arch, int err)
It looks like busybox find command doesn't have -printf option
find: unrecognized: -printf
, Yesterday 9:16 PM
,
BusyBox v1.36.1 (2023-07-27 17:12:24 UTC) multi-call binary.
Usage: find [-HL] [PATH]... [OPTIONS] [ACTIONS]
Search for files and perform actions on them.
First failed action stops processing of current file.
Defaults: PATH is current directory, action is '-print'
So just remove it and pipe find's entry to a basename loop to produce
the same result.
Then use an alternative loop that relies on the shell to avoid needless
forks and execs.
The discussion about it generated the impetus to stop doing strcmps to
find the right table at each errno to string translation but instead do
this just once and then use a function pointer to the right arch
specific table.
Suggested-by: David Laight <[email protected]>
Reported-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Hendrik Brueckner <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Michael Petlan <[email protected]>
Cc: Thomas Richter <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|