aboutsummaryrefslogtreecommitdiff
path: root/tools/perf/arch/arm
AgeCommit message (Collapse)AuthorFilesLines
2024-08-29perf: cs-etm: Only save valid trace IDs into filesJames Clark1-1/+2
This isn't a bug because Perf always masks with CORESIGHT_TRACE_ID_VAL_MASK before using these values, but to avoid it looking like it could be, make an effort to not save bad values. Reviewed-by: Mike Leach <[email protected]> Signed-off-by: James Clark <[email protected]> Signed-off-by: James Clark <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexandre Torgue <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Ganapatrao Kulkarni <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Maxime Coquelin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2024-08-29perf: cs-etm: Create decoders based on the trace ID mappingsJames Clark1-5/+3
Now that each queue has a unique set of trace ID mappings, use this list to create the decoders. In unformatted mode just add a single mapping so only one decoder is made. Previously each queue would have a decoder created for each traced CPU on the system but this won't work anymore because CPUs can have overlapping trace IDs. This also means that the CORESIGHT_TRACE_ID_UNUSED_FLAG isn't needed any more. If mappings aren't added then decoders aren't created, rather than needing a flag to suppress creation. Reviewed-by: Mike Leach <[email protected]> Signed-off-by: James Clark <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexandre Torgue <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Ganapatrao Kulkarni <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Maxime Coquelin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: James Clark <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2024-08-28perf auxtrace: Remove unused 'pmu' pointer from struct auxtrace_recordLeo Yan1-1/+0
The 'pmu' pointer in the auxtrace_record structure is not used after support multiple AUX events, remove it. Reviewed-by: Adrian Hunter <[email protected]> Signed-off-by: Leo Yan <[email protected]> Cc: Ian Rogers <[email protected]> Cc: James Clark <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2024-07-31perf tools: Enable evsel__is_aux_event() to work for ARM/ARM64Adrian Hunter1-0/+3
Set pmu->auxtrace on ARM/ARM64 AUX area PMUs. evsel__is_aux_event() needs the setting to identify AUX area tracing selected events. Currently, the features that use evsel__is_aux_event() are used only by Intel PT, but that may change in the future. Reviewed-by: Andi Kleen <[email protected]> Reviewed-by: Leo Yan <[email protected]> Signed-off-by: Adrian Hunter <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Hendrik Brueckner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yicong Yang <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2024-07-12perf arm-spe: Support multiple Arm SPE PMUsLeo Yan1-1/+1
A platform can have more than one Arm SPE PMU. For example, a system with multiple clusters may have each cluster enabled with its own Arm SPE instance. In such case, the PMU devices will be named 'arm_spe_0', 'arm_spe_1', and so on. Currently, the tool only supports 'arm_spe_0'. This commit extends support to multiple Arm SPE PMUs by detecting the substring 'arm_spe_'. Signed-off-by: Leo Yan <[email protected]> Reviewed-by: James Clark <[email protected]> Cc: Suzuki K Poulose <[email protected]> Cc: Will Deacon <[email protected]> Cc: Mike Leach <[email protected]> Cc: Kajol Jain <[email protected]> Cc: John Garry <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2024-06-26perf util: Make util its own libraryIan Rogers2-6/+6
Make the util directory into its own library. This is done to avoid compiling code twice, once for the perf tool and once for the perf python module. For convenience: arch/common.c scripts/perl/Perf-Trace-Util/Context.c scripts/python/Perf-Trace-Util/Context.c are made part of this library. Signed-off-by: Ian Rogers <[email protected]> Reviewed-by: James Clark <[email protected]> Cc: Suzuki K Poulose <[email protected]> Cc: Kees Cook <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Cc: Nick Terrell <[email protected]> Cc: Gary Guo <[email protected]> Cc: Alex Gaynor <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Wedson Almeida Filho <[email protected]> Cc: Ze Gao <[email protected]> Cc: Alice Ryhl <[email protected]> Cc: Andrei Vagin <[email protected]> Cc: Yicong Yang <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Guo Ren <[email protected]> Cc: Miguel Ojeda <[email protected]> Cc: Will Deacon <[email protected]> Cc: Mike Leach <[email protected]> Cc: Leo Yan <[email protected]> Cc: Oliver Upton <[email protected]> Cc: John Garry <[email protected]> Cc: Benno Lossin <[email protected]> Cc: Björn Roy Baron <[email protected]> Cc: Andreas Hindborg <[email protected]> Cc: Paul Walmsley <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-06-26perf test: Make tests its own libraryIan Rogers2-5/+5
Make the tests code its own library. This is done to avoid compiling code twice, once for the perf tool and once for the perf python module. Signed-off-by: Ian Rogers <[email protected]> Reviewed-by: James Clark <[email protected]> Cc: Suzuki K Poulose <[email protected]> Cc: Kees Cook <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Cc: Nick Terrell <[email protected]> Cc: Gary Guo <[email protected]> Cc: Alex Gaynor <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Wedson Almeida Filho <[email protected]> Cc: Ze Gao <[email protected]> Cc: Alice Ryhl <[email protected]> Cc: Andrei Vagin <[email protected]> Cc: Yicong Yang <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Guo Ren <[email protected]> Cc: Miguel Ojeda <[email protected]> Cc: Will Deacon <[email protected]> Cc: Mike Leach <[email protected]> Cc: Leo Yan <[email protected]> Cc: Oliver Upton <[email protected]> Cc: John Garry <[email protected]> Cc: Benno Lossin <[email protected]> Cc: Björn Roy Baron <[email protected]> Cc: Andreas Hindborg <[email protected]> Cc: Paul Walmsley <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-06-21perf arm: Workaround ARM PMUs cpu maps having offline cpusIan Rogers1-2/+8
When PMUs have a cpu map in the 'cpus' or 'cpumask' file, perf will try to open events on those CPUs. ARM doesn't remove offline CPUs meaning taking a CPU offline will cause perf commands to fail unless a CPU map is passed on the command line. More context in: https://lore.kernel.org/lkml/[email protected]/ Reported-by: Yicong Yang <[email protected]> Closes: https://lore.kernel.org/lkml/[email protected]/ Signed-off-by: Ian Rogers <[email protected]> Tested-by: Yicong Yang <[email protected]> Tested-by: Leo Yan <[email protected]> Cc: James Clark <[email protected]> Cc: Suzuki K Poulose <[email protected]> Cc: Will Deacon <[email protected]> Cc: Mike Leach <[email protected]> Cc: Leo Yan <[email protected]> Cc: [email protected] Cc: [email protected] Cc: John Garry <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-05-02perf cs-etm: Improve version detection and error reportingJames Clark1-18/+43
When the config validation functions are warning about ETMv3, they do it based on "not ETMv4". If the drivers aren't all loaded or the hardware doesn't support Coresight it will appear as "not ETMv4" and then Perf will print the error message "... not supported in ETMv3 ..." which is wrong and confusing. cs_etm_is_etmv4() is also misnamed because it also returns true for ETE because ETE has a superset of the ETMv4 metadata files. Although this was always done in the correct order so it wasn't a bug. Improve all this by making a single get version function which also handles not present as a separate case. Change the ETMv3 error message to only print when ETMv3 is detected, and add a new error message for the not present case. Reviewed-by: Ian Rogers <[email protected]> Reviewed-by: Leo Yan <[email protected]> Signed-off-by: James Clark <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2024-05-02perf cs-etm: Remove repeated fetches of the ETM PMUJames Clark1-33/+27
Most functions already have cs_etm_pmu, so it's a bit neater to pass it through rather than itr only to convert it again. Signed-off-by: James Clark <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2024-05-02perf cs-etm: Use struct perf_cpu as much as possibleJames Clark1-116/+88
The perf_cpu struct makes some iterators simpler and avoids some mistakes with interchanging CPU IDs with indexes etc. At the moment in this file the conversion to an integer is done somewhere in the middle of the call tree. Change it to delay the conversion to an int until the leaf functions. Some of the usage patterns are duplicated, so instead of changing them all, make cs_etm_get_ro() more reusable and use that everywhere. cs_etm_get_ro() didn't return an error before, but return one now so that it can also be used where an error is needed. Continue to ignore the error where it was already ignored. Use cs_etm_pmu_path_exists() instead of cs_etm_get_ro() in cs_etm_is_etmv4() because cs_etm_get_ro() prints a warning, but path exists is sufficient for this use case. Signed-off-by: James Clark <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2024-03-21perf arm-spe/cs-etm: Directly iterate CPU mapsIan Rogers1-65/+49
Rather than iterate all CPUs and see if they are in CPU maps, directly iterate the CPU map. Similarly make use of the intersect function taking care for when "any" CPU is specified. Switch perf_cpu_map__has_any_cpu_or_is_empty() to more appropriate alternatives. Reviewed-by: James Clark <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: Andrew Jones <[email protected]> Cc: André Almeida <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Atish Patra <[email protected]> Cc: Changbin Du <[email protected]> Cc: Darren Hart <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Paran Lee <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Steinar H. Gunderson <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Jihong <[email protected]> Cc: Yang Li <[email protected]> Cc: Yanteng Si <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2024-02-15perf parse-regs: Introduce a weak function arch__sample_reg_masks()Leo Yan1-1/+6
Every architecture can provide a register list for sampling. If an architecture doesn't support register sampling, it won't define the data structure 'sample_reg_masks'. Consequently, any code using this structure must be protected by the macro 'HAVE_PERF_REGS_SUPPORT'. This patch defines a weak function, arch__sample_reg_masks(), which will be replaced by an architecture-defined function for returning the architecture's register list. With this refactoring, the function always exists, the condition checking for 'HAVE_PERF_REGS_SUPPORT' is not needed anymore, so remove it. Signed-off-by: Leo Yan <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Cc: James Clark <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Albert Ou <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Guo Ren <[email protected]> Cc: Will Deacon <[email protected]> Cc: Mike Leach <[email protected]> Cc: Kan Liang <[email protected]> Cc: Ming Wang <[email protected]> Cc: John Garry <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-01-24perf mem: Add mem_events into the supported perf_pmuKan Liang1-0/+3
With the mem_events, perf doesn't need to read sysfs for each PMU to find the mem-events-supported PMU. The patch also makes it possible to clean up the related __weak functions later. The patch is only to add the mem_events into the perf_pmu for all ARCHs. It will be used in the later cleanup patches. Reviewed-by: Ian Rogers <[email protected]> Reviewed-by: Kajol Jain <[email protected]> Tested-by: Ravi Bangoria <[email protected]> Tested-by: Leo Yan <[email protected]> Tested-by: Kajol Jain <[email protected]> Suggested-by: Leo Yan <[email protected]> Signed-off-by: Kan Liang <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-12-12libperf cpumap: Replace usage of perf_cpu_map__new(NULL) with ↵Ian Rogers1-3/+3
perf_cpu_map__new_online_cpus() Passing NULL to perf_cpu_map__new() performs perf_cpu_map__new_online_cpus(), just directly call perf_cpu_map__new_online_cpus() to be more intention revealing. Reviewed-by: James Clark <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: Andrew Jones <[email protected]> Cc: André Almeida <[email protected]> Cc: Athira Jajeev <[email protected]> Cc: Atish Patra <[email protected]> Cc: Changbin Du <[email protected]> Cc: Darren Hart <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Paran Lee <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Steinar H. Gunderson <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Jihong <[email protected]> Cc: Yang Li <[email protected]> Cc: Yanteng Si <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-12-12libperf cpumap: Rename perf_cpu_map__empty() to ↵Ian Rogers1-5/+5
perf_cpu_map__has_any_cpu_or_is_empty() The name perf_cpu_map_empty is misleading as true is also returned when the map contains an "any" CPU (aka dummy) map. Rename to perf_cpu_map__has_any_cpu_or_is_empty(), later changes will (re)introduce perf_cpu_map__empty() and perf_cpu_map__has_any_cpu(). Reviewed-by: James Clark <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexandre Ghiti <[email protected]> Cc: Andrew Jones <[email protected]> Cc: André Almeida <[email protected]> Cc: Athira Jajeev <[email protected]> Cc: Atish Patra <[email protected]> Cc: Changbin Du <[email protected]> Cc: Darren Hart <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Paran Lee <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Steinar H. Gunderson <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Jihong <[email protected]> Cc: Yang Li <[email protected]> Cc: Yanteng Si <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-10-17perf cs-etm: Respect timestamp optionLeo Yan1-0/+9
When users pass the option '--timestamp' or '-T' in the record command, all events will set the PERF_SAMPLE_TIME bit in the attribution. In this case, the AUX event will record the kernel timestamp, but it doesn't mean Arm CoreSight enables timestamp packets in its hardware tracing. If the option '--timestamp' or '-T' is set, this patch always enables Arm CoreSight timestamp, as a result, the bit 28 in event's config is to be set. Before: # perf record -e cs_etm// --per-thread --timestamp -- ls # perf script --header-only ... # event : name = cs_etm//, , id = { 69 }, type = 12, size = 136, config = 0, { sample_period, sample_freq } = 1, sample_type = IP|TID|TIME|CPU|IDENTIFIER, read_format = ID|LOST, disabled = 1, enable_on_exec = 1, sample_id_all = 1, exclude_guest = 1 ... After: # perf record -e cs_etm// --per-thread --timestamp -- ls # perf script --header-only ... # event : name = cs_etm//, , id = { 49 }, type = 12, size = 136, config = 0x10000000, { sample_period, sample_freq } = 1, sample_type = IP|TID|TIME|CPU|IDENTIFIER, read_format = ID|LOST, disabled = 1, enable_on_exec = 1, sample_id_all = 1, exclude_guest = 1 ... Signed-off-by: Leo Yan <[email protected]> Reviewed-by: James Clark <[email protected]> Acked-by: Suzuki K Poulose <[email protected]> Cc: Will Deacon <[email protected]> Cc: Mike Leach <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: John Garry <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-10-17perf cs-etm: Validate timestamp tracing in per-thread modeLeo Yan1-2/+11
So far, it's impossible to validate timestamp trace in Arm CoreSight when the perf is in the per-thread mode. E.g. for the command: perf record -e cs_etm/timestamp/ --per-thread -- ls The command enables config 'timestamp' for 'cs_etm' event in the per-thread mode. In this case, the function cs_etm_validate_config() directly bails out and skips validation. Given profiled process can be scheduled on any CPUs in the per-thread mode, this patch validates timestamp tracing for all CPUs when detect the CPU map is empty. Signed-off-by: Leo Yan <[email protected]> Reviewed-by: James Clark <[email protected]> Acked-by: Suzuki K Poulose <[email protected]> Cc: Will Deacon <[email protected]> Cc: Mike Leach <[email protected]> Cc: John Garry <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-10-17perf pmu: Lazily compute default configIan Rogers2-12/+5
The default config is computed during creation of the PMU and may do things like scanning sysfs, when the PMU may just be used as part of scanning. Change default_config to perf_event_attr_init_default, a callback that is used when a default config needs initializing. This avoids holding onto the memory for a perf_event_attr and copying. On a tigerlake laptop running the pmu-scan benchmark: Before: Running 'internals/pmu-scan' benchmark: Computing performance of sysfs PMU event scan for 100 times Average core PMU scanning took: 28.780 usec (+- 0.503 usec) Average PMU scanning took: 283.480 usec (+- 18.471 usec) Number of openat syscalls: 30,227 After: Running 'internals/pmu-scan' benchmark: Computing performance of sysfs PMU event scan for 100 times Average core PMU scanning took: 27.880 usec (+- 0.169 usec) Average PMU scanning took: 245.260 usec (+- 15.758 usec) Number of openat syscalls: 28,914 Over 3 runs it is a nearly 12% reduction in execution time and a 4.3% of openat calls. Signed-off-by: Ian Rogers <[email protected]> Reviewed-by: Adrian Hunter <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: James Clark <[email protected]> Cc: Suzuki K Poulose <[email protected]> Cc: Yang Jihong <[email protected]> Cc: Will Deacon <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mike Leach <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Kan Liang <[email protected]> Cc: John Garry <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-10-17perf arm-spe: Move PMU initialization from default config codeIan Rogers1-0/+2
Avoid setting PMU values in arm_spe_pmu_default_config, move to perf_pmu__arch_init. Signed-off-by: Ian Rogers <[email protected]> Reviewed-by: Adrian Hunter <[email protected]> Tested-by: Leo Yan <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: James Clark <[email protected]> Cc: Suzuki K Poulose <[email protected]> Cc: Yang Jihong <[email protected]> Cc: Will Deacon <[email protected]> Cc: Mike Leach <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Kan Liang <[email protected]> Cc: John Garry <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-10-17perf pmu: Rename perf_pmu__get_default_config to perf_pmu__arch_initIan Rogers1-5/+3
Assign default_config as part of the init. perf_pmu__get_default_config was doing more than just getting the default config and so this is intended to better align with the code. Signed-off-by: Ian Rogers <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: James Clark <[email protected]> Cc: Suzuki K Poulose <[email protected]> Cc: Yang Jihong <[email protected]> Cc: Will Deacon <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mike Leach <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Kan Liang <[email protected]> Cc: John Garry <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-08-23perf pmu: Avoid passing format list to perf_pmu__format_bits()Ian Rogers1-6/+6
Pass the PMU so the format list can be better abstracted and later lazily loaded. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gaosheng Cui <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ Did missing conversions in tools/perf/arch/arm*/util/cs-etm.c ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-16perf parse-regs: Move out arch specific header from util/perf_regs.hLeo Yan2-0/+2
util/perf_regs.h includes another perf_regs.h: #include <perf_regs.h> Here it includes architecture specific header, for example, if we build arm64 target, the header tools/perf/arch/arm64/include/perf_regs.h is included. We use this implicit way to include architecture specific header, which is not directive; furthermore, util/perf_regs.c is coupled with the architecture specific definitions. This patch moves out arch specific header from util/perf_regs.h for generalizing the 'util' folder, as a result, the source files in 'arch' folder explicitly include architecture's perf_regs.h. Signed-off-by: Leo Yan <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Eric Lin <[email protected]> Cc: Fangrui Song <[email protected]> Cc: Guo Ren <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ivan Babrou <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Ming Wang <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-16perf parse-regs: Remove PERF_REGS_{MAX|MASK} from common codeLeo Yan1-0/+10
The macros PERF_REGS_MAX and PERF_REGS_MASK are architecture specific, let's remove them from the common file util/perf_regs.c. As a side effect, the weak functions arch__intr_reg_mask() and arch__user_reg_mask() just return zeros, every arch defines its own functions in the 'arch' folder for returning right values. Note, we don't need to return intr/user register masks dynamically, this is because these two functions are invoked during recording phase but not decoding phase, they are always invoked on the native environment, thus we don't need to parse them dynamically. Signed-off-by: Leo Yan <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Eric Lin <[email protected]> Cc: Fangrui Song <[email protected]> Cc: Guo Ren <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ivan Babrou <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Ming Wang <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-08-16perf parse-regs: Remove unused macros PERF_REG_{IP|SP}Leo Yan1-3/+0
The macros PERF_REG_{IP|SP} have been replaced by using functions perf_arch_reg_{ip|sp}(), remove them! Signed-off-by: Leo Yan <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Albert Ou <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Eric Lin <[email protected]> Cc: Fangrui Song <[email protected]> Cc: Guo Ren <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ivan Babrou <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Ming Wang <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Palmer Dabbelt <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-06-12perf thread: Add accessor functions for threadIan Rogers1-1/+1
Using accessors will make it easier to add reference count checking in later patches. Committer notes: thread->nsinfo wasn't wrapped as it is used together with nsinfo__zput(), where does a trick to set the field with a refcount being dropped to NULL, and that doesn't work well with using thread__nsinfo(thread), that loses the &thread->nsinfo pointer. When refcount checking is added to 'struct thread', later in this series, nsinfo__zput(RC_CHK_ACCESS(thread)->nsinfo) will be used to check the thread pointer. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ali Saidi <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Brian Robbins <[email protected]> Cc: Changbin Du <[email protected]> Cc: Dmitrii Dolgov <[email protected]> Cc: Fangrui Song <[email protected]> Cc: German Gomez <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ivan Babrou <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Liam Howlett <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Miguel Ojeda <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Naveen N. Rao <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Steinar H. Gunderson <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Wenyu Liu <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Jihong <[email protected]> Cc: Ye Xingchen <[email protected]> Cc: Yuan Can <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-31Merge tag 'perf-tools-fixes-for-v6.4-2-2023-05-30' into perf-tools-nextArnaldo Carvalho de Melo1-1/+1
perf tools fixes for v6.4: 2nd batch - Fix BPF CO-RE naming convention for checking the availability of fields on 'union perf_mem_data_src' on the running kernel. - Remove the use of llvm-strip on BPF skel object files, not needed, fixes a build breakage when the llvm package, that contains it in most distros, isn't installed. - Fix tools that use both evsel->{bpf_counter_list,bpf_filters}, removing them from a union. - Remove extra "--" from the 'perf ftrace latency' --use-nsec option, previously it was working only when using the '-n' alternative. - Don't stop building when both binutils-devel and a C++ compiler isn't available to compile the alternative C++ demangle support code, disable that feature instead. - Sync the linux/in.h and coresight-pmu.h header copies with the kernel sources. - Fix relative include path to cs-etm.h. Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-27perf pmu: Separate pmu and pmusIan Rogers2-5/+6
Separate and hide the pmus list in pmus.[ch]. Move pmus functionality out of pmu.[ch] into pmus.[ch] renaming pmus functions which were prefixed perf_pmu__ to perf_pmus__. Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ali Saidi <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Dmitrii Dolgov <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kang Minchul <[email protected]> Cc: Leo Yan <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Ming Wang <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Will Deacon <[email protected]> Cc: Xing Zhengjun <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-26perf arm: Fix include path to cs-etm.hIan Rogers1-1/+1
Change "../cs-etm.h" to just "../../../util/cs-etm.h" as ../cs-etm.h doesn't exist. Suggested-by: Leo Yan <[email protected]> Reviewed-by: Leo Yan <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-10perf cs-etm: Fix contextid validationJames Clark1-5/+4
Pre 5.11 kernels don't support 'contextid1' and 'contextid2' so validation would be skipped. By adding an additional check for 'contextid', old kernels will still have validation done even though contextid would either be contextid1 or contextid2. Additionally now that it's possible to override options, an existing bug in the validation is revealed. 'val' is overwritten by the contextid1 validation, and re-used for contextid2 validation causing it to always fail. '!val || val != 0x4' is the same as 'val != 0x4' because 0 is also != 4, so that expression can be simplified and the temp variable not overwritten. Fixes: 35c51f83dd1ed5db ("perf cs-etm: Validate options after applying them") Reviewed-by: Leo Yan <[email protected]> Signed-off-by: James Clark <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/all/[email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-04-24perf cs-etm: Allow user to override timestamp and contextid settingsJames Clark2-6/+25
Timestamps and context tracking are automatically enabled in per-core mode and it's impossible to override this. Use the new utility function to set them conditionally. Signed-off-by: James Clark <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Denis Nikitin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Shi <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-04-24perf cs-etm: Validate options after applying themJames Clark1-84/+68
Currently the cs_etm_set_option() function both validates and applies the config options. Because it's only called when they are added automatically, there are some paths where the user can apply the option on the command line and skip the validation. By moving it to the end it covers both cases. Also, options don't need to be re-applied anyway, Perf handles parsing and applying the config terms automatically. Signed-off-by: James Clark <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Denis Nikitin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Shi <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-04-24perf cs-etm: Don't test full_auxtrace because it's always setJames Clark1-31/+25
There is no path in cs-etm where this isn't true so it doesn't need to be tested. Also re-order the beginning of cs_etm_recording_options() so that nothing is done until the early exit is passed. Signed-off-by: James Clark <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Denis Nikitin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yang Shi <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-04-04perf map: Add accessor for start and endIan Rogers1-1/+1
Later changes will add reference count checking for struct map, start and end are frequently accessed variables. Add an accessor so that the reference count check is only necessary in one place. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexey Bayduraev <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andy Shevchenko <[email protected]> Cc: Darren Hart <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Dmitriy Vyukov <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: German Gomez <[email protected]> Cc: Hao Luo <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Miaoqian Lin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: Shunsuke Nakamura <[email protected]> Cc: Song Liu <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Stephen Brennan <[email protected]> Cc: Steven Rostedt (VMware) <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Yury Norov <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-04-04perf cs-etm: Update record event to use new Trace ID protocolMike Leach1-10/+17
Trace IDs are now dynamically allocated. Previously used the static association algorithm that is no longer used. The 'cpu * 2 + seed' was outdated and broken for systems with high core counts (>46). as it did not scale and was broken for larger core counts. Trace ID will now be sent in PERF_RECORD_AUX_OUTPUT_HW_ID record. Legacy ID algorithm renamed and retained for limited backward compatibility use. Reviewed-by: James Clark <[email protected]> Signed-off-by: Mike Leach <[email protected]> Acked-by: Suzuki Poulouse <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Darren Hart <[email protected]> Cc: Ganapatrao Kulkarni <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-03-13perf cs-etm: Avoid printing warning in cs_etm_is_ete() checkJames Clark1-1/+5
When checking for the presence of ETE, a register is read that may not be present on older kernels or if ETE isn't available. cs_etm_get_ro() will print a warning if it doesn't exist, so check for the existence first before accessing it. Reviewed-by: Leo Yan <[email protected]> Signed-off-by: James Clark <[email protected]> Cc: Al Grant <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-03-13perf cs-etm: Reduce verbosity of ts_source warningJames Clark1-4/+4
This is printed as a warning but it is normal behavior that users shouldn't be expected to do anything about. Reduce the warning level to debug3 so it's only seen in verbose mode to avoid confusion. Reviewed-by: Leo Yan <[email protected] Signed-off-by: James Clark <[email protected]> Cc: Al Grant <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-01-27perf cs-etm: Improve missing sink warning messageJames Clark1-3/+9
Make the sink error message more similar to the event error message that reminds about missing kernel support. The available sinks are also determined by the hardware so mention that too. Also, usually it's not necessary to specify the sink, so add that as a hint. Now the error for a made up sink looks like this: $ perf record -e cs_etm/@abc/ Couldn't find sink "abc" on event cs_etm/@abc/. Missing kernel or device support? Hint: An appropriate sink will be picked automatically if one isn't is specified. For any error other than ENOENT, the same message as before is displayed. Signed-off-by: James Clark <[email protected]> Acked-by: Suzuki Poulouse <[email protected]> Suggested-by: Arnaldo Carvalho de Melo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-01-22perf cs_etm: Record ts_source in AUXTRACE_INFO for ETMv4 and ETEGerman Gomez1-0/+48
Read the value of ts_source exposed by the driver and store it in the ETMv4 and ETE header. If the interface doesn't exist (such as in older Kernels), defaults to a safe value of -1. Signed-off-by: German Gomez <[email protected]> Acked-by: Suzuki Poulouse <[email protected]> Tested-by: Tanmay Jagdale <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Bharat Bhushan <[email protected]> Cc: George Cherian <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Leo Yan <[email protected]> Cc: Linu Cherian <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sunil Kovvuri Goutham <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: James Clark <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-01-22perf cs_etm: Keep separate symbols for ETMv4 and ETE parametersGerman Gomez1-7/+36
Previously, adding a new parameter at the end of ETMv4 meant adding it somewhere in the middle of ETE, which is not supported by the current header version. Reviewed-by: Mike Leach <[email protected]> Signed-off-by: German Gomez <[email protected]> Acked-by: Suzuki Poulouse <[email protected]> Tested-by: Tanmay Jagdale <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Bharat Bhushan <[email protected]> Cc: George Cherian <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Leo Yan <[email protected]> Cc: Linu Cherian <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sunil Kovvuri Goutham <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: James Clark <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-01-22perf pmu: Remove duplication around EVENT_SOURCE_DEVICE_PATHJames Clark1-3/+2
The pattern for accessing EVENT_SOURCE_DEVICE_PATH is duplicated in a few places, so add two utility functions to cover it. Also just use perf_pmu__scan_file() instead of pmu_type() which already does the same thing. No functional changes. Reviewed-by: Leo Yan <[email protected]> Signed-off-by: James Clark <[email protected]> Acked-by: Suzuki Poulouse <[email protected]> Tested-by: Tanmay Jagdale <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Bharat Bhushan <[email protected]> Cc: George Cherian <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Linu Cherian <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sunil Kovvuri Goutham <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-31perf tools: Move 'struct perf_sample' to a separate header file to ↵Arnaldo Carvalho de Melo1-1/+1
disentangle headers Some places were including event.h just to get 'struct perf_sample', move it to a separate place so that we speed up a bit the build. Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-15perf auxtrace arm64: Add support for HiSilicon PCIe Tune and Trace device driverQi Liu2-0/+66
HiSilicon PCIe tune and trace device (PTT) could dynamically tune the PCIe link's events, and trace the TLP headers). This patch add support for PTT device in perf tool, so users could use 'perf record' to get TLP headers trace data. Reviewed-by: Leo Yan <[email protected]> Signed-off-by: Qi Liu <[email protected]> Signed-off-by: Yicong Yang <[email protected]> Acked-by: John Garry <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Lorenzo Pieralisi <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Qi Liu <[email protected]> Cc: Shameerali Kolothum Thodi <[email protected]> Cc: Shaokun Zhang <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: Zeng Prime <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-15perf auxtrace arm: Refactor event list iteration in auxtrace_record__init()Qi Liu1-19/+34
Add find_pmu_for_event() and use to simplify logic in auxtrace_record_init(). find_pmu_for_event() will be reused in subsequent patches. Reviewed-by: John Garry <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]> Reviewed-by: Leo Yan <[email protected]> Signed-off-by: Qi Liu <[email protected]> Signed-off-by: Yicong Yang <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Lorenzo Pieralisi <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Qi Liu <[email protected]> Cc: Shameerali Kolothum Thodi <[email protected]> Cc: Shaokun Zhang <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: Zeng Prime <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-08-10perf tools: Do not pass NULL to parse_events()Adrian Hunter1-1/+1
Many cases do not use the extra error information provided by parse_events and instead pass NULL as the struct parse_events_error pointer. Add a wrapper for those cases so that the pointer is never NULL. Signed-off-by: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-05-10perf auxtrace: Record whether an auxtrace mmap is neededAdrian Hunter1-0/+1
Add a flag needs_auxtrace_mmap to record whether an auxtrace mmap is needed, in preparation for correctly determining whether or not an auxtrace mmap is needed. Signed-off-by: Adrian Hunter <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Alexey Bayduraev <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Leo Yan <[email protected]> Cc: Namhyung Kim <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-04-01perf evlist: Rename cpus to user_requested_cpusIan Rogers1-4/+4
evlist contains cpus and all_cpus. all_cpus is the union of the cpu maps of all evsels. For non-task targets, cpus is set to be cpus requested from the command line, defaulting to all online cpus if no cpus are specified. For an uncore event, all_cpus may be just CPU 0 or every online CPU. This causes all_cpus to have fewer values than the cpus variable which is confusing given the 'all' in the name. To try to make the behavior clearer, rename cpus to user_requested_cpus and add comments on the two struct variables. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Antonov <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Alexey Bayduraev <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: German Gomez <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Fastabend <[email protected]> Cc: John Garry <[email protected]> Cc: KP Singh <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: Song Liu <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yonghong Song <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-02-15perf cs-etm: Update deduction of TRCCONFIGR register for branch broadcastJames Clark1-0/+3
Now that a config flag for branch broadcast has been added, take it into account when trying to deduce what the driver would have programmed the TRCCONFIGR register to. Reviewed-by: Leo Yan <[email protected]> Reviewed-by: Mike Leach <[email protected]> Reviewed-by: Suzuki Poulouse <[email protected]> Signed-off-by: James Clark <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-01-12perf cpumap: Give CPUs their own typeIan Rogers1-12/+23
A common problem is confusing CPU map indices with the CPU, by wrapping the CPU with a struct then this is avoided. This approach is similar to atomic_t. Committer notes: To make it build with BUILD_BPF_SKEL=1 these files needed the conversions to 'struct perf_cpu' usage: tools/perf/util/bpf_counter.c tools/perf/util/bpf_counter_cgroup.c tools/perf/util/bpf_ftrace.c Also perf_env__get_cpu() was removed back in "perf cpumap: Switch cpu_map__build_map to cpu function". Additionally these needed to be fixed for the ARM builds to complete: tools/perf/arch/arm/util/cs-etm.c tools/perf/arch/arm64/util/pmu.c Suggested-by: John Garry <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Clarke <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Vineet Singh <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-01-12perf cpumap: Move 'has' function to libperfIan Rogers1-8/+8
Make the cpu map argument const for consistency with the rest of the API. Modify cpu_map__idx accordingly. Reviewed-by: James Clark <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Clarke <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Vineet Singh <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>