aboutsummaryrefslogtreecommitdiff
path: root/tools/perf/util
AgeCommit message (Collapse)AuthorFilesLines
2021-09-08tools: rename bitmap_alloc() to bitmap_zalloc()Andy Shevchenko4-7/+7
Rename bitmap_alloc() to bitmap_zalloc() in tools to follow the bitmap API in the kernel. No functional changes intended. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Andy Shevchenko <[email protected]> Signed-off-by: Yury Norov <[email protected]> Suggested-by: Yury Norov <[email protected]> Acked-by: Yury Norov <[email protected]> Tested-by: Wolfram Sang <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Lobakin <[email protected]> Cc: Alexey Klimov <[email protected]> Cc: Dennis Zhou <[email protected]> Cc: Ulf Hansson <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-05Merge tag 'perf-tools-for-v5.15-2021-09-04' of ↵Linus Torvalds40-615/+994
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tool updates from Arnaldo Carvalho de Melo: "New features: - Improvements for the flamegraph python script, including: - Display perf.data header - Display PIDs of user stacks - Added option to change color scheme - Default to blue/green color scheme to improve accessibility - Correctly identify kernel stacks when debuginfo is available - Improvements for 'perf bench futex': - Add --mlockall parameter - Add --broadcast and --pi to the 'requeue' sub benchmark - Add support for PMU aliases. - Introduce an ARM Coresight ETE decoder. - Add a 'perf bench' entry for evlist open/close operations, to help quantify improvements with multithreading 'perf record'. - Allow reporting the [un]throttle PERF_RECORD_ meta event in 'perf script's python scripting. - Add a 'perf test' entry for PMU aliases. - Add a 'perf test' entry for 'perf record/perf report/perf script' pipe mode. Fixes: - perf script dlfilter (API for filtering via dynamically loaded shared object introduced in v5.14) fixes and a 'perf test' entry for it. - Fix get_current_dir_name() compilation on Android. - Fix issues with asciidoc and double dashes uses. - Fix memory leaks in the BTF handling code. - Fix leftover problems in the Documentation from the infrastructure originally lifted from the git codebase. - Fix *probe_vfs_getname.sh 'perf test' failures. - Handle fd gaps in 'perf test's test__dso_data_reopen(). - Make sure to show disasembly warnings for 'perf annotate --stdio'. - Fix output from pipe to file and vice-versa in 'perf record/report/script'. - Correct 'perf data -h' output. - Fix wrong comm in system-wide mode with 'perf record --delay'. - Do not allow --for-each-cgroup without cpu in 'perf stat' - Make 'perf test --skip' work on shell tests. - Fix libperf's verbose printing. Misc improvements: - Preparatory patches for multithreading various 'perf record' phases (synthesizing, opening, recording, etc). - Add sparse context/locking annotations in compiler-types.h, also to help with the multithreading effort. - Optimize the generation of the arch specific erno tables used in 'perf trace'. - Optimize libperf's perf_cpu_map__max(). - Improve ARM's CoreSight warnings. - Report collisions in AUX records. - Improve warnings for the LLVM 'perf test' entry. - Improve the PMU events 'perf test' codebase. - perf test: Do not compare overheads in the zstd comp test - Better support annotation on ARM. - Update 'perf trace's cmd string table to decode sys_bpf() first arg. Vendor events: - Add JSON events and metrics for Intel's Ice Lake, Tiger Lake and Elhart Lake. - Update JSON eventsand metrics for Intel's Cascade Lake and Sky Lake servers. Hardware tracing: - Improvements for the ARM hardware tracing auxtrace support" * tag 'perf-tools-for-v5.15-2021-09-04' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (130 commits) perf tests: Add test for PMU aliases perf pmu: Add PMU alias support perf session: Report collisions in AUX records perf script python: Allow reporting the [un]throttle PERF_RECORD_ meta event perf build: Report failure for testing feature libopencsd perf cs-etm: Show a warning for an unknown magic number perf cs-etm: Print the decoder name perf cs-etm: Create ETE decoder perf cs-etm: Update OpenCSD decoder for ETE perf cs-etm: Fix typo perf cs-etm: Save TRCDEVARCH register perf cs-etm: Refactor out ETMv4 header saving perf cs-etm: Initialise architecture based on TRCIDR1 perf cs-etm: Refactor initialisation of decoder params. tools build: Fix feature detect clean for out of source builds perf evlist: Add evlist__for_each_entry_from() macro perf evsel: Handle precise_ip fallback in evsel__open_cpu() perf evsel: Move bpf_counter__install_pe() to success path in evsel__open_cpu() perf evsel: Move test_attr__open() to success path in evsel__open_cpu() perf evsel: Move ignore_missing_thread() to fallback code ...
2021-09-03perf pmu: Add PMU alias supportKan Liang3-4/+44
A perf uncore PMU may have two PMU names, a real name and an alias. The alias is exported at /sys/bus/event_source/devices/uncore_*/alias. The perf tool should support the alias as well. Add alias_name in the struct perf_pmu to store the alias. For the PMU which doesn't have an alias. It's NULL. Introduce two X86 specific functions to retrieve the real name and the alias separately. Only go through the sysfs to retrieve the mapping between the real name and the alias once. The result is cached in a list, uncore_pmu_list. Nothing changed for the other ARCHs. With the patch, the perf tool can monitor the PMU with either the real name or the alias. Use the real name, $ perf stat -e uncore_cha_2/event=1/ -x, 4044879584,,uncore_cha_2/event=1/,2528059205,100.00,, Use the alias, $ perf stat -e uncore_type_0_2/event=1/ -x, 3659675336,,uncore_type_0_2/event=1/,2287306455,100.00,, Committer notes: Rename 'struct perf_pmu_alias_name' to 'pmu_alias', the 'perf_' prefix should be used for libperf, things inside just tools/perf/ are being moved away from that prefix. Also 'pmu_alias' is shorter and reflects the abstraction. Also don't use 'pmu' as the name for variables for that type, we should use that for the 'struct perf_pmu' variables, avoiding confusion. Use 'pmu_alias' for 'struct pmu_alias' variables. Co-developed-by: Jin Yao <[email protected]> Co-developed-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Kan Liang <[email protected]> Reviewed-by: Andi Kleen <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Jin Yao <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-09-03perf session: Report collisions in AUX recordsSuzuki K Poulose2-0/+10
Just like the other flags in the AUX records, report a summary of the Collisions if there were any. Signed-off-by: Suzuki Poulouse <[email protected]> Reviewed-by: Leo Yan <[email protected]> Reviewed-by: Mathieu Poirier <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: [email protected] Cc: [email protected] LPU-Reference: [email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-09-03perf script python: Allow reporting the [un]throttle PERF_RECORD_ meta eventStephen Brennan2-0/+35
perf_events may sometimes throttle an event due to creating too many samples during a given timer tick. As of now, the perf tool will not report on throttling, which means this is a silent error. Implement a callback for the throttle and unthrottle events within the Python scripting engine, which can allow scripts to detect and report when events may have been lost due to throttling. The simplest script to report throttle events is: def throttle(*args): print("throttle" + repr(args)) def unthrottle(*args): print("unthrottle" + repr(args)) Signed-off-by: Stephen Brennan <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-09-03perf cs-etm: Show a warning for an unknown magic numberJames Clark1-0/+5
Currently perf reports "Cannot allocate memory" which isn't very helpful for a potentially user facing issue. If we add a new magic number in the future, perf will be able to report unrecognised magic numbers. Reviewed-by: Leo Yan <[email protected]> Signed-off-by: James Clark <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: https //lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-09-03perf cs-etm: Print the decoder nameJames Clark3-8/+14
Use the real name of the decoder instead of hard-coding "ETM" to avoid confusion when the trace is ETE. This also now distinguishes between ETMv3 and ETMv4. Reviewed-by: Leo Yan <[email protected]> Reviewed-by: Suzuki Poulouse <[email protected]> Signed-off-by: James Clark <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: https //lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-09-03perf cs-etm: Create ETE decoderJames Clark3-0/+50
If the magic number indicates ETE instantiate a OCSD_BUILTIN_DCD_ETE decoder instead of OCSD_BUILTIN_DCD_ETMV4I. ETE is the new trace feature for Armv9. Testing performed ================= * Old files with v0 and v1 headers for ETMv4 still open correctly * New files with new magic number open on new versions of perf * New files with new magic number fail to open on old versions of perf * Decoding with the ETE decoder results in the same output as the ETMv4 decoder as long as there are no new ETE packet types Reviewed-by: Leo Yan <[email protected]> Signed-off-by: James Clark <[email protected]> Acked-by: Suzuki Poulouse <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: https //lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-09-03perf cs-etm: Update OpenCSD decoder for ETEJames Clark1-2/+0
OpenCSD v1.1.1 has a bug fix for the installation of the ETE decoder headers. This also means that including headers separately for each decoder is unnecessary so remove these. Reviewed-by: Leo Yan <[email protected]> Signed-off-by: James Clark <[email protected]> Acked-by: Suzuki Poulouse <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: https //lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-09-03perf cs-etm: Save TRCDEVARCH registerJames Clark2-2/+24
When ETE is present save the TRCDEVARCH register and set a new magic number. It will be used to configure the decoder in a later commit. Old versions of perf will not be able to open files with this new magic number, but old files will still work with newer versions of perf. Reviewed-by: Leo Yan <[email protected]> Signed-off-by: James Clark <[email protected]> Acked-by: Suzuki Poulouse <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: https //lore.kernel.org/r/[email protected] [ Addressed some cosmetic suggestions by Suzuki Poulouse ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-09-03perf cs-etm: Initialise architecture based on TRCIDR1James Clark1-1/+16
Currently the architecture is hard coded as ARCH_V8, but from ETMv4.4 onwards this should be ARCH_AA64. Reviewed-by: Leo Yan <[email protected]> Signed-off-by: James Clark <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: https //lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-09-03perf cs-etm: Refactor initialisation of decoder params.James Clark1-74/+25
The initialisation of the decoder params is duplicated between creation of the packet printer and packet decoder. Put them both into one function so that future changes only need to be made in one place. Reviewed-by: Leo Yan <[email protected]> Signed-off-by: James Clark <[email protected]> Acked-by: Suzuki Poulouse <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: https //lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-09-01perf evlist: Add evlist__for_each_entry_from() macroRiccardo Mancini1-0/+16
This patch adds a new iteration macro for evlist that resumes iteration from a given evsel in the evlist. This macro will be used in the workqueue series. Signed-off-by: Riccardo Mancini <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/2386505f8b598adf0dbcd04ec21804c6bcf00826.1629490974.git.rickyman7@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-31perf evsel: Handle precise_ip fallback in evsel__open_cpu()Riccardo Mancini2-33/+28
This is another patch in the effort to separate the fallback mechanisms from the open itself. In case of precise_ip fallback, the original precise_ip will be stored in the evsel (it was stored in a local variable) and the open will be retried. Since the precise_ip fallback will be the first in the chain of fallbacks, there should be no functional change with this patch. Signed-off-by: Riccardo Mancini <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/74208c433d2024a6c4af9c0b140b54ed6b5ea810.1629490974.git.rickyman7@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-31perf evsel: Move bpf_counter__install_pe() to success path in evsel__open_cpu()Riccardo Mancini1-2/+2
I don't see why bpf_counter__install_pe() should get called even if fd = -1, so I'm moving it to the success path. This will be useful in following patches to separate the actual open and the related operations from the fallback mechanisms. Signed-off-by: Riccardo Mancini <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Link: http://lore.kernel.org/lkml/64f8a1b0a838a6e6049cd43c1beafd432999ae57.1629490974.git.rickyman7@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-31perf evsel: Move test_attr__open() to success path in evsel__open_cpu()Riccardo Mancini1-5/+5
test_attr__open() ignores the fd if -1, therefore it is safe to move it to the success path (fd >= 0). Signed-off-by: Riccardo Mancini <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/b3baf11360ca96541c9631730614fd7d217496fc.1629490974.git.rickyman7@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-31perf evsel: Move ignore_missing_thread() to fallback codeRiccardo Mancini2-16/+18
This patch moves ignore_missing_thread outside the perf_event_open loop. Doing so, we need to move the retry_open flag a few places higher, with minimal impact. Furthermore, thread need not be decreased since it won't get increased by the for loop (since we're jumping back inside), but we need to check that the nthreads decrease didn't put thread out of range. The goal is to have fallbacks handled in one place only, since in the future parallel code, these would be handled separately. Signed-off-by: Riccardo Mancini <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/4eca51443c786baaf6811b7cd8e73aafd97f7606.1629490974.git.rickyman7@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-31perf evsel: Separate rlimit increase from evsel__open_cpu()Riccardo Mancini2-20/+33
This is a preparatory patch for the workqueue patches with the goal to separate from evlist__open_cpu() the actual opening (which could be performed in parallel), from the existing fallback mechanisms, which should be handled sequentially. This patch separates the rlimit increase from evsel__open_cpu(). Signed-off-by: Riccardo Mancini <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/2f256de8ec37b9809a5cef73c2fa7bce416af5d3.1629490974.git.rickyman7@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-31perf evsel: Separate missing feature detection from evsel__open_cpu()Riccardo Mancini2-83/+92
This is a preparatory patch for the workqueue patches with the goal to separate in evlist__open_cpu() the actual opening, which could be performed in parallel, from the existing fallback mechanisms, which should be handled sequentially. This patch separates the missing feature detection in evsel__open_cpu() into a new evsel__detect_missing_features() function. Signed-off-by: Riccardo Mancini <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/cba0b7d939862473662adeedb0f9c9b69566ee9a.1629490974.git.rickyman7@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-31perf evsel: Add evsel__prepare_open()Riccardo Mancini2-0/+16
This function will prepare the evsel and disable the missing features. It will be used in one of the following patches. Signed-off-by: Riccardo Mancini <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/fa5e78bbb92c848226f044278fdcf777b3ce4583.1629490974.git.rickyman7@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-31perf evsel: Separate missing feature disabling from evsel__open_cpuRiccardo Mancini1-26/+31
This is a preparatory patch for the patches in the workqueue series with the goal to separate in evlist__open_cpu() the actual opening, which could be performed in parallel, from the existing fallback mechanisms, which should be handled sequentially. This patch separates the disabling of missing features from evlist__open_cpu() into a new function evsel__disable_missing_features((). Signed-off-by: Riccardo Mancini <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/48138bd2932646dde315505da733c2ca635ad2ee.1629490974.git.rickyman7@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-31perf evsel: Save open flags in evsel in prepare_open()Riccardo Mancini2-12/+13
This patch caches the flags used in perf_event_open() inside evsel, so that they can be set in __evsel__prepare_open() (this will be useful in patches in the workqueue series, when the fallback mechanisms will be handled outside the open itself). This also optimizes the code, by not having to recompute them everytime. Since flags are now saved in evsel, the flags argument in perf_event_open() is removed. Signed-off-by: Riccardo Mancini <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/d9f63159098e56fa518eecf25171d72e6f74df37.1629490974.git.rickyman7@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-31perf evsel: Separate open preparation from open itselfRiccardo Mancini1-11/+34
This is a preparatory patch for the following patches with the goal to separate in evlist__open_cpu the actual perf_event_open, which could be performed in parallel, from the existing fallback mechanisms, which should be handled sequentially. This patch separates the first lines of evsel__open_cpu into a new __evsel__prepare_open function. Signed-off-by: Riccardo Mancini <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/e14118b934c338dbbf68b8677f20d0d7dbf9359a.1629490974.git.rickyman7@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-31perf evsel: Remove retry_sample_id goto labelRiccardo Mancini1-2/+1
As far as I can tell, there is no good reason, apart from optimization to have the retry_sample_id separate from fallback_missing_features. Probably, this label was added to avoid reapplying patches for missing features that had already been applied. However, missing features that have been added later have not used this optimization, always jumping to fallback_missing_features and reapplying all missing features. This patch removes that label, replacing it with fallback_missing_features. Signed-off-by: Riccardo Mancini <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/340af0d03408d6621fd9c742e311db18b3585b3b.1629490974.git.rickyman7@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-31perf mmap: Add missing bitops.h headerRiccardo Mancini1-0/+1
MMAP_CPU_MASK_BYTES uses the BITS_TO_LONGS macro, which is defined in linux/bitops.h. However, this header is not included directly, but gets imported indirectly in files using the macro. This patch adds the missing include. Signed-off-by: Riccardo Mancini <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/c5b91ee432a2e28e7f16337c740b43b4d0b0e86c.1629490974.git.rickyman7@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-31perf tools: Fix LLVM download hint linkJames Clark1-1/+1
http://llvm.org/apt returns 404, it has moved to https://apt.llvm.org/ Signed-off-by: James Clark <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-31perf tools: Fix LLVM test failure when running in verbose modeJames Clark1-1/+1
A CI system might want to run all tests in verbose mode so that there is enough information to diagnose issues. This LLVM test is the only test that uses "-v" to signify to not skip the test if the preconditions aren't met (LLVM isn't installed). This means that running the test in verbose mode without LLVM installed causes a test failure. For consistency with the other tests, remove this verbose/skip check. An alternate solution would be to make _all_ tests not skip when run in verbose mode, but I don't think that would be intuitive. Also change the search_program() call to search_program_and_warn(). Previously the hint about installing LLVM was only printed by the actual test because this check was skipped in verbose mode. To maintain the old behaviour, the precondition check must also print the full warning. Previous output: $ ./perf test llvm 40: LLVM search and compile : 40.1: Basic BPF llvm compile : Skip $ ./perf test -v llvm 40: LLVM search and compile : 40.1: Basic BPF llvm compile : --- start --- test child forked, pid 2085835 ERROR: unable to find clang. Hint: Try to install latest clang/llvm to support BPF. Check your $PATH ... test child finished with -1 ---- end ---- LLVM search and compile subtest 1: FAILED! New output (non verbose mode is identical, verbose changes from fail to skip): $ ./perf test llvm 40: LLVM search and compile : 40.1: Basic BPF llvm compile : Skip $ ./perf test -v llvm 40: LLVM search and compile : 40.1: Basic BPF llvm compile : --- start --- test child forked, pid 2087680 ERROR: unable to find clang. Hint: Try to install latest clang/llvm to support BPF. Check your $PATH ... No clang, skip this test test child finished with -2 ---- end ---- LLVM search and compile subtest 1: Skip Signed-off-by: James Clark <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-31perf tools: Refactor LLVM test warning for missing binaryJames Clark1-15/+21
The same warning is duplicated in two places so refactor it into a single function "search_program_and_warn". This will be used a third time in a later commit. Signed-off-by: James Clark <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-31perf auxtrace: Add compat_auxtrace_mmap__{read_head|write_tail}Leo Yan2-7/+103
When perf runs in compat mode (kernel in 64-bit mode and the perf is in 32-bit mode), the 64-bit value atomicity in the user space cannot be assured, E.g. on some architectures, the 64-bit value accessing is split into two instructions, one is for the low 32-bit word accessing and another is for the high 32-bit word. This patch introduces weak functions compat_auxtrace_mmap__read_head() and compat_auxtrace_mmap__write_tail(), as their naming indicates, when perf tool works in compat mode, it uses these two functions to access the AUX head and tail. These two functions can allow the perf tool to work properly in certain conditions, e.g. when perf tool works in snapshot mode with only using AUX head pointer, or perf tool uses the AUX buffer and the incremented tail is not bigger than 4GB. When perf tool cannot handle the case when the AUX tail is bigger than 4GB, the function compat_auxtrace_mmap__write_tail() returns -1 and tells the caller to bail out for the error. These two functions are declared as weak attribute, this allows to implement arch specific functions if any arch can support the 64-bit value atomicity in compat mode. Suggested-by: Adrian Hunter <[email protected]> Signed-off-by: Leo Yan <[email protected]> Acked-by: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: "Russell King (oracle)" <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-31perf bpf: Fix memory leaks relating to BTF.Ian Rogers2-3/+3
BTF needs to be freed with btf__free(). Signed-off-by: Ian Rogers <[email protected]> Reviewed-by: Kajol Jain <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-31perf header: Fix spelling mistake "cant'" -> "can't"Colin Ian King1-1/+1
There is a spelling mistake in a warning message. Fix it. Signed-off-by: Colin King <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-30perf config: Fix caching and memory leak in perf_home_perfconfig()Arnaldo Carvalho de Melo1-1/+4
Acaict, perf_home_perfconfig() is supposed to cache the result of home_perfconfig, which returns the default location of perfconfig for the user, given the HOME environment variable. However, the current implementation calls home_perfconfig every time perf_home_perfconfig() is called (so no caching is actually performed), replacing the previous pointer, thus also causing a memory leak. This patch adds a check of whether either config or failed is set and, in that case, directly returns config without calling home_perfconfig at each invocation. Fixes: f5f03e19ce14fc31 ("perf config: Add perf_home_perfconfig function") Signed-off-by: Riccardo Mancini <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] [ Removed needless double check for the 'failed' variable ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-30perf tools: Fixup get_current_dir_name() compilationAlexey Dobriyan1-1/+2
strdup() prototype doesn't live in stdlib.h . Add limits.h for PATH_MAX definition as well. This fixes the build on Android. Signed-off-by: Alexey Dobriyan (SK hynix) <[email protected]> Acked-by: Namhyung Kim <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-24perf tools: Add missing newline at the end of header fileNghia Le1-1/+1
Add missing newline at the end of file parse-sublevel-options.h. Thus removing relevant warning reported by checkpatch. Signed-off-by: Nghia Le <[email protected]> Reviewed-by: Lukas Bulwahn <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http //lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-11perf tools: Enable on a list of CPUs for hybridJin Yao5-0/+114
The 'perf record' and 'perf stat' commands have supported the option '-C/--cpus' to count or collect only on the list of CPUs provided. This option needs to be supported for hybrid as well. For hybrid support, it needs to check that the cpu list are available on hybrid PMU. One example for AlderLake, cpu0-7 is 'cpu_core', cpu8-11 is 'cpu_atom'. Before: # perf stat -e cpu_core/cycles/ -C11 -- sleep 1 Performance counter stats for 'CPU(s) 11': <not supported> cpu_core/cycles/ 1.006179431 seconds time elapsed The 'perf stat' command silently returned "<not supported>" without any helpful information. It should error out pointing out that that cpu11 was not 'cpu_core'. After: # perf stat -e cpu_core/cycles/ -C11 -- sleep 1 WARNING: 11 isn't a 'cpu_core', please use a CPU list in the 'cpu_core' range (0-7) failed to use cpu list 11 We also need to support the events without pmu prefix specified. # perf stat -e cycles -C11 -- sleep 1 WARNING: 11 isn't a 'cpu_core', please use a CPU list in the 'cpu_core' range (0-7) Performance counter stats for 'CPU(s) 11': 1,067,373 cpu_atom/cycles/ 1.005544738 seconds time elapsed The perf tool creates two cycles events automatically, cpu_core/cycles/ and cpu_atom/cycles/. It checks that cpu11 is not 'cpu_core', then shows a warning for cpu_core/cycles/ and only count the cpu_atom/cycles/. If part of cpus are 'cpu_core' and part of cpus are 'cpu_atom', for example, # perf stat -e cycles -C0,11 -- sleep 1 WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list. WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list. Performance counter stats for 'CPU(s) 0,11': 1,914,704 cpu_core/cycles/ 2,036,983 cpu_atom/cycles/ 1.005815641 seconds time elapsed It now automatically selects cpu0 for cpu_core/cycles/, selects cpu11 for cpu_atom/cycles/, and output with some warnings. Some more complex examples, # perf stat -e cycles,instructions -C0,11 -- sleep 1 WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list. WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list. WARNING: use 0 in 'cpu_core' for 'instructions', skip other cpus in list. WARNING: use 11 in 'cpu_atom' for 'instructions', skip other cpus in list. Performance counter stats for 'CPU(s) 0,11': 2,780,387 cpu_core/cycles/ 1,583,432 cpu_atom/cycles/ 3,957,277 cpu_core/instructions/ 1,167,089 cpu_atom/instructions/ 1.006005124 seconds time elapsed # perf stat -e cycles,cpu_atom/instructions/ -C0,11 -- sleep 1 WARNING: use 0 in 'cpu_core' for 'cycles', skip other cpus in list. WARNING: use 11 in 'cpu_atom' for 'cycles', skip other cpus in list. WARNING: use 11 in 'cpu_atom' for 'cpu_atom/instructions/', skip other cpus in list. Performance counter stats for 'CPU(s) 0,11': 3,290,301 cpu_core/cycles/ 1,953,073 cpu_atom/cycles/ 1,407,869 cpu_atom/instructions/ 1.006260912 seconds time elapsed Signed-off-by: Jin Yao <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jin Yao <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https //lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-11perf tools: Create hybrid flag in targetJin Yao2-1/+2
The user may count or collect only on a cpu list via '-C/--cpus' option. Previously cpus for an evsel were retrieved from PMU's sysfs. But if the target cpu list is defined, the retrieved cpus are not kept and the target cpu list is used instead. But for hybrid system, we can't directly use target cpu list. The cpu list may not be available on hybrid pmu (e.g. cpu_core or cpu_atom). So we should not set the 'has_user_cpus' flag for hybrid system. The difficulity is that we can't call perf_pmu__has_hybrid() in evlist.c to check hybrid system otherwise 'perf test python' would be failed (undefined symbol for perf_pmu__has_hybrid). If we add pmu.c to python-ext-sources, too many symbol dependencies are hard to resolve. We use an alternative method by using a new 'hybrid' flag in target for hybrid system checking. Signed-off-by: Jin Yao <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jin Yao <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https //lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-11perf tests: Add dlfilter testAdrian Hunter2-2/+4
Add a perf test to test the dlfilter C API. A perf.data file is synthesized and then processed by perf script with a dlfilter named dlfilter-test-api-v0.so. Also a C file is compiled to provide a dso to match the synthesized perf.data file. Committer testing: [root@five ~]# perf test dlfilter 72: dlfilter C API : Ok [root@five ~]# perf test -v dlfilter 72: dlfilter C API : --- start --- test child forked, pid 3387712 Checking for gcc Command: gcc --version gcc (GCC) 11.1.1 20210531 (Red Hat 11.1.1-3) Copyright (C) 2021 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. dlfilters path: /var/home/acme/libexec/perf-core/dlfilters Command: gcc -g -o /tmp/dlfilter-test-3387712-prog /tmp/dlfilter-test-3387712-prog.c Creating new host machine structure Command: /var/home/acme/bin/perf script -i /tmp/dlfilter-test-3387712-perf-data --dlfilter /var/home/acme/libexec/perf-core/dlfilters/dlfilter-test-api-v0.so --dlarg first --dlarg 1 --dlarg 4198669 --dlarg 4198662 --dlarg 0 --dlarg last start API filter_event_early API filter_event API stop API Command: /var/home/acme/bin/perf script -i /tmp/dlfilter-test-3387712-perf-data --dlfilter /var/home/acme/libexec/perf-core/dlfilters/dlfilter-test-api-v0.so --dlarg first --dlarg 1 --dlarg 4198669 --dlarg 4198662 --dlarg 1 --dlarg last start API filter_event_early API filter_event API stop API Command: /var/home/acme/bin/perf script -i /tmp/dlfilter-test-3387712-perf-data --dlfilter /var/home/acme/libexec/perf-core/dlfilters/dlfilter-test-api-v0.so --dlarg first --dlarg 1 --dlarg 4198669 --dlarg 4198662 --dlarg 2 --dlarg last start API filter_event_early API stop API test child finished with 0 ---- end ---- dlfilter C API: Ok [root@five ~]# Signed-off-by: Adrian Hunter <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Link: https //lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-11perf build: Move perf_dlfilters.h in the source treeAdrian Hunter2-151/+1
Move perf_dlfilters.h in the source tree so that it will be found when building dlfilters as part of the perf build. Signed-off-by: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Link: https //lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-10perf pmu: Make pmu_add_sys_aliases() publicJohn Garry2-1/+2
Function pmu_add_sys_aliases() will be required for the PMU events test for system events aliases, so make it public. Signed-off-by: John Garry <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: https //lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-10perf pmu: Check .is_uncore field in pmu_add_cpu_aliases_map()John Garry1-2/+1
Calling pmu_is_uncore() for fake PMUs does not work, as it checks sysfs for the PMU details (which won't exist). Check .is_uncore field instead, which makes sense anyway. Signed-off-by: John Garry <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: https //lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-09perf env: Track kernel 64-bit mode in environmentLeo Yan2-1/+26
It's useful to know that the kernel is running in 32-bit or 64-bit mode. E.g. We can decide if perf tool is running in compat mode based on the info. This patch adds an item "kernel_is_64_bit" into session's environment structure perf_env, its value is initialized based on the architecture string. Suggested-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Leo Yan <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Li Huafei <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: russell king <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-09perf: Cleanup for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORTLeo Yan1-5/+0
Since the __sync functions have been dropped, This patch removes unused build and checking for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT in perf tool. Signed-off-by: Leo Yan <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Daniel Díaz <[email protected]> Cc: Frank Ch. Eigler <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sedat Dilek <[email protected]> Cc: Song Liu <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-09perf auxtrace: Remove auxtrace_mmap__read_snapshot_head()Leo Yan2-18/+5
Since the function auxtrace_mmap__read_snapshot_head() is exactly same with auxtrace_mmap__read_head(), whether the session is in snapshot mode or not, it's unified to use function auxtrace_mmap__read_head() for reading AUX buffer head. And the function auxtrace_mmap__read_snapshot_head() is unused so this patch removes it. Signed-off-by: Leo Yan <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Daniel Díaz <[email protected]> Cc: Frank Ch. Eigler <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sedat Dilek <[email protected]> Cc: Song Liu <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-09perf auxtrace: Drop legacy __sync functionsLeo Yan1-19/+0
The main purpose for using __sync built-in functions is to support compat mode for 32-bit perf with 64-bit kernel. But using these built-in functions might cause potential issues. __sync functions originally support Intel Itanium processoer [1] but it cannot promise to support all 32-bit archs. Now these functions have become the legacy functions. Considering __sync functions cannot really fix the 64-bit value atomicity on 32-bit archs, thus this patch drops __sync functions. Credits to Peter for detailed analysis. [1] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fsync-Builtins.html#g_t_005f_005fsync-Builtins Suggested-by: Peter Zijlstra <[email protected]> Signed-off-by: Leo Yan <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Daniel Díaz <[email protected]> Cc: Frank Ch. Eigler <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sedat Dilek <[email protected]> Cc: Song Liu <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-09perf auxtrace: Use WRITE_ONCE() for updating aux_tailLeo Yan1-1/+1
Use WRITE_ONCE() for updating aux_tail, so can avoid unexpected memory behaviour. Signed-off-by: Leo Yan <[email protected]> Acked-by: Adrian Hunter <[email protected]> Acked-by: Peter Zijlstra <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Daniel Díaz <[email protected]> Cc: Frank Ch. Eigler <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sedat Dilek <[email protected]> Cc: Song Liu <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Link: http //lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-09perf cs-etm: Add warnings for missing DSOsJames Clark1-1/+10
Currently decode will silently fail if no binary data is available for the decode. This is made worse if only partial data is available because the decode will appear to work, but any trace from that missing DSO will silently not be generated. Add a UI popup once if there is any data missing, and then warn in the bottom left for each individual DSO that's missing. Reviewed-by: Leo Yan <[email protected]> Signed-off-by: James Clark <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: http //lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski3-13/+42
Build failure in drivers/net/wwan/mhi_wwan_mbim.c: add missing parameter (0, assuming we don't want buffer pre-alloc). Conflict in drivers/net/dsa/sja1105/sja1105_main.c between: 589918df9322 ("net: dsa: sja1105: be stateless with FDB entries on SJA1105P/Q/R/S/SJA1110 too") 0fac6aa098ed ("net: dsa: sja1105: delete the best_effort_vlan_filtering mode") Follow the instructions from the commit message of the former commit - removed the if conditions. When looking at commit 589918df9322 ("net: dsa: sja1105: be stateless with FDB entries on SJA1105P/Q/R/S/SJA1110 too") note that the mask_iotag fields get removed by the following patch. Signed-off-by: Jakub Kicinski <[email protected]>
2021-08-03perf cs-etm: Improve Coresight zero timestamp warningJames Clark1-2/+5
Only show the warning if the user hasn't already set timeless mode and improve the text because there was ambiguity around the meaning of '...' Change the warning to a UI warning instead of printing straight to stderr because this corrupts the UI when perf report TUI is used. The UI warning function also handles printing to stderr when in perf script mode. Suggested-by: Leo Yan <[email protected]> Signed-off-by: James Clark <[email protected]> Reviewed-by: Leo Yan <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-03perf tools: Add flag for tracking warnings of missing DSOsJames Clark1-0/+1
Auxtrace support may need DSOs for decoding (for example Arm Coresight). If one of these is missing it would make sense to warn once for each one that's missing, but not flood the output with every address as there could be thousands of lookups. This flag will allow tracking whether a warning was shown for each DSO. Signed-off-by: James Clark <[email protected]> Reviewed-by: Leo Yan <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-03perf annotate: Add disassembly warnings for annotate --stdioJames Clark1-2/+18
Currently 'perf annotate --stdio' (and --stdio2) will exit without printing anything if there are disassembly errors. Apply the same error handler that's used for TUI and GTK modes. This makes comparing disassembly across the different modes more consistent. Signed-off-by: James Clark <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>