aboutsummaryrefslogtreecommitdiff
path: root/tools/lib/perf
AgeCommit message (Collapse)AuthorFilesLines
2022-04-13perf tools: Fix segfault accessing sample_id xyarrayAdrian Hunter1-2/+1
perf_evsel::sample_id is an xyarray which can cause a segfault when accessed beyond its size. e.g. # perf record -e intel_pt// -C 1 sleep 1 Segmentation fault (core dumped) # That is happening because a dummy event is opened to capture text poke events accross all CPUs, however the mmap logic is allocating according to the number of user_requested_cpus. In general, perf sometimes uses the evsel cpus to open events, and sometimes the evlist user_requested_cpus. However, it is not necessary to determine which case is which because the opened event file descriptors are also in an xyarray, the size of whch can be used to correctly allocate the size of the sample_id xyarray, because there is one ID per file descriptor. Note, in the affected code path, perf_evsel fd array is subsequently used to get the file descriptor for the mmap, so it makes sense for the xyarrays to be the same size there. Fixes: d1a177595b3a824c ("libperf: Adopt perf_evlist__mmap()/munmap() from tools/perf") Fixes: 246eba8e9041c477 ("perf tools: Add support for PERF_RECORD_TEXT_POKE") Signed-off-by: Adrian Hunter <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: [email protected] # 5.5+ Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-04-01perf cpumap: More cpu map reuse by merge.Ian Rogers1-10/+5
perf_cpu_map__merge() will reuse one of its arguments if they are equal or the other argument is NULL. The arguments could be reused if it is known one set of values is a subset of the other. For example, a map of 0-1 and a map of just 0 when merged yields the map of 0-1. Currently a new map is created rather than adding a reference count to the original 0-1 map. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Antonov <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Alexey Bayduraev <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: German Gomez <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Fastabend <[email protected]> Cc: John Garry <[email protected]> Cc: KP Singh <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: Song Liu <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yonghong Song <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-04-01perf cpumap: Add is_subset functionIan Rogers2-0/+21
Returns true if the second argument is a subset of the first. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Antonov <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Alexey Bayduraev <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: German Gomez <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Fastabend <[email protected]> Cc: John Garry <[email protected]> Cc: KP Singh <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: Song Liu <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yonghong Song <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-04-01perf evlist: Rename cpus to user_requested_cpusIan Rogers2-15/+20
evlist contains cpus and all_cpus. all_cpus is the union of the cpu maps of all evsels. For non-task targets, cpus is set to be cpus requested from the command line, defaulting to all online cpus if no cpus are specified. For an uncore event, all_cpus may be just CPU 0 or every online CPU. This causes all_cpus to have fewer values than the cpus variable which is confusing given the 'all' in the name. To try to make the behavior clearer, rename cpus to user_requested_cpus and add comments on the two struct variables. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Antonov <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Alexey Bayduraev <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: German Gomez <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Fastabend <[email protected]> Cc: John Garry <[email protected]> Cc: KP Singh <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: Song Liu <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Will Deacon <[email protected]> Cc: Yonghong Song <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-03-31Merge tag 'kbuild-v5.18-v2' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Add new environment variables, USERCFLAGS and USERLDFLAGS to allow additional flags to be passed to user-space programs. - Fix missing fflush() bugs in Kconfig and fixdep - Fix a minor bug in the comment format of the .config file - Make kallsyms ignore llvm's local labels, .L* - Fix UAPI compile-test for cross-compiling with Clang - Extend the LLVM= syntax to support LLVM=<suffix> form for using a particular version of LLVm, and LLVM=<prefix> form for using custom LLVM in a particular directory path. - Clean up Makefiles * tag 'kbuild-v5.18-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kbuild: Make $(LLVM) more flexible kbuild: add --target to correctly cross-compile UAPI headers with Clang fixdep: use fflush() and ferror() to ensure successful write to files arch: syscalls: simplify uapi/kapi directory creation usr/include: replace extra-y with always-y certs: simplify empty certs creation in certs/Makefile certs: include certs/signing_key.x509 unconditionally kallsyms: ignore all local labels prefixed by '.L' kconfig: fix missing '# end of' for empty menu kconfig: add fflush() before ferror() check kbuild: replace $(if A,A,B) with $(or A,B) kbuild: Add environment variables for userprogs flags kbuild: unify cmd_copy and cmd_shipped
2022-03-27Merge tag 'perf-tools-for-v5.18-2022-03-26' of ↵Linus Torvalds6-23/+77
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools updates from Arnaldo Carvalho de Melo: "New features: perf ftrace: - Add -n/--use-nsec option to the 'latency' subcommand. Default: usecs: $ sudo perf ftrace latency -T dput -a sleep 1 # DURATION | COUNT | GRAPH | 0 - 1 us | 2098375 | ############################# | 1 - 2 us | 61 | | 2 - 4 us | 33 | | 4 - 8 us | 13 | | 8 - 16 us | 124 | | 16 - 32 us | 123 | | 32 - 64 us | 1 | | 64 - 128 us | 0 | | 128 - 256 us | 1 | | 256 - 512 us | 0 | | Better granularity with nsec: $ sudo perf ftrace latency -T dput -a -n sleep 1 # DURATION | COUNT | GRAPH | 0 - 1 us | 0 | | 1 - 2 ns | 0 | | 2 - 4 ns | 0 | | 4 - 8 ns | 0 | | 8 - 16 ns | 0 | | 16 - 32 ns | 0 | | 32 - 64 ns | 0 | | 64 - 128 ns | 1163434 | ############## | 128 - 256 ns | 914102 | ############# | 256 - 512 ns | 884 | | 512 - 1024 ns | 613 | | 1 - 2 us | 31 | | 2 - 4 us | 17 | | 4 - 8 us | 7 | | 8 - 16 us | 123 | | 16 - 32 us | 83 | | perf lock: - Add -c/--combine-locks option to merge lock instances in the same class into a single entry. # perf lock report -c Name acquired contended avg wait(ns) total wait(ns) max wait(ns) min wait(ns) rcu_read_lock 251225 0 0 0 0 0 hrtimer_bases.lock 39450 0 0 0 0 0 &sb->s_type->i_l... 10301 1 662 662 662 662 ptlock_ptr(page) 10173 2 701 1402 760 642 &(ei->i_block_re... 8732 0 0 0 0 0 &xa->xa_lock 8088 0 0 0 0 0 &base->lock 6705 0 0 0 0 0 &p->pi_lock 5549 0 0 0 0 0 &dentry->d_lockr... 5010 4 1274 5097 1844 789 &ep->lock 3958 0 0 0 0 0 - Add -F/--field option to customize the list of fields to output: $ perf lock report -F contended,wait_max -k avg_wait Name contended max wait(ns) avg wait(ns) slock-AF_INET6 1 23543 23543 &lruvec->lru_lock 5 18317 11254 slock-AF_INET6 1 10379 10379 rcu_node_1 1 2104 2104 &dentry->d_lockr... 1 1844 1844 &dentry->d_lockr... 1 1672 1672 &newf->file_lock 15 2279 1025 &dentry->d_lockr... 1 792 792 - Add --synth=no option for record, as there is no need to symbolize, lock names comes from the tracepoints. perf record: - Threaded recording, opt-in, via the new --threads command line option. - Improve AMD IBS (Instruction-Based Sampling) error handling messages. perf script: - Add 'brstackinsnlen' field (use it with -F) for branch stacks. - Output branch sample type in 'perf script'. perf report: - Add "addr_from" and "addr_to" sort dimensions. - Print branch stack entry type in 'perf report --dump-raw-trace' - Fix symbolization for chrooted workloads. Hardware tracing: Intel PT: - Add CFE (Control Flow Event) and EVD (Event Data) packets support. - Add MODE.Exec IFLAG bit support. Explanation about these features from the "Intel® 64 and IA-32 architectures software developer’s manual combined volumes: 1, 2A, 2B, 2C, 2D, 3A, 3B, 3C, 3D, and 4" PDF at: https://cdrdv2.intel.com/v1/dl/getContent/671200 At page 3951: "32.2.4 Event Trace is a capability that exposes details about the asynchronous events, when they are generated, and when their corresponding software event handler completes execution. These include: o Interrupts, including NMI and SMI, including the interrupt vector when defined. o Faults, exceptions including the fault vector. - Page faults additionally include the page fault address, when in context. o Event handler returns, including IRET and RSM. o VM exits and VM entries.¹ - VM exits include the values written to the “exit reason” and “exit qualification” VMCS fields. INIT and SIPI events. o TSX aborts, including the abort status returned for the RTM instructions. o Shutdown. Additionally, it provides indication of the status of the Interrupt Flag (IF), to indicate when interrupts are masked" ARM CoreSight: - Use advertised caps/min_interval as default sample_period on ARM spe. - Update deduction of TRCCONFIGR register for branch broadcast on ARM's CoreSight ETM. Vendor Events (JSON): Intel: - Update events and metrics for: Alderlake, Broadwell, Broadwell DE, BroadwellX, CascadelakeX, Elkhartlake, Bonnell, Goldmont, GoldmontPlus, Westmere EP-DP, Haswell, HaswellX, Icelake, IcelakeX, Ivybridge, Ivytown, Jaketown, Knights Landing, Nehalem EP, Sandybridge, Silvermont, Skylake, Skylake Server, SkylakeX, Tigerlake, TremontX, Westmere EP-SP, and Westmere EX. ARM: - Add support for HiSilicon CPA PMU aliasing. perf stat: - Fix forked applications enablement of counters. - The 'slots' should only be printed on a different order than the one specified on the command line when 'topdown' events are present, fix it. Miscellaneous: - Sync msr-index, cpufeatures header files with the kernel sources. - Stop using some deprecated libbpf APIs in 'perf trace'. - Fix some spelling mistakes. - Refactor the maps pointers usage to pave the way for using refcount debugging. - Only offer the --tui option on perf top, report and annotate when perf was built with libslang. - Don't mention --to-ctf in 'perf data --help' when not linking with the required library, libbabeltrace. - Use ARRAY_SIZE() instead of ad hoc equivalent, spotted by array_size.cocci. - Enhance the matching of sub-commands abbreviations: 'perf c2c rec' -> 'perf c2c record' 'perf c2c recport -> error - Set build-id using build-id header on new mmap records. - Fix generation of 'perf --version' string. perf test: - Add test for the arm_spe event. - Add test to check unwinding using fame-pointer (fp) mode on arm64. - Make metric testing more robust in 'perf test'. - Add error message for unsupported branch stack cases. libperf: - Add API for allocating new thread map array. - Fix typo in perf_evlist__open() failure error messages in libperf tests. perf c2c: - Replace bitmap_weight() with bitmap_empty() where appropriate" * tag 'perf-tools-for-v5.18-2022-03-26' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (143 commits) perf evsel: Improve AMD IBS (Instruction-Based Sampling) error handling messages perf python: Add perf_env stubs that will be needed in evsel__open_strerror() perf tools: Enhance the matching of sub-commands abbreviations libperf tests: Fix typo in perf_evlist__open() failure error messages tools arm64: Import cputype.h perf lock: Add -F/--field option to control output perf lock: Extend struct lock_key to have print function perf lock: Add --synth=no option for record tools headers cpufeatures: Sync with the kernel sources tools headers cpufeatures: Sync with the kernel sources perf stat: Fix forked applications enablement of counters tools arch x86: Sync the msr-index.h copy with the kernel sources perf evsel: Make evsel__env() always return a valid env perf build-id: Fix spelling mistake "Cant" -> "Can't" perf header: Fix spelling mistake "could't" -> "couldn't" perf script: Add 'brstackinsnlen' for branch stacks perf parse-events: Move slots only with topdown perf ftrace latency: Update documentation perf ftrace latency: Add -n/--use-nsec option perf tools: Fix version kernel tag ...
2022-03-26libperf tests: Fix typo in perf_evlist__open() failure error messagesShunsuke Nakamura1-4/+4
This patch corrects typos in error messages. I should be "evlist", not "evsel" as the function that fails is perf_evlist__open(). Fixes: 3ce311afb5583cf3 ("libperf: Move to tools/lib/perf") Fixes: a7f3713f6bf207e6 ("libperf tests: Add test_stat_multiplexing test") Signed-off-by: Shunsuke Nakamura <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-03-24Merge tag 'flexible-array-transformations-5.18-rc1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux Pull flexible-array transformations from Gustavo Silva: "Treewide patch that replaces zero-length arrays with flexible-array members. This has been baking in linux-next for a whole development cycle" * tag 'flexible-array-transformations-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: treewide: Replace zero-length arrays with flexible-array members
2022-02-23libperf: Add API for allocating new thread map arrayTzvetomir Stoyanov (VMware)5-7/+61
The existing API perf_thread_map__new_dummy() allocates new thread map for one thread. I couldn't find a way to reallocate the map with more threads, or to allocate a new map for more than one thread. Having multiple threads in a thread map is essential for some use cases. That's why a new API is proposed, which allocates a new thread map for given number of threads: perf_thread_map__new_array() Signed-off-by: Tzvetomir Stoyanov (VMware) <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Ian Rogers <[email protected]> Link: https://lore.kernel.org/linux-perf-users/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-02-23libperf: Rename arguments of perf_thread_map APIsTzvetomir Stoyanov (VMware)3-12/+12
The "int thread" input arguments of some perf_thead_map APIs are index of the thread in the thread map. In order to avoid confusion and to make the APIs consistent with perf_cpu_map APIs, those arguments are renamed to "int idx". Suggested-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Tzvetomir Stoyanov (VMware) <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Ian Rogers <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-02-17treewide: Replace zero-length arrays with flexible-array membersGustavo A. R. Silva1-1/+1
There is a regular need in the kernel to provide a way to declare having a dynamically sized set of trailing elements in a structure. Kernel code should always use “flexible array members”[1] for these cases. The older style of one-element or zero-length arrays should no longer be used[2]. This code was transformed with the help of Coccinelle: (next-20220214$ spatch --jobs $(getconf _NPROCESSORS_ONLN) --sp-file script.cocci --include-headers --dir . > output.patch) @@ identifier S, member, array; type T1, T2; @@ struct S { ... T1 member; T2 array[ - 0 ]; }; UAPI and wireless changes were intentionally excluded from this patch and will be sent out separately. [1] https://en.wikipedia.org/wiki/Flexible_array_member [2] https://www.kernel.org/doc/html/v5.16/process/deprecated.html#zero-length-and-one-element-arrays Link: https://github.com/KSPP/linux/issues/78 Reviewed-by: Kees Cook <[email protected]> Signed-off-by: Gustavo A. R. Silva <[email protected]>
2022-02-16libperf: Fix perf_cpu_map__for_each_cpu macroJiri Olsa4-5/+18
Tzvetomir Stoyanov reported an issue with using macro perf_cpu_map__for_each_cpu using private perf_cpu object. The issue is caused by recent change that wrapped cpu in struct perf_cpu to distinguish it from cpu indexes. We need to make struct perf_cpu public. Add a simple test for using the perf_cpu_map__for_each_cpu macro. Fixes: 6d18804b963b78dc ("perf cpumap: Give CPUs their own type") Reported-by: Tzvetomir Stoyanov (VMware) <[email protected]> Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-02-16libperf: Fix 32-bit build for tests uint64_t printfRob Herring1-2/+3
Commit a7f3713f6bf207e6 ("libperf tests: Add test_stat_multiplexing test") added printf's of 64-bit ints using %lu which doesn't work on 32-bit builds: tests/test-evlist.c:529:29: error: format ‘%lu’ expects argument of type \ ‘long unsigned int’, but argument 4 has type ‘uint64_t’ {aka ‘long long unsigned int’} [-Werror=format=] Use PRIu64 instead which works on both 32-bit and 64-bit systems. Fixes: a7f3713f6bf207e6 ("libperf tests: Add test_stat_multiplexing test") Signed-off-by: Rob Herring <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Shunsuke Nakamura <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-02-15kbuild: replace $(if A,A,B) with $(or A,B)Masahiro Yamada1-1/+1
$(or ...) is available since GNU Make 3.81, and useful to shorten the code in some places. Covert as follows: $(if A,A,B) --> $(or A,B) This patch also converts: $(if A, A, B) --> $(or A, B) Strictly speaking, the latter is not an equivalent conversion because GNU Make keeps spaces after commas; if A is not empty, $(if A, A, B) expands to " A", while $(or A, B) expands to "A". Anyway, preceding spaces are not significant in the code hunks I touched. Signed-off-by: Masahiro Yamada <[email protected]> Reviewed-by: Nicolas Schier <[email protected]>
2022-02-06libperf: Add arm64 support to perf_mmap__read_self()Rob Herring2-1/+102
Add the arm64 variants for read_perf_counter() and read_timestamp(). Unfortunately the counter number is encoded into the instruction, so the code is a bit verbose to enumerate all possible counters. Tested-by: Masayoshi Mizuma <[email protected]> Signed-off-by: Rob Herring <[email protected]> Acked-by: Jiri Olsa <[email protected]> Tested-by: John Garry <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Will Deacon <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: [email protected] Cc: [email protected]
2022-01-22perf cpumap: Migrate to libperf cpumap apiIan Rogers1-2/+2
Switch from directly accessing the perf_cpu_map to using the appropriate libperf API when possible. Using the API simplifies the job of refactoring use of perf_cpu_map. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexey Bayduraev <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andrew Morton <[email protected]> Cc: André Almeida <[email protected]> Cc: Andy Shevchenko <[email protected]> Cc: Darren Hart <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Dmitriy Vyukov <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: German Gomez <[email protected]> Cc: James Clark <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Miaoqian Lin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: Shunsuke Nakamura <[email protected]> Cc: Song Liu <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Stephen Brennan <[email protected]> Cc: Steven Rostedt (VMware) <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Yury Norov <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-01-14libperf tests: Update a use of the new cpumap APIIan Rogers1-2/+3
Fixes a build breakage. Fixes: 6d18804b963b78dc ("perf cpumap: Give CPUs their own type") Signed-off-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: colin ian king <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Shunsuke Nakamura <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-01-12perf cpumap: Give CPUs their own typeIan Rogers9-63/+85
A common problem is confusing CPU map indices with the CPU, by wrapping the CPU with a struct then this is avoided. This approach is similar to atomic_t. Committer notes: To make it build with BUILD_BPF_SKEL=1 these files needed the conversions to 'struct perf_cpu' usage: tools/perf/util/bpf_counter.c tools/perf/util/bpf_counter_cgroup.c tools/perf/util/bpf_ftrace.c Also perf_env__get_cpu() was removed back in "perf cpumap: Switch cpu_map__build_map to cpu function". Additionally these needed to be fixed for the ARM builds to complete: tools/perf/arch/arm/util/cs-etm.c tools/perf/arch/arm64/util/pmu.c Suggested-by: John Garry <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Clarke <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Vineet Singh <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-01-12libperf: Sync evsel documentationIan Rogers1-5/+5
cpu was renamed cpu_map_idx, for clarity. Signed-off-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Clarke <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Vineet Singh <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-01-12libperf: Allow NULL in perf_cpu_map__idx()Ian Rogers1-1/+6
Return -1, not found, if NULL is passed. Signed-off-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Clarke <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Vineet Singh <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-01-12libperf: Use cpu not index for evsel mmapIan Rogers1-1/+2
Fix issue where evsel's CPU map index was being used as the mmap cpu. Signed-off-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Clarke <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Vineet Singh <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-01-12libperf: Switch cpu to more accurate cpu_map_idxIan Rogers2-48/+50
Modify variable names and adopt perf_cpu_map__for_each_cpu() in perf_evsel__open(). Renaming is done by looking for consistency in API usage. Signed-off-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Clarke <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Vineet Singh <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-01-12perf cpumap: Move 'has' function to libperfIan Rogers5-2/+10
Make the cpu map argument const for consistency with the rest of the API. Modify cpu_map__idx accordingly. Reviewed-by: James Clark <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Clarke <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Vineet Singh <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-01-12libperf: Add comments to 'struct perf_cpu_map'Ian Rogers1-0/+9
A particular observed problem is confusing the index with the CPU value, documentation should hopefully reduce this type of problem. Reviewed-by: James Clark <[email protected]> Reviewed-by: John Garry <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Clarke <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Vineet Singh <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-16libperf tests: Fix a spelling mistake "Runnnig" -> "Running"Colin Ian King1-1/+1
There is a spelling mistake in a __T_VERBOSE message. Fix it. Signed-off-by: Colin Ian King <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-07perf evlist: Allow setting arbitrary leaderIan Rogers2-7/+10
The leader of a group is the first, but allow it to be an arbitrary list member so that for Intel topdown events slots may always be the group leader. Reviewed-by: Kajol Jain <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Clarke <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Riccardo Mancini <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Vineet Singh <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-07libperf tests: Add test_stat_multiplexing testShunsuke Nakamura1-0/+157
Adds a test for a counter obtained using read() system call during multiplexing. $ sudo make tests -C ./tools/lib/perf/ V=1 make: Entering directory '/home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/lib/perf' make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=. obj=libperf make -C /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/lib/api/ O= libapi.a make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=./fd obj=libapi make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=./fs obj=libapi make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=. obj=tests make -f /home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/build/Makefile.build dir=./tests obj=tests running static: - running tests/test-cpumap.c...OK - running tests/test-threadmap.c...OK - running tests/test-evlist.c... Event 0 -- Raw count = 298049842, run = 270269503, enable = 456262127 Scaled count = 503160191 (59.24%, 270269503/456262127) Event 1 -- Raw count = 299134173, run = 271075173, enable = 456257234 Scaled count = 503484435 (59.41%, 271075173/456257234) Event 2 -- Raw count = 300461996, run = 272069283, enable = 456253417 Scaled count = 503867290 (59.63%, 272069283/456253417) Event 3 -- Raw count = 301308704, run = 273063387, enable = 456249352 Scaled count = 503443183 (59.85%, 273063387/456249352) Event 4 -- Raw count = 302531164, run = 274102932, enable = 456244712 Scaled count = 503563543 (60.08%, 274102932/456244712) Event 5 -- Raw count = 303710254, run = 275406214, enable = 456228165 Scaled count = 503115633 (60.37%, 275406214/456228165) Event 6 -- Raw count = 304531302, run = 276396076, enable = 456221130 Scaled count = 502661313 (60.58%, 276396076/456221130) Event 7 -- Raw count = 304486460, run = 276601890, enable = 456213754 Scaled count = 502205212 (60.63%, 276601890/456213754) Event 8 -- Raw count = 304116681, run = 276631326, enable = 456205562 Scaled count = 501532936 (60.64%, 276631326/456205562) Event 9 -- Raw count = 303567766, run = 276188567, enable = 456196839 Scaled count = 501420666 (60.54%, 276188567/456196839) Event 10 -- Raw count = 302238014, run = 275144001, enable = 456185300 Scaled count = 501106833 (60.31%, 275144001/456185300) Event 11 -- Raw count = 300805716, run = 273824589, enable = 456175608 Scaled count = 501124573 (60.03%, 273824589/456175608) Event 12 -- Raw count = 299959051, run = 272834556, enable = 456166593 Scaled count = 501517477 (59.81%, 272834556/456166593) Event 13 -- Raw count = 299037090, run = 271820805, enable = 456157086 Scaled count = 501830195 (59.59%, 271820805/456157086) Event 14 -- Raw count = 298327042, run = 270784311, enable = 456147546 Scaled count = 502544433 (59.36%, 270784311/456147546) Expected: 501614268 High: 503867290 Low: 298049842 Average: 502438527 Average Error = 0.16% OK - running tests/test-evsel.c... loop = 65536, count = 328182 loop = 131072, count = 660214 loop = 262144, count = 1315534 loop = 524288, count = 2635364 loop = 1048576, count = 5271971 loop = 65536, count = 491952 loop = 131072, count = 850061 loop = 262144, count = 1648608 loop = 524288, count = 3162059 loop = 1048576, count = 6353393 OK running dynamic: - running tests/test-cpumap.c...OK - running tests/test-threadmap.c...OK - running tests/test-evlist.c... Event 0 -- Raw count = 300218292, run = 297528154, enable = 496789343 Scaled count = 501281125 (59.89%, 297528154/496789343) Event 1 -- Raw count = 301438606, run = 298515328, enable = 496784768 Scaled count = 501649643 (60.09%, 298515328/496784768) Event 2 -- Raw count = 302342618, run = 298798983, enable = 496782015 Scaled count = 502673648 (60.15%, 298798983/496782015) Event 3 -- Raw count = 303132319, run = 299230407, enable = 496778508 Scaled count = 503256412 (60.23%, 299230407/496778508) Event 4 -- Raw count = 302758195, run = 299218047, enable = 496774243 Scaled count = 502651743 (60.23%, 299218047/496774243) Event 5 -- Raw count = 303158458, run = 299204274, enable = 496769146 Scaled count = 503334281 (60.23%, 299204274/496769146) Event 6 -- Raw count = 303471397, run = 299197479, enable = 496763124 Scaled count = 503859189 (60.23%, 299197479/496763124) Event 7 -- Raw count = 303583387, run = 299196861, enable = 496756458 Scaled count = 504039405 (60.23%, 299196861/496756458) Event 8 -- Raw count = 303096897, run = 299186924, enable = 496748667 Scaled count = 503240507 (60.23%, 299186924/496748667) Event 9 -- Raw count = 301424173, run = 297845086, enable = 496739994 Scaled count = 502709122 (59.96%, 297845086/496739994) Event 10 -- Raw count = 300876415, run = 296851339, enable = 496729034 Scaled count = 503464297 (59.76%, 296851339/496729034) Event 11 -- Raw count = 300239338, run = 296547963, enable = 496719538 Scaled count = 502902612 (59.70%, 296547963/496719538) Event 12 -- Raw count = 299751948, run = 296547195, enable = 496710036 Scaled count = 502077926 (59.70%, 296547195/496710036) Event 13 -- Raw count = 299341883, run = 296549981, enable = 496700423 Scaled count = 501376663 (59.70%, 296549981/496700423) Event 14 -- Raw count = 299145476, run = 296561684, enable = 496690949 Scaled count = 501018366 (59.71%, 296561684/496690949) Expected: 501669431 High: 504039405 Low: 300218292 Average: 502635662 Average Error = 0.19% OK - running tests/test-evsel.c... loop = 65536, count = 329275 loop = 131072, count = 664638 loop = 262144, count = 1315367 loop = 524288, count = 2629617 loop = 1048576, count = 5273657 loop = 65536, count = 459641 loop = 131072, count = 978402 loop = 262144, count = 1581219 loop = 524288, count = 3774908 loop = 1048576, count = 7694417 OK make: Leaving directory '/home/nakamura/build_work/build_kernel/linux_kernel/linux/tools/lib/perf' Signed-off-by: Shunsuke Nakamura <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-07libperf: Remove scaling process from perf_mmap__read_self()Shunsuke Nakamura1-2/+0
Remove the scaling process from perf_mmap__read_self(), and unify the counters that can be obtained from perf_evsel__read() to "no scaling". Signed-off-by: Shunsuke Nakamura <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-07libperf: Adopt perf_counts_values__scale() from tools/perf/utilShunsuke Nakamura3-0/+24
Move perf_counts_values__scale() from tools/perf/util to tools/lib/perf so that it can be used with libperf. Committer notes: As noted by Jiri, use __s8 instead of s8 on the exported function. Signed-off-by: Shunsuke Nakamura <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-10-26Merge remote-tracking branch 'torvalds/master' into perf/coreArnaldo Carvalho de Melo2-6/+7
To pick up the fixes from upstream. Fix simple conflict on session.c related to the file position fix that went upstream and is touched by the active decomp changes in perf/core. Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-10-20perf tools: Add support for PERF_RECORD_AUX_OUTPUT_HW_IDAdrian Hunter1-0/+6
The PERF_RECORD_AUX_OUTPUT_HW_ID event provides a way to match AUX output data like Intel PT PEBS-via-PT back to the event that it came from, by providing a hardware ID that is present in the AUX output. Reviewed-by: Alexander Shishkin <[email protected]> Reviewed-by: Andi Kleen <[email protected]> Signed-off-by: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-10-14libperf tests: Fix test_stat_cpuShunsuke Nakamura2-6/+6
The `cpu` argument of perf_evsel__read() must specify the cpu index. perf_cpu_map__for_each_cpu() is for iterating the cpu number (not index) and is thus not appropriate for use with perf_evsel__read(). So, if there is an offline CPU, the cpu number specified in the argument may point out of range because the cpu number and the cpu index are different. Fix test_stat_cpu(). Testing it: # make tests -C tools/lib/perf/ make: Entering directory '/home/nakamura/kernel_src/linux-5.15-rc4_fix/tools/lib/perf' running static: - running tests/test-cpumap.c...OK - running tests/test-threadmap.c...OK - running tests/test-evlist.c...OK - running tests/test-evsel.c...OK running dynamic: - running tests/test-cpumap.c...OK - running tests/test-threadmap.c...OK - running tests/test-evlist.c...OK - running tests/test-evsel.c...OK make: Leaving directory '/home/nakamura/kernel_src/linux-5.15-rc4_fix/tools/lib/perf' Signed-off-by: Shunsuke Nakamura <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-10-14libperf test evsel: Fix build error on !x86 architecturesShunsuke Nakamura1-0/+1
In test_stat_user_read, following build error occurs except i386 and x86_64 architectures: tests/test-evsel.c:129:31: error: variable 'pc' set but not used [-Werror=unused-but-set-variable] struct perf_event_mmap_page *pc; Fix build error. Signed-off-by: Shunsuke Nakamura <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-10-08libperf cpumap: Use binary search in perf_cpu_map__idx() as array are sortedRiccardo Mancini1-4/+12
Since 7074674e7338863e ("perf cpumap: Maintain cpumaps ordered and without dups") perf_cpu_map elements are sorted in ascending order. This patch improves perf_cpu_map__idx() by using a binary search. Signed-off-by: Riccardo Mancini <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/f1543c15797169c21e8b205a4a6751159180580d.1629490974.git.rickyman7@gmail.com [ Removed 'else' after if + return, declared variables where needed ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-09-18libperf evsel: Make use of FD robust.Ian Rogers1-23/+41
FD uses xyarray__entry that may return NULL if an index is out of bounds. If NULL is returned then a segv happens as FD unconditionally dereferences the pointer. This was happening in a case of with perf iostat as shown below. The fix is to make FD an "int*" rather than an int and handle the NULL case as either invalid input or a closed fd. $ sudo gdb --args perf stat --iostat list ... Breakpoint 1, perf_evsel__alloc_fd (evsel=0x5555560951a0, ncpus=1, nthreads=1) at evsel.c:50 50 { (gdb) bt #0 perf_evsel__alloc_fd (evsel=0x5555560951a0, ncpus=1, nthreads=1) at evsel.c:50 #1 0x000055555585c188 in evsel__open_cpu (evsel=0x5555560951a0, cpus=0x555556093410, threads=0x555556086fb0, start_cpu=0, end_cpu=1) at util/evsel.c:1792 #2 0x000055555585cfb2 in evsel__open (evsel=0x5555560951a0, cpus=0x0, threads=0x555556086fb0) at util/evsel.c:2045 #3 0x000055555585d0db in evsel__open_per_thread (evsel=0x5555560951a0, threads=0x555556086fb0) at util/evsel.c:2065 #4 0x00005555558ece64 in create_perf_stat_counter (evsel=0x5555560951a0, config=0x555555c34700 <stat_config>, target=0x555555c2f1c0 <target>, cpu=0) at util/stat.c:590 #5 0x000055555578e927 in __run_perf_stat (argc=1, argv=0x7fffffffe4a0, run_idx=0) at builtin-stat.c:833 #6 0x000055555578f3c6 in run_perf_stat (argc=1, argv=0x7fffffffe4a0, run_idx=0) at builtin-stat.c:1048 #7 0x0000555555792ee5 in cmd_stat (argc=1, argv=0x7fffffffe4a0) at builtin-stat.c:2534 #8 0x0000555555835ed3 in run_builtin (p=0x555555c3f540 <commands+288>, argc=3, argv=0x7fffffffe4a0) at perf.c:313 #9 0x0000555555836154 in handle_internal_command (argc=3, argv=0x7fffffffe4a0) at perf.c:365 #10 0x000055555583629f in run_argv (argcp=0x7fffffffe2ec, argv=0x7fffffffe2e0) at perf.c:409 #11 0x0000555555836692 in main (argc=3, argv=0x7fffffffe4a0) at perf.c:539 ... (gdb) c Continuing. Error: The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (uncore_iio_0/event=0x83,umask=0x04,ch_mask=0xF,fc_mask=0x07/). /bin/dmesg | grep -i perf may provide additional information. Program received signal SIGSEGV, Segmentation fault. 0x00005555559b03ea in perf_evsel__close_fd_cpu (evsel=0x5555560951a0, cpu=1) at evsel.c:166 166 if (FD(evsel, cpu, thread) >= 0) v3. fixes a bug in perf_evsel__run_ioctl where the sense of a branch was backward. Signed-off-by: Ian Rogers <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-31libperf cpumap: Take into advantage it is sorted to optimize perf_cpu_map__max()Riccardo Mancini1-8/+2
From commit 7074674e7338863e ("perf cpumap: Maintain cpumaps ordered and without dups"), perf_cpu_map elements are sorted in ascending order. This patch improves the perf_cpu_map__max function by returning the last element. Committer notes: Do it as a ternary to keep it in just one return line, add a comment explaining it is sorted and what functions does it. Signed-off-by: Riccardo Mancini <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/fb79f02e7b86ea8044d563adb1e9890c906f982f.1629490974.git.rickyman7@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-24libperf tests: Fix verbose printingShunsuke Nakamura1-0/+2
libperf's verbose printing checks the -v option every time the macro _T_ START is called. Since there are currently four libperf tests registered, the macro _T_ START is called four times, but verbose printing after the second time is not output. Resets the index of the element processed by getopt() and fix verbose printing so that it prints in all tests. Signed-off-by: Shunsuke Nakamura <[email protected]> Acked-by: Rob Herring <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-08-11libperf: Add perf_cpu_map__default_new()Jin Yao2-0/+6
libperf already has a static function called 'cpu_map__default_new()'. Add a new API perf_cpu_map__default_new() to export the function. Signed-off-by: Jin Yao <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jin Yao <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https //lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-07-09libperf: Add tests for perf_evlist__set_leader()Jiri Olsa1-6/+21
Add a test for the newly added perf_evlist__set_leader() function. Committer testing: $ cd tools/lib/perf/ $ sudo make tests [sudo] password for acme: running static: - running tests/test-cpumap.c...OK - running tests/test-threadmap.c...OK - running tests/test-evlist.c...OK - running tests/test-evsel.c...OK running dynamic: - running tests/test-cpumap.c...OK - running tests/test-threadmap.c...OK - running tests/test-evlist.c...OK - running tests/test-evsel.c...OK $ Signed-off-by: Jiri Olsa <[email protected]> Requested-by: Shunsuke Nakamura <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-07-09libperf: Remove BUG_ON() from library code in get_group_fd()Arnaldo Carvalho de Melo1-7/+16
We shouldn't just panic, return a value that doesn't clash with what perf_evsel__open() was already returning in case of error, i.e. errno when sys_perf_event_open() fails. Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Shunsuke Nakamura <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-07-09libperf: Add group support to perf_evsel__open()Jiri Olsa1-2/+24
Add support to set group_fd in perf_evsel__open() and make it follow the group setup. Signed-off-by: Jiri Olsa <[email protected]> Requested-by: Shunsuke Nakamura <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-07-09libperf: Adopt evlist__set_leader() from tools/perf as perf_evlist__set_leader()Jiri Olsa4-0/+24
Move the implementation of evlist__set_leader() to a new libperf perf_evlist__set_leader() function with the same functionality make it a libperf exported API. Signed-off-by: Jiri Olsa <[email protected]> Requested-by: Shunsuke Nakamura <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-07-09libperf: Move 'nr_groups' from tools/perf to evlist::nr_groupsJiri Olsa1-0/+1
Move evsel::nr_groups to perf_evsel::nr_groups, so we can move the group interface to libperf. Signed-off-by: Jiri Olsa <[email protected]> Requested-by: Shunsuke Nakamura <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-07-09libperf: Move 'leader' from tools/perf to perf_evsel::leaderJiri Olsa2-0/+2
Move evsel::leader to perf_evsel::leader, so we can move the group interface to libperf. Also add several evsel helpers to ease up the transition: struct evsel *evsel__leader(struct evsel *evsel); - get leader evsel bool evsel__has_leader(struct evsel *evsel, struct evsel *leader); - true if evsel has leader as leader bool evsel__is_leader(struct evsel *evsel); - true if evsel is itw own leader void evsel__set_leader(struct evsel *evsel, struct evsel *leader); - set leader for evsel Committer notes: Fix this when building with 'make BUILD_BPF_SKEL=1' tools/perf/util/bpf_counter.c - if (evsel->leader->core.nr_members > 1) { + if (evsel->core.leader->nr_members > 1) { Signed-off-by: Jiri Olsa <[email protected]> Requested-by: Shunsuke Nakamura <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-07-09libperf: Move 'idx' from tools/perf to perf_evsel::idxJiri Olsa3-3/+8
Move evsel::idx to perf_evsel::idx, so we can move the group interface to libperf. Committer notes: Fixup evsel->idx usage in tools/perf/util/bpf_counter_cgroup.c, that appeared in my tree in my local tree. Also fixed up these: $ find tools/perf/ -name "*.[ch]" | xargs grep 'evsel->idx' tools/perf/ui/gtk/annotate.c: evsel->idx + i); tools/perf/ui/gtk/annotate.c: evsel->idx); $ That running 'make -C tools/perf build-test' caught. Signed-off-by: Jiri Olsa <[email protected]> Requested-by: Shunsuke Nakamura <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-07-07libperf: Change tests to single static and shared binariesJiri Olsa11-51/+67
Make tests to be two binaries 'tests_static' and 'tests_shared', so the maintenance is easier. Adding tests under libperf build system, so we define all the flags just once. Adding make-tests tule to just compile tests without running them. Signed-off-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Petlan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Shunsuke Nakamura <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-29perf jit: Let convert_timestamp() to be backwards-compatibleLeo Yan1-0/+2
Commit d110162cafc80dad ("perf tsc: Support cap_user_time_short for event TIME_CONV") supports the extended parameters for event TIME_CONV, but it broke the backwards compatibility, so any perf data file with old event format fails to convert timestamp. This patch introduces a helper event_contains() to check if an event contains a specific member or not. For the backwards-compatibility, if the event size confirms the extended parameters are supported in the event TIME_CONV, then copies these parameters. Committer notes: To make this compiler backwards compatible add this patch: - struct perf_tsc_conversion tc = { 0 }; + struct perf_tsc_conversion tc = { .time_shift = 0, }; Fixes: d110162cafc8 ("perf tsc: Support cap_user_time_short for event TIME_CONV") Signed-off-by: Leo Yan <[email protected]> Acked-by: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gustavo A. R. Silva <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steve MacLean <[email protected]> Cc: Yonatan Goldschmidt <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-29perf tools: Change fields type in perf_record_time_convLeo Yan1-2/+3
C standard claims "An object declared as type _Bool is large enough to store the values 0 and 1", bool type size can be 1 byte or larger than 1 byte. Thus it's uncertian for bool type size with different compilers. This patch changes the bool type in structure perf_record_time_conv to __u8 type, and pads extra bytes for 8-byte alignment; this can give reliable structure size. Fixes: d110162cafc8 ("perf tsc: Support cap_user_time_short for event TIME_CONV") Suggested-by: Adrian Hunter <[email protected]> Signed-off-by: Leo Yan <[email protected]> Acked-by: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Gustavo A. R. Silva <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steve MacLean <[email protected]> Cc: Yonatan Goldschmidt <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-29perf util: Move bpf_perf definitions to a libperf headerSong Liu1-0/+31
By following the same protocol, other tools can share hardware PMCs with perf. Move perf_event_attr_map_entry and BPF_PERF_DEFAULT_ATTR_MAP_PATH to bpf_perf.h for other tools to use. Signed-off-by: Song Liu <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Song Liu <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-04-20libperf xyarray: Add bounds checks to xyarray__entry()Rob Herring1-1/+8
xyarray__entry() is missing any bounds checking yet often the x and y parameters come from external callers. Add bounds checks and an unchecked __xyarray__entry(). Committer notes: Make the 'x' and 'y' arguments to the new xyarray__entry() that does bounds check to be of type 'size_t', so that we cover also the case where 'x' and 'y' could be negative, which is needed anyway as having them as 'int' breaks the build with: /home/acme/git/perf/tools/lib/perf/include/internal/xyarray.h: In function ‘xyarray__entry’: /home/acme/git/perf/tools/lib/perf/include/internal/xyarray.h:28:8: error: comparison of integer expressions of different signedness: ‘int’ and ‘size_t’ {aka ‘long unsigned int’} [-Werror=sign-compare] 28 | if (x >= xy->max_x || y >= xy->max_y) | ^~ /home/acme/git/perf/tools/lib/perf/include/internal/xyarray.h:28:26: error: comparison of integer expressions of different signedness: ‘int’ and ‘size_t’ {aka ‘long unsigned int’} [-Werror=sign-compare] 28 | if (x >= xy->max_x || y >= xy->max_y) | ^~ cc1: all warnings being treated as errors Signed-off-by: Rob Herring <[email protected]> Suggested-by: Arnaldo Carvalho de Melo <[email protected]> Suggested-by: Namhyung Kim <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>