aboutsummaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2022-10-06perf mem/c2c: Avoid printing empty lines for unsupported eventsRavi Bangoria1-5/+6
The 'perf mem' and 'perf c2c' tools can be used with 3 different events: load, store and combined load-store. Some architectures might support only partial set of events in which case, perf prints an empty line for unsupported events. Avoid that. Ex, AMD Zen cpus supports only combined load-store event and does not support individual load and store event. Before patch: $ perf mem record -e list mem-ldst : available $ After patch: $ perf mem record -e list mem-ldst : available $ Signed-off-by: Ravi Bangoria <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ali Saidi <[email protected]> Cc: Ananth Narayan <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Joe Mario <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kim Phillips <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Santosh Shukla <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf mem/c2c: Add load store event mappings for AMDRavi Bangoria3-7/+41
The 'perf mem' and 'perf c2c' tools are wrappers around 'perf record' with mem load/ store events. IBS tagged load/store sample provides most of the information needed for these tools. Wire in the "ibs_op//" event as mem-ldst event for AMD. There are some limitations though: Only load/store micro-ops provide mem/c2c information. Whereas, IBS does not have a way to choose a particular type of micro-op to tag. This results in many non-LS micro-ops being tagged which appear as N/A in the perf report. IBS, being an uncore pmu from kernel point of view[1], does not support per process monitoring. Thus, perf mem/c2c on AMD are currently supported in per-cpu mode only. Example: $ sudo perf mem record -- -c 10000 ^C[ perf record: Woken up 227 times to write data ] [ perf record: Captured and wrote 58.760 MB perf.data (836978 samples) ] $ sudo perf mem report -F mem,sample,snoop Samples: 836K of event 'ibs_op//', Event count (approx.): 8418762 Memory access Samples Snoop N/A 700620 N/A L1 hit 126675 N/A L2 hit 424 N/A L3 hit 664 HitM L3 hit 10 N/A Local RAM hit 2 N/A Remote RAM (1 hop) hit 8558 N/A Remote Cache (1 hop) hit 3 N/A Remote Cache (1 hop) hit 2 HitM Remote Cache (2 hops) hit 10 HitM Remote Cache (2 hops) hit 6 N/A Uncached hit 4 N/A $ [1]: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Ravi Bangoria <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ali Saidi <[email protected]> Cc: Ananth Narayan <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Joe Mario <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kim Phillips <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Santosh Shukla <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf mem/c2c: Set PERF_SAMPLE_WEIGHT for LOAD_STORE eventsRavi Bangoria3-0/+3
Currently perf sets PERF_SAMPLE_WEIGHT flag only for mem load events. Set it for combined load-store event as well which will enable recording of load latency by default on arch that does not support independent mem load event. Also document missing -W in perf-record man page. Signed-off-by: Ravi Bangoria <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ali Saidi <[email protected]> Cc: Ananth Narayan <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Joe Mario <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kim Phillips <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Santosh Shukla <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf mem: Add support for printing PERF_MEM_LVLNUM_{CXL|IO}Ravi Bangoria1-0/+2
Add support for printing these new fields in perf mem report. Signed-off-by: Ravi Bangoria <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ali Saidi <[email protected]> Cc: Ananth Narayan <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Joe Mario <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kim Phillips <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Santosh Shukla <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf amd ibs: Sync arch/x86/include/asm/amd-ibs.h header with the kernelRavi Bangoria1-0/+16
Although new details added into this header is currently used by kernel only, tools copy needs to be in sync with kernel file to avoid tools/perf/check-headers.sh warnings. Signed-off-by: Ravi Bangoria <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ali Saidi <[email protected]> Cc: Ananth Narayan <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Joe Mario <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kim Phillips <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Santosh Shukla <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06tools headers UAPI: Sync include/uapi/linux/perf_event.h header with the kernelRavi Bangoria1-1/+3
Two new fields for mem_lvl_num has been introduced: PERF_MEM_LVLNUM_IO and PERF_MEM_LVLNUM_CXL which are required to support perf mem/c2c on AMD platform. Signed-off-by: Ravi Bangoria <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ali Saidi <[email protected]> Cc: Ananth Narayan <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Joe Mario <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kim Phillips <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Santosh Shukla <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06Merge tag 'arm64-upstream' of ↵Linus Torvalds32-168/+1555
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - arm64 perf: DDR PMU driver for Alibaba's T-Head Yitian 710 SoC, SVE vector granule register added to the user regs together with SVE perf extensions documentation. - SVE updates: add HWCAP for SVE EBF16, update the SVE ABI documentation to match the actual kernel behaviour (zeroing the registers on syscall rather than "zeroed or preserved" previously). - More conversions to automatic system registers generation. - vDSO: use self-synchronising virtual counter access in gettimeofday() if the architecture supports it. - arm64 stacktrace cleanups and improvements. - arm64 atomics improvements: always inline assembly, remove LL/SC trampolines. - Improve the reporting of EL1 exceptions: rework BTI and FPAC exception handling, better EL1 undefs reporting. - Cortex-A510 erratum 2658417: remove BF16 support due to incorrect result. - arm64 defconfig updates: build CoreSight as a module, enable options necessary for docker, memory hotplug/hotremove, enable all PMUs provided by Arm. - arm64 ptrace() support for TPIDR2_EL0 (register provided with the SME extensions). - arm64 ftraces updates/fixes: fix module PLTs with mcount, remove unused function. - kselftest updates for arm64: simple HWCAP validation, FP stress test improvements, validation of ZA regs in signal handlers, include larger SVE and SME vector lengths in signal tests, various cleanups. - arm64 alternatives (code patching) improvements to robustness and consistency: replace cpucap static branches with equivalent alternatives, associate callback alternatives with a cpucap. - Miscellaneous updates: optimise kprobe performance of patching single-step slots, simplify uaccess_mask_ptr(), move MTE registers initialisation to C, support huge vmalloc() mappings, run softirqs on the per-CPU IRQ stack, compat (arm32) misalignment fixups for multiword accesses. * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (126 commits) arm64: alternatives: Use vdso/bits.h instead of linux/bits.h arm64/kprobe: Optimize the performance of patching single-step slot arm64: defconfig: Add Coresight as module kselftest/arm64: Handle EINTR while reading data from children kselftest/arm64: Flag fp-stress as exiting when we begin finishing up kselftest/arm64: Don't repeat termination handler for fp-stress ARM64: reloc_test: add __init/__exit annotations to module init/exit funcs arm64/mm: fold check for KFENCE into can_set_direct_map() arm64: ftrace: fix module PLTs with mcount arm64: module: Remove unused plt_entry_is_initialized() arm64: module: Make plt_equals_entry() static arm64: fix the build with binutils 2.27 kselftest/arm64: Don't enable v8.5 for MTE selftest builds arm64: uaccess: simplify uaccess_mask_ptr() arm64: asm/perf_regs.h: Avoid C++-style comment in UAPI header kselftest/arm64: Fix typo in hwcap check arm64: mte: move register initialization to C arm64: mm: handle ARM64_KERNEL_USES_PMD_MAPS in vmemmap_populate() arm64: dma: Drop cache invalidation from arch_dma_prep_coherent() arm64/sve: Add Perf extensions documentation ...
2022-10-06perf stat: Fix cpu check to use id.cpu.cpu in aggr_printout()Athira Rajeev1-2/+2
'perf stat' has options to aggregate the counts in different modes like per socket, per core etc. The function "aggr_printout" in util/stat-display.c which is used to print the aggregates, has a check for cpu in case of AGGR_NONE. This check was originally using condition : "if (id.cpu.cpu > -1)". But this got changed after commit df936cadfb58 ("perf stat: Add JSON output option"), which added option to output json format for different aggregation modes. After this commit, the check in "aggr_printout" is using "if (id.core > -1)". The old code was using "id.cpu.cpu > -1" while the new code is using "id.core > -1". But since the value printed is id.cpu.cpu, fix this check to use cpu and not core. Suggested-by: Ian Rogers <[email protected]> Suggested-by: James Clark <[email protected]> Signed-off-by: Athira Jajeev <[email protected]> Tested-by: Ian Rogers <[email protected]> Cc: Disha Goel <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Nageswara R Sastry <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf test coresight: Add relevant documentation about ARM64 CoreSight testingCarsten Haitzler1-0/+5
Add/improve documentation helping people get started with CoreSight and perf as well as describe the testing and how it works. Reviewed-by: James Clark <[email protected]> Signed-off-by: Carsten Haitzler <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf test: Add git ignore for tmp and output files of ARM CoreSight testsCarsten Haitzler1-0/+2
Ignore other output files of the new CoreSight tests so they don't fill git status with noise we don't need or want. Reviewed-by: James Clark <[email protected]> Signed-off-by: Carsten Haitzler <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf test coresight: Add unroll thread test shell scriptCarsten Haitzler1-0/+18
This adds scripts to drive the unroll thread tests to compare perf output against a minimum bar of content/quality. Reviewed-by: James Clark <[email protected]> Signed-off-by: Carsten Haitzler <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf test coresight: Add unroll thread test toolCarsten Haitzler4-1/+110
Add test tool to be driven by further test scripts. This is a simple C based test that is for arm64 with some inline ASM to manually unroll a lot of code to have a very long sequence of commands. Reviewed-by: James Clark <[email protected]> Signed-off-by: Carsten Haitzler <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf test coresight: Add thread loop test shell scriptsCarsten Haitzler2-0/+38
Add a script to drive the thread loop test that gathers data so it passes a minimum bar (in this case do we get any perf context data for every thread). Reviewed-by: James Clark <[email protected]> Signed-off-by: Carsten Haitzler <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf test coresight: Add thread loop test toolCarsten Haitzler4-1/+122
Add test tool to be driven by further test scripts. This is a simple C based loop with threads test to drive from scripts that can output TIDs for tracking/checking. Reviewed-by: James Clark <[email protected]> Signed-off-by: Carsten Haitzler <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf test coresight: Add memcpy thread test shell scriptCarsten Haitzler1-0/+18
Add a script to drive the threaded memcpy test that gathers data so it passes a minimum bar for amount and quality of content that we extract from the kernel's perf support. Reviewed-by: James Clark <[email protected]> Signed-off-by: Carsten Haitzler <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf test coresight: Add memcpy thread test toolCarsten Haitzler4-1/+115
Add test tool to be driven by further test scripts. This is a simple C based memcpy with threads test to drive from scripts. Reviewed-by: James Clark <[email protected]> Signed-off-by: Carsten Haitzler <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf test: Add git ignore for perf data generated by the ARM CoreSight testsCarsten Haitzler1-2/+2
Ignore perf output data files generated by perf tests for cleaner git status. Reviewed-by: James Clark <[email protected]> Signed-off-by: Carsten Haitzler <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf test: Add arm64 asm pureloop test shell scriptCarsten Haitzler2-1/+20
Add a script to drive the asm pureloop test for arm64/CoreSight that gathers data so it passes a minimum bar for amount and quality of content that we extract from the kernel's perf support. Committer notes: Add the install of tests/shell/coresight/*.sh to tools/perf/Makefile.perf as we're starting to populate that dir. Reviewed-by: James Clark <[email protected]> Signed-off-by: Carsten Haitzler <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf test: Add asm pureloop test toolCarsten Haitzler4-1/+65
Add test tool to be driven by further test scripts. This tool is pure arm64 ASM with no libc usage to ensure it is the same exact binary/code every time so it can also be re-used for many uses. It just loops for a given fixed number of loops. Reviewed-by: James Clark <[email protected]> Signed-off-by: Carsten Haitzler <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf test: Add build infra for perf test tools for ARM CoreSight testsCarsten Haitzler4-3/+55
This adds the initial build infrastructure (makefiles maintainers information) for adding follow-on tests for CoreSight. Committer notes: Remove the installation of tests/shell/coresight/*.sh, as there are no files there yet and thus, at this point, make install fails. Use $(QUIET_CLEAN) to avoid having extraneous output in the 'make clean' output. Also use @$(MAKE) in tools/perf/tests/shell/coresight/Makefile as $(Q) is not turning into @ when V=1 isn't used, i.e. in the default case it is not being quiet. The >/dev/null in the all for tools/perf/tests/shell/coresight/Makefile is to avoid this: make[4]: Nothing to be done for 'all'. make[4]: Nothing to be done for 'all'. make[4]: Nothing to be done for 'all'. DESCEND plugins GEN /tmp/build/perf/python/perf.so make[4]: Nothing to be done for 'all'. INSTALL trace_plugins On !arm64 where nothing is done on the main target for tools/perf/tests/shell/coresight/*/Makefile. Signed-off-by: Carsten Haitzler <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf test: Add CoreSight shell lib shared code for future testsCarsten Haitzler1-0/+132
This adds a library of shell "code" to be shared and used by future tests that target quality testing for Arm CoreSight support in perf and the Linux kernel. Signed-off-by: Carsten Haitzler <[email protected]> Reviewed-by: James Clark <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mathieu Poirier <[email protected]> Cc: Mike Leach <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf test: Introduce script for data symbol testingLeo Yan1-0/+93
The test is designed with a data structure with 64-byte alignment, it has two fields "data1" and "data2", and other fields are reserved. Using the "perf mem" command, we can record and report memory samples for a self-contained workload with 1 second duration. If no samples are obtained for the data structure "buf1", it reports failure; and by checking the offset in structure "buf1", if the memory samples aren't for the "data1" and "data2" fields, it means wrong data symbol parsing and returns failure. Committer testing: [root@quaco ~]# grep -m1 "model name" /proc/cpuinfo model name : Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz [root@quaco ~]# [root@quaco ~]# perf test -v "data symbol" 104: Test data symbol : --- start --- test child forked, pid 192318 Compiling test program... Recording workload... [ perf record: Woken up 2 times to write data ] [ perf record: Captured and wrote 0.389 MB /tmp/__perf_test.perf.data.LIuQl (5570 samples) ] Cleaning up files... test child finished with 0 ---- end ---- Test data symbol: Ok [root@quaco ~]# Signed-off-by: Leo Yan <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf record: Save DSO build-ID for synthesizingNamhyung Kim1-3/+22
When synthesizing MMAP2 with build-id, it'd read the same file repeatedly as it has no idea if it's done already. Maintain a dsos to check that and skip the file access if possible. Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf stat: Rename to aggr_cpu_id.thread_idxNamhyung Kim4-16/+16
The aggr_cpu_id has a thread value but it's actually an index to the thread_map. To reduce possible confusion, rename it to thread_idx. Suggested-by: Ian Rogers <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf stat: Don't compare runtime stat for shadow statsNamhyung Kim1-12/+0
Now it always uses the global rt_stat. Let's get rid of the field from the saved_value. When the both evsels are NULL, it'd return 0 so remove the block in the saved_value_cmp. Reviewed-by: James Clark <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf stat: Kill unused per-thread runtime statsNamhyung Kim2-56/+0
Now it's using the global rt_stat, no need to use per-thread stats. Let get rid of them. Reviewed-by: James Clark <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf stat: Use thread map index for shadow statNamhyung Kim3-18/+12
When AGGR_THREAD is active, it aggregates the values for each thread. Previously it used cpu map index which is invalid for AGGR_THREAD so it had to use separate runtime stats with index 0. But it can just use the rt_stat with thread_map_index. Rename the first_shadow_map_idx() and make it return the thread index. Reviewed-by: James Clark <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf stat: Rename saved_value->cpu_map_idxNamhyung Kim2-157/+157
The cpu_map_idx fields is just to differentiate values from other entries. It doesn't need to be strictly cpu map index. Actually we can pass thread map index or aggr map index. So rename the fields first. No functional change intended. Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf stat: Don't call perf_stat_evsel_id_init() repeatedlyNamhyung Kim1-1/+1
evsel__reset_stat_priv() is called more than once if user gave -r option for multiple runs. But it doesn't need to re-initialize the id. Reviewed-by: James Clark <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf stat: Convert perf_stat_evsel.res_stats arrayNamhyung Kim3-9/+5
It uses only one member, no need to have it as an array. Reviewed-by: James Clark <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf tools: Remove special handling of system-wide evselNamhyung Kim4-19/+2
For system-wide evsels, the thread map should be dummy - i.e. it has a single entry of -1. But the code guarantees such a thread map, so no need to handle it specially. No functional change intended. Reviewed-by: Adrian Hunter <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf tools: Add evlist__add_sched_switch()Namhyung Kim4-20/+28
Add a help to create a system-wide sched_switch event. One merit is that it sets the system-wide bit before adding it to evlist so that the libperf can handle the cpu and thread maps correctly. Reviewed-by: Adrian Hunter <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf tools: Get rid of evlist__add_on_all_cpus()Namhyung Kim1-27/+2
The cpu and thread maps are properly handled in libperf now. No need to do it in the perf tools anymore. Let's remove the logic. Reviewed-by: Adrian Hunter <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06libperf: Propagate maps only if necessaryNamhyung Kim2-7/+5
The current code propagate evsel's cpu map settings to evlist when it's added to an evlist. But the evlist->all_cpus and each evsel's cpus will be updated in perf_evlist__set_maps() later. No need to do it before evlist's cpus are set actually. In fact it discards this intermediate all_cpus maps at the beginning of perf_evlist__set_maps(). Let's not do this. It's only needed when an evsel is added after the evlist cpu/thread maps are set. Reviewed-by: Adrian Hunter <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06libperf: Populate system-wide evsel mapsNamhyung Kim1-6/+9
Setting proper cpu and thread maps for system wide evsels regardless of user requested cpu in __perf_evlist__propagate_maps(). Those evsels need to be active on all cpus always. Do it in the libperf so that we can guarantee it has proper maps. Reviewed-by: Adrian Hunter <[email protected]> Signed-off-by: Namhyung Kim <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf vendor events: Update Intel broadwelldeIan Rogers1-134/+577
Events remain at v23, and the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py with updates at: https://github.com/captain5050/event-converter-for-linux-perf Updates include: - Switch for core metrics from BDX to BDW. - Switch for Page_Walks_Utilization to BDX version. - Rename of topdown TMA metrics from Frontend_Bound to tma_frontend_bound. - Addition of all 6 levels of TMA metrics. Child metrics are placed in a group named after their parent allowing children of a metric to be easily measured using the metric name with a _group suffix. - ## and ##? operators are correctly expanded. - The locate-with column is added to the long description describing a sampling event. - Metrics are written in terms of other metrics to reduce the expression size and increase readability. Tested with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kshipra Bopardikar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Miaoqian Lin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf vendor events: Update Intel tigerlakeIan Rogers1-48/+762
Events remain at v1.07, and the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py with updates at: https://github.com/captain5050/event-converter-for-linux-perf Updates include: - Rename of topdown TMA metrics from Frontend_Bound to tma_frontend_bound. - Addition of all 6 levels of TMA metrics. Previously metrics involving topdown events were dropped. Child metrics are placed in a group named after their parent allowing children of a metric to be easily measured using the metric name with a _group suffix. - ## and ##? operators are correctly expanded. - The locate-with column is added to the long description describing a sampling event. - Metrics are written in terms of other metrics to reduce the expression size and increase readability. Tested with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kshipra Bopardikar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Miaoqian Lin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf vendor events: Update Intel skylakeIan Rogers1-182/+679
Events remain at v53, and the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py with updates at: https://github.com/captain5050/event-converter-for-linux-perf Updates include: - Rename of topdown TMA metrics from Frontend_Bound to tma_frontend_bound. - _SMT suffix metrics are dropped as the #SMT_On and #EBS_Mode are correctly expanded in the single main metric. - Addition of all 6 levels of TMA metrics. Child metrics are placed in a group named after their parent allowing children of a metric to be easily measured using the metric name with a _group suffix. - ## and ##? operators are correctly expanded. - The locate-with column is added to the long description describing a sampling event. - Metrics are written in terms of other metrics to reduce the expression size and increase readability. Tested with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kshipra Bopardikar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Miaoqian Lin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf vendor events: Update silvermont cpuidsIan Rogers1-1/+1
Add cpuid that was added to https://download.01.org/perfmon/mapfile.csv Signed-off-by: Ian Rogers <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kshipra Bopardikar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Miaoqian Lin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf vendor events: Update Intel sapphirerapidsIan Rogers5-353/+917
Events are updated to v1.06 the core metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py with updates at: https://github.com/captain5050/event-converter-for-linux-perf Updates include: - Rename of topdown TMA metrics from Frontend_Bound to tma_frontend_bound. - Addition of all 6 levels of TMA metrics. Previously metrics involving topdown events were dropped. Child metrics are placed in a group named after their parent allowing children of a metric to be easily measured using the metric name with a _group suffix. - ## and ##? operators are correctly expanded. - The locate-with column is added to the long description describing a sampling event. - Metrics are written in terms of other metrics to reduce the expression size and increase readability. - Latest metrics from: https://github.com/intel/perfmon-metrics Tested with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kshipra Bopardikar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Miaoqian Lin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf vendor events: Update Intel sandybridgeIan Rogers1-75/+240
Events remain at v17, and the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py with updates at: https://github.com/captain5050/event-converter-for-linux-perf Updates include: - Rename of topdown TMA metrics from Frontend_Bound to tma_frontend_bound. - _SMT suffix metrics are dropped as the #SMT_On and #EBS_Mode are correctly expanded in the single main metric. - Addition of all 6 levels of TMA metrics. Child metrics are placed in a group named after their parent allowing children of a metric to be easily measured using the metric name with a _group suffix. - ## and ##? operators are correctly expanded. - The locate-with column is added to the long description describing a sampling event. - Metrics are written in terms of other metrics to reduce the expression size and increase readability. Tested with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kshipra Bopardikar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Miaoqian Lin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf vendor events: Update Intel jaketownIan Rogers1-81/+246
Events remain at v21, and the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py with updates at: https://github.com/captain5050/event-converter-for-linux-perf Updates include: - Rename of topdown TMA metrics from Frontend_Bound to tma_frontend_bound. - _SMT suffix metrics are dropped as the #SMT_On and #EBS_Mode are correctly expanded in the single main metric. - Addition of all 6 levels of TMA metrics. Child metrics are placed in a group named after their parent allowing children of a metric to be easily measured using the metric name with a _group suffix. - ## and ##? operators are correctly expanded. - The locate-with column is added to the long description describing a sampling event. - Metrics are written in terms of other metrics to reduce the expression size and increase readability. Tested with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kshipra Bopardikar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Miaoqian Lin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf vendor events: Update Intel ivytownIan Rogers10-189/+625
Events are updated to v22 the core metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py with updates at: https://github.com/captain5050/event-converter-for-linux-perf Updates include: - Rename of topdown TMA metrics from Frontend_Bound to tma_frontend_bound. - _SMT suffix metrics are dropped as the #SMT_On and #EBS_Mode are correctly expanded in the single main metric. - Addition of all 6 levels of TMA metrics. Child metrics are placed in a group named after their parent allowing children of a metric to be easily measured using the metric name with a _group suffix. - ## and ##? operators are correctly expanded. - The locate-with column is added to the long description describing a sampling event. - Metrics are written in terms of other metrics to reduce the expression size and increase readability. Tested with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kshipra Bopardikar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Miaoqian Lin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf vendor events: Update Intel ivybridgeIan Rogers1-91/+503
Events remain at v22, and the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py with updates at: https://github.com/captain5050/event-converter-for-linux-perf Updates include: - Rename of topdown TMA metrics from Frontend_Bound to tma_frontend_bound. - _SMT suffix metrics are dropped as the #SMT_On and #EBS_Mode are correctly expanded in the single main metric. - Addition of all 6 levels of TMA metrics. Child metrics are placed in a group named after their parent allowing children of a metric to be easily measured using the metric name with a _group suffix. - ## and ##? operators are correctly expanded. - The locate-with column is added to the long description describing a sampling event. - Metrics are written in terms of other metrics to reduce the expression size and increase readability. Tested with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kshipra Bopardikar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Miaoqian Lin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf vendor events: Update Intel icelakexIan Rogers5-334/+833
Events are updated to v1.16 the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py with updates at: https://github.com/captain5050/event-converter-for-linux-perf Updates include: - Rename of topdown TMA metrics from Frontend_Bound to tma_frontend_bound. - Addition of all 6 levels of TMA metrics. Previously metrics involving topdown events were dropped. Child metrics are placed in a group named after their parent allowing children of a metric to be easily measured using the metric name with a _group suffix. - ## and ##? operators are correctly expanded. - The locate-with column is added to the long description describing a sampling event. - Metrics are written in terms of other metrics to reduce the expression size and increase readability. - Latest metrics from: https://github.com/intel/perfmon-metrics Tested with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kshipra Bopardikar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Miaoqian Lin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf vendor events: Update Intel icelakeIan Rogers4-52/+766
Events are updated to v1.15, the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py with updates at: https://github.com/captain5050/event-converter-for-linux-perf Updates include: - Rename of topdown TMA metrics from Frontend_Bound to tma_frontend_bound. - Addition of all 6 levels of TMA metrics. Previously metrics involving topdown events were dropped. Child metrics are placed in a group named after their parent allowing children of a metric to be easily measured using the metric name with a _group suffix. - ## and ##? operators are correctly expanded. - The locate-with column is added to the long description describing a sampling event. - Metrics are written in terms of other metrics to reduce the expression size and increase readability. Tested with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kshipra Bopardikar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Miaoqian Lin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf vendor events: Update Intel haswellxIan Rogers6-356/+615
Events are updated to v26, the core metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py with updates at: https://github.com/captain5050/event-converter-for-linux-perf Updates include: - Uncore event updates by Zhengjun Xing <[email protected]>. - Rename of topdown TMA metrics from Frontend_Bound to tma_frontend_bound. - _SMT suffix metrics are dropped as the #SMT_On and #EBS_Mode are correctly expanded in the single main metric. - Addition of all 6 levels of TMA metrics. Previously metrics involving topdown events were dropped. Child metrics are placed in a group named after their parent allowing children of a metric to be easily measured using the metric name with a _group suffix. - ## and ##? operators are correctly expanded. - The locate-with column is added to the long description describing a sampling event. - Metrics are written in terms of other metrics to reduce the expression size and increase readability. - Latest metrics from: https://github.com/intel/perfmon-metrics Tested with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kshipra Bopardikar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Miaoqian Lin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf vendor events: Update Intel haswellIan Rogers4-90/+498
Events are updated to v32, the core metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py with updates at: https://github.com/captain5050/event-converter-for-linux-perf Updates include: - Rename of topdown TMA metrics from Frontend_Bound to tma_frontend_bound. - _SMT suffix metrics are dropped as the #SMT_On and #EBS_Mode are correctly expanded in the single main metric. - Addition of all 6 levels of TMA metrics. Child metrics are placed in a group named after their parent allowing children of a metric to be easily measured using the metric name with a _group suffix. - ## and ##? operators are correctly expanded. - The locate-with column is added to the long description describing a sampling event. - Metrics are written in terms of other metrics to reduce the expression size and increase readability. Tested with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kshipra Bopardikar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Miaoqian Lin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf vendor events: Update elkhartlake cpuidsIan Rogers1-1/+1
Add cpuid that was added to https://download.01.org/perfmon/mapfile.csv Signed-off-by: Ian Rogers <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kshipra Bopardikar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Miaoqian Lin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf vendor events: Update Intel cascadelakexIan Rogers3-526/+787
Events remain at v1.16, and the metrics are based on TMA 4.4 full. Use script at: https://github.com/intel/event-converter-for-linux-perf/blob/master/download_and_gen.py with updates at: https://github.com/captain5050/event-converter-for-linux-perf Updates include: - Removal of ScaleUnit from uncore events by Zhengjun Xing <[email protected]>. - Rename of topdown TMA metrics from Frontend_Bound to tma_frontend_bound. - _SMT suffix metrics are dropped as the #SMT_On and #EBS_Mode are correctly expanded in the single main metric. - Addition of all 6 levels of TMA metrics. Child metrics are placed in a group named after their parent allowing children of a metric to be easily measured using the metric name with a _group suffix. - ## and ##? operators are correctly expanded. - The locate-with column is added to the long description describing a sampling event. - Metrics are written in terms of other metrics to reduce the expression size and increase readability. - Latest metrics from: https://github.com/intel/perfmon-metrics Tested with 'perf test': 10: PMU events : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok Signed-off-by: Ian Rogers <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kshipra Bopardikar <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Miaoqian Lin <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>