aboutsummaryrefslogtreecommitdiff
path: root/tools/perf/util/mem-events.c
AgeCommit message (Collapse)AuthorFilesLines
2024-09-06perf mem: Fix missed p-core mem events on ADL and RPLKan Liang1-2/+4
The p-core mem events are missed when launching 'perf mem record' on ADL and RPL. root@number:~# perf mem record sleep 1 Memory events are enabled on a subset of CPUs: 16-27 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.032 MB perf.data ] root@number:~# perf evlist cpu_atom/mem-loads,ldlat=30/P cpu_atom/mem-stores/P dummy:u A variable 'record' in the 'struct perf_mem_event' is to indicate whether a mem event in a mem_events[] should be recorded. The current code only configure the variable for the first eligible PMU. It's good enough for a non-hybrid machine or a hybrid machine which has the same mem_events[]. However, if a different mem_events[] is used for different PMUs on a hybrid machine, e.g., ADL or RPL, the 'record' for the second PMU never get a chance to be set. The mem_events[] of the second PMU are always ignored. 'perf mem' doesn't support the per-PMU configuration now. A per-PMU mem_events[] 'record' variable doesn't make sense. Make it global. That could also avoid searching for the per-PMU mem_events[] via perf_pmu__mem_events_ptr every time. Committer testing: root@number:~# perf evlist -g cpu_atom/mem-loads,ldlat=30/P cpu_atom/mem-stores/P {cpu_core/mem-loads-aux/,cpu_core/mem-loads,ldlat=30/} cpu_core/mem-stores/P dummy:u root@number:~# The :S for '{cpu_core/mem-loads-aux/,cpu_core/mem-loads,ldlat=30/}' is not being added by 'perf evlist -g', to be checked. Fixes: abbdd79b786e036e ("perf mem: Clean up perf_mem_events__name()") Reported-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Kan Liang <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Closes: https://lore.kernel.org/lkml/Zthu81fA3kLC2CS2@x1/ Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2024-09-06perf mem: Check mem_events for all eligible PMUsKan Liang1-1/+13
The current perf_pmu__mem_events_init() only checks the availability of the mem_events for the first eligible PMU. It works for non-hybrid machines and hybrid machines that have the same mem_events. However, it may bring issues if a hybrid machine has a different mem_events on different PMU, e.g., Alder Lake and Raptor Lake. A mem-loads-aux event is only required for the p-core. The mem_events on both e-core and p-core should be checked and marked. The issue was not found, because it's hidden by another bug, which only records the mem-events for the e-core. The wrong check for the p-core events didn't yell. Fixes: abbdd79b786e036e ("perf mem: Clean up perf_mem_events__name()") Signed-off-by: Kan Liang <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2024-07-12perf mem: Warn if memory events are not supported on all CPUsLeo Yan1-0/+14
It is possible that memory events are not supported on all CPUs. Prints a warning by dumping the enabled CPU maps in this case. Signed-off-by: Leo Yan <[email protected]> Reviewed-by: James Clark <[email protected]> Cc: Suzuki K Poulose <[email protected]> Cc: Will Deacon <[email protected]> Cc: Mike Leach <[email protected]> Cc: Kajol Jain <[email protected]> Cc: John Garry <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2024-06-25perf mem: Fix a segfault with NULL event->nameNamhyung Kim1-1/+1
Guilherme reported a crash in perf mem record. It's because the perf_mem_event->name was NULL on his machine. It should just return a NULL string when it has no format string in the name. The backtrace at the crash is below: Program received signal SIGSEGV, Segmentation fault. __strchrnul_avx2 () at ../sysdeps/x86_64/multiarch/strchr-avx2.S:67 67 vmovdqu (%rdi), %ymm2 (gdb) bt #0 __strchrnul_avx2 () at ../sysdeps/x86_64/multiarch/strchr-avx2.S:67 #1 0x00007ffff6c982de in __find_specmb (format=0x0) at printf-parse.h:82 #2 __printf_buffer (buf=buf@entry=0x7fffffffc760, format=format@entry=0x0, ap=ap@entry=0x7fffffffc880, mode_flags=mode_flags@entry=0) at vfprintf-internal.c:649 #3 0x00007ffff6cb7840 in __vsnprintf_internal (string=<optimized out>, maxlen=<optimized out>, format=0x0, args=0x7fffffffc880, mode_flags=mode_flags@entry=0) at vsnprintf.c:96 #4 0x00007ffff6cb787f in ___vsnprintf (string=<optimized out>, maxlen=<optimized out>, format=<optimized out>, args=<optimized out>) at vsnprintf.c:103 #5 0x00005555557b9391 in scnprintf (buf=0x555555fe9320 <mem_loads_name> "", size=100, fmt=0x0) at ../lib/vsprintf.c:21 #6 0x00005555557b74c3 in perf_pmu__mem_events_name (i=0, pmu=0x555556832180) at util/mem-events.c:106 #7 0x00005555557b7ab9 in perf_mem_events__record_args (rec_argv=0x55555684c000, argv_nr=0x7fffffffca20) at util/mem-events.c:252 #8 0x00005555555e370d in __cmd_record (argc=3, argv=0x7fffffffd760, mem=0x7fffffffcd80) at builtin-mem.c:156 #9 0x00005555555e49c4 in cmd_mem (argc=4, argv=0x7fffffffd760) at builtin-mem.c:514 #10 0x000055555569716c in run_builtin (p=0x555555fcde80 <commands+672>, argc=8, argv=0x7fffffffd760) at perf.c:349 #11 0x0000555555697402 in handle_internal_command (argc=8, argv=0x7fffffffd760) at perf.c:402 #12 0x0000555555697560 in run_argv (argcp=0x7fffffffd59c, argv=0x7fffffffd590) at perf.c:446 #13 0x00005555556978a6 in main (argc=8, argv=0x7fffffffd760) at perf.c:562 Reported-by: Guilherme Amadio <[email protected]> Reviewed-by: Kan Liang <[email protected]> Closes: https://lore.kernel.org/linux-perf-users/[email protected] Signed-off-by: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2024-05-07perf mem-info: Add reference count checkingIan Rogers1-10/+10
Add reference count checking and switch 'struct mem_info' usage to use accessor functions. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Li Dong <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Oliver Upton <[email protected]> Cc: Paran Lee <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sun Haiyong <[email protected]> Cc: Tim Chen <[email protected]> Cc: Yanteng Si <[email protected]> Cc: Yicong Yang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2024-05-07perf mem-info: Move mem-info out of mem-events and symbolIan Rogers1-7/+9
Move mem-info to its own header rather than having it split between mem-events and symbol. Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Li Dong <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Oliver Upton <[email protected]> Cc: Paran Lee <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sun Haiyong <[email protected]> Cc: Tim Chen <[email protected]> Cc: Yanteng Si <[email protected]> Cc: Yicong Yang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2024-01-24perf mem: Clean up perf_pmus__num_mem_pmus()Kan Liang1-0/+14
The number of mem PMUs can be calculated by searching the perf_pmus__scan_mem(). Remove the ARCH specific perf_pmus__num_mem_pmus() Tested-by: Leo Yan <[email protected]> Signed-off-by: Kan Liang <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2024-01-24perf mem: Clean up perf_mem_events__record_args()Kan Liang1-22/+12
The current code iterates all memory PMUs. It doesn't matter if the system has only one memory PMU or multiple PMUs. The check of perf_pmus__num_mem_pmus() is not required anymore. The rec_tmp is not used in c2c and mem. Removing them as well. Suggested-by: Leo Yan <[email protected]> Tested-by: Leo Yan <[email protected]> Signed-off-by: Kan Liang <[email protected]> Reviewed-by: Ian Rogers <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2024-01-24perf mem: Clean up is_mem_loads_aux_event()Kan Liang1-2/+12
The aux_event can be retrieved from the perf_pmu now. Implement a generic support. Reviewed-by: Ian Rogers <[email protected]> Tested-by: Ravi Bangoria <[email protected]> Tested-by: Leo Yan <[email protected]> Signed-off-by: Kan Liang <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2024-01-24perf mem: Clean up perf_mem_event__supported()Kan Liang1-10/+12
For some ARCHs, e.g., ARM and AMD, to get the availability of the mem-events, perf checks the existence of a specific PMU. For the other ARCHs, e.g., Intel and Power, perf has to check the existence of some specific events. The current perf only iterates the mem-events-supported PMUs. It's not required to check the existence of a specific PMU anymore. Rename sysfs_name to event_name, which stores the specific mem-events. Perf only needs to check those events for the availability of the mem-events. Rename perf_mem_event__supported to perf_pmu__mem_events_supported. Reviewed-by: Ian Rogers <[email protected]> Tested-by: Ravi Bangoria <[email protected]> Tested-by: Leo Yan <[email protected]> Signed-off-by: Kan Liang <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2024-01-24perf mem: Clean up perf_mem_events__name()Kan Liang1-17/+43
Introduce a generic perf_mem_events__name(). Remove the ARCH-specific one. The mem_load events may have a different format. Add ldlat and aux_event in the struct perf_mem_event to indicate the format and the extra aux event. Add perf_mem_events_intel_aux[] to support the extra mem_load_aux event. Rename perf_mem_events__name to perf_pmu__mem_events_name. Reviewed-by: Ian Rogers <[email protected]> Tested-by: Ravi Bangoria <[email protected]> Tested-by: Leo Yan <[email protected]> Signed-off-by: Kan Liang <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2024-01-24perf mem: Clean up perf_mem_events__ptr()Kan Liang1-53/+50
The mem_events can be retrieved from the struct perf_pmu now. An ARCH specific perf_mem_events__ptr() is not required anymore. Remove all of them. The Intel hybrid has multiple mem-events-supported PMUs. But they share the same mem_events. Other ARCHs only support one mem-events-supported PMU. In the configuration, it's good enough to only configure the mem_events for one PMU. Add perf_mem_events_find_pmu() which returns the first mem-events-supported PMU. In the perf_mem_events__init(), the perf_pmus__scan() is not required anymore. It avoids checking the sysfs for every PMU on the system. Make the perf_mem_events__record_args() more generic. Remove the perf_mem_events__print_unsupport_hybrid(). Since pmu is added as a new parameter, rename perf_mem_events__ptr() to perf_pmu__mem_events_ptr(). Several other functions also do a similar rename. Reviewed-by: Ian Rogers <[email protected]> Reviewed-by: Kajol Jain <[email protected]> Tested-by: Ravi Bangoria <[email protected]> Tested-by: Kajol jain <[email protected]> Signed-off-by: Kan Liang <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2024-01-24perf mem: Add mem_events into the supported perf_pmuKan Liang1-1/+1
With the mem_events, perf doesn't need to read sysfs for each PMU to find the mem-events-supported PMU. The patch also makes it possible to clean up the related __weak functions later. The patch is only to add the mem_events into the perf_pmu for all ARCHs. It will be used in the later cleanup patches. Reviewed-by: Ian Rogers <[email protected]> Reviewed-by: Kajol Jain <[email protected]> Tested-by: Ravi Bangoria <[email protected]> Tested-by: Leo Yan <[email protected]> Tested-by: Kajol Jain <[email protected]> Suggested-by: Leo Yan <[email protected]> Signed-off-by: Kan Liang <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-12-05perf mem: Fix error on hybrid related to availability of mem event in a PMUKan Liang1-11/+14
The below error can be triggered on a hybrid machine. $ perf mem record -t load sleep 1 event syntax error: 'breakpoint/mem-loads,ldlat=30/P' \___ Bad event or PMU Unable to find PMU or event on a PMU of 'breakpoint' In the perf_mem_events__record_args(), the current perf never checks the availability of a mem event on a given PMU. All the PMUs will be added to the perf mem event list. Perf errors out for the unsupported PMU. Extend perf_mem_event__supported() and take a PMU into account. Check the mem event for each PMU before adding it to the perf mem event list. Optimize the perf_mem_events__init() a little bit. The function is to check whether the mem events are supported in the system. It doesn't need to scan all PMUs. Just return with the first supported PMU is good enough. Fixes: 5752c20f3787c9bc ("perf mem: Scan all PMUs instead of just core ones") Reported-by: Ammy Yi <[email protected]> Signed-off-by: Kan Liang <[email protected]> Tested-by: Ammy Yi <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-10-12perf mem-events: Avoid uninitialized readIan Rogers1-1/+2
pmu should be initialized to NULL before perf_pmus__scan loop. Fix and shrink the scope of pmu at the same time. Issue detected by clang-tidy. Fixes: 5752c20f3787 ("perf mem: Scan all PMUs instead of just core ones") Signed-off-by: Ian Rogers <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Nick Desaulniers <[email protected]> Cc: Yang Jihong <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Kan Liang <[email protected]> Cc: [email protected] Cc: Ming Wang <[email protected]> Cc: Tom Rix <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2023-08-25perf pmu: Remove logic for PMU name being NULLIan Rogers1-8/+8
The PMU name could be NULL in the case of the fake_pmu. Initialize the name for the fake_pmu to "fake" so that all other logic can assume it is initialized. Add a const to the type of name so that a literal can be used to avoid additional initialization code. Propagate the cost through related routines and remove now unnecessary "(char *)" casts. Doing this located a bug in builtin-list for the pmu_glob that was missing a strdup. Signed-off-by: Ian Rogers <[email protected]> Link: https://lore.kernel.org/r/[email protected] Cc: K Prateek Nayak <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: James Clark <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Wei Li <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Will Deacon <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mike Leach <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Kan Liang <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: [email protected] Cc: Ming Wang <[email protected]> Cc: John Garry <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-06-16perf mem: Scan all PMUs instead of just core onesRavi Bangoria1-4/+9
Scanning only core PMUs is not sufficient on platforms like AMD since perf mem on AMD uses IBS OP PMU, which is independent of core PMU. Scan all PMUs instead of just core PMUs. There should be negligible performance overhead because of scanning all PMUs, so we should be okay. Reviewed-by: Ian Rogers <[email protected]> Signed-off-by: Ravi Bangoria <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ali Saidi <[email protected]> Cc: Ananth Narayan <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Santosh Shukla <[email protected]> Cc: Thomas Richter <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-27perf pmus: Remove perf_pmus__has_hybridIan Rogers1-13/+5
perf_pmus__has_hybrid was used to detect when there was >1 core PMU, this can be achieved with perf_pmus__num_core_pmus that doesn't depend upon is_pmu_hybrid and PMU name comparisons. When modifying the function calls take the opportunity to improve comments, enable/simplify tests that were previously failing for hybrid but now pass and to simplify generic code. Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ali Saidi <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Dmitrii Dolgov <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kang Minchul <[email protected]> Cc: Leo Yan <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Ming Wang <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Will Deacon <[email protected]> Cc: Xing Zhengjun <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-27perf pmus: Allow just core PMU scanningIan Rogers1-11/+3
Scanning all PMUs is expensive as all PMUs sysfs entries are loaded, benchmarking shows more than 4x the cost: ``` $ perf bench internals pmu-scan -i 1000 Computing performance of sysfs PMU event scan for 1000 times Average core PMU scanning took: 989.231 usec (+- 1.535 usec) Average PMU scanning took: 4309.425 usec (+- 74.322 usec) ``` Add new perf_pmus__scan_core routine that scans just core PMUs. Replace perf_pmus__scan calls with perf_pmus__scan_core when non-core PMUs are being ignored. Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ali Saidi <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Dmitrii Dolgov <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kang Minchul <[email protected]> Cc: Leo Yan <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Ming Wang <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Will Deacon <[email protected]> Cc: Xing Zhengjun <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-27perf pmu: Separate pmu and pmusIan Rogers1-5/+6
Separate and hide the pmus list in pmus.[ch]. Move pmus functionality out of pmu.[ch] into pmus.[ch] renaming pmus functions which were prefixed perf_pmu__ to perf_pmus__. Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ali Saidi <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Dmitrii Dolgov <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kang Minchul <[email protected]> Cc: Leo Yan <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Ming Wang <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Will Deacon <[email protected]> Cc: Xing Zhengjun <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-27perf mem: Avoid hybrid PMU listIan Rogers1-6/+14
Add perf_pmu__num_mem_pmus that scans/counts the number of PMUs for mem events. Switch perf_pmu__for_each_hybrid_pmu to iterating all PMUs and only handling is_core ones. Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ali Saidi <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Dmitrii Dolgov <[email protected]> Cc: Huacai Chen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jing Zhang <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kang Minchul <[email protected]> Cc: Leo Yan <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Ming Wang <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Sean Christopherson <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Will Deacon <[email protected]> Cc: Xing Zhengjun <[email protected]> Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-04-10perf mem: Refactor perf_mem__lvl_scnprintf() to process 'union ↵Ravi Bangoria1-42/+47
perf_mem_data_src' more intuitively Interpretation of 'union perf_mem_data_src' by perf_mem__lvl_scnprintf() is non-intuitive. For ex, it ignores 'mem_lvl' when 'mem_hops' is set but considers it otherwise. It prints both 'mem_lvl_num' and 'mem_lvl' when 'mem_hops' is not set. Refactor this function such that it behaves more intuitively: Use new API 'mem_lvl_num'|'mem_remote'|'mem_hops' if 'mem_lvl_num' contains value other than PERF_MEM_LVLNUM_NA. Otherwise, fallback to old API 'mem_lvl'. Since new API has no way to indicate MISS, use it from old api, otherwise don't club old and new APIs while parsing as well as printing. Before: $ sudo ./perf mem report -F sample,mem --stdio # Samples Memory access # ............ ........................ # 250097 N/A 188907 L1 hit 4116 L2 hit 3496 Remote Cache (1 hop) hit 3271 Remote Cache (2 hops) hit 873 L3 hit 598 Local RAM hit 438 Remote RAM (1 hop) hit 1 Uncached hit After: $ sudo ./perf mem report -F sample,mem --stdio # Samples Memory access # ............ ....................................... # 255517 N/A 189989 L1 hit 4541 L2 hit 3363 Remote core, same node Any cache hit 3336 Remote node, same socket Any cache hit 1275 L3 hit 743 RAM hit 545 Remote node, same socket RAM hit 4 Uncached hit Signed-off-by: Ravi Bangoria <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ananth Narayan <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Santosh Shukla <[email protected]> Cc: Stephane Eranian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-04-10perf mem: Add support for printing PERF_MEM_LVLNUM_UNCRavi Bangoria1-0/+1
Add support for printing PERF_MEM_LVLNUM_UNC in perf mem report. Signed-off-by: Ravi Bangoria <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ananth Narayan <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Leo Yan <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Santosh Shukla <[email protected]> Cc: Stephane Eranian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf mem: Print "LFB/MAB" for PERF_MEM_LVLNUM_LFBRavi Bangoria1-2/+2
A hw component to track outstanding L1 Data Cache misses is called LFB (Line Fill Buffer) on Intel and Arm. However similar component exists on other arch with different names, for ex, it's called MAB (Miss Address Buffer) on AMD. Use 'LFB/MAB' instead of just 'LFB'. Signed-off-by: Ravi Bangoria <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ali Saidi <[email protected]> Cc: Ananth Narayan <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Joe Mario <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kim Phillips <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Santosh Shukla <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf mem/c2c: Avoid printing empty lines for unsupported eventsRavi Bangoria1-5/+6
The 'perf mem' and 'perf c2c' tools can be used with 3 different events: load, store and combined load-store. Some architectures might support only partial set of events in which case, perf prints an empty line for unsupported events. Avoid that. Ex, AMD Zen cpus supports only combined load-store event and does not support individual load and store event. Before patch: $ perf mem record -e list mem-ldst : available $ After patch: $ perf mem record -e list mem-ldst : available $ Signed-off-by: Ravi Bangoria <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ali Saidi <[email protected]> Cc: Ananth Narayan <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Joe Mario <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kim Phillips <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Santosh Shukla <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-10-06perf mem: Add support for printing PERF_MEM_LVLNUM_{CXL|IO}Ravi Bangoria1-0/+2
Add support for printing these new fields in perf mem report. Signed-off-by: Ravi Bangoria <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ali Saidi <[email protected]> Cc: Ananth Narayan <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Dave Hansen <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Joe Mario <[email protected]> Cc: Kan Liang <[email protected]> Cc: Kim Phillips <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Santosh Shukla <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-08-11perf mem: Add statistics for peer snoopingLeo Yan1-3/+25
Since the flag PERF_MEM_SNOOPX_PEER is added to support cache snooping from peer cache line, it can come from a peer core, a peer cluster, or a remote NUMA node. This patch adds statistics for the flag PERF_MEM_SNOOPX_PEER. Note, we take PERF_MEM_SNOOPX_PEER as an affiliated info, it needs to cooperate with cache level statistics. Therefore, we account the load operations for both the cache level's metrics (e.g. ld_l2hit, ld_llchit, etc.) and peer related metrics when flag PERF_MEM_SNOOPX_PEER is set. So three new metrics are introduced: 'lcl_peer' is for local cache access, the metric 'rmt_peer' is for remote access (includes remote DRAM and any caches in remote node), and the metric 'tot_peer' is accounting the sum value of 'lcl_peer' and 'rmt_peer'. Reviewed-by: Ali Saidi <[email protected]> Signed-off-by: Leo Yan <[email protected]> Tested-by: Ali Saidi <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: German Gomez <[email protected]> Cc: Gustavo A. R. Silva <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Like Xu <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Timothy Hayes <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-08-11perf mem: Print snoop peer flagLeo Yan1-3/+15
Since PERF_MEM_SNOOPX_PEER flag is a new snoop type, print this flag if it is set. Before: memstress 3603 [020] 122.463754: 1 l1d-miss: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP N/A|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress) memstress 3603 [020] 122.463754: 1 l1d-access: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP N/A|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress) memstress 3603 [020] 122.463754: 1 llc-miss: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP N/A|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress) memstress 3603 [020] 122.463754: 1 llc-access: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP N/A|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress) memstress 3603 [020] 122.463754: 1 tlb-access: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP N/A|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress) memstress 3603 [020] 122.463754: 1 memory: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP N/A|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress) After: memstress 3603 [020] 122.463754: 1 l1d-miss: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP Peer|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress) memstress 3603 [020] 122.463754: 1 l1d-access: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP Peer|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress) memstress 3603 [020] 122.463754: 1 llc-miss: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP Peer|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress) memstress 3603 [020] 122.463754: 1 llc-access: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP Peer|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress) memstress 3603 [020] 122.463754: 1 tlb-access: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP Peer|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress) memstress 3603 [020] 122.463754: 1 memory: 8688000842 |OP LOAD|LVL L3 or L3 hit|SNP Peer|TLB Walker hit|LCK No|BLK N/A aaaac17c3e88 [unknown] (/home/ubuntu/memstress) Reviewed-by: Ali Saidi <[email protected]> Reviewed-by: Kajol Jain <[email protected]> Signed-off-by: Leo Yan <[email protected]> Tested-by: Ali Saidi <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: German Gomez <[email protected]> Cc: Gustavo A. R. Silva <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Like Xu <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Timothy Hayes <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-05-23perf mem: Add stats for store operation with no available memory levelLeo Yan1-0/+3
Sometimes we don't know memory store operations happen on exactly which memory (or cache) level, the memory level flag is set to PERF_MEM_LVL_NA in this case; a practical example is Arm SPE AUX trace sets this flag for all store operations due to absent info for cache level. This patch is to add a new item "st_na" in structure c2c_stats to add statistics for store operations with no available cache level. Signed-off-by: Leo Yan <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Adam Li <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ali Saidi <[email protected]> Cc: Alyssa Ross <[email protected]> Cc: German Gomez <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Joe Mario <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: Li Huafei <[email protected]> Cc: Like Xu <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2022-04-18perf mem: Print memory operation typeLeo Yan1-1/+28
The memory operation types are not only for load and store, for easier reviewing the memory operation type, this patch prints out it. Before: ls 14753 [011] 3678.072400: 1 l1d-miss: 88000182 L1 miss|SNP N/A|TLB Walker hit|LCK No|BLK N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms]) ls 14753 [011] 3678.072400: 1 l1d-access: 88000182 L1 miss|SNP N/A|TLB Walker hit|LCK No|BLK N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms]) ls 14753 [011] 3678.072400: 1 tlb-access: 88000182 L1 miss|SNP N/A|TLB Walker hit|LCK No|BLK N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms]) ls 14753 [011] 3678.072400: 1 memory: 88000182 L1 miss|SNP N/A|TLB Walker hit|LCK No|BLK N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms]) After: ls 14753 [011] 3678.072400: 1 l1d-miss: 88000182 |OP LOAD|LVL L1 miss|SNP N/A|TLB Walker hit|LCK No|BLK N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms]) ls 14753 [011] 3678.072400: 1 l1d-access: 88000182 |OP LOAD|LVL L1 miss|SNP N/A|TLB Walker hit|LCK No|BLK N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms]) ls 14753 [011] 3678.072400: 1 tlb-access: 88000182 |OP LOAD|LVL L1 miss|SNP N/A|TLB Walker hit|LCK No|BLK N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms]) ls 14753 [011] 3678.072400: 1 memory: 88000182 |OP LOAD|LVL L1 miss|SNP N/A|TLB Walker hit|LCK No|BLK N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms]) Signed-off-by: Leo Yan <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ali Saidi <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Li Huafei <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-12-22tools headers UAPI: Add new macros for mem_hops field to perf_event.hKajol Jain1-11/+18
Add new macros for mem_hops field which can be used to represent remote-node, socket and board level details. Currently the code had macro for HOPS_0 which, corresponds to data coming from another core but same node. Add new macros for HOPS_1 to HOPS_3 to represent remote-node, socket and board level data. Also add corresponding strings in the mem_hops array to represent mem_hop field data in perf_mem__lvl_scnprintf function Incase mem_hops field is used, PERF_MEM_LVLNUM field also need to be set inorder to represent the data source. Hence printing data source via PERF_MEM_LVL field can be skip in that scenario. For ex: Encodings for mem_hops fields with L2 cache: L2 - local L2 L2 | REMOTE | HOPS_0 - remote core, same node L2 L2 | REMOTE | HOPS_1 - remote node, same socket L2 L2 | REMOTE | HOPS_2 - remote socket, same board L2 L2 | REMOTE | HOPS_3 - remote board L2 Signed-off-by: Kajol Jain <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Jajeev <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Nageswara R Sastry <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Song Liu <[email protected]> Cc: [email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-10-19tools/perf: Add mem_hops field in perf_mem_data_src structureKajol Jain1-1/+18
Going forward, future generation systems can have more hierarchy within the node/package level but currently we don't have any data source encoding field in perf, which can be used to represent this level of data. Add a new field called 'mem_hops' in the perf_mem_data_src structure which can be used to represent intra-node/package or inter-node/off-package details. This field is of size 3 bits where PERF_MEM_HOPS_{NA, 0..6} value can be used to present different hop levels data. Also add corresponding macros to define mem_hop field values and shift value. Currently we define macro for HOPS_0 which corresponds to data coming from another core but same node. Add functionality to represent mem_hop field data in perf_mem__lvl_scnprintf function with the help of added string array called mem_hops. For ex: Encodings for mem_hops fields with L2 cache: L2 - local L2 L2 | REMOTE | HOPS_0 - remote core, same node L2 Since with the addition of HOPS field, now remote can be used to denote cache access from the same node but different core, a check is added in the c2c_decode_stats function to set mrem only when HOPS is zero along with set remote field. Signed-off-by: Kajol Jain <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-10-19perf: Add comment about current state of PERF_MEM_LVL_* namespace and remove ↵Kajol Jain1-1/+0
an extra line Add a comment about PERF_MEM_LVL_* namespace being depricated to some extent in favour of added PERF_MEM_{LVLNUM_,REMOTE_,SNOOPX_} fields. Remove an extra line present in perf_mem__lvl_scnprintf function. Signed-off-by: Kajol Jain <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2021-06-16perf mem-events: Remove duplicate #undefLi Huafei1-2/+0
Remove duplicate '#undef E'. Signed-off-by: Li Huafei <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Zhang Jinhao <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-06-01perf mem: Support record for hybrid platformJin Yao1-0/+65
Support 'perf mem record' for hybrid platform. On hybrid platform, such as Alderlake, when executing 'perf mem record', it actually calls: record -e {cpu_core/mem-loads-aux/,cpu_core/mem-loads,ldlat=30/}:P -e cpu_atom/mem-loads,ldlat=30/P -e cpu_core/mem-stores/P -e cpu_atom/mem-stores/P Signed-off-by: Jin Yao <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-06-01perf tools: Check if mem_events is supported for hybrid platformJin Yao1-6/+26
Check if the mem_events ('mem-loads' and 'mem-stores') exist in the sysfs path. For Alderlake, the hybrid cpu pmu are "cpu_core" and "cpu_atom". Check the existing of following paths: /sys/devices/cpu_atom/events/mem-loads /sys/devices/cpu_atom/events/mem-stores /sys/devices/cpu_core/events/mem-loads /sys/devices/cpu_core/events/mem-stores If the patch exists, the mem_event is supported. Signed-off-by: Jin Yao <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-06-01perf tools: Support pmu prefix for mem-load eventJin Yao1-2/+2
The perf_mem_events__name() can generate the mem-load event name. It uses a variable 'mem_loads_name__init' to avoid generating the event name every time (because perf_pmu__scan takes some time). The perf_mem_events__name() assumes the pmu is "cpu" but it's not correct for hybrid platform. For Alderlake, the pmu is "cpu_core" or "cpu_atom" Introduce a new parameter 'pmu_name' in perf_mem_events__name to let the caller specify a pmu name. Considering such event name is x86 specific, so move perf_mem_events[] to arch/x86/util/mem-events.c. We still keep the variable 'mem_loads_name__init' but it's only used when pmu_name is NULL (compatible for original behavior). When pmu_name is not NULL (e.g. "cpu_core"), this patch doesn't have optimization. That can be implemented in follow up patch. Signed-off-by: Jin Yao <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Kan Liang <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-02-08perf c2c: Support data block and addr blockKan Liang1-0/+6
'perf c2c' is also a memory profiling tool. Apply the two new data source fields to 'perf c2c' as well. Extend 'perf c2c' to display the number of loads which blocked by data or address conflict. Signed-off-by: Kan Liang <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Don Zickus <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Joe Mario <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-02-08perf tools: Support data block and addr blockKan Liang1-0/+25
Two new data source fields, to indicate the block reasons of a load instruction, are introduced on the Intel Sapphire Rapids server. The fields can be used by the memory profiling. Add a new sort function, SORT_MEM_BLOCKED, for the two fields. For the previous platforms or the block reason is unknown, print "N/A" for the block reason. Add blocked as a default mem sort key for perf report and perf mem report. Committer testing: So in machines without this capability we get a "N/A" filling the new "Blocked" column: $ perf mem record ls arch certs CREDITS Documentation include ipc Kconfig lib MAINTAINERS mm samples security usr block COPYING crypto drivers fs init Kbuild kernel LICENSES Makefile net README scripts sound tools virt [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.008 MB perf.data (17 samples) ] $ $ perf mem report --stdio # To display the perf.data header info, please use --header/--header-only options. # # Total Lost Samples: 0 # # Samples: 6 of event 'cpu/mem-loads,ldlat=30/Pu' # Total weight : 1381 # Sort order : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked,blocked # # Overhead Samples Local Weight Memory access Symbol Shared Object Data Symbol Data Object Snoop TLB access Locked Blocked # ........ ....... ............ .................... ....................... ............. ...................... ............ ..... ............ ...... ....... # 32.87% 1 454 Local RAM or RAM hit [.] _dl_relocate_object ld-2.31.so [.] 0x00007fe91cef3078 libc-2.31.so Hit L1 or L2 hit No N/A 25.56% 1 353 LFB or LFB hit [.] strcmp ld-2.31.so [.] 0x00005586973855ca ls None L1 or L2 hit No N/A 22.59% 1 312 LFB or LFB hit [.] _dl_cache_libcmp ld-2.31.so [.] 0x00007fe91d0e3b18 ld.so.cache None L1 or L2 hit No N/A 8.47% 1 117 LFB or LFB hit [.] _dl_relocate_object ld-2.31.so [.] 0x00007fe91ceee570 libc-2.31.so None L1 or L2 hit No N/A 6.88% 1 95 LFB or LFB hit [.] _dl_relocate_object ld-2.31.so [.] 0x00007fe91ceed490 libc-2.31.so None L1 or L2 hit No N/A 3.62% 1 50 LFB or LFB hit [.] _dl_cache_libcmp ld-2.31.so [.] 0x00007fe91d0ebe60 ld.so.cache None L1 or L2 hit No N/A # Samples: 11 of event 'cpu/mem-stores/Pu' # Total weight : 11 # Sort order : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked,blocked # # Overhead Samples Local Weight Memory access Symbol Shared Object Data Symbol Data Object Snoop TLB access Locked Blocked # ........ ....... ............ ............. ....................... ............. ...................... ........... ..... .......... ...... ....... # 9.09% 1 0 L1 hit [.] __strcoll_l libc-2.31.so [.] 0x00007fffe5648fc8 [stack] N/A N/A N/A N/A 9.09% 1 0 L1 hit [.] _dl_lookup_symbol_x ld-2.31.so [.] 0x00007fffe56490b8 [stack] N/A N/A N/A N/A 9.09% 1 0 L1 hit [.] _dl_name_match_p ld-2.31.so [.] 0x00007fffe56487d8 [stack] N/A N/A N/A N/A 9.09% 1 0 L1 hit [.] _dl_start ld-2.31.so [.] start_time+0x0 ld-2.31.so N/A N/A N/A N/A 9.09% 1 0 L1 hit [.] _dl_sysdep_start ld-2.31.so [.] 0x00007fffe56494b8 [stack] N/A N/A N/A N/A 9.09% 1 0 L1 hit [.] do_lookup_x ld-2.31.so [.] 0x00007fffe5648ff8 [stack] N/A N/A N/A N/A 9.09% 1 0 L1 hit [.] do_lookup_x ld-2.31.so [.] 0x00007fffe5649064 [stack] N/A N/A N/A N/A 9.09% 1 0 L1 hit [.] do_lookup_x ld-2.31.so [.] 0x00007fffe5649130 [stack] N/A N/A N/A N/A 9.09% 1 0 L1 miss [.] _dl_start ld-2.31.so [.] _rtld_global+0xaf8 ld-2.31.so N/A N/A N/A N/A 9.09% 1 0 L1 miss [.] _dl_start ld-2.31.so [.] _rtld_global+0xc28 ld-2.31.so N/A N/A N/A N/A 9.09% 1 0 L1 miss [.] _dl_start ld-2.31.so [.] 0x00007fffe56495b8 [stack] N/A N/A N/A N/A # (Tip: Show user configuration overrides: perf config --user --list) $ Signed-off-by: Kan Liang <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2021-02-08perf tools: Support the auxiliary eventKan Liang1-0/+5
On the Intel Sapphire Rapids server, an auxiliary event has to be enabled simultaneously with the load latency event to retrieve complete Memory Info. Add X86 specific perf_mem_events__name() to handle the auxiliary event. - Users are only interested in the samples of the mem-loads event. Sample read the auxiliary event. - The auxiliary event must be in front of the load latency event in a group. Assume the second event to sample if the auxiliary event is the leader. - Add a weak is_mem_loads_aux_event() to check the auxiliary event for X86. For other ARCHs, it always return false. Parse the unique event name, mem-loads-aux, for the auxiliary event. Committer notes: According to 61b985e3e775a3a7 ("perf/x86/intel: Add perf core PMU support for Sapphire Rapids"), ENODATA is only returned by sys_perf_event_open() when used with these auxiliary events, with this in evsel__open_strerror(): case ENODATA: return scnprintf(msg, size, "Cannot collect data source with the load latency event alone. " "Please add an auxiliary event in front of the load latency event."); This is Ok at this point in time, but fragile long term, I pointed this out in the e-mail thread, requesting a follow up patch to check if ENODATA is really for this specific case. Fixed up sizeof(MEM_LOADS_AUX_NAME) bug pointed out by Namhyung. Signed-off-by: Kan Liang <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jin Yao <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-11-11perf mem: Support new memory event PERF_MEM_EVENTS__LOAD_STORELeo Yan1-1/+12
On the architectures with perf memory profiling, two types of hardware events have been supported: load and store; if want to profile memory for both load and store operations, the tool will use these two events at the same time, the usage is: # perf mem record -t load,store -- uname But this cannot be applied for AUX tracing event, the same PMU event can be used to only trace memory load, or only memory store, or trace for both memory load and store. This patch introduces a new event PERF_MEM_EVENTS__LOAD_STORE, which is used to support the event which can record both memory load and store operations. When user specifies memory operation type as 'load,store', or doesn't set type so use 'load,store' as default, if the arch supports the event PERF_MEM_EVENTS__LOAD_STORE, the tool will convert the required operations to this single event; otherwise, if the arch doesn't support PERF_MEM_EVENTS__LOAD_STORE, the tool rolls back to enable both events PERF_MEM_EVENTS__LOAD and PERF_MEM_EVENTS__STORE, which keeps the same behaviour with before. Signed-off-by: Leo Yan <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-11-11perf mem: Introduce weak function perf_mem_events__ptr()Leo Yan1-7/+19
Different architectures might use different event or different event parameters for memory profiling, this patch introduces a weak perf_mem_events__ptr() function which allows to return back a architecture specific memory event. Since the variable 'perf_mem_events' can be only accessed by the perf_mem_events__ptr() function, mark the variable as 'static', this allows the architectures to define its own memory event array. Signed-off-by: Leo Yan <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-11-11perf mem: Search event name with more flexible pathLeo Yan1-3/+3
The perf tool searches a memory event name under the folder '/sys/devices/cpu/events/', this leads to the limitation for the selection of a memory profiling event which must be under this folder. Thus it's impossible to use any other event as memory event which is not under this specific folder, e.g. Arm SPE hardware event is not located in '/sys/devices/cpu/events/' so it cannot be enabled for memory profiling. This patch changes to search folder from '/sys/devices/cpu/events/' to '/sys/devices', so it give flexibility to find events which can be used for memory profiling. Signed-off-by: Leo Yan <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2020-05-28perf c2c: Fix 'perf c2c record -e list' to show the default events usedIan Rogers1-0/+15
When the event is passed as list, the default events should be listed as per 'perf mem record -e list'. Previous behavior is: $ perf c2c record -e list failed: event 'list' not found, use '-e list' to get list of available events Usage: perf c2c record [<options>] [<command>] or: perf c2c record [<options>] -- <command> [<options>] -e, --event <event> event selector. Use 'perf mem record -e list' to list available events $ New behavior: $ perf c2c record -e list ldlat-loads : available ldlat-stores : available v3: is a rebase. v2: addresses review comments by Jiri Olsa. https://lore.kernel.org/lkml/20191127081844.GH32367@krava/ Signed-off-by: Ian Rogers <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-11-12pref tools: Make 'struct addr_map_symbol' contain 'struct map_symbol'Arnaldo Carvalho de Melo1-1/+1
So that we pass that substructure around and with it consolidate lots of functions that receive a (map, symbol) pair and now can receive just a 'struct map_symbol' pointer. This further paves the way to add 'struct map_groups' to 'struct map_symbol' so that we can have all we need for annotation so that we can ditch 'struct map'->groups, i.e. have the map_groups pointer in a more central place, avoiding the pointer in the 'struct map' that have tons of instances. Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31perf symbols: Move mem_info and branch_info out of symbol.hArnaldo Carvalho de Melo1-0/+1
The mem_info struct goes to mem-events.h and branch_info goes to branch.h, where they belong, this way we can remove several headers from symbols.h and trim the include dependency tree more. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-08-31perf tools: Remove needless sort.h include directivesArnaldo Carvalho de Melo1-1/+0
Now that sort.h isn't included by any other header, we can check where it is really needed, i.e. we can remove it and be sure that it isn't being obtained indirectly. Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2019-02-04perf mem/c2c: Fix perf_mem_events to support powerpcRavi Bangoria1-1/+1
PowerPC hardware does not have a builtin latency filter (--ldlat) for the "mem-load" event and perf_mem_events by default includes "/ldlat=30/" which is causing a failure on PowerPC. Refactor the code to support "perf mem/c2c" on PowerPC. This patch depends on kernel side changes done my Madhavan: https://lists.ozlabs.org/pipermail/linuxppc-dev/2018-December/182596.html Signed-off-by: Ravi Bangoria <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Dick Fowles <[email protected]> Cc: Don Zickus <[email protected]> Cc: Joe Mario <[email protected]> Cc: Madhavan Srinivasan <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman1-0/+1
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <[email protected]> Reviewed-by: Philippe Ombredanne <[email protected]> Reviewed-by: Thomas Gleixner <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2017-08-28perf c2c: Fix remote HITM detection for SkylakeJiri Olsa1-2/+9
Skylake introduced new mem_remote bit in union perf_mem_data_src [1]. It applies to any other memory level to express Remote unknown level, as is reported by Skylake. Adding this extra check to c2c_decode_stats to properly decode remote HITMs on Skylake. [1] http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]> Acked-by: Andi Kleen <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: David Ahern <[email protected]> Cc: Joe Mario <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>