aboutsummaryrefslogtreecommitdiff
path: root/tools/perf
AgeCommit message (Collapse)AuthorFilesLines
2023-05-23pert tests: Add tests for new "perf stat --per-cache" aggregation optionK Prateek Nayak3-1/+30
Add tests for the new "--per-cache" option in 'perf stat' for CSV and JSON generation as well as for the JSON linting. Suggested-by: Gautham Shenoy <[email protected]> Signed-off-by: K Prateek Nayak <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ananth Narayan <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Wen Pu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-23perf stat: Add "--per-cache" aggregation option and document itK Prateek Nayak2-0/+72
This patch adds support for "--per-cache" option for aggregation at a particular cache level and documents the same. Following is the output of 'perf stat' with aggregation at L3 for the event "ls_dmnd_fills_from_sys.ext_cache_remote" on a dual socket 3rd Generation EPYC Processor (2 x 64C/128T - 16 LLCs) when running hackbench pinned to 4 LLCs: $ sudo perf stat --per-cache=L3 -a -e ls_dmnd_fills_from_sys.ext_cache_remote -- \ taskset -c 0-15,64-79,128-143,192-207 \ perf bench sched messaging -p -t -l 100000 -g 8 ... Performance counter stats for 'system wide': S0-D0-L3-ID0 16 9,500,803 ls_dmnd_fills_from_sys.ext_cache_remote S0-D0-L3-ID8 16 6,338,099 ls_dmnd_fills_from_sys.ext_cache_remote S0-D0-L3-ID16 16 355,005 ls_dmnd_fills_from_sys.ext_cache_remote S0-D0-L3-ID24 16 22,067 ls_dmnd_fills_from_sys.ext_cache_remote S0-D0-L3-ID32 16 16,321 ls_dmnd_fills_from_sys.ext_cache_remote S0-D0-L3-ID40 16 11,619 ls_dmnd_fills_from_sys.ext_cache_remote S0-D0-L3-ID48 16 4,238 ls_dmnd_fills_from_sys.ext_cache_remote S0-D0-L3-ID56 16 31,158 ls_dmnd_fills_from_sys.ext_cache_remote S1-D1-L3-ID64 16 28,242,452 ls_dmnd_fills_from_sys.ext_cache_remote S1-D1-L3-ID72 16 22,906,973 ls_dmnd_fills_from_sys.ext_cache_remote S1-D1-L3-ID80 16 72,898 ls_dmnd_fills_from_sys.ext_cache_remote S1-D1-L3-ID88 16 56,907 ls_dmnd_fills_from_sys.ext_cache_remote S1-D1-L3-ID96 16 20,456 ls_dmnd_fills_from_sys.ext_cache_remote S1-D1-L3-ID104 16 40,913 ls_dmnd_fills_from_sys.ext_cache_remote S1-D1-L3-ID112 16 78,113 ls_dmnd_fills_from_sys.ext_cache_remote S1-D1-L3-ID120 16 37,897 ls_dmnd_fills_from_sys.ext_cache_remote Also support 'perf stat record' and 'perf stat report' with the ability to specify a different cache level to aggregate data at when running 'perf stat report'. $ sudo perf stat record --per-cache=L2 -a -e ls_dmnd_fills_from_sys.ext_cache_remote -- \ taskset -c 0-15,64-79,128-143,192-207 \ perf bench sched messaging -p -t -l 100000 -g 8 ... Performance counter stats for 'system wide': S0-D0-L2-ID0 2 1,442,061 ls_dmnd_fills_from_sys.ext_cache_remote S0-D0-L2-ID1 2 1,548,994 ls_dmnd_fills_from_sys.ext_cache_remote S0-D0-L2-ID2 2 1,553,557 ls_dmnd_fills_from_sys.ext_cache_remote S0-D0-L2-ID3 2 1,420,122 ls_dmnd_fills_from_sys.ext_cache_remote S0-D0-L2-ID4 2 1,465,461 ls_dmnd_fills_from_sys.ext_cache_remote S0-D0-L2-ID5 2 1,455,153 ls_dmnd_fills_from_sys.ext_cache_remote S0-D0-L2-ID6 2 1,595,237 ls_dmnd_fills_from_sys.ext_cache_remote S0-D0-L2-ID7 2 1,499,321 ls_dmnd_fills_from_sys.ext_cache_remote S0-D0-L2-ID8 2 1,919,025 ls_dmnd_fills_from_sys.ext_cache_remote ... S1-D1-L2-ID127 2 21,295 ls_dmnd_fills_from_sys.ext_cache_remote $ sudo perf stat report --per-cache=L3 Performance counter stats for 'perf stat record --per-cache=L2 -a -e ls_dmnd_fills_from_sys.ext_cache_remote --\ taskset -c 0-15,64-79,128-143,192-207 \ perf bench sched messaging -p -t -l 100000 -g 8': S0-D0-L3-ID0 16 11,979,906 ls_dmnd_fills_from_sys.ext_cache_remote S0-D0-L3-ID8 16 14,257,202 ls_dmnd_fills_from_sys.ext_cache_remote S0-D0-L3-ID16 16 377,484 ls_dmnd_fills_from_sys.ext_cache_remote S0-D0-L3-ID24 16 27,224 ls_dmnd_fills_from_sys.ext_cache_remote S0-D0-L3-ID32 16 26,816 ls_dmnd_fills_from_sys.ext_cache_remote S0-D0-L3-ID40 16 14,461 ls_dmnd_fills_from_sys.ext_cache_remote S0-D0-L3-ID48 16 10,499 ls_dmnd_fills_from_sys.ext_cache_remote S0-D0-L3-ID56 16 53,817 ls_dmnd_fills_from_sys.ext_cache_remote S1-D1-L3-ID64 16 27,361,987 ls_dmnd_fills_from_sys.ext_cache_remote S1-D1-L3-ID72 16 37,299,024 ls_dmnd_fills_from_sys.ext_cache_remote S1-D1-L3-ID80 16 84,125 ls_dmnd_fills_from_sys.ext_cache_remote S1-D1-L3-ID88 16 64,561 ls_dmnd_fills_from_sys.ext_cache_remote S1-D1-L3-ID96 16 13,403 ls_dmnd_fills_from_sys.ext_cache_remote S1-D1-L3-ID104 16 20,138 ls_dmnd_fills_from_sys.ext_cache_remote S1-D1-L3-ID112 16 93,220 ls_dmnd_fills_from_sys.ext_cache_remote S1-D1-L3-ID120 16 35,465 ls_dmnd_fills_from_sys.ext_cache_remote On the above system, the domain covered by S0-D0-L3-ID0 contains S0-D0-L2-ID0 to S0-D0-L2-ID7, the corresponding count for L3-ID0 is equal to the sum of counts for L2-ID0 to L2-ID7. Add documentation for the newly introduced "--per-cache" option. Suggested-by: Gautham Shenoy <[email protected]> Signed-off-by: K Prateek Nayak <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ananth Narayan <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Wen Pu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-23perf stat record: Save cache level informationK Prateek Nayak2-3/+5
When aggregating based on cache-topology, in addition to the aggregation mode, knowing the cache level at which data is aggregated is necessary to ensure consistency when running 'perf stat record' and later 'perf stat report'. Save the cache level for aggregation as a part of the env data that can be later retrieved when running perf stat report. Suggested-by: Gautham Shenoy <[email protected]> Signed-off-by: K Prateek Nayak <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ananth Narayan <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Wen Pu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-23perf stat: Setup the foundation to allow aggregation based on cache topologyK Prateek Nayak5-1/+244
Processors based on chiplet architecture, such as AMD EPYC and Hygon do not expose the chiplet details in the sysfs CPU topology information. However, this information can be derived from the per CPU cache level information from the sysfs. 'perf stat' has already supported aggregation based on topology information using core ID, socket ID, etc. It'll be useful to aggregate based on the cache topology to detect problems like imbalance and cache-to-cache sharing at various cache levels. This patch lays the foundation for aggregating data in 'perf stat' based on the processor's cache topology. The cmdline option to aggregate data based on the cache topology is added in Patch 4 of the series while this patch sets up all the necessary functions and variables required to support the new aggregation option. The patch also adds support to display per-cache aggregation, or save it as a JSON or CSV, as splitting it into a separate patch would break builds when compiling with "-Werror=switch-enum" where the compiler will complain about the lack of handling for the AGGR_CACHE case in the output functions. Committer notes: Don't use perf_stat_config in tools/perf/util/cpumap.c, this would make code that is in util/, thus not really specific to a single builtin, use a specific builtin config structure. Move the functions introduced in this patch from tools/perf/util/cpumap.c since it needs access to builtin specific and is not strictly needed to live in the util/ directory. With this 'perf test python' is back building. Suggested-by: Gautham Shenoy <[email protected]> Signed-off-by: K Prateek Nayak <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ananth Narayan <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Wen Pu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-23perf: Extract building cache level for a CPU into separate functionK Prateek Nayak2-23/+43
build_caches() builds the complete cache topology of the system by iterating over all CPU, building and comparing cache levels of each CPU, keeping only the unique ones at the end. Extract the unit that build the cache levels for a single CPU into a separate function. Expose this function, and the MAX_CACHE_LVL value to be used elsewhere in perf too. Signed-off-by: K Prateek Nayak <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ananth Narayan <[email protected]> Cc: Gautham Shenoy <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sandipan Das <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Wen Pu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-23perf vendor events intel: Update tigerlake events/metricsIan Rogers4-486/+505
Update tigerlake events to v1.12 including the new events MEM_LOAD_MISC_RETIRED.UC and SQ_MISC.BUS_LOCK. Metrics are updated to make TMA info metric names synchronized. Events and metrics were generated by: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-23perf vendor events intel: Update snowridgex eventsIan Rogers9-23/+36
Update snowridgex to v1.21 that marks deprecated a number of events and adds improves descriptions. The events data was generated by: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-23perf vendor events intel: Update skylake/skylakex events/metricsIan Rogers7-917/+1216
Update skylake events to v60 and skylakex events to v1.30, adding the events FP_ARITH_INST_RETIRED.4_FLOPS, FP_ARITH_INST_RETIRED.8_FLOPS, FP_ARITH_INST_RETIRED.SCALAR, FP_ARITH_INST_RETIRED.VECTOR and INT_MISC.CLEARS_COUNT. Metrics are updated to make TMA info metric names synchronized. Events and metrics were generated by: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-23perf vendor events intel: Update sapphirerapids events/metricsIan Rogers5-552/+823
Update sapphirerapids events to v1.13 improving event descriptions. Metrics are updated to make TMA info metric names synchronized. Events and metrics were generated by: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-23perf vendor events intel: Update sandybridge metricsIan Rogers1-111/+111
Metrics are updated to make TMA info metric names synchronized. Metrics were generated by: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-23perf vendor events intel: Update jaketown metricsIan Rogers1-112/+112
Metrics are updated to make TMA info metric names synchronized. Metrics were generated by: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-23perf vendor events intel: Update ivybridge/ivytown metricsIan Rogers2-530/+530
Metrics are updated to make TMA info metric names synchronized. Metrics were generated by: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-23perf vendor events intel: Update icelake/icelakex events/metricsIan Rogers4-993/+1269
Update icelake events to v1.18 including the new events MEM_LOAD_MISC_RETIRED.UC and SQ_MISC.BUS_LOCK. Metrics are updated to make TMA info metric names synchronized. Events and metrics were generated by: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-23perf vendor events intel: Update haswell(x) metricsIan Rogers2-488/+696
Metrics are updated to make TMA info metric names synchronized. Metrics were generated by: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-23perf vendor events intel: Update elkhartlake eventsIan Rogers5-1/+23
Update elkhartlake to v1.04 that marks deprecated a number of events and adds additional description to MEM_BOUND_STALLS.IFETCH. The events data was generated by: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-23perf vendor events intel: Update cascadelakex events/metricsIan Rogers4-482/+783
Update cascadelakex to v1.18 including the new events INT_MISC.CLEARS_COUNT, FP_ARITH_INST_RETIRED.VECTOR, FP_ARITH_INST_RETIRED.SCALAR, FP_ARITH_INST_RETIRED.8_FLOPS and FP_ARITH_INST_RETIRED.4_FLOPS. Metrics are updated to make TMA info metric names synchronized. Events and metrics were generated by: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-23perf vendor events intel: Update broadwell variant events/metricsIan Rogers7-865/+1118
Update broadwell events to v28, broadwellde to v10, broadwellx to v21. Including the new events FP_ARITH_INST_RETIRED.VECTOR, and FP_ARITH_INST_RETIRED.4_FLOPS. Metrics are updated to make TMA info metric names synchronized. Events and metrics were generated by: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-23perf vendor events intel: Update alderlake events/metricsIan Rogers5-824/+783
Update events to v21 including the new event SQ_MISC.BUS_LOCK and improved comments. Metrics are updated to make TMA info metric names synchronized. Events and metrics were generated by: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Reviewed-by: Kan Liang <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Xing Zhengjun <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-22perf test: Add test validating JSON generated by 'perf data convert --to-json'Anup Sharma1-0/+72
This commit adds support for testing the JSON output generated by the 'perf data' command's conversion to JSON functionality. The test script now includes a step to ensure that the resulting JSON file contains valid data. Changes: V1 -> V2: Added a check for the existence of the result output file. Replaced the usage of jq with json.load for validating the JSON format. Checks using ShellCheck and checkpatch, addressing and resolving warnings. Removed the unnecessary root permission check. Modified the 'perf record' command to avoid requiring root permissions. Committer testing: $ perf test to-json 115: 'perf data convert --to-json' command test : Ok $ perf test -v to-json Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc 115: 'perf data convert --to-json' command test : --- start --- test child forked, pid 1746867 Testing Perf Data Convertion Command to JSON Perf Data Converter Command to JSON [SUCCESS] Validating Perf Data Converted JSON file The file contains valid JSON format [SUCCESS] test child finished with 0 ---- end ---- 'perf data convert --to-json' command test: Ok $ Signed-off-by: Anup Sharma <[email protected]> Acked-by: Ian Rogers <[email protected]> Link: https://lore.kernel.org/r/ZGcoJBAGlknjsA/n@yoga Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Anup Sharma <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: [email protected] Cc: [email protected] [ Fixup indentation to use consistently tabs, not a mixture of spaces and tabs, have 'if ... ; then' on the same line ] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-22Merge remote-tracking branch 'acme/perf-tools' into perf-tools-nextArnaldo Carvalho de Melo14-189/+259
To pick up fixes that were already merged upstream. Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-19perf test attr: Fix python SafeConfigParser() deprecation warningIan Rogers1-3/+3
Address the warning: ``` tests/attr.py:155: DeprecationWarning: The SafeConfigParser class has been renamed to ConfigParser in Python 3.2. This alias will be removed in Python 3.12. Use ConfigParser directly instead. parser = configparser.SafeConfigParser() ``` by removing the word 'Safe'. Reviewed-by: James Clark <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kan Liang <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Richter <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-19perf test attr: Update no event/metric expectationsIan Rogers5-174/+249
Previously hard coded events/metrics were used, update for the use of the TopdownL1 json metric group. Reported-by: Arnaldo Carvalho de Melo <[email protected]> Fixes: 94b1a603fca78388 ("perf stat: Add TopdownL1 metric as a default if present") Reviewed-by: James Clark <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Tested-by: Kan Liang <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Richter <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-17tools headers UAPI: Sync arch prctl headers with the kernel sourcesArnaldo Carvalho de Melo2-0/+3
To pick the changes in this cset: a03c376ebaf38394 ("x86/arch_prctl: Add AMX feature numbers as ABI constants") 23e5d9ec2bab53c4 ("x86/mm/iommu/sva: Make LAM and SVA mutually exclusive") 2f8794bd087e7958 ("x86/mm: Provide arch_prctl() interface for LAM") This picks these new prctls in a third range, that was also added to the tools/perf/trace/beauty/arch_prctl.c beautifier. $ tools/perf/trace/beauty/x86_arch_prctl.sh > /tmp/before $ cp arch/x86/include/uapi/asm/prctl.h tools/arch/x86/include/uapi/asm/prctl.h $ tools/perf/trace/beauty/x86_arch_prctl.sh > /tmp/after $ diff -u /tmp/before /tmp/after @@ -20,3 +20,11 @@ [0x2003 - 0x2001]= "MAP_VDSO_64", }; +#define x86_arch_prctl_codes_3_offset 0x4001 +static const char *x86_arch_prctl_codes_3[] = { + [0x4001 - 0x4001]= "GET_UNTAG_MASK", + [0x4002 - 0x4001]= "ENABLE_TAGGED_ADDR", + [0x4003 - 0x4001]= "GET_MAX_TAG_BITS", + [0x4004 - 0x4001]= "FORCE_TAGGED_SVA", +}; + $ With this 'perf trace' can translate those numbers into strings and use the strings in filter expressions: # perf trace -e prctl 0.000 ( 0.011 ms): DOM Worker/3722622 prctl(option: SET_NAME, arg2: 0x7f9c014b7df5) = 0 0.032 ( 0.002 ms): DOM Worker/3722622 prctl(option: SET_NAME, arg2: 0x7f9bb6b51580) = 0 5.452 ( 0.003 ms): StreamT~ns #30/3722623 prctl(option: SET_NAME, arg2: 0x7f9bdbdfeb70) = 0 5.468 ( 0.002 ms): StreamT~ns #30/3722623 prctl(option: SET_NAME, arg2: 0x7f9bdbdfea70) = 0 24.494 ( 0.009 ms): IndexedDB #556/3722624 prctl(option: SET_NAME, arg2: 0x7f562a32ae28) = 0 24.540 ( 0.002 ms): IndexedDB #556/3722624 prctl(option: SET_NAME, arg2: 0x7f563c6d4b30) = 0 670.281 ( 0.008 ms): systemd-userwo/3722339 prctl(option: SET_NAME, arg2: 0x564be30805c8) = 0 670.293 ( 0.002 ms): systemd-userwo/3722339 prctl(option: SET_NAME, arg2: 0x564be30800f0) = 0 ^C# This addresses this perf build warning: Warning: Kernel ABI header at 'tools/arch/x86/include/uapi/asm/prctl.h' differs from latest version at 'arch/x86/include/uapi/asm/prctl.h' diff -u tools/arch/x86/include/uapi/asm/prctl.h arch/x86/include/uapi/asm/prctl.h Cc: Adrian Hunter <[email protected]> Cc: Chang S. Bae <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Cc: Namhyung Kim <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-17tools headers: Update the copy of x86's mem{cpy,set}_64.S used in 'perf bench'Arnaldo Carvalho de Melo4-10/+2
This is to get the changes from: 68674f94ffc9dddc ("x86: don't use REP_GOOD or ERMS for small memory copies") 20f3337d350c4e1b ("x86: don't use REP_GOOD or ERMS for small memory clearing") This also make the 'perf bench mem' files stop referring to the erms versions that gone away with the above patches. That addresses these perf tools build warning: Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S' diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S Warning: Kernel ABI header at 'tools/arch/x86/lib/memset_64.S' differs from latest version at 'arch/x86/lib/memset_64.S' diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-16tools headers UAPI: Sync s390 syscall table file that wires up the ↵Arnaldo Carvalho de Melo1-1/+1
memfd_secret syscall To pick the changes in these csets: 7608f70adcb1ea69 ("s390: wire up memfd_secret system call") That add support for this new syscall in tools such as 'perf trace'. For instance, this is now possible (adapted from the x86_64 test output): # perf trace -v -e memfd_secret event qualifier tracepoint filter: (common_pid != 13375 && common_pid != 3713) && (id == 447) ^C# That is the filter expression attached to the raw_syscalls:sys_{enter,exit} tracepoints. $ grep memfd_secret tools/perf/arch/x86/entry/syscalls/syscall_64.tbl 447 common memfd_secret sys_memfd_secret $ This addresses this perf build warnings: Warning: Kernel ABI header at 'tools/perf/arch/s390/entry/syscalls/syscall.tbl' differs from latest version at 'arch/s390/kernel/syscalls/syscall.tbl' diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl Cc: Adrian Hunter <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Vasily Gorbik <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-16perf metrics: Avoid segv with --topdown for metrics without a groupIan Rogers1-1/+1
Some metrics may not have a metric_group which can result in segvs with "perf stat --topdown". Add a condition for the no metric_group case. Fixes: 1647cd5b8802698f ("perf stat: Implement --topdown using json metrics") Reported-by: Kan Liang <[email protected]> Signed-off-by: Ian Rogers <[email protected]> Tested-by: Kan Liang <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-15perf vendor events arm64: Add AmpereOne core PMU eventsIlkka Koskinen11-0/+1080
Add JSON files for AmpereOne core PMU events. Reviewed-by: John Garry <[email protected]> Signed-off-by: Doug Rady <[email protected]> Signed-off-by: Ilkka Koskinen <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Mike Leach <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-15perf ftrace: Flush output after each writingChangbin Du1-0/+2
The pager will result stdout in full buffering mode instead of line buffering. We need to make the trace visible timely. Signed-off-by: Changbin Du <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-15perf annotate browser: Add '<' and '>' keys for navigationNamhyung Kim2-2/+6
hists__find_annotations() allows to move to next or previous symbols for annotation using the arrow keys. But TUI annotate_browser__run() uses the RIGHT key as ENTER to handle jump/call instructions. That makes the navigation to the next function impossible. I'd like to change it back to move the next symbol but I'm afraid if some users get confused. So I added a new pair of keys to handle that. Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-15perf annotate: Parse x86 SIB addressing properlyNamhyung Kim1-0/+13
When the source argument of the "mov" instruction looks like below, it didn't parse the whole operand and just stopped at the first comma. mov (%rbx,%rax,1),%rcx Fix it by checking the parentheses and move it to the closing one. Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-15perf annotate: Handle "decq", "incq", "testq", "tzcnt" instructions on x86Namhyung Kim1-0/+4
I found that the "decq", "incq", "testq", "tzcnt" instructions didn't parse the operands properly. Add them to the "x86__instructions" table to fix the issue. Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-15perf doc: Add support for KBUILD_BUILD_TIMESTAMPBen Hutchings1-3/+7
When building man pages from a Git checkout, we consistently set the man page date based on when the input was last changed. Otherwise, it defaults to the build time, which is not reproducible. Allow the date to be set through the KBUILD_BUILD_TIMESTAMP variable, as for timestamps in the kernel itself. Signed-off-by: Ben Hutchings <[email protected]> Acked-by: Ian Rogers<[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Salvatore Bonaccorso <[email protected]> Link: https://lore.kernel.org/r/ZF/1F1P+b9qZ/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-15perf doc: Define man page date when using asciidoctorBen Hutchings1-4/+7
When building perf documentation with asciidoc, we use "git log" to find the last commit date of each doc source and pass that to asciidoc to use as the man page date. When using asciidoctor, however, the current date is always used instead. Defining perf_date like we do for asciidoc also doesn't work because we're not using DocBook as an intermediate format. The asciidoctor man page backend looks for the variable "docdate", so set that instead. Signed-off-by: Ben Hutchings <[email protected]> Acked-by: Ian Rogers<[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Salvatore Bonaccorso <[email protected]> Link: https://lore.kernel.org/r/ZF/1BOahN/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-15perf test: Add cputype testing to perf statIan Rogers1-0/+44
Check a bogus PMU fails and that a known PMU succeeds. Limit to PMUs known cpu, cpu_atom and armv8_pmuv3_0 ones. Signed-off-by: Ian Rogers <[email protected]> Tested-by: Kan Liang <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Edward Baker <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kang Minchul <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Sumanth Korikkar <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Tiezhu Yang <[email protected]> Cc: Weilin Wang <[email protected]> Cc: Xing Zhengjun <[email protected]> Cc: Yang Jihong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-15perf build: Don't use -ftree-loop-distribute-patterns and ↵Arnaldo Carvalho de Melo1-0/+4
-gno-variable-location-views in the python feature test when building with clang-13 Using -ftree-loop-distribute-patterns and -gno-variable-location-views in the python feature test when building with clang-16 results in: 16 80.04 clearlinux:latest : FAIL clang version 16.0.1 clang-16: error: unknown argument: '-gno-variable-location-views' clang-16: error: unknown argument: '-gno-variable-location-views' clang-16: error: optimization flag '-ftree-loop-distribute-patterns' is not supported [-Werror,-Wignored-optimization-argument] clang-16: error: optimization flag '-ftree-loop-distribute-patterns' is not supported [-Werror,-Wignored-optimization-argument] error: command '/usr/sbin/clang' failed with exit code 1 Noticed when building on a docker.io/library/clearlinux:latest container. Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-15perf parse-events: Reduce scope of is_event_supportedIan Rogers3-41/+39
Move to print-events.c and make static. Signed-off-by: Ian Rogers <[email protected]> Tested-by: Kan Liang <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Edward Baker <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kang Minchul <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Sumanth Korikkar <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Tiezhu Yang <[email protected]> Cc: Weilin Wang <[email protected]> Cc: Xing Zhengjun <[email protected]> Cc: Yang Jihong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-15perf stat: Don't disable TopdownL1 metric on hybridIan Rogers1-6/+1
Now that hybrid bugs are fixed sufficient to run TopdownL1 metrics, don't implicitly disable them for hybrid. Signed-off-by: Ian Rogers <[email protected]> Tested-by: Kan Liang <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Edward Baker <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kang Minchul <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Sumanth Korikkar <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Tiezhu Yang <[email protected]> Cc: Weilin Wang <[email protected]> Cc: Xing Zhengjun <[email protected]> Cc: Yang Jihong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-15perf metrics: Be PMU specific in event matchIan Rogers1-3/+7
Ids/events from a metric are turned into an event string and parsed; setup_metric_events matches the id back to the parsed evsel. With hybrid the same event may exist on both PMUs with the same name and be being used by metrics at the same time. A metric on cpu_core therefore shouldn't match against evsels on cpu_atom, or the metric will compute the wrong value. Make the matching sensitive to the PMU being parsed. Signed-off-by: Ian Rogers <[email protected]> Tested-by: Kan Liang <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Edward Baker <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kang Minchul <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Sumanth Korikkar <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Tiezhu Yang <[email protected]> Cc: Weilin Wang <[email protected]> Cc: Xing Zhengjun <[email protected]> Cc: Yang Jihong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-15perf jevents: Don't rewrite metrics across PMUsIan Rogers3-16/+22
Don't rewrite metrics across PMUs as the result events likely won't be found. Identify metrics with a pair of PMU name and metric name. Signed-off-by: Ian Rogers <[email protected]> Tested-by: Kan Liang <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Edward Baker <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kang Minchul <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Sumanth Korikkar <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Tiezhu Yang <[email protected]> Cc: Weilin Wang <[email protected]> Cc: Xing Zhengjun <[email protected]> Cc: Yang Jihong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-15perf vendor events intel: Correct alderlake metricsIan Rogers2-122/+122
Fix the metrics tma_memory_bound on alderlake cpu_core and tma_microcode_sequencer on alderlake cpu_atom, where metrics had be rewritten across PMUs. Fix MEM_BOUND_STALLS_AT_RET_CORRECTION which is an aux metric but lacks a hash prefix. Add PMU prefixes for cpu_core/cpu_atom events to avoid wildcard opening the events. Signed-off-by: Ian Rogers <[email protected]> Tested-by: Kan Liang <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Edward Baker <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kang Minchul <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Sumanth Korikkar <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Tiezhu Yang <[email protected]> Cc: Weilin Wang <[email protected]> Cc: Xing Zhengjun <[email protected]> Cc: Yang Jihong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-15perf stat: Command line PMU metric filteringIan Rogers3-9/+15
Wire up the --cputype value to limit which metrics are parsed. Signed-off-by: Ian Rogers <[email protected]> Tested-by: Kan Liang <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Edward Baker <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kang Minchul <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Sumanth Korikkar <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Tiezhu Yang <[email protected]> Cc: Weilin Wang <[email protected]> Cc: Xing Zhengjun <[email protected]> Cc: Yang Jihong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-15perf metrics: Be PMU specific for referenced metrics.Ian Rogers5-32/+75
Hybrid systems may define the same metric for different PMUs, this can cause confusion of events. To avoid this make the referenced metric searches PMU specific, matching that in the table. Signed-off-by: Ian Rogers <[email protected]> Tested-by: Kan Liang <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Edward Baker <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kang Minchul <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Sumanth Korikkar <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Tiezhu Yang <[email protected]> Cc: Weilin Wang <[email protected]> Cc: Xing Zhengjun <[email protected]> Cc: Yang Jihong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-15perf parse-events: Don't reorder atom cpu eventsIan Rogers1-2/+2
On hybrid systems the topdown events don't share a fixed counter on the atom core, so they don't require the sorting the perf metric supporting PMUs do. Signed-off-by: Ian Rogers <[email protected]> Tested-by: Kan Liang <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Edward Baker <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kang Minchul <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Sumanth Korikkar <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Tiezhu Yang <[email protected]> Cc: Weilin Wang <[email protected]> Cc: Xing Zhengjun <[email protected]> Cc: Yang Jihong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-15perf parse-events: Don't auto merge hybrid wildcard eventsIan Rogers4-2/+13
Bring back the behavior of not auto-merging hybrid events by delegating to a test in pmu. Signed-off-by: Ian Rogers <[email protected]> Tested-by: Kan Liang <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Edward Baker <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kang Minchul <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Sumanth Korikkar <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Tiezhu Yang <[email protected]> Cc: Weilin Wang <[email protected]> Cc: Xing Zhengjun <[email protected]> Cc: Yang Jihong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-15perf parse-events: Avoid error when assigning a legacy cache termIan Rogers2-4/+27
Avoid the parser error: ''' $ perf stat -e 'cycles/name=l1d/' true event syntax error: 'cycles/name=l1d/' \___ parser error ''' by combining the name and legacy cache cases in the parser. Signed-off-by: Ian Rogers <[email protected]> Tested-by: Kan Liang <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Edward Baker <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kang Minchul <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Sumanth Korikkar <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Tiezhu Yang <[email protected]> Cc: Weilin Wang <[email protected]> Cc: Xing Zhengjun <[email protected]> Cc: Yang Jihong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-15perf parse-events: Avoid error when assigning a termIan Rogers4-0/+41
Avoid the parser error: ''' $ perf stat -e 'cycles/name=name/' true event syntax error: 'cycles/name=name/' \___ parser error ''' by turning the term back to a string if it is on the right. Add PMU and generic parsing tests. Signed-off-by: Ian Rogers <[email protected]> Tested-by: Kan Liang <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Edward Baker <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kang Minchul <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Sumanth Korikkar <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Tiezhu Yang <[email protected]> Cc: Weilin Wang <[email protected]> Cc: Xing Zhengjun <[email protected]> Cc: Yang Jihong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-15perf parse-events: Support hardware events as termsIan Rogers5-33/+187
An event like "cpu/instructions/" typically parses due to there being a sysfs event called instructions. On hybrid recursive parsing means that the hardware event is encoded in the attribute, with the PMU being placed in the high bits of the config: ''' $ perf stat -vv -e 'cpu_core/cycles/' true ... ------------------------------------------------------------ perf_event_attr: size 136 config 0x400000000 sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ ''' Make this behavior the default by adding a new term type and token for hardware events. The token gathers both the numeric config and the parsed name, so that if the token appears like "cycles/name=cycles/" then the token can be handled like a name. The numeric value isn't sufficient to distinguish say "cpu-cycles" from "cycles". Extend the parse-events test so that all current non-PMU hardware parsing tests, also test with the PMU cpu - more than half the change. Signed-off-by: Ian Rogers <[email protected]> Tested-by: Kan Liang <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Edward Baker <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kang Minchul <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Sumanth Korikkar <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Tiezhu Yang <[email protected]> Cc: Weilin Wang <[email protected]> Cc: Xing Zhengjun <[email protected]> Cc: Yang Jihong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-15perf test: Fix parse-events tests for >1 core PMUIan Rogers1-72/+105
Remove assumptions of just 1 core PMU. Signed-off-by: Ian Rogers <[email protected]> Tested-by: Kan Liang <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Edward Baker <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kang Minchul <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Sumanth Korikkar <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Tiezhu Yang <[email protected]> Cc: Weilin Wang <[email protected]> Cc: Xing Zhengjun <[email protected]> Cc: Yang Jihong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-15perf stat: Make cputype filter genericIan Rogers6-35/+47
Rather than limit the --cputype argument for "perf list" and "perf stat" to hybrid PMUs of just cpu_atom and cpu_core, allow any PMU. Note, that if cpu_atom isn't mounted but a filter of cpu_atom is requested, then this will now fail. As such a filter would never succeed, no events can come from that unmounted PMU, then this behavior could never have been useful and failing is clearer. Signed-off-by: Ian Rogers <[email protected]> Tested-by: Kan Liang <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Edward Baker <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kang Minchul <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Sumanth Korikkar <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Tiezhu Yang <[email protected]> Cc: Weilin Wang <[email protected]> Cc: Xing Zhengjun <[email protected]> Cc: Yang Jihong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
2023-05-15perf parse-events: Add pmu filterIan Rogers11-32/+90
To support the cputype argument added to "perf stat" for hybrid it is necessary to filter events during wildcard matching. Add a scanner argument for the filter and checking it when wildcard matching. Signed-off-by: Ian Rogers <[email protected]> Tested-by: Kan Liang <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Ahmad Yasin <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Caleb Biggers <[email protected]> Cc: Edward Baker <[email protected]> Cc: Florian Fischer <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Garry <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kang Minchul <[email protected]> Cc: Leo Yan <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Perry Taylor <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Rob Herring <[email protected]> Cc: Samantha Alt <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Sumanth Korikkar <[email protected]> Cc: Suzuki Poulouse <[email protected]> Cc: Thomas Richter <[email protected]> Cc: Tiezhu Yang <[email protected]> Cc: Weilin Wang <[email protected]> Cc: Xing Zhengjun <[email protected]> Cc: Yang Jihong <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>