Age | Commit message (Collapse) | Author | Files | Lines |
|
Based on TMA_metrics-full.csv version 4.3 at 01.org:
https://download.01.org/perfmon/
Events are updated to version 1.12:
https://download.01.org/perfmon/ICL
Json files generated by the latest code at:
https://github.com/intel/event-converter-for-linux-perf
Tested:
Not tested on an Icelake, on a SkylakeX:
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
Reviewed-by: Kan Liang <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexandre Torgue <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Maxime Coquelin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Zhengjun Xing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Based on TMA_metrics-full.csv version 4.3 at 01.org:
https://download.01.org/perfmon/
Events are updated to version 30:
https://download.01.org/perfmon/HSW
Json files generated by the latest code at:
https://github.com/intel/event-converter-for-linux-perf
Tested:
Not tested on a Haswell, on a SkylakeX:
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
Reviewed-by: Kan Liang <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexandre Torgue <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Maxime Coquelin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Zhengjun Xing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Events are still at version 1.01:
https://download.01.org/perfmon/GLP
Json files generated by the latest code at:
https://github.com/intel/event-converter-for-linux-perf
The addition of a floating-point.json is due to events having
their topic better identified by the converter script.
Tested:
Not tested on a GoldmontPlus, on a SkylakeX:
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
Reviewed-by: Kan Liang <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexandre Torgue <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Maxime Coquelin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Zhengjun Xing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Events are still at version 13:
https://download.01.org/perfmon/GLM
Json files generated by the latest code at:
https://github.com/intel/event-converter-for-linux-perf
The addition of a floating-point.json is due to events having
their topic better identified by the converter script.
Tested:
Not tested on a Goldmont, on a SkylakeX:
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
Reviewed-by: Kan Liang <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexandre Torgue <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Maxime Coquelin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Zhengjun Xing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Based on TMA_metrics-full.csv version 4.3 at 01.org:
https://download.01.org/perfmon/
Events are updated to version 17:
https://download.01.org/perfmon/BDX
Json files generated by the latest code at:
https://github.com/intel/event-converter-for-linux-perf
Tested:
Not tested on a BroadwellX, on a SkylakeX:
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
Reviewed-by: Kan Liang <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexandre Torgue <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Maxime Coquelin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Zhengjun Xing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Based on TMA_metrics-full.csv version 4.3 at 01.org:
https://download.01.org/perfmon/
Events are updated to version 26:
https://download.01.org/perfmon/BDW
Json files generated by the latest code at:
https://github.com/intel/event-converter-for-linux-perf
Tested:
Not tested on a Broadwell, on a SkylakeX:
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
Reviewed-by: Kan Liang <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexandre Torgue <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Maxime Coquelin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Zhengjun Xing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Events are still at version 4:
https://download.01.org/perfmon/BNL
Json files generated by the latest code at:
https://github.com/intel/event-converter-for-linux-perf
Tested:
Not tested on a Bonnell, on a SkylakeX:
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
Reviewed-by: Kan Liang <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexandre Torgue <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Maxime Coquelin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Zhengjun Xing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Based on TMA_metrics-full.csv version 4.3 at 01.org:
https://download.01.org/perfmon/
Events are updated to version 1.11:
https://download.01.org/perfmon/ICX
Json files generated by:
https://github.com/intel/event-converter-for-linux-perf
Tested:
...
6: Parse event definition strings : Ok
7: Simple expression parser : Ok
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
68: Parse and process metrics : Ok
...
88: perf stat metrics (shadow stat) test : Ok
89: perf all metricgroups test : Ok
90: perf all metrics test : FAILED!
91: perf all PMU test : Ok
...
Test 90 failed due to MEM_PMM_Read_Latency as the test machine
lacks optane memory, and the divide by 0 causes the metric not to
print - which is intended behavior.
Reviewed-by: Kan Liang <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexandre Torgue <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Maxime Coquelin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Zhengjun Xing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Events are still at version 2:
https://download.01.org/perfmon/WSM-EP-DP
Json files generated by the latest code at:
https://github.com/intel/event-converter-for-linux-perf
Tested:
...
6: Parse event definition strings : Ok
7: Simple expression parser : Ok
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
68: Parse and process metrics : Ok
...
88: perf stat metrics (shadow stat) test : Ok
89: perf all metricgroups test : Ok
90: perf all metrics test : Ok
91: perf all PMU test : Ok
...
Reviewed-by: Kan Liang <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexandre Torgue <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Maxime Coquelin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Zhengjun Xing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Based on TMA_metrics-full.csv version 4.3 at 01.org:
https://download.01.org/perfmon/
Events are still at version 21:
https://download.01.org/perfmon/IVB
Json files generated by:
https://github.com/intel/event-converter-for-linux-perf
Tested:
...
6: Parse event definition strings : Ok
7: Simple expression parser : Ok
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
68: Parse and process metrics : Ok
...
88: perf stat metrics (shadow stat) test : Ok
89: perf all metricgroups test : Ok
90: perf all metrics test : FAILED!
91: perf all PMU test : Ok
...
Test 90 failed for Load_Miss_Real_Latency with <not counted> events:
Performance counter stats for 'perf bench internals synthesize':
<not counted> mem_load_uops_retired.hit_lfb (0.00%)
<not counted> MEM_LOAD_UOPS_RETIRED.L1_MISS (0.00%)
<not counted> L1D_PEND_MISS.PENDING (0.00%)
558185217 ns duration_time
This is exposing a somewhat known issue with weak groups that can
be worked around with:
$ perf stat --metric-no-group -M Load_Miss_Real_Latency -a sleep 1
Performance counter stats for 'system wide':
14935022 mem_load_uops_retired.hit_lfb # 23.55 Load_Miss_Real_Latency (83.23%)
4716714 MEM_LOAD_UOPS_RETIRED.L1_MISS (66.68%)
462705675 L1D_PEND_MISS.PENDING (83.22%)
1001548340 ns duration_time
1.001548340 seconds time elapsed
Reviewed-by: Kan Liang <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexandre Torgue <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Maxime Coquelin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Zhengjun Xing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Based on TMA_metrics-full.csv version 4.3 at 01.org:
https://download.01.org/perfmon/
Events are updated to version 22:
https://download.01.org/perfmon/HSX
Json files generated by:
https://github.com/intel/event-converter-for-linux-perf
Tested:
...
6: Parse event definition strings : Ok
7: Simple expression parser : Ok
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
68: Parse and process metrics : Ok
...
88: perf stat metrics (shadow stat) test : Ok
89: perf all metricgroups test : Ok
90: perf all metrics test : FAILED!
91: perf all PMU test : Ok
...
Test 90 failed for Load_Miss_Real_Latency with <not counted> events:
Performance counter stats for 'system wide':
<not counted> mem_load_uops_retired.hit_lfb (0.00%)
<not counted> MEM_LOAD_UOPS_RETIRED.L1_MISS (0.00%)
<not counted> L1D_PEND_MISS.PENDING (0.00%)
1002638743 ns duration_time
This is exposing a somewhat known issue with weak groups that can
be worked around with:
$ perf stat --metric-no-group -M Load_Miss_Real_Latency -a sleep 1
Performance counter stats for 'system wide':
9539883 mem_load_uops_retired.hit_lfb # 25.87 Load_Miss_Real_Latency (83.24%)
10876212 MEM_LOAD_UOPS_RETIRED.L1_MISS (66.68%)
528172960 L1D_PEND_MISS.PENDING (83.26%)
1001964165 ns duration_time
Reviewed-by: Kan Liang <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexandre Torgue <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Maxime Coquelin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Zhengjun Xing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Based on TMA_metrics-full.csv version 4.3 at 01.org:
https://download.01.org/perfmon/
Events are updated to version 1.14:
https://download.01.org/perfmon/CLX
Json files generated by:
https://github.com/intel/event-converter-for-linux-perf
Tested:
...
6: Parse event definition strings : Ok
7: Simple expression parser : Ok
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
68: Parse and process metrics : Ok
...
88: perf stat metrics (shadow stat) test : Ok
89: perf all metricgroups test : Ok
90: perf all metrics test : FAILED!
91: perf all PMU test : Ok
...
Test 90 failed due to MEM_PMM_Read_Latency as the test machine lacks
optane memory, and the divide by 0 causes the metric not to print -
which is intended behavior.
Reviewed-by: Kan Liang <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexandre Torgue <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Maxime Coquelin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Zhengjun Xing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Based on TMA_metrics-full.csv version 4.3 at 01.org:
https://download.01.org/perfmon/
Events are still at version 7:
https://download.01.org/perfmon/BDW-DE
Json files generated by:
https://github.com/intel/event-converter-for-linux-perf
This adds TopdownL1_SMT metrics to bdwde-metrics.json. A discussed in:
https://lore.kernel.org/all/[email protected]/
The TMA_Metrics-full.csv was modified so that BDW-DE is in the server
column with BDX, the Page_Walks_Utilization and
Page_Walks_Utilization_SMT metrics are then copied from BDW.
Tested:
...
6: Parse event definition strings : Ok
7: Simple expression parser : Ok
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
68: Parse and process metrics : Ok
...
88: perf stat metrics (shadow stat) test : Ok
89: perf all metricgroups test : Ok
90: perf all metrics test : Skip
91: perf all PMU test : Ok
...
90 skips due to a lack of floating point samples, which is
understandable.
Suggested-by: Kan Liang <[email protected]>
Reviewed-by: Kan Liang <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexandre Torgue <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Maxime Coquelin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Zhengjun Xing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Based on TMA_metrics-full.csv version 4.3 at 01.org:
https://download.01.org/perfmon/
Events are updated to version 1.26:
https://download.01.org/perfmon/SKX
Json files generated by:
https://github.com/intel/event-converter-for-linux-perf
Fixes were made that allow the skx-metrics.json to successfully
generate, bringing back TopdownL1 metrics.
Tested:
$ perf test
...
6: Parse event definition strings : Ok
7: Simple expression parser : Ok
...
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
...
68: Parse and process metrics : Ok
...
88: perf stat metrics (shadow stat) test : Ok
89: perf all metricgroups test : Ok
90: perf all metrics test : Skip
91: perf all PMU test : Ok
...
90 skips due to a lack of floating point samples, which is
understandable.
Fixes: c4ad8fabd03f76ed ("perf vendor events: Update metrics for SkyLake Server")
Reviewed-by: Kan Liang <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexandre Torgue <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Maxime Coquelin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Zhengjun Xing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Some Intel TMA metrics compute a ratio that may divide by 0, which
causes the metric not to print. This happens for metrics with FP_ARITH
events. If we see these events in the result and would otherwise fail,
then switch to a skip.
Also, don't early exit when processing metrics.
Reviewed-by: John Garry <[email protected]>
Reviewed-by: Kan Liang <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexandre Torgue <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Maxime Coquelin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Zhengjun Xing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Now unmap_ip is const, make contains symbol const.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexey Bayduraev <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: André Almeida <[email protected]>
Cc: Andy Shevchenko <[email protected]>
Cc: Darren Hart <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Dmitriy Vyukov <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: German Gomez <[email protected]>
Cc: Hao Luo <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Miaoqian Lin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Riccardo Mancini <[email protected]>
Cc: Shunsuke Nakamura <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Stephen Brennan <[email protected]>
Cc: Steven Rostedt (VMware) <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Yury Norov <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The maps code has its own header, move the corresponding C function
definitions to their own C file. In the process tidy and minimize
includes.
Committer notes:
Add back the 'static' for maps__init() and maps__exit().
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexey Bayduraev <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: André Almeida <[email protected]>
Cc: Andy Shevchenko <[email protected]>
Cc: Darren Hart <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Dmitriy Vyukov <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: German Gomez <[email protected]>
Cc: Hao Luo <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Miaoqian Lin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Riccardo Mancini <[email protected]>
Cc: Shunsuke Nakamura <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Stephen Brennan <[email protected]>
Cc: Steven Rostedt (VMware) <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Yury Norov <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Now purely accessed through new and delete, so reduce to file scope.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexey Bayduraev <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: André Almeida <[email protected]>
Cc: Andy Shevchenko <[email protected]>
Cc: Darren Hart <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Dmitriy Vyukov <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: German Gomez <[email protected]>
Cc: Hao Luo <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Miaoqian Lin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Riccardo Mancini <[email protected]>
Cc: Shunsuke Nakamura <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Stephen Brennan <[email protected]>
Cc: Steven Rostedt (VMware) <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Yury Norov <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
struct maps is reference counted, using a pointer is more idiomatic.
Committer notes:
Check maps__new() return.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexey Bayduraev <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: André Almeida <[email protected]>
Cc: Andy Shevchenko <[email protected]>
Cc: Darren Hart <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Dmitriy Vyukov <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: German Gomez <[email protected]>
Cc: Hao Luo <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Miaoqian Lin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Riccardo Mancini <[email protected]>
Cc: Shunsuke Nakamura <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Stephen Brennan <[email protected]>
Cc: Steven Rostedt (VMware) <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Yury Norov <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
struct maps is reference counted, using a pointer is more idiomatic.
Committer notes:
Delay:
maps = machine__kernel_maps(&vmlinux);
To after:
machine__init(&vmlinux, "", HOST_KERNEL_ID);
To avoid this on f34:
In file included from /var/home/acme/git/perf/tools/perf/util/build-id.h:10,
from /var/home/acme/git/perf/tools/perf/util/dso.h:13,
from tests/vmlinux-kallsyms.c:8:
In function ‘machine__kernel_maps’,
inlined from ‘test__vmlinux_matches_kallsyms’ at tests/vmlinux-kallsyms.c:122:22:
/var/home/acme/git/perf/tools/perf/util/machine.h:86:23: error: ‘vmlinux.kmaps’ is used uninitialized [-Werror=uninitialized]
86 | return machine->kmaps;
| ~~~~~~~^~~~~~~
tests/vmlinux-kallsyms.c: In function ‘test__vmlinux_matches_kallsyms’:
tests/vmlinux-kallsyms.c:121:34: note: ‘vmlinux’ declared here
121 | struct machine kallsyms, vmlinux;
| ^~~~~~~
cc1: all warnings being treated as errors
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexey Bayduraev <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: André Almeida <[email protected]>
Cc: Andy Shevchenko <[email protected]>
Cc: Darren Hart <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Dmitriy Vyukov <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: German Gomez <[email protected]>
Cc: Hao Luo <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Miaoqian Lin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Riccardo Mancini <[email protected]>
Cc: Shunsuke Nakamura <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Stephen Brennan <[email protected]>
Cc: Steven Rostedt (VMware) <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Yury Norov <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Libbpf has deprecated the ability to keep track of object list inside
libbpf, it now requires applications to track usage multiple bpf objects
directly. Remove usage of bpf_object__next() API and hoist the tracking
logic to perf.
Signed-off-by: Christy Lee <[email protected]>
Acked-by: Song Liu <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/bpf/[email protected]
Signed-off-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Jiri Olsa <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
bpf_load_program() API is deprecated, remove perf's usage of the
deprecated function. Add a __weak function declaration for libbpf
version compatibility.
Signed-off-by: Christy Lee <[email protected]>
Cc: Alexei Starovoitov <[email protected]>
Cc: Daniel Borkmann <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/bpf/[email protected]
Signed-off-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Having functions to access nsinfo reduces the places where reference
counting checking needs to be added.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexey Bayduraev <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: André Almeida <[email protected]>
Cc: Andy Shevchenko <[email protected]>
Cc: Darren Hart <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Dmitriy Vyukov <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: German Gomez <[email protected]>
Cc: Hao Luo <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Miaoqian Lin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Riccardo Mancini <[email protected]>
Cc: Shunsuke Nakamura <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Stephen Brennan <[email protected]>
Cc: Steven Rostedt (VMware) <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Yury Norov <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Functions purely determine a value from the map and don't need to modify
it. Move functions to C file as they are most commonly used via a
function pointer.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexey Bayduraev <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: André Almeida <[email protected]>
Cc: Andy Shevchenko <[email protected]>
Cc: Darren Hart <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Dmitriy Vyukov <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: German Gomez <[email protected]>
Cc: Hao Luo <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Miaoqian Lin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Riccardo Mancini <[email protected]>
Cc: Shunsuke Nakamura <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Stephen Brennan <[email protected]>
Cc: Steven Rostedt (VMware) <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Yury Norov <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Improve readability in perf_pmu__cpus_match() by using
perf_cpu_map__for_each_cpu().
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexey Bayduraev <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: André Almeida <[email protected]>
Cc: Andy Shevchenko <[email protected]>
Cc: Darren Hart <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Dmitriy Vyukov <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: German Gomez <[email protected]>
Cc: Hao Luo <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Miaoqian Lin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Riccardo Mancini <[email protected]>
Cc: Shunsuke Nakamura <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Stephen Brennan <[email protected]>
Cc: Steven Rostedt (VMware) <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Yury Norov <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Switch from directly accessing the perf_cpu_map to using the appropriate
libperf API when possible. Using the API simplifies the job of
refactoring use of perf_cpu_map.
Signed-off-by: Ian Rogers <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexey Bayduraev <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: André Almeida <[email protected]>
Cc: Andy Shevchenko <[email protected]>
Cc: Darren Hart <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Dmitriy Vyukov <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: German Gomez <[email protected]>
Cc: Hao Luo <[email protected]>
Cc: James Clark <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: John Garry <[email protected]>
Cc: Kajol Jain <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Madhavan Srinivasan <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Masami Hiramatsu <[email protected]>
Cc: Miaoqian Lin <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Riccardo Mancini <[email protected]>
Cc: Shunsuke Nakamura <[email protected]>
Cc: Song Liu <[email protected]>
Cc: Stephane Eranian <[email protected]>
Cc: Stephen Brennan <[email protected]>
Cc: Steven Rostedt (VMware) <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Thomas Richter <[email protected]>
Cc: Yury Norov <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Print path and name of a data file into raw dump (-D)
<file_offset>@<path/file>:
[email protected] [0x30]: event: 9
or
[email protected]/data.7 [0x30]: event: 9
Reviewed-by: Riccardo Mancini <[email protected]>
Signed-off-by: Alexey Bayduraev <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Tested-by: Riccardo Mancini <[email protected]>
Acked-by: Andi Kleen <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Antonov <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexei Budankov <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/e8378fd4910c10751b001be880705653989283c2.1642440724.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Load data directory files and provide basic raw dump and aggregated
analysis support of data directories in report mode, still with no
memory consumption optimizations.
READER_MAX_SIZE is chosen based on the results of measurements on
different machines on perf.data directory sizes >1GB. On machines
with big core count (192 cores) the difference between 1MB and 2MB
is about 4%. Other sizes (>2MB) are quite equal to 2MB.
On machines with small core count (4-24) there is no differences
between 1-16 MB sizes. So this constant is 2MB.
Suggested-by: Jiri Olsa <[email protected]>
Reviewed-by: Riccardo Mancini <[email protected]>
Signed-off-by: Alexey Bayduraev <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Tested-by: Riccardo Mancini <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Antonov <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexei Budankov <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/3f10c13a226c0ceb53e88a082f847b91c1ae2c25.1642440724.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Implement compatibility checks for other modes and related command line
options: asynchronous (--aio) trace streaming and affinity (--affinity)
modes, pipe mode, AUX area tracing --snapshot and --aux-sample options,
--switch-output, --switch-output-event, --switch-max-files and
--timestamp-filename options. Parallel data streaming is compatible with
Zstd compression (--compression-level) and external control commands
(--control). CPU mask provided via -C option filters --threads
specification masks.
Reviewed-by: Riccardo Mancini <[email protected]>
Signed-off-by: Alexey Bayduraev <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Tested-by: Riccardo Mancini <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Antonov <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexei Budankov <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/fadc1cf74057af4d5766248fcfe5cdde40732aa9.1642440724.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Extend --threads option in perf record command line interface.
The option can have a value in the form of masks that specify
CPUs to be monitored with data streaming threads and its layout
in system topology. The masks can be filtered using CPU mask
provided via -C option.
The specification value can be user defined list of masks. Masks
separated by colon define CPUs to be monitored by one thread and
affinity mask of that thread is separated by slash. For example:
<cpus mask 1>/<affinity mask 1>:<cpu mask 2>/<affinity mask 2>
specifies parallel threads layout that consists of two threads
with corresponding assigned CPUs to be monitored.
The specification value can be a string e.g. "cpu", "core" or
"package" meaning creation of data streaming thread for every
CPU or core or package to monitor distinct CPUs or CPUs grouped
by core or package.
The option provided with no or empty value defaults to per-cpu
parallel threads layout creating data streaming thread for every
CPU being monitored.
Document --threads option syntax and parallel data streaming modes
in Documentation/perf-record.txt.
Suggested-by: Jiri Olsa <[email protected]>
Suggested-by: Namhyung Kim <[email protected]>
Reviewed-by: Riccardo Mancini <[email protected]>
Signed-off-by: Alexey Bayduraev <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Tested-by: Riccardo Mancini <[email protected]>
Acked-by: Andi Kleen <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Antonov <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexei Budankov <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/079e2619be70c465317cf7c9fdaf5fa069728c32.1642440724.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Provide --threads option in perf record command line interface.
The option creates a data streaming thread for each CPU in the system.
Document --threads option in Documentation/perf-record.txt.
Reviewed-by: Riccardo Mancini <[email protected]>
Signed-off-by: Alexey Bayduraev <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Tested-by: Riccardo Mancini <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Antonov <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexei Budankov <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/01aeae43b047f428596c4ef9f9342ab94865cedd.1642440724.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Introduce bytes_transferred and bytes_compressed stats so they
would capture statistics for the related data buffer transfers.
Reviewed-by: Riccardo Mancini <[email protected]>
Signed-off-by: Alexey Bayduraev <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Tested-by: Riccardo Mancini <[email protected]>
Acked-by: Andi Kleen <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Antonov <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexei Budankov <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/b5d598034c507dfb7544d2125500280b7d434764.1642440724.git.alexey.v.bayduraev@linux.intel.com
[ Use PRiu64 to print u64 values, fixing the build on 32-bit architectures ]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Introduce compressor object into mmap object so it could be used to
pack the data stream from the corresponding kernel data buffer.
Initialize and make use of the introduced per mmap compressor.
Signed-off-by: Alexey Bayduraev <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Acked-by: Andi Kleen <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Antonov <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexei Budankov <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Riccardo Mancini <[email protected]>
Link: https://lore.kernel.org/r/80edc286cf6543139a7d5a91217605123aa0b50d.1642440724.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Introduce a function to calculate the total amount of data written
and use it to support the --max-size option.
Reviewed-by: Riccardo Mancini <[email protected]>
Signed-off-by: Alexey Bayduraev <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Tested-by: Riccardo Mancini <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Antonov <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexei Budankov <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/3e2c69186641446f8ab003ec209bccc762b3394d.1642440724.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Introduce data file objects into mmap object so it could be used to
process and store data stream from the corresponding kernel data buffer.
Initialize data files located at mmap buffer objects so trace data
can be written into several data file located at data directory.
Signed-off-by: Alexey Bayduraev <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Acked-by: Andi Kleen <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Antonov <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexei Budankov <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Riccardo Mancini <[email protected]>
Link: https://lore.kernel.org/r/177077f7734b63e5c999ccd75ac6dc3c694f0d0d.1642440724.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Start thread in detached state because its management is implemented
via messaging to avoid any scaling issues. Block signals prior thread
start so only main tool thread would be notified on external async
signals during data collection. Thread affinity mask is used to assign
eligible CPUs for the thread to run. Wait and sync on thread start using
thread ack pipe.
Reviewed-by: Riccardo Mancini <[email protected]>
Signed-off-by: Alexey Bayduraev <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Tested-by: Riccardo Mancini <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Antonov <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexei Budankov <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/95784dd9f7c81ee408eab27b50b4c09ad4cf7be6.1642440724.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Signal thread to terminate by closing write fd of msg pipe.
Receive THREAD_MSG__READY message as the confirmation of the
thread's termination. Stop threads created for parallel trace
streaming prior their stats processing.
Reviewed-by: Riccardo Mancini <[email protected]>
Signed-off-by: Alexey Bayduraev <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Tested-by: Riccardo Mancini <[email protected]>
Acked-by: Andi Kleen <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Antonov <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexei Budankov <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/55ef8cc5ec3a96360660d9dc1763573225325f8c.1642440724.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Introduce thread local variable and use it for threaded trace streaming.
Use thread affinity mask instead of record affinity mask in affinity
modes. Use evlist__ctlfd_update() to propagate control commands from
thread object to global evlist object to enable evlist__ctlfd_*
functionality. Move waking and sample statistic to struct record_thread
and introduce record__waking function to calculate the total number of
wakes.
Reviewed-by: Riccardo Mancini <[email protected]>
Signed-off-by: Alexey Bayduraev <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Tested-by: Riccardo Mancini <[email protected]>
Acked-by: Andi Kleen <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Antonov <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexei Budankov <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/0d127555219991c1dcd6c6bb76b24fa6b78d2932.1642440724.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Introduce evlist__ctlfd_update() function to propagate external control
commands to global evlist object.
Reviewed-by: Riccardo Mancini <[email protected]>
Signed-off-by: Alexey Bayduraev <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Tested-by: Riccardo Mancini <[email protected]>
Acked-by: Andi Kleen <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Antonov <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexei Budankov <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/7df52c9816b13c74897b9e518128b29a391462fe.1642440724.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Introduce thread specific data object and array of such objects
to store and manage thread local data. Implement functions to
allocate, initialize, finalize and release thread specific data.
Thread local maps and overwrite_maps arrays keep pointers to
mmap buffer objects to serve according to maps thread mask.
Thread local pollfd array keeps event fds connected to mmaps
buffers according to maps thread mask.
Thread control commands are delivered via thread local comm pipes
and ctlfd_pos fd. External control commands (--control option)
are delivered via evlist ctlfd_pos fd and handled by the main
tool thread.
Reviewed-by: Riccardo Mancini <[email protected]>
Signed-off-by: Alexey Bayduraev <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Tested-by: Riccardo Mancini <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Antonov <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexei Budankov <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/fc9f74af6f822d9c0fa0e145c3564a760dbe3d4b.1642440724.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Introduce affinity and mmap thread masks. Thread affinity mask
defines CPUs that a thread is allowed to run on. Thread maps
mask defines mmap data buffers the thread serves to stream
profiling data from.
Reviewed-by: Riccardo Mancini <[email protected]>
Signed-off-by: Alexey Bayduraev <[email protected]>
Tested-by: Jiri Olsa <[email protected]>
Tested-by: Riccardo Mancini <[email protected]>
Acked-by: Andi Kleen <[email protected]>
Acked-by: Namhyung Kim <[email protected]>
Cc: Adrian Hunter <[email protected]>
Cc: Alexander Antonov <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Alexei Budankov <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Jiri Olsa <[email protected]>
Cc: Namhyung Kim <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/9042bf7daf988e17e17e6acbf5d29590bde869cd.1642440724.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Stats from discarded entries should be omitted.
But a lock class may have both good and bad entries.
If the first entry was bad, we can zero-fill the stats and only add good
stats if any.
The entry can remove the discard state if it finds a good entry later.
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The -c or --combine-locks option is to merge lock instances in the
same class into a single entry. It compares the name of the locks
and marks duplicated entries using lock_stat->combined.
# perf lock report
Name acquired contended avg wait (ns) total wait (ns) max wait (ns) min wait (ns)
rcu_read_lock 251225 0 0 0 0 0
&(ei->i_block_re... 8731 0 0 0 0 0
&sb->s_type->i_l... 8731 0 0 0 0 0
hrtimer_bases.lock 5261 0 0 0 0 0
hrtimer_bases.lock 2626 0 0 0 0 0
hrtimer_bases.lock 1953 0 0 0 0 0
hrtimer_bases.lock 1382 0 0 0 0 0
cpu_hotplug_lock 1350 0 0 0 0 0
hrtimer_bases.lock 1273 0 0 0 0 0
hrtimer_bases.lock 1269 0 0 0 0 0
# perf lock report -c
Name acquired contended avg wait (ns) total wait (ns) max wait (ns) min wait (ns)
rcu_read_lock 251225 0 0 0 0 0
hrtimer_bases.lock 39450 0 0 0 0 0
&sb->s_type->i_l... 10301 1 662 662 662 662
ptlock_ptr(page) 10173 2 701 1402 760 642
&(ei->i_block_re... 8732 0 0 0 0 0
&xa->xa_lock 8088 0 0 0 0 0
&base->lock 6705 0 0 0 0 0
&p->pi_lock 5549 0 0 0 0 0
&dentry->d_lockr... 5010 4 1274 5097 1844 789
&ep->lock 3958 0 0 0 0 0
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
It has 20 character spaces for name so lock names shorter than 20
should be printed without ellipsis.
Before:
# perf lock report
Name acquired contended avg wait (ns) total wait (ns) max wait (ns) min wait (ns)
rcu_read_lock 251225 0 0 0 0 0
&(ei->i_block_re... 8731 0 0 0 0 0
&sb->s_type->i_l... 8731 0 0 0 0 0
hrtimer_bases.lo... 5261 0 0 0 0 0
hrtimer_bases.lo... 2626 0 0 0 0 0
hrtimer_bases.lo... 1953 0 0 0 0 0
hrtimer_bases.lo... 1382 0 0 0 0 0
cpu_hotplug_lock... 1350 0 0 0 0 0
After:
# perf lock report
Name acquired contended avg wait (ns) total wait (ns) max wait (ns) min wait (ns)
rcu_read_lock 251225 0 0 0 0 0
&(ei->i_block_re... 8731 0 0 0 0 0
&sb->s_type->i_l... 8731 0 0 0 0 0
hrtimer_bases.lock 5261 0 0 0 0 0
hrtimer_bases.lock 2626 0 0 0 0 0
hrtimer_bases.lock 1953 0 0 0 0 0
hrtimer_bases.lock 1382 0 0 0 0 0
cpu_hotplug_lock 1350 0 0 0 0 0
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Instead of the random order, sort it by lock class name.
Before:
# perf lock info -m
Address of instance: name of class
0xffffa0d940ac5310: &dentry->d_lockref.lock
0xffffa0c20b0e1cb0: &dentry->d_lockref.lock
0xffffa0d8e051cc48: &base->lock
0xffffa0d94f992110: &anon_vma->rwsem
0xffffa0d947a4f278: (null)
0xffffa0c208f6e108: &map->lock
0xffffa0c213ad32c8: &cfs_rq->removed.lock
0xffffa0c20d695888: &parent->list_lock
0xffffa0c278775278: (null)
0xffffa0c212ad4690: &dentry->d_lockref.lock
After:
# perf lock info -m
Address of instance: name of class
0xffffa0c20d538800: &(&sig->stats_lock)->lock
0xffffa0c216d4ec40: &(&sig->stats_lock)->lock
0xffffa1fe4cb04610: &(__futex_data.queues)[i].lock
0xffffa1fe4cb07750: &(__futex_data.queues)[i].lock
0xffffa1fe4cb07b50: &(__futex_data.queues)[i].lock
0xffffa1fe4cb0b850: &(__futex_data.queues)[i].lock
0xffffa1fe4cb0bcd0: &(__futex_data.queues)[i].lock
0xffffa1fe4cb0e5d0: &(__futex_data.queues)[i].lock
0xffffa1fe4cb11ad0: &(__futex_data.queues)[i].lock
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
As evsel__intval() returns u64, we can just use it as is.
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
The hlist_head has a single entry so we can save some memory.
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Likewise, it should use a proper name in case the task runs under
chroot.
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
When reading build-id from a DSO, it should consider if it's from a
chroot task. In that case, the path is different so it needs to prepend
the root directory to access the file correctly.
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|
|
Currently it doesn't handle tasks in chroot properly. As filenames in
MMAP records base on their root directory, it's different than what perf
tool can see from outside.
Add filename_with_chroot() helper to deal with those cases. The
function returns a new filename only if it's in a different root
directory. Since it needs to access /proc for the process, it only
works until the task exits.
With this change, I can see symbols in my program like below.
# perf record -o- chroot myroot myprog 3 | perf report -i-
...
#
# Overhead Command Shared Object Symbol
# ........ ....... ................. .............................
#
99.83% myprog myprog [.] loop
0.04% chroot [kernel.kallsyms] [k] fxregs_fixup
0.04% chroot [kernel.kallsyms] [k] rsm_load_seg_32
...
Signed-off-by: Namhyung Kim <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Ian Rogers <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
|