Age | Commit message (Collapse) | Author | Files | Lines |
|
Disable CC3/CC7/PC2/PC3/PC6/PC7 for is_jvl() models.
Delete is_jvl() CPU model check.
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
Disable PC2/PC3/PC7 and enable PC6 for has_slv_msrs() models.
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
Enable PC7 for has_snb_msrs() models.
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
Enable PC3/PC6 for platforms with .cst_limit set because package cstates
are guarded by pkg_cstate_limit.
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
Enable CC7 and PC2 for has_snb_msrs() models.
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
Enable CC1/CC3/CC6 for platforms with .has_nhm_msrs set.
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
Add skeleton support for cstate enumeration.
Note that the previous logic may override the cstate setting for
multiple times for different reasons. The conversion to new cstate
enumeration must be done step by step following the previous code
order strictly.
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
On some models, the CPU base frequency is different from the TSC
frequency, and the aperf/mperf counters are running at CPU base
frequency instead of TSC frequency.
Abstract support for TSC tweak.
Given that tsc_tweak depends on base_hz, move the code to probe_bclk()
after base_hz is available.
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
RAPL probing can be done without family/model checking. Remove these
parameters in rapl probe functions.
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
Different hardcoded TDP values are used when TDP can not be retrieved
from the hardware.
Abstract hardcoded TDP value.
Delete CPU model checks in get_tdp_intel().
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
Abstract the support for fixed Dram domain energy unit.
Delete rapl_dram_energy_units_probe() CPU model check.
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
INTEL_FAM6_ATOM_SILVERMONT model needs a divisor to convert the raw
Energy Units value from MSR_RAPL_POWER_UNIT.
Abstract the support for RAPL divisor.
Delete CPU model check in rapl_probe_intel().
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
Abstract the support for Per Core RAPL.
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
Abstract the support for RAPL MSRs.
Delete CPU model checks in rapl_probe_intel().
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
The support for each RAPL domains, as well as the support for the perf
status of each RAPL domains, can be detected by checking the
availabilities of the corresponding RAPL MSRs.
Change the code accordingly and remove the hardcoded logic for each
model.
Note that this also fixes the INTEL_FAM6_ATOM_TREMONT model, which has
RAPL_PKG_PERF_STATUS and MSR_DRAM_PERF_STATUS but doesn't have BIC_PKG__
and BIC_RAM__ set.
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
Redefine RAPL macros to make the code more readable.
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
Abstract the support for hardcoded Crystal Clock frequency, which is
used when crystal clock is not available from CPUID.15.
Delete CPU model checks in process_cpuid().
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
Abstract the support for AUTOMATIC_CSTATE_CONVERSION bit in
MSR_PKG_CST_CONFIG_CONTROL.
Delete automatic_cstate_conversion_probe() CPU model check.
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
Abstract the support for MSR_CORE/GFX/RING_PERF_LIMIT_REASONS MSRs.
Delete perf_limit_reasons_probe() CPU model check.
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
Abstract the support for different TCC Offset bits in
MSR_IA32_TEMPERATURE_TARGET.
Delete check_tcc_offset() CPU model check.
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
Abstract the support for MSR_CONFIG_TDP_NOMINAL/LEVEL_1/LEVEL_2/CONTROL
and MSR_TURBO_ACTIVATION_RATIO.
Delete has_config_tdp() CPU model check.
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
Rename dump_hsw_turbo_ratio_limits() and dump_ivt_turbo_ratio_limits()
to dump_turbo_ratio_limit2() and dump_turbo_ratio_limit1() because they
dump MSR_TURBO_RATIO_LIMIT1/LIMIT2, and the MSRs' behavior is
consistent when they are available.
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
Abstract the support for MSR_TURBO_RATIO_LIMIT, MSR_TRUBO_RATIO_LIMIT1,
MSR_TURBO_RATIO_LIMIT2, MSR_SECONDARY_TURBO_RATIO_LIMIT,
MSR_ATOM_CORE_RATIOS and MSR_ATOM_CORE_TURBO_RATIOS.
Delete has_turbo_ratio_group_limits(), has_turbo_ratio_limit(),
has_atom_turbo_ratio_limit(), has_ivt_turbo_ratio_limit(),
has_hsw_turbo_ratio_limit(), has_knl_turbo_ratio_limit() and
has_glm_turbo_ratio_limit() CPU model checks.
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
Rename dump_nhm_platform_info() and dump_nhm_cst_cfg() to
dump_platform_info() and dump_cst_cfg() because these MSRs' behavior is
consistent when they're available.
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
Platforms with has_msr_misc_pwr_mgmt set is a subset of platforms with
has_nhm_msrs set.
Thus remove the redudant check for platform->has_nhm_msrs.
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
MSR_PLATFORM_INFO, MSR_IA32_TEMPERATURE_TARGET, MSR_SMI_COUNT,
MSR_PKG_CST_CONFIG_CONTROL, and the TRL MSRs are always available for
platforms since Nehalem. Support for these msrs can be described
altogether.
Abstract the support for these MSRs.
Delete probe_nhm_msrs() CPU model check.
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
Abstract the support for decoding package cstate limit from
MSR_PKG_CST_CONFIG_CONTROL.
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
Abstract CPU base clock frequency support.
Note that bclk is used by
1. calculate base_hz using MSR_PLATFORM_INFO, which is guarded by
probe_nhm_msrs().
2. dump MSR_PLATFORM_INFO and Turbo Ratio Limit MSRs, which are also
guarded by probe_nhm_msrs().
Thus probe_bclk() works for probe_nhm_msrs() models only.
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
Abstract MSR_MISC_PWR_MGMT support.
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
Abstract MSR_MISC_FEATURE_CONTROL support.
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
Turbostat supports a series of features that may diverge among different
CPU models.
Current code uses various of CPU model checks in different places to
handle this, which makes the code hard to maintain.
Add skeleton support for table driven feature enumeration to replace the
current error-prone CPU model checks and global variables.
Note: by comparing the CPU models with intel-family.h, it is found that
turbostat support for below four Models are missing, including
INTEL_FAM6_ICELAKE, INTEL_FAM6_ATOM_SILVERMONT_MID,
INTEL_FAM6_ATOM_AIRMONT_MID and INTEL_FAM6_ATOM_AIRMONT_NP. Adding
support for these models is a different work, thus it is not covered in
this patch set.
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
INTEL_FAM6_ATOM_SILVERMONT_MID/INTEL_FAM6_ATOM_AIRMONT_MID are not
listed in probe_nhm_msrs(). This means that most of the turbostat
features are not available on these two platforms.
Further more, checking for these two models in has_slv_msrs() is
dead code. Because has_slv_msrs() is called by the code guarded by
probe_nhm_msrs().
For these two reasons, remove pseudo check for
INTEL_FAM6_ATOM_SILVERMONT_MID and INTEL_FAM6_ATOM_AIRMONT_MID.
Will add back the support when we can access these two platforms.
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
Remove redundant duplicates in intel_model_duplicates().
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
Kernel already has
#define INTEL_FAM6_NEHALEM_G 0x1F /* Auburndale / Havendale */
Use standard Macro for CPU Model instead of raw value.
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
/sys/class/graphics/fb0/device/drm/card0/ and /sys/class/drm/card0/
point to the same device node.
But in some cases, one exists and the other one does not.
Prefer to use /sys/class/drm/card0/, and fall back to
/sys/class/graphics/fb0/device/drm/card0/.
This recovers the "GFXMHz" and "GFXAMHz" columns on some platforms like
a SPR server.
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
All Models that duplicate INTEL_FAM6_CANNONLAKE_L support TCC Offset.
Enable this feature on all these models.
Delete obsolete model_orig.
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
Currently the C-state Pre-wake will not be printed due to the
probe has not been invoked. Invoke the probe function accordingly.
Fixes: aeb01e6d71ff ("tools/power turbostat: Print the C-state Pre-wake settings")
Signed-off-by: Chen Yu <[email protected]>
Reviewed-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
MSR_KNL_CORE_C6_RESIDENCY should be evaluated only if
1. this is KNL platform
AND
2. need to get C6 residency or need to calculate C1 residency
Fix the broken logic introduced by commit 1e9042b9c8d4 ("tools/power
turbostat: Fix CPU%C1 display value").
Fixes: 1e9042b9c8d4 ("tools/power turbostat: Fix CPU%C1 display value")
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
On some platforms, turbostat fails during launch time like below,
turbostat version 2023.03.17 - Len Brown <[email protected]>
...
cpu40: MSR_IA32_PACKAGE_THERM_STATUS: 0x884c0000 (24 C)
cpu40: MSR_IA32_PACKAGE_THERM_INTERRUPT: 0x00000003 (100 C, 100 C)
turbostat: snapshot_sysfs_counter(/sys/devices/system/cpu/intel_uncore_frequency/package_00_die_00/current_freq_khz): No data available
This is because new uncore sysfs is used on these platforms as
introduced by commit 9b8dea80e3cb ("platform/x86/intel-uncore-freq:
Support for cluster level controls").
With the new uncore sysfs interface,
/sys/devices/system/cpu/intel_uncore_frequency/package_00_die_00/current_freq_khz
is still available, but reading it fails.
How to support the fabric cluster level uncore sysfs is not settled yet,
as a short term fix, clear the BIC_UNCORE_MHZ bit when new sysfs I/F is
detected.
Signed-off-by: Zhang Rui <[email protected]>
Reviewed-by: Len Brown <[email protected]>
|
|
When exit abnormally in process mode, customize SIGINT and SIGTERM signal
handler to kill the forked child processes.
Before:
# perf bench sched messaging -l 1000000 -g 1 &
[1] 8519
# # Running 'sched/messaging' benchmark:
# pgrep sched-messaging | wc -l
41
# kill -15 8519
[1]+ Terminated perf bench sched messaging -l 1000000 -g 1
# pgrep sched-messaging | wc -l
40
After:
# perf bench sched messaging -l 1000000 -g 1 &
[1] 8472
# # Running 'sched/messaging' benchmark:
# pgrep sched-messaging | wc -l
41
# kill -15 8472
[1]+ Exit 1 perf bench sched messaging -l 1000000 -g 1
# pgrep sched-messaging | wc -l
0
Signed-off-by: Yang Jihong <[email protected]>
Reviewed-by: Ian Rogers <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[ namhyung: fix a whitespace ]
Signed-off-by: Namhyung Kim <[email protected]>
|
|
process mode
To save pid of child processes when creating worker:
1. The messaging worker is changed to `union` type to store thread id and
process pid.
2. Save child process pid in create_process_worker().
3. Rename `pth_tab` as `work_tab`.
Test result:
# perf bench sched messaging
# Running 'sched/messaging' benchmark:
# 20 sender and receiver processes per group
# 10 groups == 400 processes run
Total time: 6.744 [sec]
# perf bench sched messaging -t
# Running 'sched/messaging' benchmark:
# 20 sender and receiver threads per group
# 10 groups == 400 threads run
Total time: 5.788 [sec]
Signed-off-by: Yang Jihong <[email protected]>
Reviewed-by: Ian Rogers <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Namhyung Kim <[email protected]>
|
|
Refactor the create_worker() helper:
1. Modify the return value and use pthread pointer as a parameter to
facilitate value assignment in create_worker().
2. The thread worker creation and process worker creation are abstracted
into independent helpers.
No functional change.
Test result:
# perf bench sched messaging
# Running 'sched/messaging' benchmark:
# 20 sender and receiver processes per group
# 10 groups == 400 processes run
Total time: 6.332 [sec]
# perf bench sched messaging -t
# Running 'sched/messaging' benchmark:
# 20 sender and receiver threads per group
# 10 groups == 400 threads run
Total time: 5.545 [sec]
Signed-off-by: Yang Jihong <[email protected]>
Reviewed-by: Ian Rogers <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Namhyung Kim <[email protected]>
|
|
Fixed several code style issues in sched-messaging:
1. Use one space around "-" and "+" operators.
2. When a long line is broken, the operator is at the end of the line.
Signed-off-by: Yang Jihong <[email protected]>
Reviewed-by: Ian Rogers <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Namhyung Kim <[email protected]>
|
|
Running shellcheck on some of the shell scripts, throws
below warning on shellcheck v0.6. Example:
In tests/shell/coresight/asm_pure_loop.sh line 14:
DATA="$DATD/perf-$TEST-$DATV.data"
^---^ SC2153: Possible misspelling: DATD may not be assigned, but DATA is.
Here, DATD is exported from "lib/coresight.sh" and this
warning can be ignored. Use "shellcheck disable=" to ignore
this check.
Signed-off-by: Athira Rajeev <[email protected]>
Tested-by: Ian Rogers <[email protected]>
Reviewed-by: Kajol Jain <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Namhyung Kim <[email protected]>
|
|
Running shellcheck on stat+shadow_stat.sh generates below
warning
In tests/shell/stat+csv_summary.sh line 26:
while read _num _event _run _pct
^--^ SC2034: _num appears unused. Verify use (or export if used externally).
^----^ SC2034: _event appears unused. Verify use (or export if used externally).
^--^ SC2034: _run appears unused. Verify use (or export if used externally).
^--^ SC2034: _pct appears unused. Verify use (or export if used externally).
This variable is intentionally unused since it is
needed to parse through the output. commit used "_"
as a prefix for this throw away variable. But this
stil shows warning with shellcheck v0.6. Fix this
by only using "_" instead of prefix and variable name.
Signed-off-by: Athira Rajeev <[email protected]>
Tested-by: Ian Rogers <[email protected]>
Reviewed-by: Kajol Jain <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Namhyung Kim <[email protected]>
|
|
Running shellcheck on some of the shell scripts throws
below error:
In tests/shell/coresight/unroll_loop_thread_10.sh line 8:
. "$(dirname $0)"/../lib/coresight.sh
^-- SC1090: Can't follow non-constant source. Use a directive to specify location.
This happens on shellcheck version "0.6.0". Fix shellcheck
warning for SC1090 using "shellcheck source="i option to mention
the location of sourced files.
Signed-off-by: Athira Rajeev <[email protected]>
Tested-by: Ian Rogers <[email protected]>
Reviewed-by: Kajol Jain <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Namhyung Kim <[email protected]>
|
|
There is a spelling mistake in a pr_debug message. Fix it.
(I didn't see this one in the first spell check scan I ran).
Signed-off-by: Colin Ian King <[email protected]>
Reviewed-by: Ian Rogers <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Namhyung Kim <[email protected]>
|
|
Dummy events are created with an attribute where the period and freq
are zero. evsel__config will then see the uninitialized values and
initialize them in evsel__default_freq_period. As fequency mode is
used by default the dummy event would be set to use frequency
mode. However, this has no effect on the dummy event but does cause
unnecessary timers/interrupts. Avoid this overhead by setting the
period to 1 for dummy events.
evlist__add_aux_dummy calls evlist__add_dummy then sets freq=0 and
period=1. This isn't necessary after this change and so the setting is
removed.
From Stephane:
The dummy event is not counting anything. It is used to collect mmap
records and avoid a race condition during the synthesize mmap phase of
perf record. As such, it should not cause any overhead during active
profiling. Yet, it did. Because of a bug the dummy event was
programmed as a sampling event in frequency mode. Events in that mode
incur more kernel overheads because on timer tick, the kernel has to
look at the number of samples for each event and potentially adjust
the sampling period to achieve the desired frequency. The dummy event
was therefore adding a frequency event to task and ctx contexts we may
otherwise not have any, e.g.,
perf record -a -e cpu/event=0x3c,period=10000000/.
On each timer tick the perf_adjust_freq_unthr_context() is invoked and
if ctx->nr_freq is non-zero, then the kernel will loop over ALL the
events of the context looking for frequency mode ones. In doing, so it
locks the context, and enable/disable the PMU of each hw event. If all
the events of the context are in period mode, the kernel will have to
traverse the list for nothing incurring overhead. The overhead is
multiplied by a very large factor when this happens in a guest kernel.
There is no need for the dummy event to be in frequency mode, it does
not count anything and therefore should not cause extra overhead for
no reason.
Fixes: 5bae0250237f ("perf evlist: Introduce perf_evlist__new_dummy constructor")
Reported-by: Stephane Eranian <[email protected]>
Signed-off-by: Ian Rogers <[email protected]>
Acked-by: Adrian Hunter <[email protected]>
Cc: Yang Jihong <[email protected]>
Cc: Kan Liang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Namhyung Kim <[email protected]>
|
|
Remove the repeated word "of" in comments.
Signed-off-by: Charles Han <[email protected]>
Reviewed-by: Ian Rogers <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Namhyung Kim <[email protected]>
|
|
This patch addresses review comments that were given for
705ed549148f ("perf vendor events arm64: Add AmpereOne metrics")
but didn't make it to the original patch [1][2]
Changes include: A fix for backend_memory formula, use of standard metrics
when possible, using #slots, renaming metrics to avoid spaces in the names,
and cleanup.
[1] https://lore.kernel.org/linux-perf-users/[email protected]/
[2] https://lore.kernel.org/linux-perf-users/[email protected]/
Fixes: 705ed549148f ("perf vendor events arm64: Add AmpereOne metrics")
Signed-off-by: Ilkka Koskinen <[email protected]>
Reviewed-by: Ian Rogers <[email protected]>
Cc: James Clark <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Leo Yan <[email protected]>
Cc: Mike Leach <[email protected]>
Cc: Dave Kleikamp <[email protected]>
Cc: John Garry <[email protected]>
Cc: D Scott Phillips <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Namhyung Kim <[email protected]>
|