| Age | Commit message (Collapse) | Author | Files | Lines |
|
The target is to provide a clear entry point(for power routines).
Also this can help to maintain a clear view about the frameworks
used on different ASICs. Hopefully all these can make power part
more friendly to play with.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
As these are already defined in amdgpu_atombios.h. Otherwise, we may
hit "redefined" compile warning.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Feifei Xu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Suppress the warning below:
In file included from drivers/gpu/drm/amd/amdgpu/../powerplay/smu_cmn.c:
>> drivers/gpu/drm/amd/powerplay/smu_cmn.c:485:9: warning: Identical condition 'ret', second condition is always false [identicalConditionAfterEarlyExit]
return ret;
^
drivers/gpu/drm/amd/powerplay/smu_cmn.c:477:6: note: first condition
if (ret)
^
drivers/gpu/drm/amd/powerplay/smu_cmn.c:485:9: note: second condition
return ret;
^
Signed-off-by: Evan Quan <[email protected]>
Reported-by: kernel test robot <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The whole approach wasn't thought through till the end.
We already had a reset lock like this in the past and it caused the same problems like this one.
Completely revert the patch for now and add individual trylock protection to the hardware access functions as necessary.
This reverts commit df9c8d1aa278c435c30a69b8f2418b4a52fcb929.
Signed-off-by: Christian König <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Support Sienna Cichlid mgpu fan boost enablement.
Signed-off-by: Evan Quan <[email protected]>
Acked-by: Nirmoy Das <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Support Navi1X mgpu fan boost enablement.
V2: rich the comment and correct the revision id check
Signed-off-by: Evan Quan <[email protected]>
Acked-by: Nirmoy Das <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Enable mgpu fan boost feature on swSMU routines.
Signed-off-by: Evan Quan <[email protected]>
Acked-by: Nirmoy Das <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
when function arcturus_get_smu_metrics_data() call failed,
it will cause the variable "throttler_status" isn't initialized before use.
warning:
powerplay/arcturus_ppt.c:2268:24: warning: ‘throttler_status’ may be used uninitialized in this function [-Wmaybe-uninitialized]
2268 | if (throttler_status & logging_label[throttler_idx].feature_mask) {
Signed-off-by: Kevin Wang <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
To fit the latest SMU firmware.
Signed-off-by: Evan Quan <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Instead of having one copy in each ASIC.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
To make the setting same as Arcturus/Navi1x/Sienna_Cichlid.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The UVD/VCE PG state is managed by UVD and VCE IP. It's error-prone to
assume the bootup state in SMU based on the dpm status.
Signed-off-by: Evan Quan <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Correct the cached smu feature state on pp_features sysfs
setting.
Signed-off-by: Evan Quan <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Reproducing bug report here:
After hibernating and resuming, DPM is not enabled. This remains the case
even if you test hibernate using the steps here:
https://www.kernel.org/doc/html/latest/power/basic-pm-debugging.html
I debugged the problem, and figured out that in the file hardwaremanager.c,
in the function, phm_enable_dynamic_state_management(), the check
'if (!hwmgr->pp_one_vf && smum_is_dpm_running(hwmgr) && !amdgpu_passthrough(adev) && adev->in_suspend)'
returns true for the hibernate case, and false for the suspend case.
This means that for the hibernate case, the AMDGPU driver doesn't enable DPM
(even though it should) and simply returns from that function.
In the suspend case, it goes ahead and enables DPM, even though it doesn't need to.
I debugged further, and found out that in the case of suspend, for the
CIK/Hawaii GPUs, smum_is_dpm_running(hwmgr) returns false, while in the case of
hibernate, smum_is_dpm_running(hwmgr) returns true.
For CIK, the ci_is_dpm_running() function calls the ci_is_smc_ram_running() function,
which is ultimately used to determine if DPM is currently enabled or not,
and this seems to provide the wrong answer.
I've changed the ci_is_dpm_running() function to instead use the same method that
some other AMD GPU chips do (e.g Fiji), which seems to read the voltage controller.
I've tested on my R9 390 and it seems to work correctly for both suspend and
hibernate use cases, and has been stable so far.
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=208839
Signed-off-by: Sandeep Raghuraman <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
As VCN related dpm table setup needs VCN be in PG ungate state. Same logics
applies to JPEG.
V2: fix paste typo
V3: code cosmetic
Signed-off-by: Evan Quan <[email protected]>
Tested-by: Matt Coffin <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add lock protections and avoid unnecessary actions
if the PG state is already the same as required.
Signed-off-by: Evan Quan <[email protected]>
Tested-by: Matt Coffin <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Update drive if file for sienna_cichlid.
Signed-off-by: Likun Gao <[email protected]>
Reviewed-by: Kenneth Feng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
As for the gpu metric export, metrics cache makes no sense. It's up to
user to decide how often the metrics should be retrieved.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
As for the gpu metric export, metrics cache makes no sense. It's up to
user to decide how often the metrics should be retrieved.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
As for the gpu metric export, metrics cache makes no sense. It's up to
user to decide how often the metrics should be retrieved.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
As for the gpu metric export, metrics cache makes no sense. It's up to
user to decide how often the metrics should be retrieved.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
As for the gpu metric export, metrics cache makes no sense. It's up to
user to decide how often the metrics should be retrieved.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
As for the gpu metric export, metrics cache makes no sense. It's up to
user to decide how often the metrics should be retrieved.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add Vega12 gpu metrics export interface.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add Vega20 gpu metrics export interface.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Enable gpu_metrics support on legacy powerplay routines.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add Renoir gpu metrics export interface.
V2: use memcpy to make code more compact
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Acked-by: Nirmoy Das <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add Sienna Cichlid gpu metrics export interface.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add Navi1x gpu metrics export interface.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Although it does not bring any problem for now, the coming gpu
metrics interface needs to handle them differently based on the
asic type.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add Arcturus gpu metrics export interface.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This will be shared around all SMU V11 asics.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
A new interface for UMD to retrieve gpu metrics data.
V2: rich the documentation
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
For Arcturus, the softmin/max settings from driver are permitted on the
latest(54.26 later) SMU firmware. Thus enabling them in driver.
Signed-off-by: Evan Quan <[email protected]>
Reviewed-by: Kenneth Feng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The i2c init/fini functions just register the i2c adapter.
There is no need to call them during hw init/fini. They only
need to be called once per driver init/fini. The previous
behavior broke runtime pm because we unregistered the i2c
adapter during suspend.
Tested-by: Tom St Denis <[email protected]>
Reviewed-by: Kevin Wang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Enable GFXOFF for navy_flounder.
Signed-off-by: Jiansong Chen <[email protected]>
Reviewed-by: Likun Gao <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Remove casting the values returned by memory allocation function.
Coccinelle emits WARNING:
./drivers/gpu/drm/amd/powerplay/hwmgr/vega20_processpptables.c:893:37-46: WARNING: casting value returned by memory allocation function to (PPTable_t *) is useless.
Signed-off-by: Li Heng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
It's in accordance with pmfw 65.5.0 for navy_flounder.
Signed-off-by: Jiansong Chen <[email protected]>
Reviewed-by: Tao Zhou <[email protected]>
Reviewed-by: Kenneth Feng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
These tables have _COUNT number of elements so the comparisons should be
>= instead of > to prevent reading one element beyond the end of the
array.
Fixes: 8264ee69f0d8 ("drm/amd/powerplay: drop unused code")
Signed-off-by: Dan Carpenter <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
1. allow asic to handle sensor type by itself.
2. if not, use smu common sensor to handle it.
Signed-off-by: Kevin Wang <[email protected]>
Reviewed-by: Kenneth Feng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Update sienna_cichlid driver if header and related files.
Support new smu metrics for pre & postDS frequency.
Signed-off-by: Likun Gao <[email protected]>
Reviewed-by: Kenneth Feng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Take back patch:drop unnecessary message support check
Because the gpu reset fail problem on renoir can be fixed by:
drm/amd/powerplay: skip invalid msg when smu set mp1 state
It needs to remove SWSMU_CODE_LAYER_L1 in smu_cmn.h to guard a clear
code layer.
Signed-off-by: changfeng <[email protected]>
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add support for reporting thermal throttling events through SMI.
Also, add a counter to count the number of throttling interrupts
observed and report the count in the SMI event message.
Signed-off-by: Mukul Joshi <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
when GPU hang, driver has multi-paths to enter amdgpu_device_gpu_recover,
the atomic adev->in_gpu_reset and hive->in_reset are used to avoid
re-entering GPU recovery.
During GPU reset and resume, it is unsafe that other threads access GPU,
which maybe cause GPU reset failed. Therefore the new rw_semaphore
adev->reset_sem is introduced, which protect GPU from being accessed by
external threads during recovery.
v2:
1. add rwlock for some ioctls, debugfs and file-close function.
2. change to use dqm->is_resetting and dqm_lock for protection in kfd
driver.
3. remove try_lock and change adev->in_gpu_reset as atomic, to avoid
re-enter GPU recovery for the same GPU hang.
v3:
1. change back to use adev->reset_sem to protect kfd callback
functions, because dqm_lock couldn't protect all codes, for example:
free_mqd must be called outside of dqm_lock;
[ 1230.176199] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019
[ 1230.177221] Call Trace:
[ 1230.178249] dump_stack+0x98/0xd5
[ 1230.179443] amdgpu_virt_kiq_reg_write_reg_wait+0x181/0x190 [amdgpu]
[ 1230.180673] gmc_v9_0_flush_gpu_tlb+0xcc/0x310 [amdgpu]
[ 1230.181882] amdgpu_gart_unbind+0xa9/0xe0 [amdgpu]
[ 1230.183098] amdgpu_ttm_backend_unbind+0x46/0x180 [amdgpu]
[ 1230.184239] ? ttm_bo_put+0x171/0x5f0 [ttm]
[ 1230.185394] ttm_tt_unbind+0x21/0x40 [ttm]
[ 1230.186558] ttm_tt_destroy.part.12+0x12/0x60 [ttm]
[ 1230.187707] ttm_tt_destroy+0x13/0x20 [ttm]
[ 1230.188832] ttm_bo_cleanup_memtype_use+0x36/0x80 [ttm]
[ 1230.189979] ttm_bo_put+0x1be/0x5f0 [ttm]
[ 1230.191230] amdgpu_bo_unref+0x1e/0x30 [amdgpu]
[ 1230.192522] amdgpu_amdkfd_free_gtt_mem+0xaf/0x140 [amdgpu]
[ 1230.193833] free_mqd+0x25/0x40 [amdgpu]
[ 1230.195143] destroy_queue_cpsch+0x1a7/0x270 [amdgpu]
[ 1230.196475] pqm_destroy_queue+0x105/0x260 [amdgpu]
[ 1230.197819] kfd_ioctl_destroy_queue+0x37/0x70 [amdgpu]
[ 1230.199154] kfd_ioctl+0x277/0x500 [amdgpu]
[ 1230.200458] ? kfd_ioctl_get_clock_counters+0x60/0x60 [amdgpu]
[ 1230.201656] ? tomoyo_file_ioctl+0x19/0x20
[ 1230.202831] ksys_ioctl+0x98/0xb0
[ 1230.204004] __x64_sys_ioctl+0x1a/0x20
[ 1230.205174] do_syscall_64+0x5f/0x250
[ 1230.206339] entry_SYSCALL_64_after_hwframe+0x49/0xbe
2. remove try_lock and introduce atomic hive->in_reset, to avoid
re-enter GPU recovery.
v4:
1. remove an unnecessary whitespace change in kfd_chardev.c
2. remove comment codes in amdgpu_device.c
3. add more detailed comment in commit message
4. define a wrap function amdgpu_in_reset
v5:
1. Fix some style issues.
Reviewed-by: Hawking Zhang <[email protected]>
Suggested-by: Andrey Grodzovsky <[email protected]>
Suggested-by: Christian König <[email protected]>
Suggested-by: Felix Kuehling <[email protected]>
Suggested-by: Lijo Lazar <[email protected]>
Suggested-by: Luben Tukov <[email protected]>
Signed-off-by: Dennis Li <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Set valid_in_vf to false for the message not support in vf mode on
sienna cichlid.
Signed-off-by: Likun Gao <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Mapping Mode1Reset message for sienna_cichlid.
Signed-off-by: Likun Gao <[email protected]>
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Some asic may not support for some message of set mp1 state.
If the return value of smu_send_smc_msg is -EINVAL, that means it failed
to send msg to smc as it can not map an valid message for the ASIC. And
with that case, smu_set_mp1_state should be skipped as those ASIC was in
fact do not support for that.
Signed-off-by: Likun Gao <[email protected]>
Reviewed-by: Evan Quan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
It's not necessary to retrieve the power features status when the
asic is booted up the first time. This patch can have the features
enablement status still checked in suspend/resume case and removed
from the first boot up sequence.
Signed-off-by: Kenneth Feng <[email protected]>
Reviewed-by: Kevin Wang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The below 3 messages are not supported on Renoir
SMU_MSG_PrepareMp1ForShutdown
SMU_MSG_PrepareMp1ForUnload
SMU_MSG_PrepareMp1ForReset
It needs to revert patch:
drm/amd/powerplay: drop unnecessary message support check
to avoid set mp1 state fail during gpu reset on renoir.
Signed-off-by: changfeng <[email protected]>
Reviewed-by: Kenneth Feng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Enable SMU i2c bus access for sienna_cichlid asics.
v2: change callback name
Reviewed-by: Andrey Grodzovsky <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|