aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
AgeCommit message (Collapse)AuthorFilesLines
2020-11-03drm/amdgpu: resolved ASD loading issue on siennaJohn Clements1-0/+1
updated fw header v2 parser to set asd fw memory Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected] # 5.9.x
2020-10-21drm/amd/psp: Fix sysfs: cannot create duplicate filenameAndrey Grodzovsky1-1/+2
psp sysfs not cleaned up on driver unload for sienna_cichlid Fixes: ce87c98db428e7 ("drm/amdgpu: Include sienna_cichlid in USBC PD FW support.") Signed-off-by: Andrey Grodzovsky <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected] # 5.9.x
2020-10-21drm/amdgpu: add rlc iram and dram firmware supportLikun Gao1-0/+6
Support to load RLC iram and dram ucode when RLC firmware struct use v2.2 Signed-off-by: Likun Gao <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-09-25drm/amd: Skip not used microcode loading in SRIOVJingwen Chen1-4/+6
smc, sdma, sos, ta and asd fw is not used in SRIOV. Skip them to accelerate sw_init for navi12. v2: skip above fw in SRIOV for vega10 and sienna_cichlid v3: directly skip psp fw loading in SRIOV Signed-off-by: Jingwen Chen <[email protected]> Reviewed-by: Emily.Deng <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-09-15drm/amdgpu: Include sienna_cichlid in USBC PD FW support.Andrey Grodzovsky1-1/+1
Create sysfs interface also for sienna_cichlid. Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-09-15drm/amdgpu: Update RAS init handlingJohn Clements1-1/+11
Output RAS init status If RAS init fails, teardown RAS context Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-09-15drm/amdgpu: Avoid accessing HW when suspending SW stateAndrey Grodzovsky1-0/+6
At this point the ASIC is already post reset by the HW/PSP so the HW not in proper state to be configured for suspension, some blocks might be even gated and so best is to avoid touching it. v2: Rename in_dpc to more meaningful name Signed-off-by: Andrey Grodzovsky <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-26drm/amdgpu: add asd fw check before loading asdTao Zhou1-2/+1
asd is not ready for some ASICs in early stage, and psp->asd_fw is more generic than ASIC name in the check. Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Jiansong Chen <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-24drm/amdgpu: drm_device to amdgpu_device by inline-f (v2)Luben Tuikov1-2/+2
Get the amdgpu_device from the DRM device by use of an inline function, drm_to_adev(). The inline function resolves a pointer to struct drm_device to a pointer to struct amdgpu_device. v2: Use a typed visible static inline function instead of an invisible macro. Signed-off-by: Luben Tuikov <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-24drm/amdgpu: refine codes to avoid reentering GPU recoveryDennis Li1-2/+2
if other threads have holden the reset lock, recovery will fail to try_lock. Therefore we introduce atomic hive->in_reset and adev->in_gpu_reset, to avoid reentering GPU recovery. v2: drop "? true : false" in the definition of amdgpu_in_reset Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Dennis Li <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-14drm/amdgpu: revert "fix system hang issue during GPU reset"Christian König1-2/+2
The whole approach wasn't thought through till the end. We already had a reset lock like this in the past and it caused the same problems like this one. Completely revert the patch for now and add individual trylock protection to the hardware access functions as necessary. This reverts commit df9c8d1aa278c435c30a69b8f2418b4a52fcb929. Signed-off-by: Christian König <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-14drm/amdgpu: enable RAP TA loadWenhui Sheng1-0/+183
Enable the RAP TA loading path and add RAP test trigger interface. v2: fix potential mem leak issue Signed-off-by: Wenhui Sheng <[email protected]> Reviewed-by: Guchun Chen <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-30drm amdgpu: Skip tmr load for SRIOVLiu ChengZhe1-6/+29
1. For Navi12, CHIP_SIENNA_CICHLID, skip tmr load operation; 2. Check pointer before release firmware. v2: use CHIP_SIENNA_CICHLID instead v3: remove local "bool ret"; fix grammer issue v4: use my name instead of "root" v5: fix grammer issue and indent issue Signed-off-by: Liu ChengZhe <[email protected]> Reviewed-by: Luben Tuikov <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-27drm/amdgpu: fix system hang issue during GPU resetDennis Li1-2/+2
when GPU hang, driver has multi-paths to enter amdgpu_device_gpu_recover, the atomic adev->in_gpu_reset and hive->in_reset are used to avoid re-entering GPU recovery. During GPU reset and resume, it is unsafe that other threads access GPU, which maybe cause GPU reset failed. Therefore the new rw_semaphore adev->reset_sem is introduced, which protect GPU from being accessed by external threads during recovery. v2: 1. add rwlock for some ioctls, debugfs and file-close function. 2. change to use dqm->is_resetting and dqm_lock for protection in kfd driver. 3. remove try_lock and change adev->in_gpu_reset as atomic, to avoid re-enter GPU recovery for the same GPU hang. v3: 1. change back to use adev->reset_sem to protect kfd callback functions, because dqm_lock couldn't protect all codes, for example: free_mqd must be called outside of dqm_lock; [ 1230.176199] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019 [ 1230.177221] Call Trace: [ 1230.178249] dump_stack+0x98/0xd5 [ 1230.179443] amdgpu_virt_kiq_reg_write_reg_wait+0x181/0x190 [amdgpu] [ 1230.180673] gmc_v9_0_flush_gpu_tlb+0xcc/0x310 [amdgpu] [ 1230.181882] amdgpu_gart_unbind+0xa9/0xe0 [amdgpu] [ 1230.183098] amdgpu_ttm_backend_unbind+0x46/0x180 [amdgpu] [ 1230.184239] ? ttm_bo_put+0x171/0x5f0 [ttm] [ 1230.185394] ttm_tt_unbind+0x21/0x40 [ttm] [ 1230.186558] ttm_tt_destroy.part.12+0x12/0x60 [ttm] [ 1230.187707] ttm_tt_destroy+0x13/0x20 [ttm] [ 1230.188832] ttm_bo_cleanup_memtype_use+0x36/0x80 [ttm] [ 1230.189979] ttm_bo_put+0x1be/0x5f0 [ttm] [ 1230.191230] amdgpu_bo_unref+0x1e/0x30 [amdgpu] [ 1230.192522] amdgpu_amdkfd_free_gtt_mem+0xaf/0x140 [amdgpu] [ 1230.193833] free_mqd+0x25/0x40 [amdgpu] [ 1230.195143] destroy_queue_cpsch+0x1a7/0x270 [amdgpu] [ 1230.196475] pqm_destroy_queue+0x105/0x260 [amdgpu] [ 1230.197819] kfd_ioctl_destroy_queue+0x37/0x70 [amdgpu] [ 1230.199154] kfd_ioctl+0x277/0x500 [amdgpu] [ 1230.200458] ? kfd_ioctl_get_clock_counters+0x60/0x60 [amdgpu] [ 1230.201656] ? tomoyo_file_ioctl+0x19/0x20 [ 1230.202831] ksys_ioctl+0x98/0xb0 [ 1230.204004] __x64_sys_ioctl+0x1a/0x20 [ 1230.205174] do_syscall_64+0x5f/0x250 [ 1230.206339] entry_SYSCALL_64_after_hwframe+0x49/0xbe 2. remove try_lock and introduce atomic hive->in_reset, to avoid re-enter GPU recovery. v4: 1. remove an unnecessary whitespace change in kfd_chardev.c 2. remove comment codes in amdgpu_device.c 3. add more detailed comment in commit message 4. define a wrap function amdgpu_in_reset v5: 1. Fix some style issues. Reviewed-by: Hawking Zhang <[email protected]> Suggested-by: Andrey Grodzovsky <[email protected]> Suggested-by: Christian König <[email protected]> Suggested-by: Felix Kuehling <[email protected]> Suggested-by: Lijo Lazar <[email protected]> Suggested-by: Luben Tukov <[email protected]> Signed-off-by: Dennis Li <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-21drm/amdgpu: load asd for sienna cichlidJohn Clements1-1/+0
do not abort psp asd load sequence for sienna cichlid Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-15drm/amdgpu: add psp support for navy_flounderJiansong Chen1-2/+6
Currently skip ASD FW loading and ih reroute per sienna_cichlid. Signed-off-by: Jiansong Chen <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-15drm/amdgpu: correct ta header v2 ucode init start addressJohn Clements1-1/+3
resolve bug calculating fw start address within binary Reviewed-by: Guchun Chen <[email protected]> Signed-off-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-10drm/amdgpu: fix spelling mistake "Falied" -> "Failed"Colin Ian King1-1/+1
There is a spelling mistake in a DRM_ERROR error message. Fix it. Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-08drm/amdgpu: updated ta ucode loadingJohn Clements1-0/+99
add support for loading ucode with ta_firmware_header_v2_0 Reviewed-by: Bhawanpreet Lakha <[email protected]> Signed-off-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-08drm/amdgpu: add TMR destory function for pspHuang Rui1-4/+53
TMR is required to be destoried with GFX_CMD_ID_DESTROY_TMR while the system goes to suspend. Otherwise, PSP may return the failure state (0xFFFF007) on Gfx-2-PSP command GFX_CMD_ID_SETUP_TMR after do multiple times suspend/resume. Signed-off-by: Huang Rui <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-08drm/amdgpu: asd function needs to be unloaded in suspend phaseHuang Rui1-0/+6
Unload ASD function in suspend phase. Signed-off-by: Huang Rui <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-01drm/amdgpu: call release_firmware() without a NULL checkNirmoy Das1-4/+2
The release_firmware() function is NULL tolerant so we do not need to check for NULL param before calling it. Signed-off-by: Nirmoy Das <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-01drm/amdgpu: label internally used symbols as staticNirmoy Das1-1/+1
Used sparse(make C=1) to find these loose ends. v2: removed unwanted extra line Signed-off-by: Nirmoy Das <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-01drm/amdgpu/sriov: Add clear vf fw supportEmily Deng1-2/+35
Guest VM issue the PSP clear_vf_fw command at 2 points: 1.On VF driver loading, after VF message PSP to setup rings, the next command is “clear_vf_fw” 2.On VF driver unload before VF message to destroy rings Signed-off-by: Emily Deng <[email protected]> Ack-by: Monk.liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-01drm/amdgpu/psp: support for loading PSP SPL fwLikun Gao1-0/+8
Add support for loading SPL firmware. Signed-off-by: Likun Gao <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-01drm/amdgpu/psp: initialization PSP SPL fwLikun Gao1-0/+13
Support for psp firmware header version v1_3 initialization and information print. Signed-off-by: Likun Gao <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-01drm/amdgpu: only send one sdma firmware for sienna_cichlidLikun Gao1-0/+9
As all four sdma firmware are same, PSP only receive one SDMA fw. Signed-off-by: Likun Gao <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-01drm/amdgpu/psp: convert amdgpu mes ucode typeJack Xiao1-0/+6
Convert to psp defined ucode item, so that psp can recognize them. Signed-off-by: Jack Xiao <[email protected]> Acked-by: Alex Deucher <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-01drm/amdgpu/psp: add psp support for sienna_cichlidLikun Gao1-0/+1
Signed-off-by: Likun Gao <[email protected]> Reviewed-by: Jack Xiao <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-01drm/amdgpu: skip ASD fw load for sienna_cichlidLikun Gao1-1/+1
Skip ASD FW load for sienna_cichlid currently. Signed-off-by: Likun Gao <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-05-28drm/amdgpu: change memory training to common functionLikun Gao1-2/+40
Change memory training init and finit a common function, as it only have software behavior do not relay on the IP version of PSP. Signed-off-by: Likun Gao <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-05-21drm/amdgpu: add condition to set MP1 state on gpu resetLikun Gao1-1/+2
Only ras supportted need to set MP1 state to prepare for unload before reloading SMU FW. Signed-off-by: Likun Gao <[email protected]> Reviewed-by: Evan Quan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-05-08drm/amdgpu: switch to common rlc_autoload helperHawking Zhang1-1/+1
drop IP specific psp function for rlc autoload since the autoload_supported was introduced to mark ASICs that support rlc_autoload Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Guchun Chen <[email protected]> Reviewed-by: John Clements <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-05-08drm/amdgpu: switch to common ras ta helperHawking Zhang1-0/+27
TRIGGER_ERROR is common ras ta command for all the ASICs that support RAS feature. switch to common helper to avoid duplicate implementation per IP generation Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Guchun Chen <[email protected]> Reviewed-by: John Clements <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-05-08drm/amdgpu: switch to common xgmi ta helpersHawking Zhang1-0/+115
get_hive_id/get_node_id/get_topology_info/set_topology_info are common xgmi command supported by TA for all the ASICs that support xgmi link. They should be implemented as common helper functions to avoid duplicated code per IP generation Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Guchun Chen <[email protected]> Reviewed-by: John Clements <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-05-07drm/amdgpu: Fix bug in RAS invokeJohn Clements1-3/+3
Invoke sequence should abort when ras interrupt is detected before reading TA host shared memory Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-04-30drm/amdgpu: update RAS sequence to parse TA flagsJohn Clements1-1/+28
RAS TA shall notify driver with flags of error specifics Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-04-23drm/amdgpu: add helper function to init sos ucodeHawking Zhang1-0/+70
driver already had psp_firmware_header struture to deal with different layout of sos ucode. the sos micorcode initialization could be common one. Signed-off-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-04-23drm/amdgpu: add helper function to init asd ucodeHawking Zhang1-0/+36
asd is unified ucode across asic. it is not necessary to keep its software structure to be ip specific one Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Guchun Chen <[email protected]> Reviewed-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-04-23drm/amdgpu: retire unused check_fw_loading statusHawking Zhang1-34/+0
The driver can't access UCODE_DATA/ADDR registers on production boards. Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Guchun Chen <[email protected]> Reviewed-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-04-23drm/amdgpu: retire support_vmr_ring interfaceHawking Zhang1-1/+1
vmr ring is dedicated for sriov vf (i.e.guest driver in sriov), which is general communication interface between driver and psp fw accross all ip version. it is not correct to make it as ip specific callback. it is even worse to check specific tOS version per IP version (like psp_v11/v12). Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Guchun Chen <[email protected]> Reviewed-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-04-22drm/amdgpu: set mp1 state before reloadJohn Clements1-1/+10
Set MP1 state to prepare for unload before reloading SMU FW Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-04-22drm/amdgpu: update psp fw loading sequenceJohn Clements1-49/+77
Added dedicated function to check if particular fw should be skipped from loading. Added dedicated function for SMU FW loading via PSP Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-04-09drm/amdgpu/psp: dont warn on missing optional TA'sAlex Deucher1-3/+3
Replace dev_warn() with dev_info() and note that they are optional to avoid confusing users. The RAS TAs only exist on server boards and the HDCP and DTM TAs only exist on client boards. They are optional either way. Acked-by: Nirmoy Das <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-04-03drm/amd/display: Guard calls to hdcp_ta and dtm_taBhawanpreet Lakha1-0/+2
[Why] The buffer used when calling psp is a shared buffer. If we have multiple calls at the same time we can overwrite the buffer. [How] Add mutex to guard the shared buffer. Signed-off-by: Bhawanpreet Lakha <[email protected]> Reviewed-by: Nicholas Kazlauskas <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-04-01drm/amdgpu: Ignore the not supported error from pspEmily Deng1-1/+5
As the VCN firmware will not use vf vmr now. And new psp policy won't support set tmr now. For driver compatible issue, ignore the not support error. Signed-off-by: Emily Deng <[email protected]> Reviewed-by: Monk Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-03-25Revert "drm/amdgpu: add CAP fw loading"Zhigang Luo1-8/+1
This reverts commit 29e2501f8a64fa2fa8f6fe4be53cce5a5a4fe79f. Signed-off-by: Zhigang Luo <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-03-19drm/amdgpu: add CAP fw loadingZhigang Luo1-1/+8
The CAP fw is for enabling driver compatibility. Currently, it only enabled for vega10 VF. Signed-off-by: Zhigang Luo <[email protected]> Reviewed-by: Shaoyun Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-03-13drm/amdgpu: resolve failed error inject msgJohn Clements1-2/+4
invoking an error injection successfully will cause an at_event intterrupt that will occur before the invoke sequence can complete causing an invalid error Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-03-06drm/amdgpu: Fix GPU reset error.Andrey Grodzovsky1-11/+17
Problem: During GU reset PSP's sysfs was being wrongly reinitilized during call to amdgpu_device_ip_late_init which was failing with duplicate error. Fix: Move psp_sysfs_init to psp_sw_init to avoid this. Add guards in sysfs file's read and write hook agains premature call if PSP is not finished initialization. Signed-off-by: Andrey Grodzovsky <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>