aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-04-28drm/amdkfd: Put ASIC revision into HSA capabilityJoseph Greathouse4-1/+16
In order to surface the ASIC revision to user level, we want to put it into the HSA topology. This can be because different ASIC revisions may require user-level software to do different things (e.g. patch code for things that are changed in later hardware revisions). The ASIC revision from the hardware is maximum of 4 bits at this time, so put it into 4 of the open bits in the HSA capability. Then user-level software can use this capability information to know -- for each ASIC -- what revision-based things must be done. Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-27drm/amd/display: remove conversion to bool in dc_link_ddc.cJason Yan1-1/+1
The '>' expression itself is bool, no need to convert it to bool again. This fixes the following coccicheck warning: drivers/gpu/drm/amd/display/dc/core/dc_link_ddc.c:602:28-33: WARNING: conversion to bool not needed here Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-27drm/amd/display: remove conversion to bool in dcn20_mpc.cJason Yan1-1/+1
The '==' expression itself is bool, no need to convert it to bool again. This fixes the following coccicheck warning: drivers/gpu/drm/amd/display/dc/dcn20/dcn20_mpc.c:455:70-75: WARNING: conversion to bool not needed here Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-27drm/amdgpu: remove conversion to bool in amdgpu_device.cJason Yan1-1/+1
The '>' expression itself is bool, no need to convert it to bool again. This fixes the following coccicheck warning: drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:3004:68-73: WARNING: conversion to bool not needed here Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-27drm/amdgpu: decouple EccErrCnt query and clear operationGuchun Chen1-4/+79
Due to hardware bug that when RSMU UMC index is disabled, clear EccErrCnt at the first UMC instance will clean up all other EccErrCnt registes from other instances at the same time. This will break the correctable error count log in EccErrCnt register once querying it. So decouple both to make error count query workable. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-27drm/amdgpu: switch to SMN interface to operate RSMU index modeGuchun Chen1-5/+24
This makes consistent with other regsiters' access in this module. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-27drm/amdgpu: drop redundant cg/pg ungate on runpm enterEvan Quan1-3/+0
CG/PG ungate is already performed in ip_suspend_phase1. Otherwise, the CG/PG ungate will be performed twice. That will cause gfxoff disablement is performed twice also on runpm enter while gfxoff enablemnt once on rump exit. That will put gfxoff into disabled state. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-27drm/amdgpu: move kfd suspend after ip_suspend_phase1Evan Quan1-2/+2
This sequence change should be safe as what did in ip_suspend_phase1 is to suspend DCE only. And this is a prerequisite for coming redundant cg/pg ungate dropping. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-27drm/amdgpu: sw pstate switch should only be for vega20Jonathan Kim1-1/+3
Driver steered p-state switching is designed for Vega20 only. Also simplify early return for temporary disable due to SMU FW bug. Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-27drm/amdgpu: Remove unneeded semicolonZheng Bin1-1/+1
Fixes coccicheck warning: drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c:2534:2-3: Unneeded semicolon Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Zheng Bin <zhengbin13@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com>
2020-04-24amdgpu/dc: remove redundant assignment to variable 'option'Colin Ian King1-2/+1
The variable option is being initialized with a value that is never read and it is being updated later with a new value. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-24drm/amdgpu/gmc: Use consistent variable on unlocksColin Ian King2-2/+2
Currently the error returns paths are unlocking lock kiq->ring_lock however it seems this should be dev->gfx.kiq.ring_lock as this is the lock that is being locked and unlocked around the ring operations. This looks like a bug, but it's not. The kiq is just a local variable pointing to the same structure. Make it consistent. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-24drm/amd/display: remove redundant assignment to variable retColin Ian King1-1/+1
The variable ret is being initialized with a value that is never read and it is being updated later with a new value. The initialization is redundant and can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-24drm/amdgpu: protect ring overrunYintian Tao7-12/+61
Wait for the oldest sequence on the ring to be signaled in order to make sure there will be no command overrun. v2: fix coding stype and remove abs operation v3: remove the initialization of variable r Signed-off-by: Yintian Tao <yttao@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-24drm/amdgpu: extent threshold of waiting FLR_COMPLETEMonk Liu2-2/+2
to 5s to satisfy WHOLE GPU reset which need 3+ seconds to finish Signed-off-by: Monk Liu <Monk.Liu@amd.com> Acked-by: Yintian Tao <yttao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-24drm/amdgpu: for nv12 always need smu ipMonk Liu1-2/+1
because nv12 SRIOV support one vf mode Signed-off-by: Monk Liu <Monk.Liu@amd.com> Acked-by: Yintian Tao <yttao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-24drm/amdgpu: skip sysfs node not belong to one vf modeMonk Liu1-20/+28
Signed-off-by: Monk Liu <Monk.Liu@amd.com> Acked-by: Yintian Tao <yttao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-24drm/amdgpu: enable one vf mode for nv12Monk Liu3-15/+52
Signed-off-by: Monk Liu <Monk.Liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-24drm/amdgpu: clear the messed up checking logicMonk Liu1-8/+3
for ARCTURUS+ ASICS, we always support SW_SMU for bare-metal and for SRIOV one_vf_mode Signed-off-by: Monk Liu <Monk.Liu@amd.com> Acked-by: Yintian Tao <yttao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-24drm/amdgpu: provide RREG32_SOC15_NO_KIQ, will be used laterMonk Liu1-0/+3
Signed-off-by: Monk Liu <Monk.Liu@amd.com> Acked-by: Yintian Tao <yttao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-24drm/amdgpu: sriov is forbidden to call disable DPMMonk Liu1-0/+3
Signed-off-by: Monk Liu <Monk.Liu@amd.com> Acked-by: Yintian Tao <yttao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-24drm/amdgpu: skip cg/pg set for SRIOVMonk Liu1-0/+7
Signed-off-by: Monk Liu <Monk.Liu@amd.com> Acked-by: Yintian Tao <yttao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-24drm/amdgpu: ignore TA ucode for SRIOVMonk Liu1-0/+2
Signed-off-by: Monk Liu <Monk.Liu@amd.com> Acked-by: Yintian Tao <yttao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-23drm/amdgpu: retire legacy vega10 sos version checkHawking Zhang1-27/+1
retired those early sos version used in vega10 bring up phase Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-23drm/amdgpu: switch to helper function to init sos ucodeHawking Zhang2-84/+6
call common helper function to init sos ucode, instead of duplicate codes per ip version Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-23drm/amdgpu: add helper function to init sos ucodeHawking Zhang2-0/+72
driver already had psp_firmware_header struture to deal with different layout of sos ucode. the sos micorcode initialization could be common one. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-23drm/amdgpu: switch to helper function to init asd ucodeHawking Zhang4-75/+5
call common helper function to initialize asd ucode Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-23drm/amdgpu: add helper function to init asd ucodeHawking Zhang2-0/+38
asd is unified ucode across asic. it is not necessary to keep its software structure to be ip specific one Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-23drm/amdgpu: retire unused check_fw_loading statusHawking Zhang6-544/+0
The driver can't access UCODE_DATA/ADDR registers on production boards. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-23drm/amdgpu: remove unnecessary tOS version checkHawking Zhang3-16/+4
tOS version is available through debugfs interface Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-23drm/amdgpu: retire support_vmr_ring interfaceHawking Zhang5-66/+30
vmr ring is dedicated for sriov vf (i.e.guest driver in sriov), which is general communication interface between driver and psp fw accross all ip version. it is not correct to make it as ip specific callback. it is even worse to check specific tOS version per IP version (like psp_v11/v12). Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-23drm/amdgpu: shrink critical section in amdgpu_amdkfd_gpuvm_free_memory_of_gpuBernard Zhao1-7/+7
Reduce the mem->lock`s protected code area, no need to protect pr_debug. This also simplifies error handling. Signed-off-by: Bernard Zhao <bernard@vivo.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-23drm/amdgpu: Init data to avoid oops while reading pp_num_states.limingyu1-1/+4
For chip like CHIP_OLAND with si enabled(amdgpu.si_support=1), the amdgpu will expose pp_num_states to the /sys directory. In this moment, read the pp_num_states file will excute the amdgpu_get_pp_num_states func. In our case, the data hasn't been initialized, so the kernel will access some ilegal address, trigger the segmentfault and system will reboot soon: uos@uos-PC:~$ cat /sys/devices/pci0000\:00/0000\:00\:00.0/0000\:01\:00 .0/pp_num_states Message from syslogd@uos-PC at Apr 22 09:26:20 ... kernel:[ 82.154129] Internal error: Oops: 96000004 [#1] SMP This patch aims to fix this problem, avoid that reading file triggers the kernel sementfault. Signed-off-by: limingyu <limingyu@uniontech.com> Signed-off-by: zhoubinbin <zhoubinbin@uniontech.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-23drm/amdgpu: remove set but not used variable 'priority'YueHaibing1-2/+0
drivers/gpu/drm/amd/amdgpu/amdgpu_job.c: In function amdgpu_job_submit: drivers/gpu/drm/amd/amdgpu/amdgpu_job.c:148:26: warning: variable priority set but not used [-Wunused-but-set-variable] commit 33abcb1f5a17 ("drm/amdgpu: set compute queue priority at mqd_init") left behind this, remove it. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-23drm: amdgpu: fix kernel-doc struct warningRandy Dunlap1-1/+1
Fix a kernel-doc warning of missing struct field desription: ../drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:92: warning: Function parameter or member 'vm' not described in 'amdgpu_vm_eviction_lock' Fixes: a269e44989f3 ("drm/amdgpu: Avoid reclaim fs while eviction lock") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Signed-off-by: Alex Sierra <alex.sierra@amd.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: David (ChunMing) Zhou <David1.Zhou@amd.com> Cc: amd-gfx@lists.freedesktop.org Reviewed-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-23drm: amd/display: fix Kconfig help textRandy Dunlap1-6/+2
Fix help text: indent one tab + 2 spaces; end a sentence with a period; and collapse short lines of text to one line. Fixes: 23c61b4599c4 ("drm/amd: Fix Kconfig indentation") Fixes: 4562236b3bc0 ("drm/amd/dc: Add dc display driver (v2)") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Krzysztof Kozlowski <krzk@kernel.org> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-23drm/amdgpu: request reg_val_offs each kiq read regYintian Tao7-29/+41
According to the current kiq read register method, there will be race condition when using KIQ to read register if multiple clients want to read at same time just like the expample below: 1. client-A start to read REG-0 throguh KIQ 2. client-A poll the seqno-0 3. client-B start to read REG-1 through KIQ 4. client-B poll the seqno-1 5. the kiq complete these two read operation 6. client-A to read the register at the wb buffer and get REG-1 value Therefore, use amdgpu_device_wb_get() to request reg_val_offs for each kiq read register. v2: fix the error remove v3: fix the print typo v4: remove unused variables Signed-off-by: Yintian Tao <yttao@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-22drm/amdgpu: change how we update mmRLC_SPM_MC_CNTLChristian König3-8/+28
In pp_one_vf mode avoid the extra overhead and read/write the registers without the KIQ. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Acked-by: Yintian Tao <yintian.tao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-22drm/amdgpu: set error query ready after all IPs late initDennis Li2-5/+3
If set error query ready in amdgpu_ras_late_init, which will cause some IP blocks aren't initialized, but their error query is ready. Signed-off-by: Dennis Li <Dennis.Li@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-22drm/amdgpu: code cleanup around gpu resetEvan Quan1-10/+4
Make code more readable. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-22drm/amdgpu: optimize the gpu reset for XGMI setup V2Evan Quan1-51/+25
This is basically just some code cosmetic. The current design for XGMI setup gput reset is to operate on current device(adev) first and then on other devices from the hive(by another 'for' loop). But actually we can do some sort to the device list(to put current device 1st position) and handle all the devices in a single 'for' loop. V2: added missing hive->hive_lock protection Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-22drm/amdgpu: correct cancel_delayed_work_sync on gpu resetEvan Quan1-0/+2
As for XGMI setup, it should be performed on other devices from the hive also. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-22drm/amdgpu: correct fbdev suspend on gpu resetEvan Quan1-1/+1
As for XGMI setup, it needs to be performed on all the devices from the same hive. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-22drm/amdgpu: cleanup coding style in amdkfd a bitBernard Zhao1-11/+9
Make the code a bit more readable by using a common error handling pattern. Signed-off-by: Bernard Zhao <bernard@vivo.com> Reviewed-by: Christian König <christian.koenig@amd.com>. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-22drm/amdgpu: clean up unused variable about ring lruKevin Wang2-6/+0
clean up unused variable: 1. ring_lru_list 2. ring_lru_list_lock related-commit: drm/amdgpu: remove ring lru handling Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-22drm/amdgpu: replace DRM prefix with PCI device info for gfx/mmhubDennis Li2-16/+31
Prefix RAS message printing in gfx/mmhub with PCI device info, which assists the debug in multiple GPU case. Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-22drm/amd/powerplay: limit smu support to Arcturus for onevfJiansong Chen1-1/+4
Under onevf mode the smu support to other chips is not well verified yet. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-22drm/amdgpu: disble vblank when unloading sriov driverJiawei1-1/+2
disble vblank in dce_vitual_crtc_commit(), which is skipped under sriov before Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Jiawei <Jiawei.Gu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-22drm/amdgpu: Print CU information by default during initializationYong Zhao1-1/+2
This is convenient for multiple teams to obtain the information. Also, add device info by using dev_info(). Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-22drm/amd/powerplay: update smu12_driver_if.h to align with pmfwPrike Liang1-15/+25
Update the smu12_driver_if.h header to follow the pmfw release. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>