aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu
AgeCommit message (Collapse)AuthorFilesLines
2022-04-06drm/amdgpu: Sync up header and implementation to use the same parameter namesMa Jun1-2/+2
Sync up header and implementation to use the same parameter names in function amdgpu_ring_init. ring_size -> max_dw, prio -> hw_prio Reviewed-by: Guchun Chen <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Ma Jun <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-04-06drm/amdgpu: fix incorrect GCR_GENERAL_CNTL addressRuili Ji1-3/+3
gfx10.3.3/gfx10.3.6/gfx10.3.7 shall use 0x1580 address for GCR_GENERAL_CNTL Acked-by: Prike Liang <[email protected]> Acked-by: Yifan Zhang <[email protected]> Reviewed-by: Aaron Liu <[email protected]> Signed-off-by: Ruili Ji <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2022-04-06drm/amd/vcn: fix an error msg on vcn 3.0tiancyin1-1/+1
Some video card has more than one vcn instance, passing 0 to vcn_v3_0_pause_dpg_mode is incorrect. Error msg: Register(1) [mmUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002 Reviewed-by: James Zhu <[email protected]> Signed-off-by: tiancyin <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2022-04-06drm/amdgpu/vcn3: send smu interface typeBoyuan Zhang2-0/+12
For VCN FW to detect ASIC type, in order to use different mailbox registers. V2: simplify codes and fix format issue. Signed-off-by: Boyuan Zhang <[email protected]> Acked-by Huang Rui <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdgpu/gfx10: enable gfx1037 clock counter retrieval functionPrike Liang1-0/+1
Enable gfx1037 clock counter retrieval function for KFDPerfCountersTest.ClockCountersBasicTest. Signed-off-by: Prike Liang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdgpu: set noretry for gfx 10.3.7Prike Liang1-0/+1
Disable xnack on the gfx10.3.7 for the KFD test. Signed-off-by: Prike Liang <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdgpu: set noretry=1 for GFX 10.3.4Felix Kuehling1-2/+3
Retry faults are not supported on GFX 10.3.4. Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Guchun Chen <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdgpu: set noretry=1 for gc 10.3.6Yifan Zhang1-0/+1
this patch to set noretry=1 for gc 10.3.6. Signed-off-by: Yifan Zhang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdgpu: add more cases to noretry=1Alex Deucher1-0/+3
Port current list from amd-staging-drm-next. Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdgpu/vcn: improve vcn dpg stop procedureTianci Yin1-0/+3
Prior to disabling dpg, VCN need unpausing dpg mode, or VCN will hang in S3 resuming. Reviewed-by: James Zhu <[email protected]> Signed-off-by: Tianci Yin <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdkfd: Fix Incorrect VMIDs passed to HWSTushar Patel1-1/+1
Compute-only GPUs have more than 8 VMIDs allocated to KFD. Fix this by passing correct number of VMIDs to HWS v2: squash in warning fix (Alex) Signed-off-by: Tushar Patel <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdgpu/vcn: Fix the register setting for vcn1Emily Deng1-2/+2
Correct the code error for setting register UVD_GFX10_ADDR_CONFIG. Need to use inst_idx, or it only will set VCN0. Signed-off-by: Emily Deng <[email protected]> Reviewed-by: James Zhu <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2022-03-25drm/amdgpu: add workarounds for VCN TMZ issue on CHIP_RAVENLang Yu1-0/+71
It is a hardware issue that VCN can't handle a GTT backing stored TMZ buffer on CHIP_RAVEN series ASIC. Move such a TMZ buffer to VRAM domain before command submission as a workaround. v2: - Use patch_cs_in_place callback. v3: - Bail out early if unsecure IBs. Suggested-by: Christian König <[email protected]> Signed-off-by: Lang Yu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2022-03-25drm/amdgpu/gmc: use PCI BARs for APUs in passthroughAlex Deucher5-7/+8
If the GPU is passed through to a guest VM, use the PCI BAR for CPU FB access rather than the physical address of carve out. The physical address is not valid in a guest. v2: Fix HDP handing as suggested by Michel Reviewed-by: Christian König <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdgpu: fix off by one in amdgpu_gfx_kiq_acquire()Dan Carpenter1-1/+1
This post-op should be a pre-op so that we do not pass -1 as the bit number to test_bit(). The current code will loop downwards from 63 to -1. After changing to a pre-op, it loops from 63 to 0. Fixes: 71c37505e7ea ("drm/amdgpu/gfx: move more common KIQ code to amdgpu_gfx.c") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdgpu: conduct a proper cleanup of PDB boGuchun Chen1-1/+1
Use amdgpu_bo_free_kernel instead of amdgpu_bo_unref to perform a proper cleanup of PDB bo. v2: update subject to be more accurate Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdgpu: prevent memory wipe in suspend/shutdown stageGuchun Chen1-1/+3
On GPUs with RAS enabled, below call trace is observed when suspending or shutting down device. The cause is we have enabled memory wipe flag for BOs on such GPUs by default, and such BOs will go to memory wipe by amdgpu_fill_buffer, however, because ring is off already, it fails to clean up the memory and throw this error message. So add a suspend/shutdown check before wipping memory. [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off. v2: fix coding style issue Fixes: fc6ea4bee13071 ("drm/amdgpu: Wipe all VRAM on free when RAS is enabled") Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Christian König <[email protected]> Acked-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-22drm/amd: Add USBC connector IDAurabindo Pillai1-0/+1
[Why&How] Add a dedicated AMDGPU specific ID for use with newer ASICs that support USB-C output Signed-off-by: Aurabindo Pillai <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-15drm/amdgpu: Use drm_mode_copy()Ville Syrjälä1-2/+2
struct drm_display_mode embeds a list head, so overwriting the full struct with another one will corrupt the list (if the destination mode is on a list). Use drm_mode_copy() instead which explicitly preserves the list head of the destination mode. Even if we know the destination mode is not on any list using drm_mode_copy() seems decent as it sets a good example. Bad examples of not using it might eventually get copied into code where preserving the list head actually matters. Obviously one case not covered here is when the mode itself is embedded in a larger structure and the whole structure is copied. But if we are careful when copying into modes embedded in structures I think we can be a little more reassured that bogus list heads haven't been propagated in. @is_mode_copy@ @@ drm_mode_copy(...) { ... } @depends on !is_mode_copy@ struct drm_display_mode *mode; expression E, S; @@ ( - *mode = E + drm_mode_copy(mode, &E) | - memcpy(mode, E, S) + drm_mode_copy(mode, E) ) @depends on !is_mode_copy@ struct drm_display_mode mode; expression E; @@ ( - mode = E + drm_mode_copy(&mode, &E) | - memcpy(&mode, E, S) + drm_mode_copy(&mode, E) ) @@ struct drm_display_mode *mode; @@ - &*mode + mode Cc: Alex Deucher <[email protected]> Cc: Harry Wentland <[email protected]> Cc: Leo Li <[email protected]> Cc: Rodrigo Siqueira <[email protected]> Cc: [email protected] Reviewed-by: Harry Wentland <[email protected]> Signed-off-by: Ville Syrjälä <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-15drm/amdgpu: Use ternary operator in `vcn_v1_0_start()`Paul Menzel1-7/+2
Remove the boilerplate of declaring a variable and using an if else statement by using the ternary operator. Reviewed-by: James Zhu <[email protected]> Signed-off-by: Paul Menzel <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-15drm/amdgpu: fix typos in commentsJulia Lawall1-2/+2
Various spelling mistakes in comments. Detected with the help of Coccinelle. Signed-off-by: Julia Lawall <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-15drm/amdgpu: Add stolen reserved memory for MI25 SRIOV.Yongqiang Sun1-0/+9
MI25 SRIOV guest driver loading failed due to allocated memory overlaps with firmware reserved area. Allocate stolen reserved memory for MI25 SRIOV specifically to avoid the memory overlap. Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Yongqiang Sun <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-15drm/amdgpu: Merge get_reserved_allocation to get_vbios_allocations.Yongqiang Sun3-21/+13
Some ASICs need reserved memory for firmware or other components, which is not allowed to be used by driver. amdgpu_gmc_get_reserved_allocation is to handle additional areas. To avoid any missing calling, merged amdgpu_gmc_get_reserved_allocation to amdgpu_gmc_get_vbios_allocations. Acked-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Yongqiang Sun <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-15drm/amdgpu/vcn: fix vcn ring test failure in igt reload testTianci Yin1-0/+2
[why] On Renoir, vcn ring test failed on the second time insmod in the reload test. After investigation, it proves that vcn only can disable dpg under dpg unpause mode (dpg unpause mode is default for dec only, dpg pause mode is for dec/enc). [how] unpause dpg in dpg stopping procedure. Reviewed-by: James Zhu <[email protected]> Signed-off-by: Tianci Yin <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-15drm/amdgpu: only allow secure submission on rings which support thatLang Yu13-2/+19
Only GFX ring, SDMA ring and VCN decode ring support secure submission at the moment. Suggested-by: Christian König <[email protected]> Signed-off-by: Lang Yu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-15drm/amdgpu: fixed the warnings reported by kernel test robotyipechai3-62/+2
The reported warnings are as follows: 1.warning:no-previous-prototype-for-amdgpu_hdp_ras_fini. 2.warning:no-previous-prototype-for-amdgpu_mmhub_ras_fini. Amdgpu_hdp_ras_fini and amdgpu_mmhub_ras_fini are unused in the code, they are the only functions in amdgpu_hdp.c and amdgpu_mmhub.c. After removing these two functions, both amdgpu_hdp.c and amdgpu_mmhub.c are empty, so these two files can be deleted to fix the warning. Signed-off-by: yipechai <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-15drm/amdgpu: Move reset domain init before calling RREG32Philip Yang1-9/+9
amdgpu_detect_virtualization reads register, amdgpu_device_rreg access adev->reset_domain->sem if kernel defined CONFIG_LOCKDEP, below is the random boot hang backtrace on Vega10. It may get random NULL pointer access backtrace if amdgpu_sriov_runtime is true too. Move amdgpu_reset_create_reset_domain before calling to RREG32. BUG: kernel NULL pointer dereference, address: #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI Workqueue: events work_for_cpu_fn RIP: 0010:down_read_trylock+0x13/0xf0 Call Trace: <TASK> amdgpu_device_skip_hw_access+0x38/0x80 [amdgpu] amdgpu_device_rreg+0x1b/0x170 [amdgpu] amdgpu_detect_virtualization+0x73/0x100 [amdgpu] amdgpu_device_init.cold.60+0xbe/0x16b1 [amdgpu] ? pci_bus_read_config_word+0x43/0x70 amdgpu_driver_load_kms+0x15/0x120 [amdgpu] amdgpu_pci_probe+0x1a1/0x3a0 [amdgpu] Fixes: d0fb18b535679a ("drm/amdgpu: Move reset sem into reset_domain") Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-15drm/amd: fix gfx hang on renoir in IGT reload testTianci.Yin1-0/+4
[why] CP hangs in igt reloading test on renoir, more precisely, hangs on the second time insmod. [how] mode2 reset can make it recover, and mode2 reset only effects gfx core, dcn and the screen will not be impacted. Acked-by: Alex Deucher <[email protected]> Signed-off-by: Tianci.Yin <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-15drm/amdgpu: only check for _PR3 on dGPUsAlex Deucher1-2/+4
We don't support runtime pm on APUs. They support more dynamic power savings using clock and powergating. Reviewed-by: Mario Limonciello <[email protected]> Tested-by: Mario Limonciello <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-15drm/amdgpu: drop xmgi23 error query/reset supportHawking Zhang1-22/+0
xgmi_ras is only initialized when host to GPU interface is PCIE. in such case, xgmi23 is disabled and protected by security firmware. Host access will results to security violation Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-15drm/amdgpu: fix aldebaran xgmi topology for vfJonathan Kim1-2/+4
VFs must also distinguish whether or not the TA supports full duplex or half duplex link records in order to report the correct xGMI topology. Signed-off-by: Jonathan Kim <[email protected]> Reviewed-by: Shaoyun Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-15drm/amdgpu: message smu to update bad channel infoStanley.Yang5-2/+42
It should notice SMU to update bad channel info when detected uncorrectable error in UMC block Signed-off-by: Stanley.Yang <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-15drm/amdgpu: Disable baco dummy modeLijo Lazar1-0/+15
On aldebaran, BACO dummy mode may be enabled during reset. Disable it during resume. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-09drm/amdgpu: fix a wrong ib referenceLang Yu1-5/+2
It should be p->job->ibs[j] instead of p->job->ibs[i] here. Fixes: cdc7893fc93f19 ("drm/amdgpu: use job and ib structures directly in CS parsers") Signed-off-by: Lang Yu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-04drm/amdgpu: initialize the vmid_wait with the stub fenceChristian König2-1/+2
This way we don't need to check for NULL any more. Signed-off-by: Christian König <[email protected]> Reviewed-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-04drm/amdgpu: properly embed the IBs into the jobChristian König2-8/+5
We now have standard macros for that. Signed-off-by: Christian König <[email protected]> Reviewed-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-04drm/amdgpu: use job and ib structures directly in CS parsersChristian König9-114/+130
Instead of providing the ib index provide the job and ib pointers directly to the patch and parse functions for UVD and VCE. Also move the set/get functions for IB values to the IB declerations. Signed-off-by: Christian König <[email protected]> Acked-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-04drm/amdgpu: header cleanupChristian König10-100/+132
No function change, just move a bunch of definitions from amdgpu.h into separate header files. Signed-off-by: Christian König <[email protected]> Acked-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-04drm/amd/amdgpu: set disabled vcn to no_schdulerJingwen Chen1-0/+2
[Why] after the reset domain introduced, the sched.ready will be init after hw_init, which will overwrite the setup in vcn hw_init, and lead to vcn ib test fail. [How] set disabled vcn to no_scheduler Fixes: 5fd8518d187ed0 ("drm/amdgpu: Move scheduler init to after XGMI is ready") Signed-off-by: Jingwen Chen <[email protected]> Acked-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-04drm/amdgpu: install ctx entities with cmpxchgChristian König1-1/+7
Since we removed the context lock we need to make sure that not two threads are trying to install an entity at the same time. Signed-off-by: Christian König <[email protected]> Fixes: 461fa7b0ac565e ("drm/amdgpu: remove ctx->lock") Reviewed-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-04drm/amdkfd: implement get_atc_vmid_pasid_mapping_info for gfx10.3Yifan Zhang1-1/+15
This patch implements get_atc_vmid_pasid_mapping_info for gfx10.3 Signed-off-by: Yifan Zhang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-04drm/amdgpu/vcn: Add vcn firmware logRuijing Dong9-1/+163
vcn fwlog is for debugging purpose only, by default, it is disabled. Signed-off-by: Ruijing Dong <[email protected]> Reviewed-by: Leo Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-04drm/amdgpu/vcn: Update fw shared data structureRuijing Dong5-35/+61
Add fw log in fw shared data structure. Reviewed-by: Leo Liu <[email protected]> Signed-off-by: Ruijing Dong <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-04drm/amdgpu: Add DFC CAP support for aldebaranDavid Yu2-1/+2
Add DFC CAP support for aldebaran Initialize cap microcode in psp_init_sriov_microcode, the ta microcode will be initialized in psp_vxx_init_microcode Signed-off-by: David Yu <[email protected]> Reviewed-by: Shaoyun.liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-04drm/amdgpu: Set correct DMA mask for aldebaranHarish Kasiviswanathan1-3/+4
Aldebaran has 48-bit physical address support Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Harish Kasiviswanathan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-04drm/amdgpu: Refactor mode2 reset logic for v13.0.2Lijo Lazar2-20/+54
Use IP version and refactor reset logic to apply to a list of devices. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Le Ma <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-02drm/amdgpu: remove redundant null checkWeiguo Li1-6/+0
Remove the redundant null check since the caller ensures that 'ctx' is never NULL. Reviewed-by: Christian König <[email protected]> Signed-off-by: Weiguo Li <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-02drm/amdgpu/sdma5: drop unused cyan skillfish firmwareAlex Deucher1-7/+1
Leftover from bring up. Not used anymore. Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-02drm/amdgpu/gfx10: drop unused cyan skillfish firmwareAlex Deucher1-11/+1
Leftover from bring up. Not used anymore. Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-02drm/amdgpu: remove unused gpu_info firmwaresAlex Deucher1-23/+0
These were leftover from bring up and are no longer necessary. The information is available via the IP discovery table. Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>