aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2021-08-26drm/amdgpu: Add support for RAS XGMI err queryJohn Clements1-0/+65
Update XGMI RAS to support error query on aldebaran Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-26drm/amdkfd: Account for SH/SE count when setting up cu masks.Sean Keely2-21/+64
On systems with multiple SH per SE compute_static_thread_mgmt_se# is split into independent masks, one for each SH, in the upper and lower 16 bits. We need to detect this and apply cu masking to each SH. The cu mask bits are assigned first to each SE, then to alternate SHs, then finally to higher CU id. This ensures that the maximum number of SPIs are engaged as early as possible while balancing CU assignment to each SH. v2: Use max SH/SE rather than max SH in cu_per_sh. v3: Fix comment blocks, ensure se_mask is initially zero filled, and correctly assign se.sh.cu positions to unset bits in cu_mask. Signed-off-by: Sean Keely <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-25drm/amdgpu: rename amdgpu_bo_get_preferred_pin_domainYifan Zhang4-7/+7
amdgpu_bo_get_preferred_pin_domain is used for page tables creation, which is not involved with page pinning. And it is used in more cases than display scanout, modify its documentation as well. Signed-off-by: Yifan Zhang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-25drm/amdgpu: drop redundant cancel_delayed_work_sync callEvan Quan4-6/+0
As those _sw_fini() APIs follow just after _suspend() APIs. And the cancel_delayed_work_sync was already called in latter. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Guchun Chen <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-25drm/amdgpu: add missing cleanups for more ASICs on UVD/VCE suspendEvan Quan6-1/+144
This is a supplement for commit below: "drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspend". Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Guchun Chen <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-25drm/amdgpu: add missing cleanups for Polaris12 UVD/VCE on suspendEvan Quan2-0/+47
Perform proper cleanups on UVD/VCE suspend: powergate enablement, clockgating enablement and dpm disablement. This can fix some hangs observed on suspending when UVD/VCE still using(e.g. issue "pm-suspend" when video is still playing). Signed-off-by: Evan Quan <[email protected]> Signed-off-by: xinhui pan <[email protected]> Reviewed-by: Guchun Chen <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-24drm/amdkfd: map SVM range with correct access permissionPhilip Yang1-48/+86
Restore retry fault or prefetch range, or restore svm range after eviction to map range to GPU with correct read or write access permission. Range may includes multiple VMAs, update GPU page table with offset of prange, number of pages for each VMA according VMA access permission. Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-24drm/amdkfd: check access permisson to restore retry faultPhilip Yang6-8/+39
Check range access permission to restore GPU retry fault, if GPU retry fault on address which belongs to VMA, and VMA has no read or write permission requested by GPU, failed to restore the address. The vm fault event will pass back to user space. Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-24drm/amdgpu: Update RAS XGMI Error QueryJohn Clements1-1/+3
Resolve bug querying error on unsupported ASIC Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-24drm/amdgpu: Add driver infrastructure for MCA RASJohn Clements9-2/+388
Add MCA specific IP blocks targetting RAS features Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-24drm/amd/display: Add Logging for HDMI color depth informationPraful Swarnakar1-0/+11
[Why] Recent HDMI2.0 HF1-1 V-Swing testing showed that logging deep color status helps in validation of testcase. [How] Add logging based on various color depths and pixel encoding formats. Signed-off-by: Praful Swarnakar <[email protected]> Reviewed-by: Hersen Wu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-24drm/amd/amdgpu: consolidate PSP TA init shared buf functionsCandice Li1-99/+43
Signed-off-by: Candice Li <[email protected]> Reviewed-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-24drm/amd/amdgpu: add name field back to ras_common_ifCandice Li1-0/+1
Adding name field back to ras_common_if to work around error injection failure with amdgpuras tool. Signed-off-by: Candice Li <[email protected]> Reviewed-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-24drm/amdgpu: Fix build with missing pm_suspend_target_state module exportBorislav Petkov1-1/+1
Building a randconfig here triggered: ERROR: modpost: "pm_suspend_target_state" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined! because the module export of that symbol happens in kernel/power/suspend.c which is enabled with CONFIG_SUSPEND. The ifdef guards in amdgpu_acpi_is_s0ix_supported(), however, test for CONFIG_PM_SLEEP which is defined like this: config PM_SLEEP def_bool y depends on SUSPEND || HIBERNATE_CALLBACKS and that randconfig has: # CONFIG_SUSPEND is not set CONFIG_HIBERNATE_CALLBACKS=y leading to the module export missing. Change the ifdeffery to depend directly on CONFIG_SUSPEND. Fixes: 5706cb3c910c ("drm/amdgpu: fix checking pmops when PM_SLEEP is not enabled") Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2021-08-24drm/radeon: switch from 'pci_' to 'dma_' APIChristophe JAILLET1-3/+3
The wrappers in include/linux/pci-dma-compat.h should go away. The patch has been generated with the coccinelle script below. It has been compile tested. @@ @@ - PCI_DMA_BIDIRECTIONAL + DMA_BIDIRECTIONAL @@ @@ - PCI_DMA_TODEVICE + DMA_TO_DEVICE @@ @@ - PCI_DMA_FROMDEVICE + DMA_FROM_DEVICE @@ @@ - PCI_DMA_NONE + DMA_NONE @@ expression e1, e2, e3; @@ - pci_alloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3; @@ - pci_zalloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3, e4; @@ - pci_free_consistent(e1, e2, e3, e4) + dma_free_coherent(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_single(e1, e2, e3, e4) + dma_map_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_single(e1, e2, e3, e4) + dma_unmap_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4, e5; @@ - pci_map_page(e1, e2, e3, e4, e5) + dma_map_page(&e1->dev, e2, e3, e4, e5) @@ expression e1, e2, e3, e4; @@ - pci_unmap_page(e1, e2, e3, e4) + dma_unmap_page(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_sg(e1, e2, e3, e4) + dma_map_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_sg(e1, e2, e3, e4) + dma_unmap_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_cpu(e1, e2, e3, e4) + dma_sync_single_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_device(e1, e2, e3, e4) + dma_sync_single_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_cpu(e1, e2, e3, e4) + dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_device(e1, e2, e3, e4) + dma_sync_sg_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2; @@ - pci_dma_mapping_error(e1, e2) + dma_mapping_error(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_dma_mask(e1, e2) + dma_set_mask(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_consistent_dma_mask(e1, e2) + dma_set_coherent_mask(&e1->dev, e2) Reviewed-by: Christian König <[email protected]> Signed-off-by: Christophe JAILLET <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-24drm/amdgpu: switch from 'pci_' to 'dma_' APIChristophe JAILLET1-3/+3
The wrappers in include/linux/pci-dma-compat.h should go away. The patch has been generated with the coccinelle script below. It has been compile tested. @@ @@ - PCI_DMA_BIDIRECTIONAL + DMA_BIDIRECTIONAL @@ @@ - PCI_DMA_TODEVICE + DMA_TO_DEVICE @@ @@ - PCI_DMA_FROMDEVICE + DMA_FROM_DEVICE @@ @@ - PCI_DMA_NONE + DMA_NONE @@ expression e1, e2, e3; @@ - pci_alloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3; @@ - pci_zalloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3, e4; @@ - pci_free_consistent(e1, e2, e3, e4) + dma_free_coherent(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_single(e1, e2, e3, e4) + dma_map_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_single(e1, e2, e3, e4) + dma_unmap_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4, e5; @@ - pci_map_page(e1, e2, e3, e4, e5) + dma_map_page(&e1->dev, e2, e3, e4, e5) @@ expression e1, e2, e3, e4; @@ - pci_unmap_page(e1, e2, e3, e4) + dma_unmap_page(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_sg(e1, e2, e3, e4) + dma_map_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_sg(e1, e2, e3, e4) + dma_unmap_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_cpu(e1, e2, e3, e4) + dma_sync_single_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_device(e1, e2, e3, e4) + dma_sync_single_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_cpu(e1, e2, e3, e4) + dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_device(e1, e2, e3, e4) + dma_sync_sg_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2; @@ - pci_dma_mapping_error(e1, e2) + dma_mapping_error(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_dma_mask(e1, e2) + dma_set_mask(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_consistent_dma_mask(e1, e2) + dma_set_coherent_mask(&e1->dev, e2) Reviewed-by: Christian König <[email protected]> Signed-off-by: Christophe JAILLET <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-24drm/amdkfd: CWSR with sw scheduler on Aldebaran and ArcturusMukul Joshi4-2/+6
Program trap handler settings to enable CWSR with software scheduler on Aldebaran and Arcturus. Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Amber Lin <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-24drm/amdgpu/OLAND: clip the ref divider max valueShashank Sharma3-9/+16
This patch limits the ref_div_max value to 100, during the calculation of PLL feedback reference divider. With current value (128), the produced fb_ref_div value generates unstable output at particular frequencies. Radeon driver limits this value at 100. On Oland, when we try to setup mode 2048x1280@60 (a bit weird, I know), it demands a clock of 221270 Khz. It's been observed that the PLL calculations using values 128 and 100 are vastly different, and look like this: +------------------------------------------+ |Parameter |AMDGPU |Radeon | | | | | +-------------+----------------------------+ |Clock feedback | | |divider max | 128 | 100 | |cap value | | | | | | | | | | | +------------------------------------------+ |ref_div_max | | | | | 42 | 20 | | | | | | | | | +------------------------------------------+ |ref_div | 42 | 20 | | | | | +------------------------------------------+ |fb_div | 10326 | 8195 | +------------------------------------------+ |fb_div | 1024 | 163 | +------------------------------------------+ |fb_dev_p | 4 | 9 | |frac fb_de^_p| | | +----------------------------+-------------+ With ref_div_max value clipped at 100, AMDGPU driver can also drive videmode 2048x1280@60 (221Mhz) and produce proper output without any blanking and distortion on the screen. PS: This value was changed from 128 to 100 in Radeon driver also, here: https://github.com/freedesktop/drm-tip/commit/4b21ce1b4b5d262e7d4656b8ececc891fc3cb806 V1: Got acks from: Acked-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]> V2: - Restricting the changes only for OLAND, just to avoid any regression for other cards. - Changed unsigned -> unsigned int to make checkpatch quiet. V3: Apply the change on SI family (not only oland) (Christian) Cc: Alex Deucher <[email protected]> Cc: Christian König <[email protected]> Cc: Eddy Qin <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Shashank Sharma <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-24drm/amd/display: refactor riommu invalidation waEric Yang4-20/+1
[Why] A cleaner solution, only done once on boot. [How] Remove previous workaround and configure an extra vmid one time on boot Reviewed-by: Kazlauskas Nicholas <[email protected]> Acked-by: Solomon Chiu <[email protected]> Signed-off-by: Eric Yang <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-20drm/amdgpu: Cancel delayed work when GFXOFF is disabledMichel Dänzer2-17/+30
schedule_delayed_work does not push back the work if it was already scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms after the first time GFXOFF was disabled and re-enabled, even if GFXOFF was disabled and re-enabled again during those 100 ms. This resulted in frame drops / stutter with the upcoming mutter 41 release on Navi 14, due to constantly enabling GFXOFF in the HW and disabling it again (for getting the GPU clock counter). To fix this, call cancel_delayed_work_sync when the disable count transitions from 0 to 1, and only schedule the delayed work on the reverse transition, not if the disable count was already 0. This makes sure the delayed work doesn't run at unexpected times, and allows it to be lock-free. v2: * Use cancel_delayed_work_sync & mutex_trylock instead of mod_delayed_work. v3: * Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König) v4: * Fix race condition between amdgpu_gfx_off_ctrl incrementing adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off checking for it to be 0 (Evan Quan) Cc: [email protected] Reviewed-by: Evan Quan <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> # v3 Acked-by: Christian König <[email protected]> # v3 Signed-off-by: Michel Dänzer <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-20drm/amdgpu: use the preferred pin domain after the checkChristian König1-5/+5
For some reason we run into an use case where a BO is already pinned into GTT, but should be pinned into VRAM|GTT again. Handle that case gracefully as well. Reviewed-by: Shashank Sharma <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2021-08-20drm/amd/pm: a quick fix for "divided by zero" errorEvan Quan2-9/+20
Considering Arcturus is a dedicated ASIC for computing, it will be more proper to drop the support for fan speed reading and setting. That's on the TODO list. Signed-off-by: Evan Quan <[email protected]> Reported-by: Rui Teng <[email protected]> Reviewed-by: Guchun Chen <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-18drm: amdgpu: remove obsolete reference to config CHASHLukas Bulwahn1-1/+0
Commit 04ed8459f334 ("drm/amdgpu: remove chash") removes the chash architecture and its corresponding config CHASH. There is still a reference to CHASH in the config DRM_AMDGPU in ./drivers/gpu/drm/Kconfig. Remove this obsolete reference to config CHASH. Signed-off-by: Lukas Bulwahn <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-18drm/amd/pm: Fix spelling mistake "firwmare" -> "firmware"Colin Ian King1-1/+1
There is a spelling mistake in a dev_err error message. Fix it. Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-18drm/amd/amdgpu:flush ttm delayed work before cancel_syncYuBiao Wang1-1/+3
[Why] In some cases when we unload driver, warning call trace will show up in vram_mgr_fini which claims that LRU is not empty, caused by the ttm bo inside delay deleted queue. [How] We should flush delayed work to make sure the delay deleting is done. Signed-off-by: YuBiao Wang <[email protected]> Reviewed-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-18drm/amd: consolidate TA shared memory structuresCandice Li6-196/+168
Signed-off-by: Candice Li <[email protected]> Reviewed-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-18drm/amdgpu: increase max xgmi physical node for aldebaranHawking Zhang1-3/+2
aldebaran supports up to 16 xgmi physical nodes. Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-18drm/amdgpu: disable BACO support for 699F:C7 polaris12 SKU temporarilyEvan Quan1-1/+8
We have a S3 issue on that SKU with BACO enabled. Will bring back this when that root caused. Signed-off-by: Evan Quan <[email protected]> Acked-by: Alex Deucher <[email protected]> Reviewed-by: Guchun Chen <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-18drm/amd/display: Use DCN30 watermark calc for DCN301Zhan Liu1-95/+1
[why] dcn301_calculate_wm_and_dl() causes flickering when external monitor is connected. This issue has been fixed before by commit 0e4c0ae59d7e ("drm/amdgpu/display: drop dcn301_calculate_wm_and_dl for now"), however part of the fix was gone after commit 2cbcb78c9ee5 ("Merge tag 'amd-drm-next-5.13-2021-03-23' of https://gitlab.freedesktop.org/agd5f/linux into drm-next"). [how] Use dcn30_calculate_wm_and_dlg() instead as in the original fix. Fixes: 2cbcb78c9ee5 ("Merge tag 'amd-drm-next-5.13-2021-03-23' of https://gitlab.freedesktop.org/agd5f/linux into drm-next") Signed-off-by: Nikola Cornij <[email protected]> Reviewed-by: Zhan Liu <[email protected]> Tested-by: Zhan Liu <[email protected]> Tested-by: Oliver Logush <[email protected]> Signed-off-by: Zhan Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-18drm/amdgpu: correct MMSCH 1.0 versionZhigang Luo1-3/+1
MMSCH 1.0 doesn't have major/minor version, only verison. Signed-off-by: Zhigang Luo <[email protected]> Reviewed by Shaoyun.liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-18drm/amdgpu: get extended xgmi topology dataJonathan Kim4-14/+145
The TA has a limit to the amount of data that can be retrieved from GET_TOPOLOGY. For setups that exceed this limit, the xGMI topology needs to be re-initialized and data needs to be re-fetched from the extended link records by setting a flag in the shared command buffer. The number of hops and the number of links must be accumulated by the driver. Other data points are all fetched from the first request. Because the TA has already exceeded its link record limit, it cannot hold bidirectional information. Otherwise the driver would have to do more than two fetches so the driver has to reflect the topology information in the opposite direction. v2: squashed with internal reviewed fix Signed-off-by: Jonathan Kim <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-16drm/radeon: Add break to switch statement in radeonfb_create_pinned_object()Nathan Chancellor1-0/+1
Clang + -Wimplicit-fallthrough warns: drivers/gpu/drm/radeon/radeon_fb.c:170:2: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough] default: ^ drivers/gpu/drm/radeon/radeon_fb.c:170:2: note: insert 'break;' to avoid fall-through default: ^ break; 1 warning generated. Clang's version of this warning is a little bit more pedantic than GCC's. Add the missing break to satisfy it to match what has been done all over the kernel tree. Reviewed-by: Christian König <[email protected]> Signed-off-by: Nathan Chancellor <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-16drm/amd/display: 3.2.149Aric Cyr1-1/+1
This version brings along following fixes: - Ensure DCN save init registers after VM setup - Fix multi-display support for idle opt workqueue - Use vblank control events for PSR enable/disable - Create default dc_sink when fail reading EDID under MST Acked-by: Wayne Lin <[email protected]> Signed-off-by: Aric Cyr <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-16drm/amd/display: [FW Promotion] Release 0.0.79Anthony Koo1-2/+12
Acked-by: Wayne Lin <[email protected]> Signed-off-by: Anthony Koo <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-16drm/amd/display: Guard vblank wq flush with DCN guardsNicholas Kazlauskas1-0/+4
[Why] Compilation of the workqueue fails if not building with the DCN config option set. [How] Guard calls to the flush with the DCN config option to fix the build. Reviewed-by: Roman Li <[email protected]> Acked-by: Wayne Lin <[email protected]> Signed-off-by: Nicholas Kazlauskas <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-16drm/amd/display: Ensure DCN save after VM setupJake Wang8-0/+30
[Why] DM initializes VM context after DMCUB initialization. This results in loss of DCN_VM_CONTEXT registers after z10. [How] Notify DMCUB when VM setup is complete, and have DMCUB save init registers. v2: squash in CONFIG_DRM_AMD_DC_DCN3_1 fix Reviewed-by: Nicholas Kazlauskas <[email protected]> Acked-by: Wayne Lin <[email protected]> Signed-off-by: Jake Wang <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-16drm/amd/display: Use vblank control events for PSR enable/disableNicholas Kazlauskas3-8/+43
[Why] PSR can disable the HUBP along with the OTG when PSR is active. We'll hit a pageflip timeout when the OTG is disable because we're no longer updating the CRTC vblank counter and the pflip high IRQ will not fire on the flip. In order to flip the page flip timeout occur we should modify the enter/exit conditions to match DRM requirements. [How] Use our deferred handlers for DRM vblank control to notify DMCU(B) when it can enable or disable PSR based on whether vblank is disabled or enabled respectively. We'll need to pass along the stream with the notification now because we want to access the CRTC state while the CRTC is locked to get the stream state prior to the commit. Retain a reference to the stream so it remains safe to continue to access and release that reference once we're done with it. Enable/disable logic follows what we were previously doing in update_planes. The workqueue has to be flushed before programming streams or planes to ensure that we exit out of idle optimizations and PSR before these events occur if necessary. To keep the skip count logic the same to avoid FBCON PSR enablement requires copying the allow condition onto the DM IRQ parameters - a field that we can actually access from the worker. Reviewed-by: Roman Li <[email protected]> Acked-by: Wayne Lin <[email protected]> Signed-off-by: Nicholas Kazlauskas <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-16drm/amd/display: Fix multi-display support for idle opt workqueueNicholas Kazlauskas2-47/+36
[Why] The current implementation for idle optimization support only has a single work item that gets reshuffled into the system workqueue whenever we receive an enable or disable event. We can have mismatched events if the work hasn't been processed or if we're getting control events from multiple displays at once. This fixes this issue and also makes the implementation usable for PSR control - which will be addressed in another patch. [How] We need to be able to flush remaining work out on demand for driver stop and psr disable so create a driver specific workqueue instead of using the system one. The workqueue will be single threaded to guarantee the ordering of enable/disable events. Refactor the queue to allocate the control work and deallocate it after processing it. Pass the acrtc directly to make it easier to handle psr enable/disable in a later patch. Rename things to indicate that it's not just MALL specific. Reviewed-by: Roman Li <[email protected]> Acked-by: Wayne Lin <[email protected]> Signed-off-by: Nicholas Kazlauskas <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-16drm/amd/display: Create dc_sink when EDID failWayne Lin1-0/+23
[Why] While reading remote EDID via Startech 1-to-4 hub, occasionally we won't get response in time and won't light up corresponding monitor. Ideally, we can still add generic modes for userspace to choose to try to light up the monitor and which is done in drm_helper_probe_single_connector_modes(). So the main problem here is that we fail .mode_valid since we don't create remote dc_sink for this case. [How] Also add default dc_sink if we can't get the EDID. Reviewed-by: Nicholas Kazlauskas <[email protected]> Signed-off-by: Wayne Lin <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-16drm/amd/pm: correct the address of Arcturus fan related registersEvan Quan1-5/+133
These registers have different address from other SMU V11 ASICs. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-16drm/amd/pm: drop unnecessary manual mode checkEvan Quan1-12/+4
As the fan control was guarded under manual mode before fan speed RPM/PWM setting. Thus the extra check is totally redundant. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-16drm/amd/pm: drop the unnecessary intermediate percent-based transitionEvan Quan23-140/+112
Currently, the readout of fan speed pwm is transited into percent-based and then pwm-based. However, the transition into percent-based is totally unnecessary and make the final output less accurate. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-16drm/amd/pm: correct the fan speed RPM retrievingEvan Quan8-5/+106
The relationship "PWM = RPM / smu->fan_max_rpm" between fan speed PWM and RPM is not true for SMU11 ASICs. So, we need a new way to retrieving the fan speed RPM. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-16drm/amd/pm: correct the fan speed PWM retrievingEvan Quan8-81/+62
The relationship "PWM = RPM / smu->fan_max_rpm" between fan speed PWM and RPM is not true for SMU11 ASICs. So, we need a new way to retrieving the fan speed PWM. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-16drm/amd/pm: record the RPM and PWM based fan speed settingsEvan Quan3-6/+31
As the relationship "PWM = RPM / smu->fan_max_rpm" between fan speed PWM and RPM is not true for SMU11 ASICs. So, both the RPM and PWM settings need to be saved. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-16drm/amd/pm: correct the fan speed RPM settingEvan Quan7-4/+51
The relationship "PWM = RPM / smu->fan_max_rpm" between fan speed PWM and RPM is not true for SMU11 ASICs. So, we need a new way to perform the fan speed RPM setting. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-16drm/amd/amdgpu: remove unnecessary RAS context fieldCandice Li10-14/+6
Delete ras_if->name in the RAS ctx structure and remove related lines. Signed-off-by: Candice Li <[email protected]> Reviewed-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-16drm/amdkfd: fix random KFDSVMRangeTest.SetGetAttributesTest test failureYifan Zhang1-0/+8
KFDSVMRangeTest.SetGetAttributesTest randomly fails in stress test. Note: Google Test filter = KFDSVMRangeTest.* [==========] Running 18 tests from 1 test case. [----------] Global test environment set-up. [----------] 18 tests from KFDSVMRangeTest [ RUN ] KFDSVMRangeTest.BasicSystemMemTest [ OK ] KFDSVMRangeTest.BasicSystemMemTest (30 ms) [ RUN ] KFDSVMRangeTest.SetGetAttributesTest [ ] Get default atrributes /home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDSVMRangeTest.cpp:154: Failure Value of: expectedDefaultResults[i] Actual: 4294967295 Expected: outputAttributes[i].value Which is: 0 /home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDSVMRangeTest.cpp:154: Failure Value of: expectedDefaultResults[i] Actual: 4294967295 Expected: outputAttributes[i].value Which is: 0 /home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDSVMRangeTest.cpp:152: Failure Value of: expectedDefaultResults[i] Actual: 4 Expected: outputAttributes[i].type Which is: 2 [ ] Setting/Getting atrributes [ FAILED ] the root cause is that svm work queue has not finished when svm_range_get_attr is called, thus some garbage svm interval tree data make svm_range_get_attr get wrong result. Flush work queue before iterate svm interval tree. Signed-off-by: Yifan Zhang <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-16drm/amd/pm: change the workload type for some cardsKenneth Feng1-1/+14
change the workload type for some cards as it is needed. Signed-off-by: Kenneth Feng <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-16Revert "drm/amd/pm: fix workload mismatch on vega10"Kenneth Feng1-1/+1
This reverts commit 0979d43259e13846d86ba17e451e17fec185d240. Revert this because it does not apply to all the cards. Signed-off-by: Kenneth Feng <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>