aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd
AgeCommit message (Collapse)AuthorFilesLines
2024-09-10drm/amdkfd: Fix resource leak in criu restore queueJesse Zhang1-0/+1
To avoid memory leaks, release q_extra_data when exiting the restore queue. v2: Correct the proto (Alex) Signed-off-by: Jesse Zhang <[email protected]> Reviewed-by: Tim Huang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-10drm/amd/display: Do not reset planes based on crtc zpos_changedLeo Li1-1/+1
[Why] drm_normalize_zpos will set the crtc_state->zpos_changed to 1 if any of it's assigned planes changes zpos, or is removed/added from it. To have amdgpu_dm request a plane reset on this is too broad. For example, if only the cursor plane was moved from one crtc to another, the crtc's zpos_changed will be set to true. But that does not mean that the underlying primary plane requires a reset. [How] Narrow it down so that only the plane that has a change in zpos will require a reset. As a future TODO, we can further optimize this by only requiring a reset on z-order change. Z-order is different from z-pos, since a zpos change doesn't necessarily mean the z-ordering changed, and DC should only require a reset if the z-ordering changed. For example, the following zpos update does not change z-ordering: Plane A: zpos 2 -> 3 Plane B: zpos 1 -> 2 => Plane A is still on top of plane B: no reset needed Whereas this one does change z-ordering: Plane A: zpos 2 -> 1 Plane B: zpos 1 -> 2 => Plane A changed from on top, to below plane B: reset needed Fixes: 38e0c3df6dbd ("drm/amd/display: Move PRIMARY plane zpos higher") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3569 Signed-off-by: Leo Li <[email protected]> Reviewed-by: Harry Wentland <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-06drm/amdgpu: drop redundant W=1 warnings from MakefileJani Nikula1-17/+1
Since commit a61ddb4393ad ("drm: enable (most) W=1 warnings by default across the subsystem"), most of the extra warnings in the driver Makefile are redundant. Remove them. Note that -Wmissing-declarations and -Wmissing-prototypes are always enabled by default in scripts/Makefile.extrawarn. Reviewed-by: Hamza Mahfooz <[email protected]> Signed-off-by: Jani Nikula <[email protected]> Signed-off-by: Hamza Mahfooz <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-06drm/amdgpu: revert "use CPU for page table update if SDMA is unavailable"Christian König1-6/+0
That is clearly not something we should do upstream. The SDMA is mandatory for the driver to work correctly. We could do this for emulation and bringup, but in those cases the engineer should probably enabled CPU based updates manually. This reverts commit 62eefd10ac1c7e976bda47ff311bd87cee40ab8d. Signed-off-by: Christian König <[email protected]> Reviewed-by: Yifan Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-06drm/amdgpu/mes11: Indent an if statmentDan Carpenter1-1/+1
Indent the "break" statement one more tab. Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-06drm/amdkfd: Document and define SVM events message macroPhilip Yang1-23/+22
Document how to use SMI system management interface to enable and receive SVM events. Document SVM event triggers. Define SVM events message string format macro that could be used by user mode for sscanf to parse the event. Add it to uAPI header file to make it obvious that is changing uAPI in future. No functional changes. Signed-off-by: Philip Yang <[email protected]> Reviewed-by: James Zhu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-06drm/amdkfd: Select reset method for poison handlingHawking Zhang1-8/+32
Driver mode-2 is only supported by relative new smc firmware. Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-06drm/amdkfd: fix missed queue reset on queue destroyJonathan Kim1-2/+3
If a queue is being destroyed but causes a HWS hang on removal, the KFD may issue an unnecessary gpu reset if the destroyed queue can be fixed by a queue reset. This is because the queue has been removed from the KFD's queue list prior to the preemption action on destroy so the reset call will fail to match the HQD PQ reset information against the KFD's queue record to do the actual reset. To fix this, deactivate the queue prior to preemption since it's being destroyed anyways and remove the queue from the KFD's queue list after preemption. Signed-off-by: Jonathan Kim <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-06drm/amdgpu: Surface svm_default_granularity, a RW module parameterRamesh Errabolu4-7/+39
Enables users to update SVM's default granularity, used in buffer migration and handling of recoverable page faults. Param value is set in terms of log(numPages(buffer)), e.g. 9 for a 2 MIB buffer Signed-off-by: Ramesh Errabolu <[email protected]> Reviewed-by: Philip Yang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-06drm/amdgpu: fix queue reset issue by mmioJesse Zhang1-0/+1
Initialize the queue type before resetting the queue using mmio. Signed-off-by: Jesse Zhang <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-06drm/amd/display: Add kdoc entry for 'program_isharp_1dlut' in ↵Srinivasan Shanmugam1-0/+1
'dpp401_dscl_program_isharp' Added a descriptor for the 'program_isharp_1dlut' parameter, which is a flag used to determine whether to program the isharp 1D LUT. Fixes the below with gcc W=1: drivers/gpu/drm/amd/amdgpu/../display/dc/dpp/dcn401/dcn401_dpp_dscl.c:963: warning: Function parameter or struct member 'program_isharp_1dlut' not described in 'dpp401_dscl_program_isharp' Cc: Tom Chung <[email protected]> Cc: Rodrigo Siqueira <[email protected]> Cc: Roman Li <[email protected]> Cc: Alex Hung <[email protected]> Cc: Aurabindo Pillai <[email protected]> Cc: Harry Wentland <[email protected]> Cc: Hamza Mahfooz <[email protected]> Signed-off-by: Srinivasan Shanmugam <[email protected]> Reviewed-by: Tom Chung <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-06drm/amdgpu: Replace 'amdgpu_job_submit_direct' with 'drm_sched_entity' in ↵Srinivasan Shanmugam1-16/+19
cleaner shader This commit replaces the use of amdgpu_job_submit_direct which submits the job to the ring directly, with drm_sched_entity in the cleaner shader job submission process. The change allows the GPU scheduler to manage the cleaner shader job. - The job is then submitted to the GPU using the drm_sched_entity_push_job function, which allows the GPU scheduler to manage the job. This change improves the reliability of the cleaner shader job submission process by leveraging the capabilities of the GPU scheduler. Fixes: d361ad5d2fc0 ("drm/amdgpu: Add sysfs interface for running cleaner shader") Cc: Christian König <[email protected]> Cc: Alex Deucher <[email protected]> Signed-off-by: Srinivasan Shanmugam <[email protected]> Suggested-by: Christian König <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-06drm/amdgpu/: Add missing kdoc entry in amdgpu_vm_handle_fault functionSrinivasan Shanmugam1-0/+1
This commit adds a description for the 'ts' parameter in the amdgpu_vm_handle_fault function's comment block. Fixes the below with gcc W=1: drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:2781: warning: Function parameter or struct member 'ts' not described in 'amdgpu_vm_handle_fault' Cc: Xiaogang.Chen <[email protected]> Cc: Christian König <[email protected]> Cc: Alex Deucher <[email protected]> Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Signed-off-by: Srinivasan Shanmugam <[email protected]> Reviewed-by: Xiaogang Chen <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-06drm/amd/display: fix dccg root clock optimization related hangQili Lu4-5/+15
[Why] enable dpp rcg before we disable dppclk in hw_init cause system hang/reboot [How] we remove dccg rcg related code from init into a separate function and call it after we init pipe Reviewed-by: Nicholas Kazlauskas <[email protected]> Signed-off-by: Qili Lu <[email protected]> Signed-off-by: Hamza Mahfooz <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-06drm/amd/display: Refactor dccg35_get_other_enabled_symclk_feNicholas Susanto1-40/+25
[Why] Function used to check the number of FEs connected to the current BE. This was then used to determine if the symclk could be disabled, if all FEs were disconnected. However, the function would skip over the primary FE and return 0 when the primary FE was still connected. This caused black screens on driver disable with an MST daisy chain hooked up. [How] Refactor the function to correctly return the number of FEs connected to the input BE. Also, rename it for clarity. Reviewed-by: Nicholas Kazlauskas <[email protected]> Signed-off-by: Nicholas Susanto <[email protected]> Signed-off-by: Hamza Mahfooz <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-06drm/amdgpu: Normalize reg offsets on JPEG v4.0.3Lijo Lazar1-21/+15
On VFs and SOCs with GC 9.4.4, VCN RRMT is disabled. Only local register offsets should be used on JPEG v4.0.3 as they cannot handle remote access to other AIDs. Since only local offsets are used, the special write to MCM_ADDR register is no longer needed. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Sathishkumar S <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-06drm/amd/display: Avoid race between dcn35_set_drr() and dc_state_destruct()Tobias Jakobi1-8/+12
dc_state_destruct() nulls the resource context of the DC state. The pipe context passed to dcn35_set_drr() is a member of this resource context. If dc_state_destruct() is called parallel to the IRQ processing (which calls dcn35_set_drr() at some point), we can end up using already nulled function callback fields of struct stream_resource. The logic in dcn35_set_drr() already tries to avoid this, by checking tg against NULL. But if the nulling happens exactly after the NULL check and before the next access, then we get a race. Avoid this by copying tg first to a local variable, and then use this variable for all the operations. This should work, as long as nobody frees the resource pool where the timing generators live. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3142 Fixes: 06ad7e164256 ("drm/amd/display: Destroy DC context while keeping DML and DML2") Signed-off-by: Tobias Jakobi <[email protected]> Reviewed-by: Harry Wentland <[email protected]> Signed-off-by: Mario Limonciello <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-06drm/amd/display: Avoid race between dcn10_set_drr() and dc_state_destruct()Tobias Jakobi1-8/+12
dc_state_destruct() nulls the resource context of the DC state. The pipe context passed to dcn10_set_drr() is a member of this resource context. If dc_state_destruct() is called parallel to the IRQ processing (which calls dcn10_set_drr() at some point), we can end up using already nulled function callback fields of struct stream_resource. The logic in dcn10_set_drr() already tries to avoid this, by checking tg against NULL. But if the nulling happens exactly after the NULL check and before the next access, then we get a race. Avoid this by copying tg first to a local variable, and then use this variable for all the operations. This should work, as long as nobody frees the resource pool where the timing generators live. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3142 Fixes: 06ad7e164256 ("drm/amd/display: Destroy DC context while keeping DML and DML2") Signed-off-by: Tobias Jakobi <[email protected]> Tested-by: Raoul van Rüschen <[email protected]> Tested-by: Christopher Snowhill <[email protected]> Reviewed-by: Harry Wentland <[email protected]> Tested-by: Sefa Eyeoglu <[email protected]> Signed-off-by: Mario Limonciello <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-06drm/amdgpu: use clamp() in amdgpu_vm_adjust_size()Li Zetao1-1/+1
When it needs to get a value within a certain interval, using clamp() makes the code easier to understand than min(max()). Reviewed-by: Christian König <[email protected]> Signed-off-by: Li Zetao <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-06drm/amd: use clamp() in amdgpu_pll_get_fb_ref_div()Li Zetao1-1/+1
When it needs to get a value within a certain interval, using clamp() makes the code easier to understand than min(max()). Reviewed-by: Christian König <[email protected]> Signed-off-by: Li Zetao <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-06drm/amdgpu: enable gfxoff quirk on HP 705G4Peng Liu1-0/+2
Enabling gfxoff quirk results in perfectly usable graphical user interface on HP 705G4 DM with R5 2400G. Without the quirk, X server is completely unusable as every few seconds there is gpu reset due to ring gfx timeout. Signed-off-by: Peng Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-06drm/amdgpu: add raven1 gfxoff quirkPeng Liu1-0/+2
Fix screen corruption with openkylin. Link: https://bbs.openkylin.top/t/topic/171497 Signed-off-by: Peng Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-06drm/amd/display: Fix spelling mistake "recompte" -> "recompute"Colin Ian King1-1/+1
There is a spelling mistake in a DRM_DEBUG_DRIVER message. Fix it. Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-06drm/amdkfd: Add cache line size infoDavid Belanger1-1/+7
Populate cache line size info in topology based on information from IP discovery table. Signed-off-by: David Belanger <[email protected]> Reviewed-by: Sreekant Somasekharan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-06drm/amd/display: Add missing kdoc entry for 'bs_coeffs_updated' in ↵Srinivasan Shanmugam1-0/+1
dpp401_dscl_program_isharp This commit addresses a missing kdoc for the 'bs_coeffs_updated' parameter in the 'dpp401_dscl_program_isharp' function. The 'bs_coeffs_updated' is a flag indicating whether the Blur and Scale Coefficients have been updated. The 'dpp401_dscl_program_isharp' function is responsible for programming the isharp, which includes setting the isharp filter, noise gain, and blur and scale coefficients. If the 'bs_coeffs_updated' flag is set to true, the function updates the blur and scale coefficients. Fixes the below with gcc W=1: drivers/gpu/drm/amd/amdgpu/../display/dc/dpp/dcn401/dcn401_dpp_dscl.c:961: warning: Function parameter or struct member 'bs_coeffs_updated' not described in 'dpp401_dscl_program_isharp' Cc: Tom Chung <[email protected]> Cc: Rodrigo Siqueira <[email protected]> Cc: Roman Li <[email protected]> Cc: Alex Hung <[email protected]> Cc: Aurabindo Pillai <[email protected]> Cc: Harry Wentland <[email protected]> Cc: Hamza Mahfooz <[email protected]> Signed-off-by: Srinivasan Shanmugam <[email protected]> Suggested-by: Tom Chung <[email protected]> Reviewed-by: Tom Chung <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-06drm/amdgpu: fix invalid fence handling in amdgpu_vm_tlb_flushLang Yu1-2/+4
CPU based update doesn't produce a fence, handle such cases properly. Fixes: d8a3f0a0348d ("drm/amdgpu: implement TLB flush fence") Signed-off-by: Lang Yu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-06drm/amdgpu: re-work VM syncingChristian König5-55/+65
Rework how VM operations synchronize to submissions. Provide an amdgpu_sync container to the backends instead of an reservation object and fill in the amdgpu_sync object in the higher layers of the code. No intended functional change, just prepares for upcomming changes. Signed-off-by: Christian König <[email protected]> Reviewed-by: Friedrich Vock <[email protected]> Acked-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-05Revert "drm/amdgpu: align pp_power_profile_mode with kernel docs"Alex Deucher1-2/+4
This reverts commit 8f614469de248a4bc55fb07e55d5f4c340c75b11. This breaks some manual setting of the profile mode in certain cases. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3600 Signed-off-by: Alex Deucher <[email protected]> (cherry picked from commit 7a199557643e993d4e7357860624b8aa5d8f4340) Cc: [email protected]
2024-09-05Revert "drm/amdgpu: align pp_power_profile_mode with kernel docs"Alex Deucher1-2/+4
This reverts commit bbb05f8a9cd87f5046d05a0c596fddfb714ee457. This breaks some manual setting of the profile mode in certain cases. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3600 Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amd/display: Block timing sync for different signals in PMODillon Varone1-1/+2
PMO assumes that like timings can be synchronized, but DC only allows this if the signal types match. Reviewed-by: Austin Zheng <[email protected]> Signed-off-by: Dillon Varone <[email protected]> Signed-off-by: Hamza Mahfooz <[email protected]> Signed-off-by: Alex Deucher <[email protected]> (cherry picked from commit 29d3d6af43135de7bec677f334292ca8dab53d67) Cc: [email protected]
2024-09-02drm/amd/display: Lock DC and exit IPS when changing backlightLeo Li1-1/+12
Backlight updates require aux and/or register access. Therefore, driver needs to disallow IPS beforehand. So, acquire the dc lock before calling into dc to update backlight - we should be doing this regardless of IPS. Then, while the lock is held, disallow IPS before calling into dc, then allow IPS afterwards (if it was previously allowed). Reviewed-by: Aurabindo Pillai <[email protected]> Reviewed-by: Roman Li <[email protected]> Signed-off-by: Leo Li <[email protected]> Signed-off-by: Hamza Mahfooz <[email protected]> Signed-off-by: Alex Deucher <[email protected]> (cherry picked from commit 988fe2862635c1b1b40e41c85c24db44ab337c13) Cc: [email protected] # 6.10+
2024-09-02drm/amdgpu: always allocate cleared VRAM for GEM allocationsAlex Deucher1-0/+3
This adds allocation latency, but aligns better with user expectations. The latency should improve with the drm buddy clearing patches that Arun has been working on. In addition this fixes the high CPU spikes seen when doing wipe on release. v2: always set AMDGPU_GEM_CREATE_VRAM_CLEARED (Christian) Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3528 Fixes: a68c7eaa7a8f ("drm/amdgpu: Enable clear page functionality") Acked-by: Arunpravin Paneer Selvam <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> (v1) Signed-off-by: Alex Deucher <[email protected]> Cc: Arunpravin Paneer Selvam <[email protected]> Cc: Christian König <[email protected]> (cherry picked from commit 6c0a7c3c693ac84f8b50269a9088af8f37446863) Cc: [email protected] # 6.10.x
2024-09-02drm/amdgpu/mes: add mes mapping legacy queue switchJack Xiao4-20/+43
For mes11 old firmware has issue to map legacy queue, add a flag to switch mes to map legacy queue. Fixes: f9d8c5c7855d ("drm/amdgpu/gfx: enable mes to map legacy queue support") Reported-by: Andrew Worsley <[email protected]> Link: https://lists.freedesktop.org/archives/amd-gfx/2024-August/112773.html Signed-off-by: Jack Xiao <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]> (cherry picked from commit 52491d97aadcde543986d596ed55f70bf2142851)
2024-09-02drm/amd/display: Determine IPS mode by ASIC and PMFW versionsLeo Li1-1/+25
[Why] DCN IPS interoperates with other system idle power features, such as Zstates. On DCN35, there is a known issue where system Z8 + DCN IPS2 causes a hard hang. We observe this on systems where the SBIOS allows Z8. Though there is a SBIOS fix, there's no guarantee that users will get it any time soon, or even install it. A workaround is needed to prevent this from rearing its head in the wild. [How] For DCN35, check the pmfw version to determine whether the SBIOS has the fix. If not, set IPS1+RCG as the deepest possible state in all cases except for s0ix and display off (DPMS). Otherwise, enable all IPS Signed-off-by: Leo Li <[email protected]> Reviewed-by: Harry Wentland <[email protected]> Signed-off-by: Alex Deucher <[email protected]> (cherry picked from commit 28d43d0895896f84c038d906d244e0a95eb243ec) Cc: [email protected]
2024-09-02drm/amdgpu/gfx10: use rlc safe mode for soft recoveryAlex Deucher1-0/+2
Protect the MMIO access with safe mode. Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/gfx11: use rlc safe mode for soft recoveryAlex Deucher1-0/+2
Protect the MMIO access with safe mode. Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/gfx12: use rlc safe mode for soft recoveryAlex Deucher1-0/+2
Protect the MMIO access with safe mode. Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/gfx12: use proper rlc safe mode helpersAlex Deucher1-2/+2
Rather than open coding it for the queue reset. Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/gfx11: use proper rlc safe mode helpersAlex Deucher1-4/+4
Rather than open coding it for the queue reset. Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/gfx10: use proper rlc safe mode helpersAlex Deucher1-2/+2
Rather than open coding it for the queue reset. Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/gfx12: per queue reset only on bare metalAlex Deucher1-0/+6
It's not supported under SR-IOV at the moment. Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/gfx11: per queue reset only on bare metalAlex Deucher1-0/+6
It's not supported under SR-IOV at the moment. Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/gfx10: per queue reset only on bare metalAlex Deucher1-0/+6
It's not supported under SR-IOV at the moment. Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/mes11: implement mmio queue reset for gfx11Jiadong Zhu1-0/+80
Implement queue reset for graphic and compute queue. v2: use amdgpu_gfx_rlc funcs to enter/exit safe mode. v3: use gfx_v11_0_request_gfx_index_mutex() v4: fix mutex handling Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Jiadong Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/mes: implement amdgpu_mes_reset_hw_queue_mmioJiadong Zhu1-0/+20
The reset_queue api could be used from kfd or kgd. v2: add use_mmio parameter for mes_reset_legacy_queue. Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Jiadong Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/mes: modify mes api for mmio queue resetJiadong Zhu4-4/+17
Add me/pipe/queue parameters for queue reset input. v2: fix build (Alex) Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Jiadong Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/gfx12: fallback to driver reset compute queue directlyAlex Deucher1-14/+79
Since the MES FW resets kernel compute queue always failed, this may caused by the KIQ failed to process unmap KCQ. So, before MES FW work properly that will fallback to driver executes dequeue and resets SPI directly. Besides, rework the ring reset function and make the busy ring type reset in each function respectively. Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/gfx12: add ring reset callbacksAlex Deucher1-0/+18
Add ring reset callbacks for gfx and compute. Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/gfx10: rework reset sequenceAlex Deucher1-7/+19
To match other GFX IPs. Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/gfx10: wait for reset done before remapJiadong Zhu1-11/+30
There is a racing condition that cp firmware modifies MQD in reset sequence after driver updates it for remapping. We have to wait till CP_HQD_ACTIVE becoming false then remap the queue. v2: fix KIQ locking (Alex) v3: fix KIQ locking harder (Jessie) Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Jiadong Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>