aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu
AgeCommit message (Collapse)AuthorFilesLines
2024-06-14drm/amdgpu: refine gmc firmware loadingYang Wang3-19/+8
refine gmc firmware loading Signed-off-by: Yang Wang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-14drm/amdgpu: refine vpe firmware loadingYang Wang1-4/+2
refine vpe firmware loading Signed-off-by: Yang Wang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-14drm/amdgpu: refine vcn firmware loadingYang Wang1-9/+5
refine vcn firmware loading Signed-off-by: Yang Wang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-14drm/amdgpu: move aca/mca init functions into ras_init() stageYang Wang4-37/+69
adjust the function position to better match aca/mca fini code in ras_fini(). Signed-off-by: Yang Wang <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-14drm/amdgpu: fix overflowed constant warning in mmhub_set_clockgating()Bob Zhou4-4/+4
To fix potential overflowed constant warning, modify the variables to u32 for getting the return value of RREG32_SOC15(). Signed-off-by: Bob Zhou <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-14drm/amdgpu: Indicate CU havest info to CPHarish Kasiviswanathan1-2/+13
To achieve full occupancy CP hardware needs to know if CUs in SE are symmetrically or asymmetrically harvested v2: Reset is_symmetric_cus for each loop Signed-off-by: Harish Kasiviswanathan <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-14drm/amdgpu: Skip coredump during resets for debugLijo Lazar1-0/+1
Skip scheduling coredump when gpu reset is intentionally triggered through debugfs. Signed-off-by: Lijo Lazar <[email protected]> Reviewed-by: Asad Kamal <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-14drm/amdgpu: refine sdma firmware loadingYang Wang4-19/+22
refine sdma firmware loading Signed-off-by: Yang Wang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-14drm/amdgpu: refine psp firmware loadingYang Wang1-19/+7
refine psp firmware loading Signed-off-by: Yang Wang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-14drm/amdgpu: refine mes firmware loadingYang Wang1-4/+2
v1: refine mes firmware loading v2: use dev_info instead of DRM_INFO Signed-off-by: Yang Wang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-14drm/amdgpu: Add missing locking for MES API callsMukul Joshi1-0/+12
Add missing locking at a few places when calling MES APIs to ensure exclusive access to MES queue. Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Kent Russell <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-14drm/amd/display: Set default brightness according to ACPIMario Limonciello2-0/+5
Currently, amdgpu will always set up the brightness at 100% when it loads. However this is jarring when the BIOS has it previously programmed to a much lower value. The ACPI ATIF method includes two members for "ac_level" and "dc_level". These represent the default values that should be used if the system is brought up in AC and DC respectively. Use these values to set up the default brightness when the backlight device is registered. v2: squash in ACPI fix Reviewed-by: Leo Li <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Mario Limonciello <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-14drm/amdgpu: drop some kernel messages in VCN codeDavid (Ming Qiang) Wu14-104/+17
Similar to commit 813e7d4cd05e where some kernel log messages are dropped. With this commit, more log messages in older version of VCN/JPEG code are dropped. Acked-by: Leo Liu <[email protected]> Signed-off-by: David (Ming Qiang) Wu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-14drm/amdgpu: refine gpu_info firmware loadingYang Wang1-5/+4
refine gpu_info firmware loading Signed-off-by: Yang Wang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-14drm/amdgpu: enhance amdgpu_ucode_request() function flexibilityYang Wang2-9/+24
v1: Adding formatting string feature to improve function flexibility. v2: modify macro name to ADMGPU_UCODE_MAX_NAME. Signed-off-by: Yang Wang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-14drm/amdgpu: fix the overflowed constant warning for RREG32_SOC15()Bob Zhou1-3/+4
To fix potential overflowed constant warning reported by Coverity, modify the variables to uint32_t. Signed-off-by: Bob Zhou <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-14drm/amdgpu: add lock in amdgpu_gart_invalidate_tlbYunxiang Li1-1/+5
We need to take the reset domain lock before flush hdp. We can't put the lock inside amdgpu_device_flush_hdp itself because it is used during reset where we already take the write side lock. Signed-off-by: Yunxiang Li <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-14drm/amdgpu: fix locking scope when flushing tlbYunxiang Li1-32/+34
Which method is used to flush tlb does not depend on whether a reset is in progress or not. We should skip flush altogether if the GPU will get reset. So put both path under reset_domain read lock. Signed-off-by: Yunxiang Li <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]> CC: [email protected]
2024-06-14drm/amdgpu: call flush_gpu_tlb directly in gfxhub enableYunxiang Li3-5/+7
Here since we are in reset and takes the reset_domain write side lock already. We can't use the flush tlb helper which tries to take the read side. Signed-off-by: Yunxiang Li <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-14drm/amdgpu: use helper in amdgpu_gart_unbindYunxiang Li1-4/+1
When amdgpu_gart_invalidate_tlb helper is introduced this part was left out of the conversion. Avoid the code duplication here. Signed-off-by: Yunxiang Li <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-14drm/amdgpu: remove tlb flush in amdgpu_gtt_mgr_recoverYunxiang Li1-2/+0
At this point the gart is not set up, there's no point to invalidate tlb here and it could even be harmful. Signed-off-by: Yunxiang Li <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-14drm/amdgpu: fix sriov host flr handlerYunxiang Li6-52/+50
We send back the ready to reset message before we stop anything. This is wrong. Move it to when we are actually ready for the FLR to happen. In the current state since we take tens of seconds to stop everything, it is very likely that host would give up waiting and reset the GPU before we send ready, so it would be the same as before. But this gets rid of the hack with reset_domain locking and also let us tell how slow ready to reset actually is from the host. The ready to reset speed can be improved later. Signed-off-by: Yunxiang Li <[email protected]> Acked-by: Christian König <[email protected]> Reviewed-by: Emily Deng <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-14drm/amdgpu: add skip_hw_access checks for sriovYunxiang Li1-0/+9
Accessing registers via host is missing the check for skip_hw_access and the lockdep check that comes with it. Signed-off-by: Yunxiang Li <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-14drm/amdgpu: add reset source in various casesEric Huang3-0/+3
To fullfill the reset event description. Suggested-by: Lijo Lazar <[email protected]> Signed-off-by: Eric Huang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-14drm/amdgpu: fix NULL pointer in amdgpu_reset_get_descEric Huang1-4/+2
amdgpu_job_ring may return NULL, which causes kernel NULL pointer error, using another way to print ring name instead of ring->name. Suggested-by: Lijo Lazar <[email protected]> Signed-off-by: Eric Huang <[email protected]> Acked-by: Alex Deucher <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-14drm/amdgpu: remove dead code in atom_get_src_intJesse Zhang1-4/+4
Since the range of align is 0~7, the expression is: align = (attr >> 3) & 7. In the case of ATOM_ARG_IMM, the code cannot reach the default case. So there is no need for "break". Signed-off-by: Jesse Zhang <[email protected]> Suggested-by: Tim Huang <[email protected]> Reviewed-by: Tim Huang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-14drm/amdgpu: add sdma 7.0 support for copy dcc bufferFrank Min4-4/+27
1. Add dcc buffer flag for copy buffer 2. Add sdma 7.0 support copy dcc buffer Signed-off-by: Likun Gao <[email protected]> Signed-off-by: Frank Min <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-14drm/amdgpu: support for DCC featureLikun Gao2-0/+9
Deal with AMDGPU_GEM_CREATE_GFX12_DCC to set DCC bit when needed. Signed-off-by: Likun Gao <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-14drm/amdgpu: add additional VM bitsAlex Deucher1-0/+1
Add additional VM PTE bits. Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-14Merge drm/drm-fixes into drm-misc-fixesMaxime Ripard5-18/+27
Roll -rc3 and current drm/fixes in. This will also unstuck our for-next branch. Signed-off-by: Maxime Ripard <[email protected]>
2024-06-11Merge tag 'amd-drm-next-6.11-2024-06-07' of ↵Dave Airlie121-1498/+14182
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.11-2024-06-07: amdgpu: - DCN 4.0.x support - DCN 3.5 updates - GC 12.0 support - DP MST fixes - Cursor fixes - MES11 updates - MMHUB 4.1 support - DML2 Updates - DCN 3.1.5 fixes - IPS fixes - Various code cleanups - GMC 12.0 support - SDMA 7.0 support - SMU 13 updates - SR-IOV fixes - VCN 5.x fixes - MES12 support - SMU 14.x updates - Devcoredump improvements - Fixes for HDP flush on platforms with >4k pages - GC 9.4.3 fixes - RAS ACA updates - Silence UBSAN flex array warnings - MMHUB 3.3 updates amdkfd: - Contiguous VRAM allocations - GC 12.0 support - SDMA 7.0 support - SR-IOV fixes radeon: - Backlight workaround for iMac - Silence UBSAN flex array warnings UAPI: - GFX12 modifier and DCC support Proposed Mesa changes: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/29510 - KFD GFX ALU exceptions Proposed ROCdebugger changes: https://github.com/ROCm/ROCdbgapi/commit/08c760622b6601abf906f75abbc5e21d9fd425df https://github.com/ROCm/ROCgdb/commit/944fe1c1414a68700414e86e32273b6bfa62ba6f - KFD Contiguous VRAM allocation flag Proposed ROCr/HIP changes: https://github.com/ROCm/ROCT-Thunk-Interface/commit/f7b4a269914a3ab4f1e2453c2879adb97b5cc9e5 https://github.com/ROCm/ROCR-Runtime/pull/214/commits/26e8530d05a775872cb06dde6693db72be0c454a https://github.com/ROCm/clr/commit/1d48f2a1ab38b632919c4b7274899b3faf4279ff Signed-off-by: Dave Airlie <[email protected]> From: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-06-10drm/amdgpu: Fix the BO release clear memory warningArunpravin Paneer Selvam2-2/+1
This happens when the amdgpu_bo_release_notify running before amdgpu_ttm_set_buffer_funcs_status set the buffer funcs to enabled. check the buffer funcs enablement before calling the fill buffer memory. v2:(Christian) - Apply it only for GEM buffers and since GEM buffers are only allocated/freed while the driver is loaded we never run into the issue to clear with buffer funcs disabled. v3:(Mario) - drop the stable tag as this will presumably go into a -fixes PR for 6.10 Log snip: *ERROR* Trying to clear memory with ring turned off. RIP: 0010:amdgpu_bo_release_notify+0x201/0x220 [amdgpu] Fixes: a68c7eaa7a8f ("drm/amdgpu: Enable clear page functionality") Signed-off-by: Arunpravin Paneer Selvam <[email protected]> Reviewed-by: Christian König <[email protected]> Tested-by: Mikhail Gavrilov <[email protected]> Tested-by: Richard Gong <[email protected]> Suggested-by: Christian König <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-06-05drm/amdgpu: add RAS is_rma flagTao Zhou4-11/+12
Set the flag to true if bad page number reaches threshold. Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-05drm/amdgpu: fix failure mapping legacy queue when FLRLin.Cao1-0/+2
Flag "mes.ring.shced.ready" will be set as true after mes hw init and set as false when mes hw fini to avoid duplicate initialization. But hw fini will not be called when function level reset, which will cause mes hw init be skipped during FLR, which will leads to mapping legacy queue fail. Set this flag as false when post reset will fix this issue. Signed-off-by: Lin.Cao <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-05drm/amdkfd: add reset cause in gpu pre-reset smi eventEric Huang3-6/+14
reset cause is requested by customer as additional info for gpu reset smi event. v2: integerate reset sources suggested by Lijo Lazar Signed-off-by: Eric Huang <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-05drm/amdgpu: Set PTE_IS_PTE bit for gfx12Frank Min1-0/+1
Set PTE_IS_PTE bit while PRT is enabled on gfx12. Signed-off-by: Frank Min <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-05drm/amdgpu: add reset sources in gpu reset contextEric Huang2-0/+47
reset source or reset cause is very useful info for reset context, it will be used by events API. Suggested-by: Lijo Lazar <[email protected]> Signed-off-by: Eric Huang <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-05drm/amdgpu: Update the impelmentation of AMDGPU_PTE_MTYPE_VG10Shane Xiao3-21/+21
This patch changes the implementation of AMDGPU_PTE_MTYPE_VG10, clear the bits before setting the new one. Suggested-by: Alex Deucher <[email protected]> Signed-off-by: longlyao <[email protected]> Signed-off-by: Shane Xiao <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-05Revert "drm/amdgpu/gfx11: enable gfx pipe1 hardware support"Alex Deucher1-3/+3
This reverts commit 6670142d25f3cc3166f2a6c8454acd310bf2776a. Pierre-Eric reported problems with this on his navi33. Revert for now until we understand what is going wrong. Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2024-06-05drm/amdgpu: Update the impelmentation of AMDGPU_PTE_MTYPE_NV10Shane Xiao3-20/+21
This patch changes the implementation of AMDGPU_PTE_MTYPE_NV10, clear the bits before setting the new one. Suggested-by: Alex Deucher <[email protected]> Signed-off-by: longlyao <[email protected]> Signed-off-by: Shane Xiao <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-05drm/amdgpu: disable lane0 L1TLB and enable lane1 L1TLBYifan Zhang1-0/+13
This patch to disable lane0 L1TLB and enable lane1 L1TLB. Signed-off-by: Yifan Zhang <[email protected]> Reviewed-by: Mario Limonciello <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-05drm/amdgpu: init SAW registers for mmhub v3.3Yifan Zhang1-0/+40
This patch to configure mmhub3.3 SAW registers Signed-off-by: Yifan Zhang <[email protected]> Reviewed-by: Mario Limonciello <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-05drm/amdgpu: fix comments and error message for ipdumpSunil Khatri3-8/+8
Fix comments and error messages to rightly represent the information. Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Sunil Khatri <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-05drm/amdgpu: rename ip_dump_cp_queues to compute queuesSunil Khatri4-22/+22
Rename the variable ip_dump_cp_queues to ip_dump_compute_queue as it represent compute queues. Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Sunil Khatri <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-05drm/amdgpu: add cp queue registers for gfx9 ipdumpSunil Khatri1-2/+108
Add gfx9 support of CP queue registers for all queues to be used by devcoredump. Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Sunil Khatri <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-05drm/amdgpu: add print support for gfx9 ipdumpSunil Khatri1-1/+17
Add support of gfx9 ipdump print so devcoredump could trigger it to dump the captured registers in devcoredump. Signed-off-by: Sunil Khatri <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-05drm/amdgpu: add gfx9 register support in ipdumpSunil Khatri1-1/+123
Add general registers of gfx9 in ipdump for devcoredump support. Signed-off-by: Sunil Khatri <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-05drm/amdgpu: Fix type mismatch in amdgpu_gfx_kiq_init_ringSrinivasan Shanmugam1-1/+2
This commit fixes a type mismatch in the amdgpu_gfx_kiq_init_ring function triggered by the snprintf function expecting unsigned char arguments due to the '%hhu' format specifier, but receiving int and u32 arguments. The issue occurred because the arguments xcc_id, ring->me, ring->pipe, and ring->queue were of type int and u32, not unsigned char. This led to a type mismatch when these arguments were passed to snprintf. To resolve this, the snprintf line was modified to cast these arguments to unsigned char. This ensures that the arguments are of the correct type for the '%hhu' format specifier and resolves the warning. Fixes the below: >> drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:333:4: warning: format >> specifies type 'unsigned char' but the argument has type 'int' >> [-Wformat] xcc_id, ring->me, ring->pipe, ring->queue); ^~~~~~ >> drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:333:12: warning: format >> specifies type 'unsigned char' but the argument has type 'u32' (aka >> 'unsigned int') [-Wformat] xcc_id, ring->me, ring->pipe, ring->queue); ^~~~~~~~ drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:333:22: warning: format specifies type 'unsigned char' but the argument has type 'u32' (aka 'unsigned int') [-Wformat] xcc_id, ring->me, ring->pipe, ring->queue); ^~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c:333:34: warning: format specifies type 'unsigned char' but the argument has type 'u32' (aka 'unsigned int') [-Wformat] xcc_id, ring->me, ring->pipe, ring->queue); ^~~~~~~~~~~ 4 warnings generated. Fixes: 0ea554455542 ("drm/amdgpu: Fix snprintf usage in amdgpu_gfx_kiq_init_ring") Reported-by: kernel test robot <[email protected]> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/ Cc: Lijo Lazar <[email protected]> Cc: Alex Deucher <[email protected]> Cc: Christian König <[email protected]> Signed-off-by: Srinivasan Shanmugam <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-05drm/amdgu: fix Unintentional integer overflow for mall sizeJesse Zhang1-1/+1
Potentially overflowing expression mall_size_per_umc * adev->gmc.num_umc with type unsigned int (32 bits, unsigned) is evaluated using 32-bit arithmetic,and then used in a context that expects an expression of type u64 (64 bits, unsigned). Signed-off-by: Jesse Zhang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-06-05drm/amdgpu: Update programming for boot error reportingHawking Zhang2-57/+46
AMDGPU_RAS_GPU_ERR_BOOT_STATUS field is no longer valid. The polling sequence is also simplifed according to the latest firmware change. Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>