aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm
AgeCommit message (Collapse)AuthorFilesLines
2024-11-17Merge tag 'amd-drm-fixes-6.12-2024-11-16' of ↵Dave Airlie12-84/+36
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.12-2024-11-16: amdgpu: - Revert a swsmu patch to fix a regression Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241116145320.2507156-1-alexander.deucher@amd.com
2024-11-16Revert "drm/amd/pm: correct the workload setting"Alex Deucher12-84/+36
This reverts commit 74e1006430a5377228e49310f6d915628609929e. This causes a regression in the workload selection. A more extensive fix is being worked on. For now, revert. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3618 Fixes: 74e1006430a5 ("drm/amd/pm: correct the workload setting") Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-16Merge tag 'drm-xe-fixes-2024-11-14' of ↵Dave Airlie4-30/+39
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Driver Changes: - Fix unlock on exec ioctl error path (Matthew Brost) - Fix hibernation on LNL due to ggtt getting lost (Matthew Brost / Matthew Auld) - Fix missing runtime PM in OA release (Ashutosh) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/5ntcf2ssmmvo5dsf2mdcee4guwwmpbm3xrlufgt2pdfmznzjo3@62ygo3bxkock
2024-11-15Merge tag 'amd-drm-fixes-6.12-2024-11-14' of ↵Dave Airlie17-96/+117
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.12-2024-11-14: amdgpu: - PSR fix - Panel replay fixes - DML fix - vblank power fix - Fix video caps - SMU 14.0 fix - GPUVM fix - MES 12 fix - APU carve out fix - DC vbios fix - NBIO fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241114143401.448210-1-alexander.deucher@amd.com
2024-11-15Merge tag 'drm-misc-fixes-2024-11-14' of ↵Dave Airlie7-38/+71
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: bridge: - tc358768: Fix DSI command tx nouveau: - Fix GSP AUX error handling - dp: Handle retires for AUX CH transfers with GSP - fw: Sync DMA after setup panthor: - Fix partial BO mappings to GPU rockchip: - vop: Avoid null-ptr deref in plane-state check vmwgfx: - Avoid null-ptr deref in surface creation Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20241114142256.GA86810@2a02-2454-fd5e-fd00-4ce-489-4b34-bd1a.dyn6.pyur.net
2024-11-14drm/bridge: tc358768: Fix DSI command txFrancesco Dolcini1-2/+19
Wait for the command transmission to be completed in the DSI transfer function polling for the dc_start bit to go back to idle state after the transmission is started. This is documented in the datasheet and failures to do so lead to commands corruption. Fixes: ff1ca6397b1d ("drm/bridge: Add tc358768 driver") Cc: stable@vger.kernel.org Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20240926141246.48282-1-francesco@dolcini.it Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240926141246.48282-1-francesco@dolcini.it
2024-11-14drm/vmwgfx: avoid null_ptr_deref in vmw_framebuffer_surface_create_handleChen Ridong1-0/+2
The 'vmw_user_object_buffer' function may return NULL with incorrect inputs. To avoid possible null pointer dereference, add a check whether the 'bo' is NULL in the vmw_framebuffer_surface_create_handle. Fixes: d6667f0ddf46 ("drm/vmwgfx: Fix handling of dumb buffers") Signed-off-by: Chen Ridong <chenridong@huawei.com> Signed-off-by: Zack Rusin <zack.rusin@broadcom.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241029083429.1185479-1-chenridong@huaweicloud.com
2024-11-14nouveau/dp: handle retries for AUX CH transfers with GSP.Dave Airlie1-23/+34
eb284f4b3781 drm/nouveau/dp: Honor GSP link training retry timeouts tried to fix a problem with panel retires, however it appears the auxch also needs the same treatment, so add the same retry wrapper around it. This fixes some eDP panels after a suspend/resume cycle. Fixes: eb284f4b3781 ("drm/nouveau/dp: Honor GSP link training retry timeouts") Cc: stable@vger.kernel.org Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241111034126.2028401-2-airlied@gmail.com
2024-11-14nouveau: handle EBUSY and EAGAIN for GSP aux errors.Dave Airlie2-4/+4
The upper layer transfer functions expect EBUSY as a return for when retries should be done. Fix the AUX error translation, but also check for both errors in a few places. Fixes: eb284f4b3781 ("drm/nouveau/dp: Honor GSP link training retry timeouts") Cc: stable@vger.kernel.org Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241111034126.2028401-1-airlied@gmail.com
2024-11-14nouveau: fw: sync dma after setup is called.Dave Airlie1-5/+6
When this code moved to non-coherent allocator the sync was put too early for some firmwares which called the setup function, move the sync down after the setup function. Reported-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt> Tested-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt> Reviewed-by: Lyude Paul <lyude@redhat.com> Fixes: 9b340aeb26d5 ("nouveau/firmware: use dma non-coherent allocator") Cc: stable@vger.kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241114004603.3095485-1-airlied@gmail.com
2024-11-13drm/xe/oa: Fix "Missing outer runtime PM protection" warningAshutosh Dixit1-0/+2
Fix the following drm_WARN: [953.586396] xe 0000:00:02.0: [drm] Missing outer runtime PM protection ... <4> [953.587090] ? xe_pm_runtime_get_noresume+0x8d/0xa0 [xe] <4> [953.587208] guc_exec_queue_add_msg+0x28/0x130 [xe] <4> [953.587319] guc_exec_queue_fini+0x3a/0x40 [xe] <4> [953.587425] xe_exec_queue_destroy+0xb3/0xf0 [xe] <4> [953.587515] xe_oa_release+0x9c/0xc0 [xe] Suggested-by: John Harrison <john.c.harrison@intel.com> Suggested-by: Matthew Brost <matthew.brost@intel.com> Fixes: e936f885f1e9 ("drm/xe/oa/uapi: Expose OA stream fd") Cc: stable@vger.kernel.org Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241109032003.3093811-1-ashutosh.dixit@intel.com (cherry picked from commit b107c63d2953907908fd0cafb0e543b3c3167b75) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-11-13drm/xe: handle flat ccs during hibernation on igpuMatthew Auld1-1/+12
Starting from LNL, CCS has moved over to flat CCS model where there is now dedicated memory reserved for storing compression state. On platforms like LNL this reserved memory lives inside graphics stolen memory, which is not treated like normal RAM and is therefore skipped by the core kernel when creating the hibernation image. Currently if something was compressed and we enter hibernation all the corresponding CCS state is lost on such HW, resulting in corrupted memory. To fix this evict user buffers from TT -> SYSTEM to ensure we take a snapshot of the raw CCS state when entering hibernation, where upon resuming we can restore the raw CCS state back when next validating the buffer. This has been confirmed to fix display corruption on LNL when coming back from hibernation. Fixes: cbdc52c11c9b ("drm/xe/xe2: Support flat ccs") Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3409 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241112162827.116523-2-matthew.auld@intel.com (cherry picked from commit c8b3c6db941299d7cc31bd9befed3518fdebaf68) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-11-13drm/xe: improve hibernation on igpuMatthew Auld2-27/+16
The GGTT looks to be stored inside stolen memory on igpu which is not treated as normal RAM. The core kernel skips this memory range when creating the hibernation image, therefore when coming back from hibernation the GGTT programming is lost. This seems to cause issues with broken resume where GuC FW fails to load: [drm] *ERROR* GT0: load failed: status = 0x400000A0, time = 10ms, freq = 1250MHz (req 1300MHz), done = -1 [drm] *ERROR* GT0: load failed: status: Reset = 0, BootROM = 0x50, UKernel = 0x00, MIA = 0x00, Auth = 0x01 [drm] *ERROR* GT0: firmware signature verification failed [drm] *ERROR* CRITICAL: Xe has declared device 0000:00:02.0 as wedged. Current GGTT users are kernel internal and tracked as pinned, so it should be possible to hook into the existing save/restore logic that we use for dgpu, where the actual evict is skipped but on restore we importantly restore the GGTT programming. This has been confirmed to fix hibernation on at least ADL and MTL, though likely all igpu platforms are affected. This also means we have a hole in our testing, where the existing s4 tests only really test the driver hooks, and don't go as far as actually rebooting and restoring from the hibernation image and in turn powering down RAM (and therefore losing the contents of stolen). v2 (Brost) - Remove extra newline and drop unnecessary parentheses. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3275 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241101170156.213490-2-matthew.auld@intel.com (cherry picked from commit f2a6b8e396666d97ada8e8759dfb6a69d8df6380) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-11-13drm/xe: Restore system memory GGTT mappingsMatthew Brost2-4/+11
GGTT mappings reside on the device and this state is lost during suspend / d3cold thus this state must be restored resume regardless if the BO is in system memory or VRAM. v2: - Unnecessary parentheses around bo->placements[0] (Checkpatch) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241031182257.2949579-1-matthew.brost@intel.com (cherry picked from commit a19d1db9a3fa89fabd7c83544b84f393ee9b851f) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-11-13drm/xe: Ensure all locks released in exec IOCTLMatthew Brost1-2/+2
In couple of places the wrong error handling goto was used to release locks. Fix these to ensure all locks dropped on exec IOCTL errors. Cc: Francois Dugast <francois.dugast@intel.com> Fixes: d16ef1a18e39 ("drm/xe/exec: Switch hw engine group execution mode upon job submission") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241106224944.30130-1-matthew.brost@intel.com (cherry picked from commit 9e7aacd8402b88394e6a83cb242901fde77a1773) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-11-13drm/panthor: Fix handling of partial GPU mapping of BOsAkash Goel1-0/+2
This commit fixes the bug in the handling of partial mapping of the buffer objects to the GPU, which caused kernel warnings. Panthor didn't correctly handle the case where the partial mapping spanned multiple scatterlists and the mapping offset didn't point to the 1st page of starting scatterlist. The offset variable was not cleared after reaching the starting scatterlist. Following warning messages were seen. WARNING: CPU: 1 PID: 650 at drivers/iommu/io-pgtable-arm.c:659 __arm_lpae_unmap+0x254/0x5a0 <snip> pc : __arm_lpae_unmap+0x254/0x5a0 lr : __arm_lpae_unmap+0x2cc/0x5a0 <snip> Call trace: __arm_lpae_unmap+0x254/0x5a0 __arm_lpae_unmap+0x108/0x5a0 __arm_lpae_unmap+0x108/0x5a0 __arm_lpae_unmap+0x108/0x5a0 arm_lpae_unmap_pages+0x80/0xa0 panthor_vm_unmap_pages+0xac/0x1c8 [panthor] panthor_gpuva_sm_step_unmap+0x4c/0xc8 [panthor] op_unmap_cb.isra.23.constprop.30+0x54/0x80 __drm_gpuvm_sm_unmap+0x184/0x1c8 drm_gpuvm_sm_unmap+0x40/0x60 panthor_vm_exec_op+0xa8/0x120 [panthor] panthor_vm_bind_exec_sync_op+0xc4/0xe8 [panthor] panthor_ioctl_vm_bind+0x10c/0x170 [panthor] drm_ioctl_kernel+0xbc/0x138 drm_ioctl+0x210/0x4b0 __arm64_sys_ioctl+0xb0/0xf8 invoke_syscall+0x4c/0x110 el0_svc_common.constprop.1+0x98/0xf8 do_el0_svc+0x24/0x38 el0_svc+0x34/0xc8 el0t_64_sync_handler+0xa0/0xc8 el0t_64_sync+0x174/0x178 <snip> panthor : [drm] drm_WARN_ON(unmapped_sz != pgsize * pgcount) WARNING: CPU: 1 PID: 650 at drivers/gpu/drm/panthor/panthor_mmu.c:922 panthor_vm_unmap_pages+0x124/0x1c8 [panthor] <snip> pc : panthor_vm_unmap_pages+0x124/0x1c8 [panthor] lr : panthor_vm_unmap_pages+0x124/0x1c8 [panthor] <snip> panthor : [drm] *ERROR* failed to unmap range ffffa388f000-ffffa3890000 (requested range ffffa388c000-ffffa3890000) Fixes: 647810ec2476 ("drm/panthor: Add the MMU/VM logical block") Signed-off-by: Akash Goel <akash.goel@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241111134720.780403-1-akash.goel@arm.com Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
2024-11-12drm/amd: Fix initialization mistake for NBIO 7.7.0Vijendar Mukunda1-0/+6
There is a strapping issue on NBIO 7.7.0 that can lead to spurious PME events while in the D0 state. Co-developed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20241112161142.28974-1-mario.limonciello@amd.com Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 447a54a0f79c9a409ceaa17804bdd2e0206397b9) Cc: stable@vger.kernel.org
2024-11-12Revert "drm/amd/display: parse umc_info or vram_info based on ASIC"Alex Deucher1-3/+1
This reverts commit 694c79769cb384bca8b1ec1d1e84156e726bd106. This was not the root cause. Revert. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3678 Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: aurabindo.pillai@amd.com Cc: hamishclaxton@gmail.com (cherry picked from commit 3c2296b1eec55b50c64509ba15406142d4a958dc) Cc: stable@vger.kernel.org # 6.11.x
2024-11-12drm/amd/display: Fix failure to read vram info due to static BP_RESULTHamish Claxton1-1/+1
The static declaration causes the check to fail. Remove it. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3678 Fixes: 00c391102abc ("drm/amd/display: Add misc DC changes for DCN401") Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Hamish Claxton <hamishclaxton@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: aurabindo.pillai@amd.com Cc: hamishclaxton@gmail.com (cherry picked from commit 91314e7dfd83345b8b820b782b2511c9c32866cd) Cc: stable@vger.kernel.org # 6.11.x
2024-11-12drm/amdgpu: enable GTT fallback handling for dGPUs onlyChristian König1-1/+2
That is just a waste of time on APUs. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3704 Fixes: 216c1282dde3 ("drm/amdgpu: use GTT only as fallback for VRAM|GTT") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit e8fc090d322346e5ce4c4cfe03a8100e31f61c3c) Cc: stable@vger.kernel.org
2024-11-12drm/i915: Grab intel_display from the encoder to avoid potential oopsiesVille Syrjälä1-2/+2
Grab the intel_display from 'encoder' rather than 'state' in the encoder hooks to avoid the massive footgun that is intel_sanitize_encoder(), which passes NULL as the 'state' argument to encoder .disable() and .post_disable(). TODO: figure out how to actually fix intel_sanitize_encoder()... Fixes: ab0b0eb5c85c ("drm/i915/tv: convert to struct intel_display") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241107161123.16269-2-ville.syrjala@linux.intel.com Reviewed-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit dc3806d9eb66d0105f8d55d462d4ef681d9eac59) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2024-11-12drm/i915/gsc: ARL-H and ARL-U need a newer GSC FW.Daniele Ceraolo Spurio4-26/+60
All MTL and ARL SKUs share the same GSC FW, but the newer platforms are only supported in newer blobs. In particular, ARL-S is supported starting from 102.0.10.1878 (which is already the minimum required version for ARL in the code), while ARL-H and ARL-U are supported from 102.1.15.1926. Therefore, the driver needs to check which specific ARL subplatform its running on when verifying that the GSC FW is new enough for it. Fixes: 2955ae8186c8 ("drm/i915: ARL requires a newer GSC firmware") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241028233132.149745-1-daniele.ceraolospurio@intel.com (cherry picked from commit 3c1d5ced18db8a67251c8436cf9bdc061f972bdb) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2024-11-11drm/amdgpu/mes12: correct kiq unmap latencyJack Xiao1-1/+1
Correct kiq unmap queue timeout value. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit cfe98204a06329b6b7fce1b828b7d620473181ff) Cc: stable@vger.kernel.org # 6.11.x
2024-11-11drm/amdgpu: fix check in gmc_v9_0_get_vm_pte()Christian König1-5/+8
The coherency flags can only be determined when the BO is locked and that in turn is only guaranteed when the mapping is validated. Fix the check, move the resource check into the function and add an assert that the BO is locked. Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: d1a372af1c3d ("drm/amdgpu: Set MTYPE in PTE based on BO flags") Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 1b4ca8546f5b5c482717bedb8e031227b1541539) Cc: stable@vger.kernel.org
2024-11-11drm/amd/pm: print pp_dpm_mclk in ascending order on SMU v14.0.0Tim Huang1-2/+3
Currently, the pp_dpm_mclk values are reported in descending order on SMU IP v14.0.0/1/4. Adjust to ascending order for consistency with other clock interfaces. Signed-off-by: Tim Huang <tim.huang@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit d4be16ccfd5bf822176740a51ff2306679a2247e) Cc: stable@vger.kernel.org
2024-11-11drm/amdgpu: Fix video caps for H264 and HEVC encode maximum sizeDavid Rosca5-19/+19
H264 supports 4096x4096 starting from Polaris. HEVC also supports 4096x4096, with VCN 3 and newer 8192x4352 is supported. Signed-off-by: David Rosca <david.rosca@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 69e9a9e65b1ea542d07e3fdd4222b46e9f5a3a29) Cc: stable@vger.kernel.org
2024-11-11drm/amd/display: Adjust VSDB parser for replay featureRodrigo Siqueira1-1/+1
At some point, the IEEE ID identification for the replay check in the AMD EDID was added. However, this check causes the following out-of-bounds issues when using KASAN: [ 27.804016] BUG: KASAN: slab-out-of-bounds in amdgpu_dm_update_freesync_caps+0xefa/0x17a0 [amdgpu] [ 27.804788] Read of size 1 at addr ffff8881647fdb00 by task systemd-udevd/383 ... [ 27.821207] Memory state around the buggy address: [ 27.821215] ffff8881647fda00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 27.821224] ffff8881647fda80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 27.821234] >ffff8881647fdb00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 27.821243] ^ [ 27.821250] ffff8881647fdb80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 27.821259] ffff8881647fdc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 27.821268] ================================================================== This is caused because the ID extraction happens outside of the range of the edid lenght. This commit addresses this issue by considering the amd_vsdb_block size. Cc: ChiaHsuan Chung <chiahsuan.chung@amd.com> Reviewed-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit b7e381b1ccd5e778e3d9c44c669ad38439a861d8) Cc: stable@vger.kernel.org
2024-11-11drm/amd/display: Require minimum VBlank size for stutter optimizationDillon Varone1-2/+9
If the nominal VBlank is too small, optimizing for stutter can cause the prefetch bandwidth to increase drasticaly, resulting in higher clock and power requirements. Only optimize if it is >3x the stutter latency. Reviewed-by: Austin Zheng <austin.zheng@amd.com> Signed-off-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 003215f962cdf2265f126a3f4c9ad20917f87fca) Cc: stable@vger.kernel.org
2024-11-11drm/amd/display: Handle dml allocation failure to avoid crashRyan Seto1-0/+3
[Why] In the case where a dml allocation fails for any reason, the current state's dml contexts would no longer be valid. Then subsequent calls dc_state_copy_internal would shallow copy invalid memory and if the new state was released, a double free would occur. [How] Reset dml pointers in new_state to NULL and avoid invalid pointer Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Ryan Seto <ryanseto@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit bcafdc61529a48f6f06355d78eb41b3aeda5296c) Cc: stable@vger.kernel.org
2024-11-11drm/amd/display: Fix Panel Replay not update screen correctlyTom Chung2-57/+59
[Why] In certain use case such as KDE login screen, there will be no atomic commit while do the frame update. If the Panel Replay enabled, it will cause the screen not updated and looks like system hang. [How] Delay few atomic commits before enabled the Panel Replay just like PSR. Fixes: be64336307a6c ("drm/amd/display: Re-enable panel replay feature") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3686 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3682 Tested-By: Corey Hickey <bugfood-c@fatooh.org> Tested-By: James Courtier-Dutton <james.dutton@gmail.com> Reviewed-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit ca628f0eddd73adfccfcc06b2a55d915bca4a342) Cc: stable@vger.kernel.org # 6.11+
2024-11-11drm/amd/display: Change some variable name of psrTom Chung4-14/+14
Panel Replay feature may also use the same variable with PSR. Change the variable name and make it not specify for PSR. Reviewed-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit c7fafb7a46b38a11a19342d153f505749bf56f3e) Cc: stable@vger.kernel.org # 6.11+
2024-11-11Merge drm/drm-fixes into drm-misc-fixesThomas Zimmermann29-150/+165
Backmerging to get fixes from v6.12-rc7. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2024-11-09drm/rockchip: vop: Fix a dereferenced before check warningAndy Yan1-4/+4
The 'state' can't be NULL, we should check crtc_state. Fix warning: drivers/gpu/drm/rockchip/rockchip_drm_vop.c:1096 vop_plane_atomic_async_check() warn: variable dereferenced before check 'state' (see line 1077) Fixes: 5ddb0bd4ddc3 ("drm/atomic: Pass the full state to planes async atomic check and update") Signed-off-by: Andy Yan <andy.yan@rock-chips.com> Signed-off-by: Heiko Stuebner <heiko@sntech.de> Link: https://patchwork.freedesktop.org/patch/msgid/20241021072818.61621-1-andyshrk@163.com
2024-11-09Merge tag 'drm-xe-fixes-2024-11-08' of ↵Dave Airlie12-41/+54
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Driver Changes: - Fix ccs_mode setting for Xe2 and later (Balasubramani) - Synchronize ccs_mode setting with client creation (Balasubramani) - Apply scheduling WA for LNL in additional places as needed (Nirmoy) - Fix leak and lock handling in error paths of xe_exec ioctl (Matthew Brost) - Fix GGTT allocation leak leading to eventual crash in SR-IOV (Michal Wajdeczko) - Move run_ticks update out of job handling to avoid synchronization with reader (Lucas) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/4ffcebtluaaaohquxfyf5babpihmtscxwad3jjmt5nggwh2xpm@ztw67ucywttg
2024-11-09Merge tag 'drm-misc-fixes-2024-11-08' of ↵Dave Airlie9-5/+92
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: imagination: - Track PVR context per file - Break ref-counting cycle panel-orientation-quirks: - Fix matching Lenovo Yoga Tab 3 X90F panthor: - Lock VM array - Be strict about I/O mapping flags Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20241108085058.GA37468@linux.fritz.box
2024-11-07drm/panthor: Be stricter about IO mapping flagsJann Horn1-0/+4
The current panthor_device_mmap_io() implementation has two issues: 1. For mapping DRM_PANTHOR_USER_FLUSH_ID_MMIO_OFFSET, panthor_device_mmap_io() bails if VM_WRITE is set, but does not clear VM_MAYWRITE. That means userspace can use mprotect() to make the mapping writable later on. This is a classic Linux driver gotcha. I don't think this actually has any impact in practice: When the GPU is powered, writes to the FLUSH_ID seem to be ignored; and when the GPU is not powered, the dummy_latest_flush page provided by the driver is deliberately designed to not do any flushes, so the only thing writing to the dummy_latest_flush could achieve would be to make *more* flushes happen. 2. panthor_device_mmap_io() does not block MAP_PRIVATE mappings (which are mappings without the VM_SHARED flag). MAP_PRIVATE in combination with VM_MAYWRITE indicates that the VMA has copy-on-write semantics, which for VM_PFNMAP are semi-supported but fairly cursed. In particular, in such a mapping, the driver can only install PTEs during mmap() by calling remap_pfn_range() (because remap_pfn_range() wants to **store the physical address of the mapped physical memory into the vm_pgoff of the VMA**); installing PTEs later on with a fault handler (as panthor does) is not supported in private mappings, and so if you try to fault in such a mapping, vmf_insert_pfn_prot() splats when it hits a BUG() check. Fix it by clearing the VM_MAYWRITE flag (userspace writing to the FLUSH_ID doesn't make sense) and requiring VM_SHARED (copy-on-write semantics for the FLUSH_ID don't make sense). Reproducers for both scenarios are in the notes of my patch on the mailing list; I tested that these bugs exist on a Rock 5B machine. Note that I only compile-tested the patch, I haven't tested it; I don't have a working kernel build setup for the test machine yet. Please test it before applying it. Cc: stable@vger.kernel.org Fixes: 5fe909cae118 ("drm/panthor: Add the device logical block") Signed-off-by: Jann Horn <jannh@google.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241105-panthor-flush-page-fixes-v1-1-829aaf37db93@google.com
2024-11-07drm/panthor: Lock XArray when getting entries for the VMLiviu Dudau1-0/+2
Similar to commit cac075706f29 ("drm/panthor: Fix race when converting group handle to group object") we need to use the XArray's internal locking when retrieving a vm pointer from there. v2: Removed part of the patch that was trying to protect fetching the heap pointer from XArray, as that operation is protected by the @pool->lock. Fixes: 647810ec2476 ("drm/panthor: Add the MMU/VM logical block") Reported-by: Jann Horn <jannh@google.com> Cc: stable@vger.kernel.org Signed-off-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241106185806.389089-1-liviu.dudau@arm.com
2024-11-07drm: panel-orientation-quirks: Make Lenovo Yoga Tab 3 X90F DMI match less strictHans de Goede1-1/+0
There are 2G and 4G RAM versions of the Lenovo Yoga Tab 3 X90F and it turns out that the 2G version has a DMI product name of "CHERRYVIEW D1 PLATFORM" where as the 4G version has "CHERRYVIEW C0 PLATFORM". The sys-vendor + product-version check are unique enough that the product-name check is not necessary. Drop the product-name check so that the existing DMI match for the 4G RAM version also matches the 2G RAM version. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240825132131.6643-1-hdegoede@redhat.com
2024-11-05drm/xe: Stop accumulating LRC timestamp on job_freeLucas De Marchi2-2/+6
The exec queue timestamp is only really useful when it's being queried through the fdinfo. There's no need to update it so often, on every job_free. Tracing a simple app like vkcube running shows an update rate of ~ 120Hz. In case of discrete, the BO is on vram, creating a lot of pcie transactions. The update on job_free() is used to cover a gap: if exec queue is created and destroyed rapidly, before a new query, the timestamp still needs to be accumulated and accounted for in the xef. Initial implementation in commit 6109f24f87d7 ("drm/xe: Add helper to accumulate exec queue runtime") couldn't do it on the exec_queue_fini since the xef could be gone at that point. However since commit ce8c161cbad4 ("drm/xe: Add ref counting for xe_file") the xef is refcounted and the exec queue always holds a reference, making this safe now. Improve the fix in commit 2149ded63079 ("drm/xe: Fix use after free when client stats are captured") by reducing the frequency in which the update is needed. Fixes: 2149ded63079 ("drm/xe: Fix use after free when client stats are captured") Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241104143815.2112272-3-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 83db047d9425d9a649f01573797558eff0f632e1) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-11-05drm/xe/pf: Fix potential GGTT allocation leakMichal Wajdeczko1-1/+3
In unlikely event that we fail during sending the new VF GGTT configuration to the GuC, we will free only the GGTT node data struct but will miss to release the actual GGTT allocation. This will later lead to list corruption, GGTT space leak and finally risking crash when unloading the driver: [ ] ... [drm] GT0: PF: Failed to provision VF1 with 1073741824 (1.00 GiB) GGTT (-EIO) [ ] ... [drm] GT0: PF: VF1 provisioning remains at 0 (0 B) GGTT [ ] list_add corruption. next->prev should be prev (ffff88813cfcd628), but was 0000000000000000. (next=ffff88813cfe2028). [ ] RIP: 0010:__list_add_valid_or_report+0x6b/0xb0 [ ] Call Trace: [ ] drm_mm_insert_node_in_range+0x2c0/0x4e0 [ ] xe_ggtt_node_insert+0x46/0x70 [xe] [ ] pf_provision_vf_ggtt+0x7f5/0xa70 [xe] [ ] xe_gt_sriov_pf_config_set_ggtt+0x5e/0x770 [xe] [ ] ggtt_set+0x4b/0x70 [xe] [ ] simple_attr_write_xsigned.constprop.0.isra.0+0xb0/0x110 [ ] ... [drm] GT0: PF: Failed to provision VF1 with 1073741824 (1.00 GiB) GGTT (-ENOSPC) [ ] ... [drm] GT0: PF: VF1 provisioning remains at 0 (0 B) GGTT [ ] Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6b7b: 0000 [#1] PREEMPT SMP NOPTI [ ] RIP: 0010:drm_mm_remove_node+0x1b7/0x390 [ ] Call Trace: [ ] <TASK> [ ] ? die_addr+0x2e/0x80 [ ] ? exc_general_protection+0x1a1/0x3e0 [ ] ? asm_exc_general_protection+0x22/0x30 [ ] ? drm_mm_remove_node+0x1b7/0x390 [ ] ggtt_node_remove+0xa5/0xf0 [xe] [ ] xe_ggtt_node_remove+0x35/0x70 [xe] [ ] xe_ttm_bo_destroy+0x123/0x220 [xe] [ ] intel_user_framebuffer_destroy+0x44/0x70 [xe] [ ] intel_plane_destroy_state+0x3b/0xc0 [xe] [ ] drm_atomic_state_default_clear+0x1cd/0x2f0 [ ] intel_atomic_state_clear+0x9/0x20 [xe] [ ] __drm_atomic_state_free+0x1d/0xb0 Fix that by using pf_release_ggtt() on the error path, which now works regardless if the node has GGTT allocation or not. Fixes: 34e804220f69 ("drm/xe: Make xe_ggtt_node struct independent") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241104144901.1903-1-michal.wajdeczko@intel.com (cherry picked from commit 43b1dd2b550f0861ce80fbfffd5881b1b26272b1) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-11-05drm/xe: Drop VM dma-resv lock on xe_sync_in_fence_get failure in exec IOCTLMatthew Brost1-0/+1
Upon failure all locks need to be dropped before returning to the user. Fixes: 58480c1c912f ("drm/xe: Skip VMAs pin when requesting signal to the last XE_EXEC") Cc: <stable@vger.kernel.org> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241105043524.4062774-3-matthew.brost@intel.com (cherry picked from commit 7d1a4258e602ffdce529f56686925034c1b3b095) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-11-05drm/xe: Fix possible exec queue leak in exec IOCTLMatthew Brost1-4/+8
In a couple of places after an exec queue is looked up the exec IOCTL returns on input errors without dropping the exec queue ref. Fix this ensuring the exec queue ref is dropped on input error. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: <stable@vger.kernel.org> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241105043524.4062774-2-matthew.brost@intel.com (cherry picked from commit 07064a200b40ac2195cb6b7b779897d9377e5e6f) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-11-05drm/amdgpu: add missing size check in amdgpu_debugfs_gprwave_read()Alex Deucher1-1/+1
Avoid a possible buffer overflow if size is larger than 4K. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit f5d873f5825b40d886d03bd2aede91d4cf002434) Cc: stable@vger.kernel.org
2024-11-05drm/amdgpu: Adjust debugfs eviction and IB access permissionsAlex Deucher1-3/+3
Users should not be able to run these. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 7ba9395430f611cfc101b1c2687732baafa239d5) Cc: stable@vger.kernel.org
2024-11-05drm/amdgpu: Adjust debugfs register access permissionsAlex Deucher1-1/+1
Regular users shouldn't have read access. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit c0cfd2e652553d607b910be47d0cc5a7f3a78641) Cc: stable@vger.kernel.org
2024-11-05drm/amdgpu: Fix DPX valid mode check on GC 9.4.3Lijo Lazar1-1/+1
For DPX mode, the number of memory partitions supported should be less than or equal to 2. Fixes: 1589c82a1085 ("drm/amdgpu: Check memory ranges for valid xcp mode") Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 990c4f580742de7bb78fa57420ffd182fc3ab4cd) Cc: stable@vger.kernel.org
2024-11-05Merge drm/drm-fixes into drm-misc-fixesThomas Zimmermann19-71/+258
Backmerging to get the latest fixes from v6.12-rc6. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2024-11-04drm/amd/pm: correct the workload settingKenneth Feng12-36/+84
Correct the workload setting in order not to mix the setting with the end user. Update the workload mask accordingly. v2: changes as below: 1. the end user can not erase the workload from driver except default workload. 2. always shows the real highest priority workoad to the end user. 3. the real workload mask is combined with driver workload mask and end user workload mask. v3: apply this to the other ASICs as well. v4: simplify the code v5: refine the code based on the review comments. Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 8cc438be5d49b8326b2fcade0bdb7e6a97df9e0b) Cc: stable@vger.kernel.org # 6.11.x
2024-11-04drm/amd/pm: always pick the pptable from IFWIKenneth Feng1-64/+1
always pick the pptable from IFWI on smu v14.0.2/3 Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 136ce12bd5907388cb4e9aa63ee5c9c8c441640b) Cc: stable@vger.kernel.org # 6.11.x
2024-11-04drm/amdgpu: prevent NULL pointer dereference if ATIF is not supportedAntonio Quartulli1-2/+2
acpi_evaluate_object() may return AE_NOT_FOUND (failure), which would result in dereferencing buffer.pointer (obj) while being NULL. Although this case may be unrealistic for the current code, it is still better to protect against possible bugs. Bail out also when status is AE_NOT_FOUND. This fixes 1 FORWARD_NULL issue reported by Coverity Report: CID 1600951: Null pointer dereferences (FORWARD_NULL) Signed-off-by: Antonio Quartulli <antonio@mandelbit.com> Fixes: c9b7c809b89f ("drm/amd: Guard against bad data for ATIF ACPI method") Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20241031152848.4716-1-antonio@mandelbit.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 91c9e221fe2553edf2db71627d8453f083de87a1) Cc: stable@vger.kernel.org