aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-03-20drm/amdgpu: retire gfx ras query_utcl2_poison_statusTao Zhou6-21/+30
Replace it with related interface in gfxhub functions. v2: replace node id with xcc id. get node id for query_utcl2_poison_status Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amdgpu: add ring buffer information in devcoredumpSunil Khatri1-0/+21
Add relevant ringbuffer information such as rptr, wptr,rb mask, ring name, ring size and also the rings content for each ring on a gpu reset. Signed-off-by: Sunil Khatri <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amd/display: Fix function banner for amdgpu_dm_psr_disable_all()Roman Li1-1/+1
[WHY] Incorrect function name in function banner. [HOW] Correct name and brief description. Reviewed-by: Hersen Wu <[email protected]> Reviewed-by: Aurabindo Pillai <[email protected]> Acked-by: Alex Hung <[email protected]> Signed-off-by: Roman Li <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amdgpu: add vm fault information to devcoredumpSunil Khatri1-0/+12
Add page fault information to the devcoredump. Output of devcoredump: **** AMDGPU Device Coredump **** version: 1 kernel: 6.7.0-amd-staging-drm-next module: amdgpu time: 29.725011811 process_name: soft_recovery_p PID: 1720 Ring timed out details IP Type: 0 Ring Name: gfx_0.0.0 [gfxhub] Page fault observed Faulty page starting at address: 0x0000000000000000 Protection fault status register: 0x301031 VRAM is lost due to GPU reset! Signed-off-by: Sunil Khatri <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amdgpu: add utcl2 poison query for gfxhubTao Zhou3-0/+34
Implement it for gfxhub 1.0 and 1.2. v2: input logical xcc id for poison query interface. Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amdgpu: add new bit definitions for GC 9.0 PROTECTION_FAULT_STATUSTao Zhou1-0/+4
Add UCE and FED bit definitions. Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amd/display: Add DML2 folder to include pathAurabindo Pillai1-0/+1
Add DML2 compilation rule in the Makefile. Reviewed-by: Chaitanya Dhere <[email protected]> Acked-by: Alex Hung <[email protected]> Signed-off-by: Aurabindo Pillai <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amdgpu: add recent pagefault info in vm_managerSunil Khatri2-0/+10
Currently page fault information is stored per vm and which could be freed or stale during reset. Add it pagefault information in the vm_manager which is a global space for vm's and remains valid across. Signed-off-by: Sunil Khatri <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amd/display: Add some forward declarationsAurabindo Pillai1-0/+2
[WHAT] Add DML2 pipe and config struct forward declaration as a preparation for DML2. Reviewed-by: Chaitanya Dhere <[email protected]> Acked-by: Alex Hung <[email protected]> Signed-off-by: Aurabindo Pillai <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amd/display: Revert Add left edge pixel + ODM pipe splitGabe Teeger5-57/+0
This reverts commit e2fdd5c5257d ("drm/amd/display: Add left edge pixel for YCbCr422/420 + ODM pipe split") Cc: Mario Limonciello <[email protected]> Cc: Alex Deucher <[email protected]> Reviewed-by: George Shen <[email protected]> Reviewed-by: Charlene Liu <[email protected]> Reviewed-by: Jun Lei <[email protected]> Acked-by: Wayne Lin <[email protected]> Signed-off-by: Gabe Teeger <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amd/display: Add left edge pixel for YCbCr422/420 + ODM pipe splitGeorge Shen5-0/+57
[WHY] Currently 3-tap chroma subsampling is used for YCbCr422/420. When ODM pipesplit is used, pixels on the left edge of ODM slices need one extra pixel from the right edge of the previous slice to calculate the correct chroma value. Without this change, the chroma value is slightly different than expected. This is usually imperceptible visually, but it impacts test pattern CRCs for compliance test automation. [HOW] Update logic to use the register for adding extra left edge pixel for YCbCr422/420 ODM cases. Cc: Mario Limonciello <[email protected]> Cc: Alex Deucher <[email protected]> Reviewed-by: Alvin Lee <[email protected]> Acked-by: Alex Hung <[email protected]> Signed-off-by: George Shen <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amd/display: revert Exit idle optimizations before HDCP executionMartin Leung2-18/+0
why and how: causes black screen on PNP on DCN 3.5 This reverts commit f30a3bea92bd ("drm/amd/display: Exit idle optimizations before HDCP execution") Cc: Mario Limonciello <[email protected]> Cc: Alex Deucher <[email protected]> Reviewed-by: Nicholas Kazlauskas <[email protected]> Acked-by: Wayne Lin <[email protected]> Signed-off-by: Martin Leung <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amd/display: Exit idle optimizations before HDCP executionNicholas Kazlauskas2-0/+18
[WHY] PSP can access DCN registers during command submission and we need to ensure that DCN is not in PG before doing so. [HOW] Add a callback to DM to lock and notify DC for idle optimization exit. It can't be DC directly because of a potential race condition with the link protection thread and the rest of DM operation. Cc: Mario Limonciello <[email protected]> Cc: Alex Deucher <[email protected]> Reviewed-by: Charlene Liu <[email protected]> Acked-by: Alex Hung <[email protected]> Signed-off-by: Nicholas Kazlauskas <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amdgpu/pm: Don't use OD table on ArcturusMa Jun1-28/+5
OD is not supported on Arcturus, so the OD table should not be used. Signed-off-by: Ma Jun <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amdgpu: drop setting buffer funcs in sdma442Le Ma1-22/+1
To fix the entity rq NULL issue. This setting has been moved to upper level. Fixes: b70438004a14 ("drm/amdgpu: move buffer funcs setting up a level") Signed-off-by: Le Ma <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amd/display: Fix noise issue on HDMI AV muteLeo Ma1-1/+11
[Why] When mode switching is triggered there is momentary noise visible on some HDMI TV or displays. [How] Wait for 2 frames to make sure we have enough time to send out AV mute and sink receives a full frame. Cc: Mario Limonciello <[email protected]> Cc: Alex Deucher <[email protected]> Cc: [email protected] Reviewed-by: Wenjing Liu <[email protected]> Acked-by: Wayne Lin <[email protected]> Signed-off-by: Leo Ma <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amd/display: Revert Remove pixle rate limit for subvpWenjing Liu1-0/+1
This reverts commit 340383c734f8 ("drm/amd/display: Remove pixle rate limit for subvp") [why] The original commit causes a regression when subvp is applied on ODM required 8k60hz timing. The display shows black screen on boot. The issue can be recovered with hotplug. It also causes MPO to fail. We will temprarily revert this commit and investigate the root cause further. Cc: Mario Limonciello <[email protected]> Cc: Alex Deucher <[email protected]> Cc: [email protected] Reviewed-by: Chaitanya Dhere <[email protected]> Reviewed-by: Martin Leung <[email protected]> Acked-by: Wayne Lin <[email protected]> Signed-off-by: Wenjing Liu <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20Revert "drm/amdgpu/vpe: don't emit cond exec command under collaborate mode"Lang Yu1-3/+0
Ready now. Remove this workaround. This reverts commit d40f6213b52c161fd4634933acbc32103a283363. Signed-off-by: Lang Yu <[email protected]> Tested-by: Alan Liu <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20Revert "drm/amd/amdgpu: Fix potential ioremap() memory leaks in ↵Ma Jun1-10/+6
amdgpu_device_init()" This patch causes the following iounmap erorr and calltrace iounmap: bad address 00000000d0b3631f The original patch was unjustified because amdgpu_device_fini_sw() will always cleanup the rmmio mapping. This reverts commit eb4f139888f636614dab3bcce97ff61cefc4b3a7. Signed-off-by: Ma Jun <[email protected]> Suggested-by: Christian König <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amd/display: Add a dc_state NULL check in dc_state_releaseAllen Pan1-1/+2
[How] Check wheather state is NULL before releasing it. Cc: Mario Limonciello <[email protected]> Cc: Alex Deucher <[email protected]> Cc: [email protected] Reviewed-by: Charlene Liu <[email protected]> Acked-by: Alex Hung <[email protected]> Signed-off-by: Allen Pan <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amd/display: Return the correct HDCP error codeRodrigo Siqueira1-0/+3
[WHY & HOW] If the display is null when creating an HDCP session, return a proper error code. Cc: Mario Limonciello <[email protected]> Cc: Alex Deucher <[email protected]> Cc: [email protected] Acked-by: Alex Hung <[email protected]> Signed-off-by: Rodrigo Siqueira <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amd/display: Implement wait_for_odm_update_pending_completeWenjing Liu11-4/+90
[WHY] Odm update is doubled buffered. We need to wait for ODM update to be completed before optimizing bandwidth or programming new udpates. [HOW] implement wait_for_odm_update_pending_complete function to wait for: 1. odm configuration update is no longer pending in timing generator. 2. no pending dpg pattern update for each active OPP. Cc: Mario Limonciello <[email protected]> Cc: Alex Deucher <[email protected]> Cc: [email protected] Reviewed-by: Alvin Lee <[email protected]> Acked-by: Alex Hung <[email protected]> Signed-off-by: Wenjing Liu <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amd/display: Lock all enabled otg pipes even with no planesWenjing Liu3-1/+26
[WHY] On DCN32 we support dynamic ODM even when OTG is blanked. When ODM configuration is dynamically changed and the OTG is on blank pattern, we will need to reprogram OPP's test pattern based on new ODM configuration. Therefore we need to lock the OTG pipe to avoid temporary corruption when we are reprogramming OPP blank patterns. [HOW] Add a new interdependent update lock implementation to lock all enabled OTG pipes even when there is no plane on the OTG for DCN32. Cc: Mario Limonciello <[email protected]> Cc: Alex Deucher <[email protected]> Cc: [email protected] Reviewed-by: Alvin Lee <[email protected]> Acked-by: Alex Hung <[email protected]> Signed-off-by: Wenjing Liu <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amd/display: Amend coasting vtotal for replay low hzChunTao Tso7-10/+18
[WHY] The original coasting vtotal is 2 bytes, and it need to be amended to 4 bytes because low hz case. [HOW] Amend coasting vtotal from 2 bytes to 4 bytes. Cc: Mario Limonciello <[email protected]> Cc: Alex Deucher <[email protected]> Cc: [email protected] Reviewed-by: Alvin Lee <[email protected]> Acked-by: Alex Hung <[email protected]> Signed-off-by: ChunTao Tso <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amd/display: Fix idle check for shared firmware stateNicholas Kazlauskas1-9/+3
[WHY] We still had an instance of get_idle_state checking the PMFW scratch register instead of the actual idle allow signal. [HOW] Replace it with the SW state check for whether we had allowed idle through notify_idle. Cc: Mario Limonciello <[email protected]> Cc: Alex Deucher <[email protected]> Cc: [email protected] Reviewed-by: Duncan Ma <[email protected]> Acked-by: Alex Hung <[email protected]> Signed-off-by: Nicholas Kazlauskas <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amd/display: Update odm when ODM combine is changed on an otg master ↵Wenjing Liu2-20/+28
pipe with no plane [WHY] When committing an update with ODM combine change when the plane is removing or already removed, we fail to detect odm change in pipe update flags. This has caused mismatch between new dc state and the actual hardware state, because we missed odm programming. [HOW] - Detect odm change even for otg master pipe without a plane. - Update odm config before calling program pipes for pipe with planes. The commit also updates blank pattern programming when odm is changed without plane. This is because number of OPP is changed when ODM combine is changed. Blank pattern is per OPP so we will need to reprogram OPP based on the new pipe topology. Cc: Mario Limonciello <[email protected]> Cc: Alex Deucher <[email protected]> Cc: [email protected] Reviewed-by: Dillon Varone <[email protected]> Acked-by: Alex Hung <[email protected]> Signed-off-by: Wenjing Liu <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amd/display: Init DPPCLK from SMU on dcn32Dillon Varone5-8/+41
[WHY & HOW] DPPCLK ranges should be obtained from the SMU when available. Cc: Mario Limonciello <[email protected]> Cc: Alex Deucher <[email protected]> Cc: [email protected] Reviewed-by: Chaitanya Dhere <[email protected]> Acked-by: Alex Hung <[email protected]> Signed-off-by: Dillon Varone <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amd/display: Add monitor patch for specific eDPRyan Lin1-2/+4
[WHY] Some eDP panels' ext caps don't write initial values. The value of dpcd_addr (0x317) can be random and the backlight control interface will be incorrect. [HOW] Add new panel patches to remove sink ext caps. Cc: Mario Limonciello <[email protected]> Cc: Alex Deucher <[email protected]> Cc: [email protected] # 6.5.x Cc: Tsung-hua Lin <[email protected]> Cc: Chris Chi <[email protected]> Reviewed-by: Wayne Lin <[email protected]> Acked-by: Alex Hung <[email protected]> Signed-off-by: Ryan Lin <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amd/display: Allow dirty rects to be sent to dmub when abm is activeJosip Pavic1-0/+3
[WHY] It's beneficial for ABM to know when new frame data are available. [HOW] Add new condition to allow dirty rects to be sent to DMUB when ABM is active. ABM will use this as a signal that a new frame has arrived. Cc: Mario Limonciello <[email protected]> Cc: Alex Deucher <[email protected]> Cc: [email protected] Reviewed-by: Anthony Koo <[email protected]> Acked-by: Alex Hung <[email protected]> Signed-off-by: Josip Pavic <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amd/display: Override min required DCFCLK in dml1_validateSohaib Nadeem3-0/+10
[WHY]: Increasing min DCFCLK addresses underflow issues that occur when phantom pipe is turned on for some Sub-Viewport configs [HOW]: dcn32_override_min_req_dcfclk is added to override DCFCLK value in dml1_validate when subviewport is being used. Cc: Mario Limonciello <[email protected]> Cc: Alex Deucher <[email protected]> Cc: [email protected] Reviewed-by: Alvin Lee <[email protected]> Acked-by: Alex Hung <[email protected]> Signed-off-by: Sohaib Nadeem <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amdgpu: Bypass display ta if display hw is not availableHawking Zhang1-0/+18
Do not load/invoke display TA if display hardware is not available. Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amdgpu: correct the KGQ fallback messagePrike Liang1-1/+1
Fix the KGQ fallback function name, as this will help differentiate the failure in the KCQ enablement. Signed-off-by: Prike Liang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amdgpu/pm: Check the validity of overdiver power limitMa Jun5-26/+37
Check the validity of overdriver power limit before using it. Fixes: 7968e9748fbb ("drm/amdgpu/pm: Fix the power1_min_cap value") Signed-off-by: Ma Jun <[email protected]> Suggested-by: Lazar Lijo <[email protected]> Suggested-by: Alex Deucher <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2024-03-20drm/amdgpu/pm: Fix NULL pointer dereference when get power limitMa Jun5-31/+41
Because powerplay_table initialization is skipped under sriov case, We check and set default lower and upper OD value if powerplay_table is NULL. Fixes: 7968e9748fbb ("drm/amdgpu/pm: Fix the power1_min_cap value") Signed-off-by: Ma Jun <[email protected]> Reported-by: Yin Zhenguo <[email protected]> Suggested-by: Lazar Lijo <[email protected]> Suggested-by: Alex Deucher <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2024-03-20drm/amdgpu: Skip access PF-only registers on gfx10/gfxhub2_1 under SRIOVZhenGuo Yin2-2/+9
[Why] RLCG interface returns "out-of-range" error under SRIOV VF when accessing PF-only registers. [How] Skip access PF-only registers on gfx10/gfxhub2_1 under SRIOV. Acked-by: Alex Deucher <[email protected]> Signed-off-by: ZhenGuo Yin <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amdgpu: Init zone device and drm client after mode-1 reset on reloadAhmad Rehman2-2/+5
In passthrough environment, when amdgpu is reloaded after unload, mode-1 is triggered after initializing the necessary IPs, That init does not include KFD, and KFD init waits until the reset is completed. KFD init is called in the reset handler, but in this case, the zone device and drm client is not initialized, causing app to create kernel panic. v2: Removing the init KFD condition from amdgpu_amdkfd_drm_client_create. As the previous version has the potential of creating DRM client twice. v3: v2 patch results in SDMA engine hung as DRM open causes VM clear to SDMA before SDMA init. Adding the condition to in drm client creation, on top of v1, to guard against drm client creation call multiple times. Signed-off-by: Ahmad Rehman <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amdgpu: amdgpu_ttm_gart_bind set gtt bound flagPhilip Yang1-0/+1
Otherwise after the GTT bo is released, the GTT and gart space is freed but amdgpu_ttm_backend_unbind will not clear the gart page table entry and leave valid mapping entry pointing to the stale system page. Then if GPU access the gart address mistakely, it will read undefined value instead page fault, harder to debug and reproduce the real issue. Cc: [email protected] Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amdgpu/vcn: enable vcn1 fw load for VCN 4_0_6Saleemkhan Jamadar10-38/+52
v1 - update the fw header for each vcn instance (Veera) VCN1 has different FW binary in VCN v4_0_6. Add changes to load the VCN1 fw binary Signed-off-by: Saleemkhan Jamadar <[email protected]> Reviewed-by: Veerabadhran Gopalakrishnan <[email protected]> Reviewed-by: Leo Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amd/display: Enable DML2 debug flagsAurabindo Pillai1-0/+3
[WHY & HOW] Enable DML2 related debug config options in DM for testing purposes. Reviewed-by: Chaitanya Dhere <[email protected]> Acked-by: Alex Hung <[email protected]> Signed-off-by: Aurabindo Pillai <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amd/display: Change default size for dummy plane in DML2Swapnil Patel1-3/+15
[WHY & HOW] Currently, to map dc states into dml_display_cfg, We create a dummy plane if the stream doesn't have any planes attached to it. This dummy plane uses max addersable width height. This results in certain mode validations failing when they shouldn't. Cc: Mario Limonciello <[email protected]> Cc: Alex Deucher <[email protected]> Cc: [email protected] Reviewed-by: Chaitanya Dhere <[email protected]> Acked-by: Alex Hung <[email protected]> Signed-off-by: Swapnil Patel <[email protected]> Tested-by: Daniel Wheeler <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amdgpu: Reset IH OVERFLOW_EN bit for IH 7.0Friedrich Vock1-0/+6
IH 7.0 support landed shortly after the original patch for resetting the bit on all other generations, but without that patch applied. Fixes: 12443fc53e7d ("drm/amdgpu: Add ih v7_0 ip block support") Cc: Christian König <[email protected]> Cc: Alex Deucher <[email protected]> Signed-off-by: Friedrich Vock <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amdgpu: fix mmhub client id out-of-bounds accessLang Yu1-4/+3
Properly handle cid 0x140. Fixes: aba2be41470a ("drm/amdgpu: add mmhub 3.3.0 support") Signed-off-by: Lang Yu <[email protected]> Reviewed-by: Yifan Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amdgpu: fix use-after-free bugVitaly Prosyak1-4/+16
The bug can be triggered by sending a single amdgpu_gem_userptr_ioctl to the AMDGPU DRM driver on any ASICs with an invalid address and size. The bug was reported by Joonkyo Jung <[email protected]>. For example the following code: static void Syzkaller1(int fd) { struct drm_amdgpu_gem_userptr arg; int ret; arg.addr = 0xffffffffffff0000; arg.size = 0x80000000; /*2 Gb*/ arg.flags = 0x7; ret = drmIoctl(fd, 0xc1186451/*amdgpu_gem_userptr_ioctl*/, &arg); } Due to the address and size are not valid there is a failure in amdgpu_hmm_register->mmu_interval_notifier_insert->__mmu_interval_notifier_insert-> check_shl_overflow, but we even the amdgpu_hmm_register failure we still call amdgpu_hmm_unregister into amdgpu_gem_object_free which causes access to a bad address. The following stack is below when the issue is reproduced when Kazan is enabled: [ +0.000014] Hardware name: ASUS System Product Name/ROG STRIX B550-F GAMING (WI-FI), BIOS 1401 12/03/2020 [ +0.000009] RIP: 0010:mmu_interval_notifier_remove+0x327/0x340 [ +0.000017] Code: ff ff 49 89 44 24 08 48 b8 00 01 00 00 00 00 ad de 4c 89 f7 49 89 47 40 48 83 c0 22 49 89 47 48 e8 ce d1 2d 01 e9 32 ff ff ff <0f> 0b e9 16 ff ff ff 4c 89 ef e8 fa 14 b3 ff e9 36 ff ff ff e8 80 [ +0.000014] RSP: 0018:ffffc90002657988 EFLAGS: 00010246 [ +0.000013] RAX: 0000000000000000 RBX: 1ffff920004caf35 RCX: ffffffff8160565b [ +0.000011] RDX: dffffc0000000000 RSI: 0000000000000004 RDI: ffff8881a9f78260 [ +0.000010] RBP: ffffc90002657a70 R08: 0000000000000001 R09: fffff520004caf25 [ +0.000010] R10: 0000000000000003 R11: ffffffff8161d1d6 R12: ffff88810e988c00 [ +0.000010] R13: ffff888126fb5a00 R14: ffff88810e988c0c R15: ffff8881a9f78260 [ +0.000011] FS: 00007ff9ec848540(0000) GS:ffff8883cc880000(0000) knlGS:0000000000000000 [ +0.000012] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ +0.000010] CR2: 000055b3f7e14328 CR3: 00000001b5770000 CR4: 0000000000350ef0 [ +0.000010] Call Trace: [ +0.000006] <TASK> [ +0.000007] ? show_regs+0x6a/0x80 [ +0.000018] ? __warn+0xa5/0x1b0 [ +0.000019] ? mmu_interval_notifier_remove+0x327/0x340 [ +0.000018] ? report_bug+0x24a/0x290 [ +0.000022] ? handle_bug+0x46/0x90 [ +0.000015] ? exc_invalid_op+0x19/0x50 [ +0.000016] ? asm_exc_invalid_op+0x1b/0x20 [ +0.000017] ? kasan_save_stack+0x26/0x50 [ +0.000017] ? mmu_interval_notifier_remove+0x23b/0x340 [ +0.000019] ? mmu_interval_notifier_remove+0x327/0x340 [ +0.000019] ? mmu_interval_notifier_remove+0x23b/0x340 [ +0.000020] ? __pfx_mmu_interval_notifier_remove+0x10/0x10 [ +0.000017] ? kasan_save_alloc_info+0x1e/0x30 [ +0.000018] ? srso_return_thunk+0x5/0x5f [ +0.000014] ? __kasan_kmalloc+0xb1/0xc0 [ +0.000018] ? srso_return_thunk+0x5/0x5f [ +0.000013] ? __kasan_check_read+0x11/0x20 [ +0.000020] amdgpu_hmm_unregister+0x34/0x50 [amdgpu] [ +0.004695] amdgpu_gem_object_free+0x66/0xa0 [amdgpu] [ +0.004534] ? __pfx_amdgpu_gem_object_free+0x10/0x10 [amdgpu] [ +0.004291] ? do_syscall_64+0x5f/0xe0 [ +0.000023] ? srso_return_thunk+0x5/0x5f [ +0.000017] drm_gem_object_free+0x3b/0x50 [drm] [ +0.000489] amdgpu_gem_userptr_ioctl+0x306/0x500 [amdgpu] [ +0.004295] ? __pfx_amdgpu_gem_userptr_ioctl+0x10/0x10 [amdgpu] [ +0.004270] ? srso_return_thunk+0x5/0x5f [ +0.000014] ? __this_cpu_preempt_check+0x13/0x20 [ +0.000015] ? srso_return_thunk+0x5/0x5f [ +0.000013] ? sysvec_apic_timer_interrupt+0x57/0xc0 [ +0.000020] ? srso_return_thunk+0x5/0x5f [ +0.000014] ? asm_sysvec_apic_timer_interrupt+0x1b/0x20 [ +0.000022] ? drm_ioctl_kernel+0x17b/0x1f0 [drm] [ +0.000496] ? __pfx_amdgpu_gem_userptr_ioctl+0x10/0x10 [amdgpu] [ +0.004272] ? drm_ioctl_kernel+0x190/0x1f0 [drm] [ +0.000492] drm_ioctl_kernel+0x140/0x1f0 [drm] [ +0.000497] ? __pfx_amdgpu_gem_userptr_ioctl+0x10/0x10 [amdgpu] [ +0.004297] ? __pfx_drm_ioctl_kernel+0x10/0x10 [drm] [ +0.000489] ? srso_return_thunk+0x5/0x5f [ +0.000011] ? __kasan_check_write+0x14/0x20 [ +0.000016] drm_ioctl+0x3da/0x730 [drm] [ +0.000475] ? __pfx_amdgpu_gem_userptr_ioctl+0x10/0x10 [amdgpu] [ +0.004293] ? __pfx_drm_ioctl+0x10/0x10 [drm] [ +0.000506] ? __pfx_rpm_resume+0x10/0x10 [ +0.000016] ? srso_return_thunk+0x5/0x5f [ +0.000011] ? __kasan_check_write+0x14/0x20 [ +0.000010] ? srso_return_thunk+0x5/0x5f [ +0.000011] ? _raw_spin_lock_irqsave+0x99/0x100 [ +0.000015] ? __pfx__raw_spin_lock_irqsave+0x10/0x10 [ +0.000014] ? srso_return_thunk+0x5/0x5f [ +0.000013] ? srso_return_thunk+0x5/0x5f [ +0.000011] ? srso_return_thunk+0x5/0x5f [ +0.000011] ? preempt_count_sub+0x18/0xc0 [ +0.000013] ? srso_return_thunk+0x5/0x5f [ +0.000010] ? _raw_spin_unlock_irqrestore+0x27/0x50 [ +0.000019] amdgpu_drm_ioctl+0x7e/0xe0 [amdgpu] [ +0.004272] __x64_sys_ioctl+0xcd/0x110 [ +0.000020] do_syscall_64+0x5f/0xe0 [ +0.000021] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ +0.000015] RIP: 0033:0x7ff9ed31a94f [ +0.000012] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00 [ +0.000013] RSP: 002b:00007fff25f66790 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ +0.000016] RAX: ffffffffffffffda RBX: 000055b3f7e133e0 RCX: 00007ff9ed31a94f [ +0.000012] RDX: 000055b3f7e133e0 RSI: 00000000c1186451 RDI: 0000000000000003 [ +0.000010] RBP: 00000000c1186451 R08: 0000000000000000 R09: 0000000000000000 [ +0.000009] R10: 0000000000000008 R11: 0000000000000246 R12: 00007fff25f66ca8 [ +0.000009] R13: 0000000000000003 R14: 000055b3f7021ba8 R15: 00007ff9ed7af040 [ +0.000024] </TASK> [ +0.000007] ---[ end trace 0000000000000000 ]--- v2: Consolidate any error handling into amdgpu_hmm_register which applied to kfd_bo also. (Christian) v3: Improve syntax and comment (Christian) Cc: Christian Koenig <[email protected]> Cc: Alex Deucher <[email protected]> Cc: Felix Kuehling <[email protected]> Cc: Joonkyo Jung <[email protected]> Cc: Dokyung Song <[email protected]> Cc: <[email protected]> Cc: <[email protected]> Signed-off-by: Vitaly Prosyak <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amdgpu: Handle duplicate BOs during process restoreMukul Joshi1-4/+10
In certain situations, some apps can import a BO multiple times (through IPC for example). To restore such processes successfully, we need to tell drm to ignore duplicate BOs. While at it, also add additional logging to prevent silent failures when process restore fails. Signed-off-by: Mukul Joshi <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-20drm/amd/display: Use freesync when `DRM_EDID_FEATURE_CONTINUOUS_FREQ` foundMario Limonciello1-8/+14
The monitor shipped with the Framework 16 supports VRR [1], but it's not being advertised. This is because the detailed timing block doesn't contain `EDID_DETAIL_MONITOR_RANGE` which amdgpu looks for to find min and max frequencies. This check however is superfluous for this case because update_display_info() calls drm_get_monitor_range() to get these ranges already. So if the `DRM_EDID_FEATURE_CONTINUOUS_FREQ` EDID feature is found then turn on freesync without extra checks. v2: squash in fix from Harry Closes: https://www.reddit.com/r/framework/comments/1b4y2i5/no_variable_refresh_rate_on_the_framework_16_on/ Closes: https://www.reddit.com/r/framework/comments/1b6vzcy/framework_16_variable_refresh_rate/ Closes: https://community.frame.work/t/resolved-no-vrr-freesync-with-amd-version/42338 Link: https://gist.github.com/superm1/e8fbacfa4d0f53150231d3a3e0a13faf Signed-off-by: Mario Limonciello <[email protected]> Reviewed-by: Harry Wentland <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-03-11Merge tag 'amd-drm-next-6.9-2024-03-08-1' of ↵Dave Airlie74-791/+122906
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.9-2024-03-08-1: amdgpu: - DCN 3.5.1 support - Fixes for IOMMUv2 removal - UAF fix - Misc small fixes and cleanups - SR-IOV fixes - MCBP cleanup - devcoredump update - NBIF 6.3.1 support - VPE 6.1.1 support amdkfd: - Misc fixes and cleanups - GFX10.1 trap fixes Signed-off-by: Dave Airlie <[email protected]> From: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-03-08Merge tag 'drm-msm-next-2024-03-07' of ↵Dave Airlie23-291/+736
https://gitlab.freedesktop.org/drm/msm into drm-next Late updates for v6.9, the main part is CDM (YUV over DP) which was waiting for drm-misc-next-2024-02-29. DPU: - Add support for YUV420 over DP - Patchset to ease debugging of vblank timeouts - Small cleanup Signed-off-by: Dave Airlie <[email protected]> From: Rob Clark <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGvedk6OCOZ-NNtGf_pNiGuK9uvWj1MCDZLX9Jo2nHS=Zg@mail.gmail.com
2024-03-08Merge tag 'drm-etnaviv-next-2024-03-07' of ↵Dave Airlie9-44/+163
https://git.pengutronix.de/git/lst/linux into drm-next - various code cleanups - enhancements for NPU and MRT support Signed-off-by: Dave Airlie <[email protected]> From: Lucas Stach <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-03-08Merge tag 'drm-xe-next-fixes-2024-03-04' of ↵Dave Airlie7-33/+13
ssh://gitlab.freedesktop.org/drm/xe/kernel into drm-next Driver Changes: - Fix kunit link failure with built-in xe - Fix one more 32-bit build failure with ARM compiler - Fix initialization order of topology struct - Cleanup unused fields in struct xe_vm - Fix xe_vm leak when handling page fault on a VM not in fault mode - Drop use of "grouped target" feature in Makefile since that's only available in make >= 4.3 Signed-off-by: Dave Airlie <[email protected]> From: Lucas De Marchi <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/kaypobelrl7u7rtnu6hg5czs3vptbhs4rp24vnwuo2ajoxysto@l5u7377hz4es
2024-03-08Merge tag 'drm-misc-next-fixes-2024-03-07' of ↵Dave Airlie5-30/+5
https://anongit.freedesktop.org/git/drm/drm-misc into drm-next Short summary of fixes pull: - i915: Fix applying placement flags - fbdev: Fix build on PowerMacs after header cleanup Signed-off-by: Dave Airlie <[email protected]> From: Thomas Zimmermann <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]