aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd
AgeCommit message (Collapse)AuthorFilesLines
2020-08-04drm/amdgpu: introduce a new parameter to configure how many KCQ we want(v5)Monk Liu7-73/+76
what: the MQD's save and restore of KCQ (kernel compute queue) cost lots of clocks during world switch which impacts a lot to multi-VF performance how: introduce a paramter to control the number of KCQ to avoid performance drop if there is no kernel compute queue needed notes: this paramter only affects gfx 8/9/10 v2: refine namings v3: choose queues for each ring to that try best to cross pipes evenly. v4: fix indentation some cleanupsin the gfx_compute_queue_acquire() v5: further fix on indentations more cleanupsin gfx_compute_queue_acquire() TODO: in the future we will let hypervisor driver to set this paramter automatically thus no need for user to configure it through modprobe in virtual machine Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-04drm/amdgpu: update eeprom once specifying one bigger threshold(v3)Guchun Chen1-2/+28
During driver's probe, when it hits bad gpu tag in eeprom i2c init calling(the tag was set when reported bad page reaches bad page threshold in last driver's working loop), there are some strategys to deal with the cases: 1. when the module parameter amdgpu_bad_page_threshold = 0, that means page retirement feature is disabled, so just resetting the eeprom is fine. 2. When amdgpu_bad_page_threshold is not 0, and moreover, user sets one bigger valid data in order to make current boot up succeeds, correct eeprom header tag and do not break booting. 3. For other cases, driver's probe will be broken. v2: Just update eeprom header tag instead of resetting the whole table header when user sets one bigger threshold data. v3: Use dev_info/dev_err to print PCI device information, which helps in mGPU case. Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-04drm/amdgpu: disable page reservation when amdgpu_bad_page_threshold = 0Guchun Chen2-4/+6
When amdgpu_bad_page_threshold = 0, bad page reservation stuffs are skipped in either UMC ECC irq or page retirement calling of sync flood isr. Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-04drm/amdgpu: decouple sysfs creating of bad page nodeGuchun Chen1-21/+42
Bad page information should not be exposed by sysfs when bad page retirement is disabled, so decouple it from ras sysfs group creating, and add one guard before creating. Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-04drm/amdgpu: add one definition for RAS's sysfs/debugfs name(v2)Guchun Chen1-5/+7
Add one definition for the RAS module's FS name. It's used in both debugfs and sysfs cases. v2: Use static variable instead of macro definition. Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-04drm/amdgpu: restore ras flags when user resets eeprom(v2)Guchun Chen1-3/+10
RAS flags needs to be cleaned as well when user requires one clean eeprom. v2: RAS flags shall be restored after eeprom reset succeeds. Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-04drm/amdgpu: break GPU recovery once it's in bad state(v4)Guchun Chen5-3/+79
When GPU executes recovery and retriving bad GPU tag from external eerpom device, the recovery will be broken and error message is printed as well for user's awareness. v2: Refine warning message in threshold reaching case, and fix spelling typo. v3: Fix explicit calling of bad gpu. v4: Rename function names. Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-04drm/amdgpu: schedule ras recovery when reaching bad page threshold(v2)Guchun Chen1-1/+36
Once the bad page saved to eeprom reaches the configured threshold, ras recovery will be issued to notify user. v2: Fix spelling typo. Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-04drm/amdgpu: skip bad page reservation once issuing from eeprom writeGuchun Chen2-5/+11
Once the ras recovery is issued from eeprom write itself, bad page reservation should be ignored, otherwise, recursive calling of writting to eeprom would happen. Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-04drm/amdgpu: break driver init process when it's bad GPU(v5)Guchun Chen4-7/+36
When retrieving bad gpu tag from eeprom, GPU init should fail as the GPU needs to be retired for further check. v2: Fix spelling typo, correct the condition to detect bad gpu tag and refine error message. v3: Refine function argument name. v4: Fix missing check of returning value of i2c initialization error case. v5: Use dev_err to print PCI information in dmesg instead of DRM_ERROR. Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-04drm/amdgpu: add bad gpu tag definitionGuchun Chen1-0/+3
This tag will be hired for bad gpu detection in eeprom's access. Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-04drm/amdgpu: validate bad page threshold in ras(v3)Guchun Chen4-0/+58
Bad page threshold value should be valid in the range between -1 and max records length of eeprom. It could determine when saved bad pages exceed threshold value, and proceed corresponding actions. v2: When using the default typical value, it should be min value between typical value and eeprom max records length. v3: drop the case of setting bad_page_cnt_threshold to be 0xFFFFFFFF, as it confuses user. Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-04drm/amdgpu: add bad page count threshold in module parameter(v3)Guchun Chen2-0/+12
bad_page_threshold could be configured to enable/disable the associated bad page retirement feature in RAS. When it's -1, ras will use typical bad page failure value to handle bad page retirement. When it's 0, disable bad page retirement, and no bad page will be recorded and saved. For other valid value, driver will use this manual value as the threshold value of totoal bad pages. v2: correct documentation of this parameter. v3: remove confused statement in documentation. Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-04drm/amdkfd: Replace bitmask with event idx in SMI event msgMukul Joshi1-11/+13
Event bitmask is a 64-bit mask with only 1 bit set. Sending this event bitmask in KFD SMI event message is both wasteful of memory and potentially limiting to only 64 events. Instead send event index in SMI event message. Please note this change does not break the ABI for the two event types defined so far. The new index is identical to the mask used before. Signed-off-by: Mukul Joshi <[email protected]> Suggested-by: Felix Kuehling <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-31drm/ttm: remove the init_mem_type callbackChristian König1-7/+0
It is a very strange concept to call a function which just calls back the caller for the functions parameters. Signed-off-by: Christian König <[email protected]> Reviewed-by: Thomas Zimmermann <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/382085/
2020-07-31drm/amdgpu: stop implementing init_mem_typeChristian König1-53/+43
Instead just initialize the memory type parameters before calling ttm_bo_init_mm. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/382084/
2020-07-31drm/ttm: remove TTM_MEMTYPE_FLAG_FIXED v2Christian König1-3/+1
Instead use a boolean field in the memory manager structure. Also invert the meaning of the field since the use of a TT structure is the special case here. v2: cleanup zero init. Signed-off-by: Christian König <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Reviewed-by: Thomas Zimmermann <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/382079/
2020-07-31drm/ttm: initialize the system domain with defaults v2Christian König1-3/+0
Instead of repeating that in each driver. v2: keep the caching limitation for VMWGFX for now. Signed-off-by: Christian König <[email protected]> Reviewed-by: Daniel Vetter <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Thomas Zimmermann <[email protected]> Link: https://patchwork.freedesktop.org/patch/382078/
2020-07-30Revert "drm/amdgpu: Fix NULL dereference in dpm sysfs handlers"Alex Deucher1-3/+6
This regressed some working configurations so revert it. Will fix this properly for 5.9 and backport then. This reverts commit 38e0c89a19fd13f28d2b4721035160a3e66e270b. Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2020-07-30drm/amdgpu: enable GFXOFF for navy_flounderJiansong Chen2-0/+2
Enable GFXOFF for navy_flounder. Signed-off-by: Jiansong Chen <[email protected]> Reviewed-by: Likun Gao <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-30drm amdgpu: Skip tmr load for SRIOVLiu ChengZhe1-6/+29
1. For Navi12, CHIP_SIENNA_CICHLID, skip tmr load operation; 2. Check pointer before release firmware. v2: use CHIP_SIENNA_CICHLID instead v3: remove local "bool ret"; fix grammer issue v4: use my name instead of "root" v5: fix grammer issue and indent issue Signed-off-by: Liu ChengZhe <[email protected]> Reviewed-by: Luben Tuikov <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-30drm/amdgpu: fix PSP autoload twice in FLRLiu ChengZhe1-1/+3
Assigning false to block->status.hw overwrites PSP's previous hardware status, which causes the PSP to Resume operation after hardware init. Remove this assignment and let the PSP execute Resume operation when it is told to. v2: Remove the braces. v3: Modify the description. Signed-off-by: Liu ChengZhe <[email protected]> Reviewed-by: Luben Tuikov <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-30drm/amdgpu/dc: Stop dma_resv_lock inversion in commit_tailDaniel Vetter1-13/+6
Trying to grab dma_resv_lock while in commit_tail before we've done all the code that leads to the eventual signalling of the vblank event (which can be a dma_fence) is deadlock-y. Don't do that. Here the solution is easy because just grabbing locks to read something races anyway. We don't need to bother, READ_ONCE is equivalent. And avoids the locking issue. v2: Also take into account tmz_surface boolean, plus just delete the old code. Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: Chris Wilson <[email protected]> Cc: Maarten Lankhorst <[email protected]> Cc: Christian König <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-30drm/amd/powerplay: Remove unneeded cast from memory allocationLi Heng1-1/+1
Remove casting the values returned by memory allocation function. Coccinelle emits WARNING: ./drivers/gpu/drm/amd/powerplay/hwmgr/vega20_processpptables.c:893:37-46: WARNING: casting value returned by memory allocation function to (PPTable_t *) is useless. Signed-off-by: Li Heng <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-30drm/amdgpu: Prevent kernel-infoleak in amdgpu_info_ioctl()Peilin Ye1-1/+2
Compiler leaves a 4-byte hole near the end of `dev_info`, causing amdgpu_info_ioctl() to copy uninitialized kernel stack memory to userspace when `size` is greater than 356. In 2015 we tried to fix this issue by doing `= {};` on `dev_info`, which unfortunately does not initialize that 4-byte hole. Fix it by using memset() instead. Cc: [email protected] Fixes: c193fa91b918 ("drm/amdgpu: information leak in amdgpu_info_ioctl()") Fixes: d38ceaf99ed0 ("drm/amdgpu: add core driver (v4)") Suggested-by: Dan Carpenter <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Peilin Ye <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-30drm/amd/display: Clear dm_state for fast updatesMazin Rezk1-9/+27
This patch fixes a race condition that causes a use-after-free during amdgpu_dm_atomic_commit_tail. This can occur when 2 non-blocking commits are requested and the second one finishes before the first. Essentially, this bug occurs when the following sequence of events happens: 1. Non-blocking commit #1 is requested w/ a new dm_state #1 and is deferred to the workqueue. 2. Non-blocking commit #2 is requested w/ a new dm_state #2 and is deferred to the workqueue. 3. Commit #2 starts before commit #1, dm_state #1 is used in the commit_tail and commit #2 completes, freeing dm_state #1. 4. Commit #1 starts after commit #2 completes, uses the freed dm_state 1 and dereferences a freelist pointer while setting the context. Since this bug has only been spotted with fast commits, this patch fixes the bug by clearing the dm_state instead of using the old dc_state for fast updates. In addition, since dm_state is only used for its dc_state and amdgpu_dm_atomic_commit_tail will retain the dc_state if none is found, removing the dm_state should not have any consequences in fast updates. This use-after-free bug has existed for a while now, but only caused a noticeable issue starting from 5.7-rc1 due to 3202fa62f ("slub: relocate freelist pointer to middle of object") moving the freelist pointer from dm_state->base (which was unused) to dm_state->context (which is dereferenced). Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=207383 Fixes: bd200d190f45 ("drm/amd/display: Don't replace the dc_state for fast updates") Reported-by: Duncan <[email protected]> Signed-off-by: Mazin Rezk <[email protected]> Reviewed-by: Nicholas Kazlauskas <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-30drm/amdgpu: update GC golden setting for navy_flounderJiansong Chen1-2/+2
Update GC golden setting for navy_flounder. Signed-off-by: Jiansong Chen <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-30drm/amd/powerplay: update driver if version for navy_flounderJiansong Chen1-1/+1
It's in accordance with pmfw 65.5.0 for navy_flounder. Signed-off-by: Jiansong Chen <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Reviewed-by: Kenneth Feng <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-30drm/amdgpu: enable umc 8.7 functions in gmc v10John Clements2-2/+50
add support for umc 8.7 initialization add umc 8.7 source to makefile Reviewed-by: Guchun Chen <[email protected]> Signed-off-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-30drm/amdgpu: skip crit temperature values on APU (v2)Huang Rui1-0/+6
It doesn't expose PPTable descriptor on APU platform. So max/min temperature values cannot be got from APU platform. v2: Stoney needs to skip crit temperature as well. Signed-off-by: Huang Rui <[email protected]> Reviewed-by: Kevin Wang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-30drm/amd/display: Fix DP Compliance tests 4.3.2.1 and 4.3.2.2Aric Cyr1-7/+6
[Why] Test expects that we also read HPD_IRQ_VECTOR when checking for symbol loss as well lane status. [How] Read bytes 0x200-0x205 instead of just 0x202-0x205 Signed-off-by: Aric Cyr <[email protected]> Reviewed-by: Jun Lei <[email protected]> Acked-by: Eryk Brol <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-30drm/amd/display: 3.2.96Aric Cyr1-1/+1
Signed-off-by: Aric Cyr <[email protected]> Acked-by: Eryk Brol <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-30drm/amd/display: DSC Slice width debugfs write entryEryk Brol4-0/+104
[Why] We need to be able to specify slice width for DSC on aconnector [How] Getting slice width parameter from debugfs entry, if it is a valid the value is set in connector's dsc preffered settings structure. Which then overwrites dsc_cfg structure's parameters if DSC is decided to be enabled. Works for both SST and MST. Signed-off-by: Eryk Brol <[email protected]> Signed-off-by: Mikita Lipski <[email protected]> Reviewed-by: Mikita Lipski <[email protected]> Acked-by: Eryk Brol <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-30drm/amd/display: Use hw lock mgrWyatt Wood1-1/+1
[Why] Feature requires synchronization of dig, pipe, and cursor locking between driver and fw. [How] Set flag to force psr to use hw lock mgr. Signed-off-by: Wyatt Wood <[email protected]> Reviewed-by: Anthony Koo <[email protected]> Acked-by: Eryk Brol <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-30drm/amd/display: dchubbub p-state warning during surface planes switchhersen wu1-2/+67
[Why] ramp_up_dispclk_with_dpp is to change dispclk, dppclk and dprefclk according to bandwidth requirement. call stack: rv1_update_clocks --> update_clocks --> dcn10_prepare_bandwidth / dcn10_optimize_bandwidth --> prepare_bandwidth / optimize_bandwidth. before change dcn hw, prepare_bandwidth will be called first to allow enough clock, watermark for change, after end of dcn hw change, optimize_bandwidth is executed to lower clock to save power for new dcn hw settings. below is sequence of commit_planes_for_stream: step 1: prepare_bandwidth - raise clock to have enough bandwidth step 2: lock_doublebuffer_enable step 3: pipe_control_lock(true) - make dchubp register change will not take effect right way step 4: apply_ctx_for_surface - program dchubp step 5: pipe_control_lock(false) - dchubp register change take effect step 6: optimize_bandwidth --> dc_post_update_surfaces_to_stream for full_date, optimize clock to save power at end of step 1, dcn clocks (dprefclk, dispclk, dppclk) may be changed for new dchubp configuration. but real dcn hub dchubps are still running with old configuration until end of step 5. this need clocks settings at step 1 should not less than that before step 1. this is checked by two conditions: 1. if (should_set_clock(safe_to_lower , new_clocks->dispclk_khz, clk_mgr_base->clks.dispclk_khz) || new_clocks->dispclk_khz == clk_mgr_base->clks.dispclk_khz) 2. request_dpp_div = new_clocks->dispclk_khz > new_clocks->dppclk_khz the second condition is based on new dchubp configuration. dppclk for new dchubp may be different from dppclk before step 1. for example, before step 1, dchubps are as below: pipe 0: recout=(0,40,1920,980) viewport=(0,0,1920,979) pipe 1: recout=(0,0,1920,1080) viewport=(0,0,1920,1080) for dppclk for pipe0 need dppclk = dispclk new dchubp pipe split configuration: pipe 0: recout=(0,0,960,1080) viewport=(0,0,960,1080) pipe 1: recout=(960,0,960,1080) viewport=(960,0,960,1080) dppclk only needs dppclk = dispclk /2. dispclk, dppclk are not lock by otg master lock. they take effect after step 1. during this transition, dispclk are the same, but dppclk is changed to half of previous clock for old dchubp configuration between step 1 and step 6. This may cause p-state warning intermittently. [How] for new_clocks->dispclk_khz == clk_mgr_base->clks.dispclk_khz, we need make sure dppclk are not changed to less between step 1 and 6. for new_clocks->dispclk_khz > clk_mgr_base->clks.dispclk_khz, new display clock is raised, but we do not know ratio of new_clocks->dispclk_khz and clk_mgr_base->clks.dispclk_khz, new_clocks->dispclk_khz /2 does not guarantee equal or higher than old dppclk. we could ignore power saving different between dppclk = displck and dppclk = dispclk / 2 between step 1 and step 6. as long as safe_to_lower = false, set dpclk = dispclk to simplify condition check. CC: Stable <[email protected]> Signed-off-by: Hersen Wu <[email protected]> Reviewed-by: Aric Cyr <[email protected]> Acked-by: Eryk Brol <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-30drm/amd/display: [FW Promotion] Release 0.0.26Anthony Koo1-2/+2
Signed-off-by: Anthony Koo <[email protected]> Reviewed-by: Aric Cyr <[email protected]> Acked-by: Eryk Brol <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-30drm/amd/display: DSC Clock enable debugfs write entryEryk Brol4-3/+107
[Why] Need a mechanism to force enable DSC on any connector [How] Debugfs entry overwrites newly added connector's dsc preffered settings structure and sets dsc_clock_en flag on it. During the attomic commit, depending if connector is SST or MST, we will enable DSC manually by overwriting stream's DSC flag. Signed-off-by: Eryk Brol <[email protected]> Signed-off-by: Mikita Lipski <[email protected]> Reviewed-by: Mikita Lipski <[email protected]> Acked-by: Eryk Brol <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-30drm/amd/display: Allow asic specific FSFT timing optimizationReza Amini8-13/+57
[Why] Each asic can optimize best based on its capabilities [How] Optimizing timing for a new pixel clock Signed-off-by: Reza Amini <[email protected]> Reviewed-by: Anthony Koo <[email protected]> Acked-by: Eryk Brol <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-30drm/amd/display: Disable idle optimizations before programming DCNJun Lei1-2/+16
[Why] Programming DCN is explicitly forbidden during idle optimzations allowed state. Existing implemenation relies on OS/DM, which is not robust. Instead DC should sequence this. Note that DC will not re-enter idle optimized state on its own, it is only responsible for catching out of sequence calls. It is still DM responsibility to sequence appropriate for optimized power, but this change removes the requirement for DM to cover the .1% case. [How] - elevate updates during idle optimized state to full updates - disable idle power optimizations prior to programming Signed-off-by: Jun Lei <[email protected]> Reviewed-by: Jun Lei <[email protected]> Acked-by: Eryk Brol <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-30drm/amd/display: Fix dmesg warning from setting abm levelStylon Wang1-0/+23
[Why] Setting abm level does not correctly update CRTC state. As a result no surface update is added to dc stream state and triggers warning. [How] Correctly update CRTC state when setting abm level property. CC: Stable <[email protected]> Signed-off-by: Stylon Wang <[email protected]> Reviewed-by: Nicholas Kazlauskas <[email protected]> Acked-by: Eryk Brol <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-30Revert "drm/amdgpu: Fix NULL dereference in dpm sysfs handlers"Alex Deucher1-3/+6
This regressed some working configurations so revert it. Will fix this properly for 5.9 and backport then. This reverts commit 38e0c89a19fd13f28d2b4721035160a3e66e270b. Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2020-07-30drm/amd/display: Clear dm_state for fast updatesMazin Rezk1-9/+27
This patch fixes a race condition that causes a use-after-free during amdgpu_dm_atomic_commit_tail. This can occur when 2 non-blocking commits are requested and the second one finishes before the first. Essentially, this bug occurs when the following sequence of events happens: 1. Non-blocking commit #1 is requested w/ a new dm_state #1 and is deferred to the workqueue. 2. Non-blocking commit #2 is requested w/ a new dm_state #2 and is deferred to the workqueue. 3. Commit #2 starts before commit #1, dm_state #1 is used in the commit_tail and commit #2 completes, freeing dm_state #1. 4. Commit #1 starts after commit #2 completes, uses the freed dm_state 1 and dereferences a freelist pointer while setting the context. Since this bug has only been spotted with fast commits, this patch fixes the bug by clearing the dm_state instead of using the old dc_state for fast updates. In addition, since dm_state is only used for its dc_state and amdgpu_dm_atomic_commit_tail will retain the dc_state if none is found, removing the dm_state should not have any consequences in fast updates. This use-after-free bug has existed for a while now, but only caused a noticeable issue starting from 5.7-rc1 due to 3202fa62f ("slub: relocate freelist pointer to middle of object") moving the freelist pointer from dm_state->base (which was unused) to dm_state->context (which is dereferenced). Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=207383 Fixes: bd200d190f45 ("drm/amd/display: Don't replace the dc_state for fast updates") Reported-by: Duncan <[email protected]> Signed-off-by: Mazin Rezk <[email protected]> Reviewed-by: Nicholas Kazlauskas <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2020-07-30drm/amdgpu: Prevent kernel-infoleak in amdgpu_info_ioctl()Peilin Ye1-1/+2
Compiler leaves a 4-byte hole near the end of `dev_info`, causing amdgpu_info_ioctl() to copy uninitialized kernel stack memory to userspace when `size` is greater than 356. In 2015 we tried to fix this issue by doing `= {};` on `dev_info`, which unfortunately does not initialize that 4-byte hole. Fix it by using memset() instead. Cc: [email protected] Fixes: c193fa91b918 ("drm/amdgpu: information leak in amdgpu_info_ioctl()") Fixes: d38ceaf99ed0 ("drm/amdgpu: add core driver (v4)") Suggested-by: Dan Carpenter <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Peilin Ye <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-29dma-buf: Use sequence counter with associated wound/wait mutexAhmed S. Darwish1-2/+0
A sequence counter write side critical section must be protected by some form of locking to serialize writers. If the serialization primitive is not disabling preemption implicitly, preemption has to be explicitly disabled before entering the sequence counter write side critical section. The dma-buf reservation subsystem uses plain sequence counters to manage updates to reservations. Writer serialization is accomplished through a wound/wait mutex. Acquiring a wound/wait mutex does not disable preemption, so this needs to be done manually before and after the write side critical section. Use the newly-added seqcount_ww_mutex_t instead: - It associates the ww_mutex with the sequence count, which enables lockdep to validate that the write side critical section is properly serialized. - It removes the need to explicitly add preempt_disable/enable() around the write side critical section because the write_begin/end() functions for this new data type automatically do this. If lockdep is disabled this ww_mutex lock association is compiled out and has neither storage size nor runtime overhead. Signed-off-by: Ahmed S. Darwish <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Daniel Vetter <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-07-29drm/ttm: make ttm_tt unbind function return void.Dave Airlie1-3/+2
The return value just led to BUG_ON, I think if a driver wants to BUG_ON here it can do it itself. (don't BUG_ON). Reviewed-by: Christian König <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-07-28drm/amdgpu/si: initial support for GPU resetAlex Deucher1-2/+90
Ported from radeon. Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-28drm/amd/display: enable SI support in the Kconfig (v2)Mauro Rossi1-0/+8
[Why] All DCE6 specific code changes are guarded by CONFIG_DRM_AMD_DC_SI Kconfig option [How] (v1) CONFIG_DRM_AMD_DC_SI configuration option is added, default setting is disabled (v2) Hainan is not supported, description updated accordingly Tested with HD7750 (Cape Verde) and HD7950 (Tahiti) Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Mauro Rossi <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-28drm/amdgpu: enable DC support for SI parts (v2)Mauro Rossi2-0/+16
[Why] amdgpu_device.c requires changes for SI chipsets support si.c require changes for Display Manager IP block enabling [How] amdgpu_device.c: add SI families in amdgpu_device_asic_has_dc_support() si.c: changes in si_set_ip_blocks() for Display Manager IP blocks enablement (v1) NOTE: As per Kaveri and older amdgpu.dc=1 kernel cmdline is required (v2) fix for bc011f9350 ("drm/amdgpu: Change SI/CI gfx/sdma/smu init sequence") remove CHIP_HAINAN support since it does not have physical DCE6 module Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Mauro Rossi <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-07-28drm/amdgpu: Change type of module param `ppfeaturemask` to hexintPaul Menzel1-2/+2
The newly added hexint helper is more convenient for bitmasks. Before: $ more /sys/module/amdgpu/parameters/ppfeaturemask 4294950911 After: $ more /sys/module/amdgpu/parameters/ppfeaturemask 0xffffbfff Cc: [email protected] Cc: [email protected] Signed-off-by: Paul Menzel <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Christian König <[email protected]> Link: https://patchwork.freedesktop.org/patch/374724/
2020-07-27drm/amd/display: create plane rotation property for Bonaire and laterMauro Rossi1-2/+3
[Why] DCE6 chipsets do not support HW rotation [How] rotation property is created for Bonaire and later Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Mauro Rossi <[email protected]> Signed-off-by: Alex Deucher <[email protected]>