aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu
AgeCommit message (Collapse)AuthorFilesLines
2021-08-16drm/amd/amdgpu embed hw_fence into amdgpu_jobJack Zhang9-37/+119
Why: Previously hw fence is alloced separately with job. It caused historical lifetime issues and corner cases. The ideal situation is to take fence to manage both job and fence's lifetime, and simplify the design of gpu-scheduler. How: We propose to embed hw_fence into amdgpu_job. 1. We cover the normal job submission by this method. 2. For ib_test, and submit without a parent job keep the legacy way to create a hw fence separately. v2: use AMDGPU_FENCE_FLAG_EMBED_IN_JOB_BIT to show that the fence is embedded in a job. v3: remove redundant variable ring in amdgpu_job v4: add tdr sequence support for this feature. Add a job_run_counter to indicate whether this job is a resubmit job. v5 add missing handling in amdgpu_fence_enable_signaling Signed-off-by: Jingwen Chen <[email protected]> Signed-off-by: Jack Zhang <[email protected]> Reviewed-by: Andrey Grodzovsky <[email protected]> Reviewed by: Monk Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-16Merge tag 'drm-misc-next-2021-08-12' of ↵Dave Airlie3-9/+15
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for v5.15: UAPI Changes: Cross-subsystem Changes: - Add lockdep_assert(once) helpers. Core Changes: - Add lockdep assert to drm_is_current_master_locked. - Fix typos in dma-buf documentation. - Mark drm irq midlayer as legacy only. - Fix GPF in udmabuf_create. - Rename member to correct value in drm_edid.h Driver Changes: - Build fix to make nouveau build with NOUVEAU_BACKLIGHT. - Add MI101AIT-ICP1, LTTD800480070-L6WWH-RT panels. - Assorted fixes to bridge/it66121, anx7625. - Add custom crtc_state to simple helpers, and use it to convert pll handling in mgag200 to atomic. - Convert drivers to use offset-adjusted framebuffer bo mappings. - Assorted small fixes and fix for a use-after-free in vmwgfx. - Convert remaining callers of non-legacy drivers to use linux irqs directly. - Small cleanup in ingenic. - Small fixes to virtio and ti-sn65dsi86. Signed-off-by: Dave Airlie <[email protected]> From: Maarten Lankhorst <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-08-12gpu: Bulk conversion to generic_handle_domain_irq()Marc Zyngier1-1/+1
Wherever possible, replace constructs that match either generic_handle_irq(irq_find_mapping()) or generic_handle_irq(irq_linear_revmap()) to a single call to generic_handle_domain_irq(). Signed-off-by: Marc Zyngier <[email protected]>
2021-08-11gpu: drm: amd: amdgpu: amdgpu_i2c: fix possible uninitialized-variable ↵Tuo Li1-1/+1
access in amdgpu_i2c_router_select_ddc_port() The variable val is declared without initialization, and its address is passed to amdgpu_i2c_get_byte(). In this function, the value of val is accessed in: DRM_DEBUG("i2c 0x%02x 0x%02x read failed\n", addr, *val); Also, when amdgpu_i2c_get_byte() returns, val may remain uninitialized, but it is accessed in: val &= ~amdgpu_connector->router.ddc_mux_control_pin; To fix this possible uninitialized-variable access, initialize val to 0 in amdgpu_i2c_router_select_ddc_port(). Reported-by: TOTE Robot <[email protected]> Signed-off-by: Tuo Li <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-11drm/amdkfd: CWSR with software schedulerMukul Joshi3-1/+94
This patch adds support to program trap handler settings when loading driver with software scheduler (sched_policy=2). Signed-off-by: Mukul Joshi <[email protected]> Suggested-by: Jay Cornwall <[email protected]> Reviewed-by: Harish Kasiviswanathan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-10drm/amdgpu: Convert to Linux IRQ interfacesThomas Zimmermann3-9/+15
Drop the DRM IRQ midlayer in favor of Linux IRQ interfaces. DRM's IRQ helpers are mostly useful for UMS drivers. Modern KMS drivers don't benefit from using it. DRM IRQ callbacks are now being called directly or inlined. The interrupt number returned by pci_msi_vector() is now stored in struct amdgpu_irq. Calls to pci_msi_vector() can fail and return a negative errno code. Abort initlaizaton in thi case. The DRM IRQ midlayer does not handle this correctly. Signed-off-by: Thomas Zimmermann <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-08-10drm/amdgpu: handle VCN instances when harvesting (v2)Alex Deucher1-3/+9
There may be multiple instances and only one is harvested. v2: fix typo in commit message Fixes: 83a0b8639185 ("drm/amdgpu: add judgement when add ip blocks (v2)") Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1673 Reviewed-by: Guchun Chen <[email protected]> Reviewed-by: James Zhu <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2021-08-10drm/amdgpu: handle VCN instances when harvesting (v2)Alex Deucher1-3/+9
There may be multiple instances and only one is harvested. v2: fix typo in commit message Fixes: 83a0b8639185 ("drm/amdgpu: add judgement when add ip blocks (v2)") Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1673 Reviewed-by: Guchun Chen <[email protected]> Reviewed-by: James Zhu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-09drm/amdgpu: Removed unnecessary if statementSergio Miguéns Iglesias1-3/+0
There was an "if" statement that did nothing so it was removed. Signed-off-by: Sergio Miguéns Iglesias <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-09drm/amdgpu: fix kernel-doc warnings on non-kernel-doc commentsRandy Dunlap1-3/+3
Don't use "begin kernel-doc notation" (/**) for comments that are not kernel-doc. This eliminates warnings reported by the 0day bot. drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c:89: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * This shader is used to clear VGPRS and LDS, and also write the input drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c:209: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * The below shaders are used to clear SGPRS, and also write the input drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c:301: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * This shader is used to clear the uninitiated sgprs after the above Fixes: 0e0036c7d13b ("drm/amdgpu: fix no full coverage issue for gprs initialization") Signed-off-by: Randy Dunlap <[email protected]> Reported-by: kernel test robot <[email protected]> Cc: Alex Deucher <[email protected]> Cc: Christian König <[email protected]> Cc: "Pan, Xinhui" <[email protected]> Cc: Dennis Li <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: Alex Deucher <[email protected]>
2021-08-09drm/amd/amdgpu: skip locking delayed work if not initialized.YuBiao Wang1-1/+2
When init failed in early init stage, amdgpu_object has not been initialized, so hasn't the ttm delayed queue functions. Signed-off-by: YuBiao Wang <[email protected]> Reviewed-by: Emily.Deng <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-09drm/amdgpu: Extend full access wait time in guestVictor Zhao1-4/+12
- Extend wait time and add retry, currently 6s * 2times - Change timing algorithm Signed-off-by: Victor Zhao <[email protected]> Signed-off-by: Peng Ju Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-06drm/amdgpu: check for allocation failure in amdgpu_vkms_sw_init()Dan Carpenter1-0/+2
Check whether the kcalloc() fails and return -ENOMEM if it does. Fixes: 84ec374bd58036 ("drm/amdgpu: create amdgpu_vkms (v4)") Reviewed-by: Christian König <[email protected]> Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-06drm/amdgpu: don't enable baco on boco platforms in runpmAlex Deucher1-0/+2
If the platform uses BOCO, don't use BACO in runtime suspend. We could end up executing the BACO path if the platform supports both. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1669 Reviewed-by: Evan Quan <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2021-08-06drm/amdgpu: set RAS EEPROM address from VBIOSJohn Clements3-0/+45
update to latest atombios fw table [Backport to 5.14 - Alex] Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1670 Signed-off-by: John Clements <[email protected]> Reviewed-by: Hawking Zhang <[email protected]>. Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2021-08-05drm/amdgpu: drop redundant null-pointer checks in amdgpu_ttm_tt_populate() ↵Tuo Li1-2/+2
and amdgpu_ttm_tt_unpopulate() The varialbe gtt in the function amdgpu_ttm_tt_populate() and amdgpu_ttm_tt_unpopulate() is guaranteed to be not NULL in the context. Thus the null-pointer checks are redundant and can be dropped. Reported-by: TOTE Robot <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Tuo Li <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-05drm/amdgpu: don't enable baco on boco platforms in runpmAlex Deucher1-0/+2
If the platform uses BOCO, don't use BACO in runtime suspend. We could end up executing the BACO path if the platform supports both. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1669 Reviewed-by: Evan Quan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-05drm/amdgpu: Put MODE register in wave debug infoJoseph Greathouse5-0/+5
Add the MODE register into the per-wave debug information. This register holds state such as FP rounding and denorm modes, which exceptions are enabled, and active clamping modes. Signed-off-by: Joseph Greathouse <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-05drm/amdgpu: set RAS EEPROM address from VBIOSJohn Clements3-0/+58
update to latest atombios fw table Signed-off-by: John Clements <[email protected]> Reviewed-by: Hawking Zhang <[email protected]>. Signed-off-by: Alex Deucher <[email protected]>
2021-08-05drm/amd/amdgpu: Recovery vcn instance iterate.Peng Ju Zhou1-13/+20
The previous logic is recording the amount of valid vcn instances to use them on SRIOV, it is a hard task due to the vcn accessment is based on the index of the vcn instance. Check if the vcn instance enabled before do instance init. Signed-off-by: Peng Ju Zhou <[email protected]> Reviewed-by: Monk Liu <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-05drm/amdgpu: added synchronization for psp cmd buf accessJohn Clements1-66/+139
resolved race condition accessing psp cmd submission memory Signed-off-by: John Clements <[email protected]> Reviewed-by: Guchun Chen <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-05drm/amdgpu: update PSP BL cmd IDsJohn Clements1-3/+3
resolved bug with incorrect PSP BL cmd IDs Signed-off-by: John Clements <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-05drm/amdgpu: add DID for beige gobyChengming Gui1-0/+7
Add device ids. Signed-off-by: Chengming Gui <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-05drm/amd/amdgpu: remove redundant host to psp cmd buf allocationsCandice Li1-161/+66
Signed-off-by: Candice Li <[email protected]> Reviewed-by: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-05drm/amdgpu: replace dce_virtual with amdgpu_vkms (v3)Ryan Taylor10-290/+228
Move dce_virtual into amdgpu_vkms and update all references to dce_virtual with amdgpu_vkms. v2: Removed more references to dce_virtual. v3: Restored display modes from previous implementation. Signed-off-by: Ryan Taylor <[email protected]> Reported-by: kernel test robot <[email protected]> Suggested-by: Alex Deucher <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-05drm/amdgpu: cleanup dce_virtualRyan Taylor1-565/+3
Remove obsolete functions and variables from dce_virtual. Signed-off-by: Ryan Taylor <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-05drm/amdgpu: create amdgpu_vkms (v4)Ryan Taylor7-11/+493
Modify the VKMS driver into an api that dce_virtual can use to create virtual displays that obey drm's atomic modesetting api. v2: Made local functions static. v3: Switched vkms_output kzalloc for kcalloc. Cleanup patches by moving display mode fixes to this patch. v4: Update atomic_check and atomic_update to comply with new kms api. Signed-off-by: Ryan Taylor <[email protected]> Reported-by: kernel test robot <[email protected]> Suggested-by: Alex Deucher <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-05gpu/drm/amd: Remove duplicated include of drm_drv.hzhouchuangao1-2/+0
Duplicate include header file <drm/drm_drv.h> line 28: #include <drm/drm_drv.h> line 44: #include <drm/drm_drv.h> Signed-off-by: zhouchuangao <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-05drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)Guchun Chen3-10/+11
In amdgpu_fence_driver_hw_fini, no need to call drm_sched_fini to stop scheduler in s3 test, otherwise, fence related failure will arrive after resume. To fix this and for a better clean up, move drm_sched_fini from fence_hw_fini to fence_sw_fini, as it's part of driver shutdown, and should never be called in hw_fini. v2: rename amdgpu_fence_driver_init to amdgpu_fence_driver_sw_init, to keep sw_init and sw_fini paired. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1668 Fixes: 8d35a2596164c1 ("drm/amdgpu: adjust fence driver enable sequence") Suggested-by: Christian König <[email protected]> Tested-by: Mike Lothian <[email protected]> Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-05drm/amdgpu: Fix channel_index table layout for AldebaranMukul Joshi3-12/+12
Fix the channel_index table layout to fetch the correct channel_index when calculating physical address from normalized address during page retirement. Also, fix the number of UMC instances and number of channels within each UMC instance for Aldebaran. Signed-off-by: Mukul Joshi <[email protected]> Reviewed-By: John Clements <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-05drm/amdgpu: fix checking pmops when PM_SLEEP is not enabledRandy Dunlap1-1/+1
'pm_suspend_target_state' is only available when CONFIG_PM_SLEEP is set/enabled. OTOH, when both SUSPEND and HIBERNATION are not set, PM_SLEEP is not set, so this variable cannot be used. ../drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c: In function ‘amdgpu_acpi_is_s0ix_active’: ../drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:1046:11: error: ‘pm_suspend_target_state’ undeclared (first use in this function); did you mean ‘__KSYM_pm_suspend_target_state’? return pm_suspend_target_state == PM_SUSPEND_TO_IDLE; ^~~~~~~~~~~~~~~~~~~~~~~ __KSYM_pm_suspend_target_state Also use shorter IS_ENABLED(CONFIG_foo) notation for checking the 2 config symbols. Fixes: 91e273712ab8dd ("drm/amdgpu: Check pmops for desired suspend state") Signed-off-by: Randy Dunlap <[email protected]> Cc: Alex Deucher <[email protected]> Cc: Christian König <[email protected]> Cc: "Pan, Xinhui" <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Alex Deucher <[email protected]>
2021-08-05drm/amdgpu: add DID for beige gobyChengming Gui1-0/+7
Add device ids. Signed-off-by: Chengming Gui <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-05drm/amdgpu: fix checking pmops when PM_SLEEP is not enabledRandy Dunlap1-1/+1
'pm_suspend_target_state' is only available when CONFIG_PM_SLEEP is set/enabled. OTOH, when both SUSPEND and HIBERNATION are not set, PM_SLEEP is not set, so this variable cannot be used. ../drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c: In function ‘amdgpu_acpi_is_s0ix_active’: ../drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c:1046:11: error: ‘pm_suspend_target_state’ undeclared (first use in this function); did you mean ‘__KSYM_pm_suspend_target_state’? return pm_suspend_target_state == PM_SUSPEND_TO_IDLE; ^~~~~~~~~~~~~~~~~~~~~~~ __KSYM_pm_suspend_target_state Also use shorter IS_ENABLED(CONFIG_foo) notation for checking the 2 config symbols. Fixes: 91e273712ab8dd ("drm/amdgpu: Check pmops for desired suspend state") Signed-off-by: Randy Dunlap <[email protected]> Cc: Alex Deucher <[email protected]> Cc: Christian König <[email protected]> Cc: "Pan, Xinhui" <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2021-08-02drm/amdgpu: fix the doorbell missing when in CGPG issue for renoir.Yifan Zhang1-1/+20
If GC has entered CGPG, ringing doorbell > first page doesn't wakeup GC. Enlarge CP_MEC_DOORBELL_RANGE_UPPER to workaround this issue. Signed-off-by: Yifan Zhang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-02Revert "Revert "drm/amdkfd: Only apply TLB flush optimization on ALdebaran""Eric Huang1-0/+6
This reverts commit 53d0533049a573298f74ae07a39db14163960e68. Revert reason: The issue has been resolved. Signed-off-by: Eric Huang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-02Revert "Revert "drm/amdgpu: Fix warning of Function parameter or member not ↵Eric Huang1-0/+1
described"" This reverts commit 4e7b93ca52fb228b177168d436449c5671415a72. Revert reason: The issue has been resolved. Signed-off-by: Eric Huang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-02Revert "Revert "drm/amdkfd: Make TLB flush conditional on mapping""Eric Huang2-9/+12
This reverts commit 7ed9876c9793bfe96fed58ba645d6c8e32f26001. Revert reason: The issue has been resolved. Signed-off-by: Eric Huang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-02Revert "Revert "drm/amdgpu: Add table_freed parameter to amdgpu_vm_bo_update""Eric Huang4-10/+10
This reverts commit 024d8811c90ed56d8b90cdcf71e51c9fedeff460. Revert reason: The issue has been resolved. Signed-off-by: Eric Huang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-08-02drm/amdgpu: Fix out-of-bounds read when update mappingxinhui pan1-1/+2
If one GTT BO has been evicted/swapped out, it should sit in CPU domain. TTM only alloc struct ttm_resource instead of struct ttm_range_mgr_node for sysMem. Now when we update mapping for such invalidated BOs, we might walk out of bounds of struct ttm_resource. Three possible fix: 1) Let sysMem manager alloc struct ttm_range_mgr_node, like ttm_range_manager does. 2) Pass pages_addr to update_mapping function too, but need memset pages_addr[] to zero when unpopulate. 3) Init amdgpu_res_cursor directly. bug is detected by kfence. ================================================================== BUG: KFENCE: out-of-bounds read in amdgpu_vm_bo_update_mapping+0x564/0x6e0 Out-of-bounds read at 0x000000008ea93fe9 (64B right of kfence-#167): amdgpu_vm_bo_update_mapping+0x564/0x6e0 [amdgpu] amdgpu_vm_bo_update+0x282/0xa40 [amdgpu] amdgpu_vm_handle_moved+0x19e/0x1f0 [amdgpu] amdgpu_cs_vm_handling+0x4e4/0x640 [amdgpu] amdgpu_cs_ioctl+0x19e7/0x23c0 [amdgpu] drm_ioctl_kernel+0xf3/0x180 [drm] drm_ioctl+0x2cb/0x550 [drm] amdgpu_drm_ioctl+0x5e/0xb0 [amdgpu] kfence-#167 [0x000000008e11c055-0x000000001f676b3e ttm_sys_man_alloc+0x35/0x80 [ttm] ttm_resource_alloc+0x39/0x50 [ttm] ttm_bo_swapout+0x252/0x5a0 [ttm] ttm_device_swapout+0x107/0x180 [ttm] ttm_global_swapout+0x6f/0x130 [ttm] ttm_tt_populate+0xb1/0x2a0 [ttm] ttm_bo_handle_move_mem+0x17e/0x1d0 [ttm] ttm_mem_evict_first+0x59d/0x9c0 [ttm] ttm_bo_mem_space+0x39f/0x400 [ttm] ttm_bo_validate+0x13c/0x340 [ttm] ttm_bo_init_reserved+0x269/0x540 [ttm] amdgpu_bo_create+0x1d1/0xa30 [amdgpu] amdgpu_bo_create_user+0x40/0x80 [amdgpu] amdgpu_gem_object_create+0x71/0xc0 [amdgpu] amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x2f2/0xcd0 [amdgpu] kfd_ioctl_alloc_memory_of_gpu+0xe2/0x330 [amdgpu] kfd_ioctl+0x461/0x690 [amdgpu] Signed-off-by: xinhui pan <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-07-30drm/amdgpu: fix the doorbell missing when in CGPG issue for renoir.Yifan Zhang1-1/+20
If GC has entered CGPG, ringing doorbell > first page doesn't wakeup GC. Enlarge CP_MEC_DOORBELL_RANGE_UPPER to workaround this issue. Signed-off-by: Yifan Zhang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-07-30drm/amdgpu: Fix out-of-bounds read when update mappingxinhui pan1-1/+2
If one GTT BO has been evicted/swapped out, it should sit in CPU domain. TTM only alloc struct ttm_resource instead of struct ttm_range_mgr_node for sysMem. Now when we update mapping for such invalidated BOs, we might walk out of bounds of struct ttm_resource. Three possible fix: 1) Let sysMem manager alloc struct ttm_range_mgr_node, like ttm_range_manager does. 2) Pass pages_addr to update_mapping function too, but need memset pages_addr[] to zero when unpopulate. 3) Init amdgpu_res_cursor directly. bug is detected by kfence. ================================================================== BUG: KFENCE: out-of-bounds read in amdgpu_vm_bo_update_mapping+0x564/0x6e0 Out-of-bounds read at 0x000000008ea93fe9 (64B right of kfence-#167): amdgpu_vm_bo_update_mapping+0x564/0x6e0 [amdgpu] amdgpu_vm_bo_update+0x282/0xa40 [amdgpu] amdgpu_vm_handle_moved+0x19e/0x1f0 [amdgpu] amdgpu_cs_vm_handling+0x4e4/0x640 [amdgpu] amdgpu_cs_ioctl+0x19e7/0x23c0 [amdgpu] drm_ioctl_kernel+0xf3/0x180 [drm] drm_ioctl+0x2cb/0x550 [drm] amdgpu_drm_ioctl+0x5e/0xb0 [amdgpu] kfence-#167 [0x000000008e11c055-0x000000001f676b3e ttm_sys_man_alloc+0x35/0x80 [ttm] ttm_resource_alloc+0x39/0x50 [ttm] ttm_bo_swapout+0x252/0x5a0 [ttm] ttm_device_swapout+0x107/0x180 [ttm] ttm_global_swapout+0x6f/0x130 [ttm] ttm_tt_populate+0xb1/0x2a0 [ttm] ttm_bo_handle_move_mem+0x17e/0x1d0 [ttm] ttm_mem_evict_first+0x59d/0x9c0 [ttm] ttm_bo_mem_space+0x39f/0x400 [ttm] ttm_bo_validate+0x13c/0x340 [ttm] ttm_bo_init_reserved+0x269/0x540 [ttm] amdgpu_bo_create+0x1d1/0xa30 [amdgpu] amdgpu_bo_create_user+0x40/0x80 [amdgpu] amdgpu_gem_object_create+0x71/0xc0 [amdgpu] amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x2f2/0xcd0 [amdgpu] kfd_ioctl_alloc_memory_of_gpu+0xe2/0x330 [amdgpu] kfd_ioctl+0x461/0x690 [amdgpu] Signed-off-by: xinhui pan <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-07-30Merge tag 'amd-drm-next-5.15-2021-07-29' of ↵Dave Airlie60-1164/+3006
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-5.15-2021-07-29: amdgpu: - VCN/JPEG power down sequencing fixes - Various navi pcie link handling fixes - Clockgating fixes - Yellow Carp fixes - Beige Goby fixes - Misc code cleanups - S0ix fixes - SMU i2c bus rework - EEPROM handling rework - PSP ucode handling cleanup - SMU error handling rework - AMD HDMI freesync fixes - USB PD firmware update rework - MMIO based vram access rework - Misc display fixes - Backlight fixes - Add initial Cyan Skillfish support - Overclocking fixes suspend/resume amdkfd: - Sysfs leak fix - Add counters for vm faults and migration - GPUVM TLB optimizations radeon: - Misc fixes Signed-off-by: Dave Airlie <[email protected]> From: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2021-07-28drm/amdgpu: enable psp front door loading by default for cyan_skillfish2Huang Rui1-3/+4
The function is ready on psp firmware, and enable it by default. Signed-off-by: Huang Rui <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-07-28drm/amdgpu: adjust fence driver enable sequenceLikun Gao3-47/+14
Fence driver was enabled per ring when sw init on per IP block before. Change to enable all the fence driver at the same time after amdgpu_device_ip_init finished. Rename some function related to fence to make it reasonable for read. Signed-off-by: Likun Gao <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-07-28drm/amdgpu: Added PSP13 BL loading support for additional driversJohn Clements1-0/+18
Added BL loading support for soc/intf/dbg drivers Signed-off-by: John Clements <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-07-28drm/amdgpu: Consolidated PSP13 BL FW loadingJohn Clements1-35/+11
Remove duplicate code Signed-off-by: John Clements <[email protected]> Reviewed-by: Hawking Zhang <[email protected]>. Signed-off-by: Alex Deucher <[email protected]>
2021-07-28drm/amdgpu: Added support for added psp driver binaries FWJohn Clements3-7/+70
Detect psp driver binaries packed into FW and try to load the FW Signed-off-by: John Clements <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-07-28drm/amdgpu: Added latest PSP FW headerJohn Clements2-21/+116
Improved handling for scalling PSP FW binaries Signed-off-by: John Clements <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-07-28drm/amdgpu: remove the access of xxx_PSP_DEBUG on cycan_skillfishHuang Rui1-2/+0
It won't need to clear the xxx_PSP_DEBUG registers, because firmware will handle this change. Signed-off-by: Huang Rui <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-07-28drm/amdgpu/display: add support for multiple backlightsAlex Deucher1-2/+2
On platforms that support multiple backlights, register each one separately. This lets us manage them independently rather than registering a single backlight and applying the same settings to both. v2: fix typo: Reported-by: kernel test robot <[email protected]> Reviewed-by: Roman Li <[email protected]> Signed-off-by: Alex Deucher <[email protected]>