aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu
AgeCommit message (Collapse)AuthorFilesLines
2022-09-29drm/amdgpu: Fixed ras warning when uninstalling amdgpuYiPeng Chai1-1/+2
For the asic using smu v13_0_2, there is the following warning when uninstalling amdgpu: amdgpu: ras disable gfx failed poison:1 ret:-22. [Why]: For the asic using smu v13_0_2, the psp .suspend and mode1reset is called before amdgpu_ras_pre_fini during amdgpu uninstall, it has disabled all ras features and reset the psp. Since the psp is reset, calling amdgpu_ras_disable_all_features in amdgpu_ras_pre_fini to disable ras features will fail. [How]: If all ras features are disabled, amdgpu_ras_disable_all_features will not be called to disable all ras features again. Signed-off-by: YiPeng Chai <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-29drm/amdgpu/gfx11: switch to amdgpu_gfx_rlc_init_microcodeHawking Zhang1-152/+4
switch to common helper to initialize rlc firmware for gfx11 Signed-off-by: Hawking Zhang <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-29drm/amdgpu/gfx10: switch to amdgpu_gfx_rlc_init_microcodeHawking Zhang1-187/+9
switch to common helper to initialize rlc firmware for gfx10 v2: squash in size validation fix (Alex) Signed-off-by: Hawking Zhang <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-28Merge tag 'amd-drm-next-6.1-2022-09-23' of ↵Dave Airlie27-409/+864
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.1-2022-09-23: amdgpu: - SDMA fix - Add new firmware types to debugfs/IOCTL version queries - Misc spelling and grammar fixes - Misc code cleanups - DCN 3.2.x fixes - DCN 3.1.x fixes - CS cleanup - Gang submit support - Clang fixes - Non-DC audio fix - GPUVM locking fixes - Vega10 PWN fan speed fix amdkgd: - MQD manager cleanup - Misc spelling and grammar fixes UAPI: - Add new firmware types to the FW version query IOCTL Signed-off-by: Dave Airlie <[email protected]> From: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2022-09-28Merge tag 'drm-misc-next-2022-09-23' of ↵Dave Airlie2-2/+14
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 6.1: UAPI Changes: Cross-subsystem Changes: - dma-buf: Improve signaling when debugging Core Changes: - Backlight handling improvements - format-helper: Add drm_fb_build_fourcc_list() - fourcc: Kunit tests improvements - modes: Add DRM_MODE_INIT() macro - plane: Remove drm_plane_init(), Allocate planes with drm_universal_plane_alloc() - plane-helper: Add drm_plane_helper_atomic_check() - probe-helper: Add drm_connector_helper_get_modes_fixed() and drm_crtc_helper_mode_valid_fixed() - tests: Conversion to parametrized tests, test name consistency Driver Changes: - amdgpu: Fix for a VRAM eviction issue - ast: Resolution handling improvements - mediatek: small code improvements for DP - omap: Refcounting fix, small improvements - rockchip: RK3568 support, Gamma support for RK3399 - sun4i: Build failure fix when !OF - udl: Multiple fixes here and there - vc4: HDMI hotplug handling improvements - vkms: Warning fix Signed-off-by: Dave Airlie <[email protected]> From: Maxime Ripard <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/20220923073943.d43tne5hni3iknlv@houat
2022-09-27drm/amdgpu: Add amdgpu suspend-resume code path under SRIOVBokun Zhang2-1/+30
- Under SRIOV, we need to send REQ_GPU_FINI to the hypervisor during the suspend time. Furthermore, we cannot request a mode 1 reset under SRIOV as VF. Therefore, we will skip it as it is called in suspend_noirq() function. - In the resume code path, we need to send REQ_GPU_INIT to the hypervisor and also resume PSP IP block under SRIOV. Signed-off-by: Bokun Zhang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2022-09-27drm/amdgpu: Remove fence_process in count_emittedJiadong.Zhu1-1/+0
The function amdgpu_fence_count_emitted used in work_hander should not call amdgpu_fence_process which must be used in irq handler. Reviewed-by: Christian König <[email protected]> Signed-off-by: Jiadong.Zhu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-27drm/amdgpu: Correct the position in patch_cond_execJiadong.Zhu1-1/+1
The current position calulated in gfx_v9_0_ring_emit_patch_cond_exec underflows when the wptr is divisible by ring->buf_mask + 1. Reviewed-by: Christian König <[email protected]> Signed-off-by: Jiadong.Zhu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-27drm/amdgpu: pass queue size and is_aql_queue to MESGraham Sider2-0/+6
Update mes_v11_api_def.h add_queue API with is_aql_queue parameter. Also re-use gds_size for the queue size (unused for KFD). MES requires the queue size in order to compute the actual wptr offset within the queue RB since it increases monotonically for AQL queues. v2: Make is_aql_queue assign clearer Signed-off-by: Graham Sider <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-27drm/amdgpu: avoid gfx register accessing during gfxoffEvan Quan1-0/+4
Make sure gfxoff is disabled before gfx register accessing. Signed-off-by: Evan Quan <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-27drm/amdgpu/gfx9: switch to amdgpu_gfx_rlc_init_microcodeHawking Zhang1-103/+3
switch to common helper to initialize rlc firmware for gfx9 Signed-off-by: Hawking Zhang <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-27drm/amdgpu: add helper to init rlc firmwareHawking Zhang2-1/+38
To initialzie rlc firmware according to rlc firmware header version v2: squash in backwards compat fix Signed-off-by: Hawking Zhang <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-24drm: POC drm on dyndbg - use in core, 2 helpers, 3 drivers.Jim Cromie1-0/+14
Use DECLARE_DYNDBG_CLASSMAP across DRM: - in .c files, since macro defines/initializes a record - in drivers, $mod_{drv,drm,param}.c ie where param setup is done, since a classmap is param related - in drm/drm_print.c since existing __drm_debug param is defined there, and we ifdef it, and provide an elaborated alternative. - in drm_*_helper modules: dp/drm_dp - 1st item in makefile target drivers/gpu/drm/drm_crtc_helper.c - random pick iirc. Since these modules all use identical CLASSMAP declarations (ie: names and .class_id's) they will all respond together to "class DRM_UT_*" query-commands: :#> echo class DRM_UT_KMS +p > /proc/dynamic_debug/control NOTES: This changes __drm_debug from int to ulong, so BIT() is usable on it. DRM's enum drm_debug_category values need to sync with the index of their respective class-names here. Then .class_id == category, and dyndbg's class FOO mechanisms will enable drm_dbg(DRM_UT_KMS, ...). Though DRM needs consistent categories across all modules, thats not generally needed; modules X and Y could define FOO differently (ie a different NAME => class_id mapping), changes are made according to each module's private class-map. No callsites are actually selected by this patch, since none are class'd yet. Signed-off-by: Jim Cromie <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
2022-09-22drm/amdgpu: Fix VRAM eviction issueArunpravin Paneer Selvam1-1/+1
A user reported that when he starts a game (MTGA) with wine, he observed an error msg failed to pin framebuffer with error -12. Found an issue with the VRAM mem type eviction decision condition logic. This patch will fix the if condition code error. Gitlab bug link: https://gitlab.freedesktop.org/drm/amd/-/issues/2159 Fixes: ded910f368a5 ("drm/amdgpu: Implement intersect/compatible functions") Signed-off-by: Arunpravin Paneer Selvam <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Christian König <[email protected]> Signed-off-by: Christian König <[email protected]>
2022-09-21drm/amdgpu: don't register a dirty callback for non-atomicAlex Deucher1-1/+10
Some asics still support non-atomic code paths. Fixes: 66f99628eb2440 ("drm/amdgpu: use dirty framebuffer helper") Reported-by: Arthur Marsh <[email protected]> Reviewed-by: Hamza Mahfooz <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-21drm/amdgpu: Update PTE flags with TF enabledMukul Joshi2-4/+6
This patch updates the PTE flags when translate further (TF) is enabled: - With translate_further enabled, invalid PTEs can be 0. Reading consecutive invalid PTEs as 0 is considered a fault. To prevent this, ensure invalid PTEs have at least 1 bit set. - The current invalid PTE flags settings to translate a retry fault into a no-retry fault, doesn't work with TF enabled. As a result, update invalid PTE flags settings which works for both TF enabled and disabled case. Fixes: 352e683b72e79d ("drm/amdgpu: Enable translate_further to extend UTCL2 reach") Reviewed-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Mukul Joshi <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-21drm/amdgpu: add helper to init rlc fw in header v2_4Hawking Zhang1-0/+60
To initialize rlc firmware in header v2_4 Signed-off-by: Hawking Zhang <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-21drm/amdgpu: add helper to init rlc fw in header v2_3Hawking Zhang1-0/+35
To initialize rlc firmware in header v2_3 Signed-off-by: Hawking Zhang <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-21drm/amdgpu: add helper to init rlc fw in header v2_2Hawking Zhang1-0/+30
To initialize rlc firmware in header v2_2 Signed-off-by: Hawking Zhang <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-21drm/amdgpu: add helper to init rlc fw in header v2_1Hawking Zhang1-0/+40
To initialize rlc firmware in header v2_1 Signed-off-by: Hawking Zhang <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-21drm/amdgpu: add helper to init rlc fw in header v2_0Hawking Zhang1-0/+64
To initialize rlc firmware in header v2_0 Signed-off-by: Hawking Zhang <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-21drm/amdgpu: Fix amdgpu_vm_pt_free warningPhilip Yang3-3/+47
Free page table BO from vm resv unlocked context generate below warnings. Add a pt_free_work in vm to free page table BO from vm->pt_freed list. pass vm resv unlock status from page table update caller, and add vm_bo entry to vm->pt_freed list and schedule the pt_free_work if calling with vm resv unlocked. WARNING: CPU: 12 PID: 3238 at drivers/gpu/drm/ttm/ttm_bo.c:106 ttm_bo_set_bulk_move+0xa1/0xc0 Call Trace: amdgpu_vm_pt_free+0x42/0xd0 [amdgpu] amdgpu_vm_pt_free_dfs+0xb3/0xf0 [amdgpu] amdgpu_vm_ptes_update+0x52d/0x850 [amdgpu] amdgpu_vm_update_range+0x2a6/0x640 [amdgpu] svm_range_unmap_from_gpus+0x110/0x300 [amdgpu] svm_range_cpu_invalidate_pagetables+0x535/0x600 [amdgpu] __mmu_notifier_invalidate_range_start+0x1cd/0x230 unmap_vmas+0x9d/0x140 unmap_region+0xa8/0x110 Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-21drm/amdgpu: Use vm status_lock to protect pt freePhilip Yang1-0/+3
Use vm_status_lock to protect all vm_status state transitions to allow them to happen without a reservation lock in unlocked page table updates. Signed-off-by: Philip Yang <[email protected]> Acked-by: Felix Kuehling <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-21drm/amdgpu: Use vm status_lock to protect vm evicted listPhilip Yang1-5/+22
Use vm_status_lock to protect all vm_status state transitions to allow them to happen without a reservation lock in unlocked page table updates. Signed-off-by: Philip Yang <[email protected]> Acked-by: Felix Kuehling <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-21drm/amdgpu: Use vm status_lock to protect vm moved listPhilip Yang1-4/+11
Use vm_status_lock to protect all vm_status state transitions to allow them to happen without a reservation lock in unlocked page table updates. Signed-off-by: Philip Yang <[email protected]> Acked-by: Felix Kuehling <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-21drm/amdgpu: Use vm status_lock to protect vm idle listPhilip Yang1-1/+3
Use vm_status_lock to protect all vm_status state transitions to allow them to happen without a reservation lock in unlocked page table updates. Signed-off-by: Philip Yang <[email protected]> Acked-by: Felix Kuehling <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-21drm/amdgpu: Use vm status_lock to protect relocated listPhilip Yang1-8/+15
Use vm_status_lock to protect all vm_status state transitions to allow them to happen without a reservation lock in unlocked page table updates. Signed-off-by: Philip Yang <[email protected]> Acked-by: Felix Kuehling <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-21drm/amdgpu: Rename vm invalidate lock to status_lockPhilip Yang2-16/+18
The vm status_lock will be used to protect all vm status lists. Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Christian König <[email protected]> Acked-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-20drm/amdgpu: fix initial connector audio valuehongao1-1/+6
This got lost somewhere along the way, This fixes audio not working until set_property was called. Signed-off-by: hongao <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-20drm/amdgpu: don't register a dirty callback for non-atomicAlex Deucher1-1/+10
Some asics still support non-atomic code paths. Fixes: 66f99628eb2440 ("drm/amdgpu: use dirty framebuffer helper") Reported-by: Arthur Marsh <[email protected]> Reviewed-by: Hamza Mahfooz <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-20drm/amdgpu: bump minor for gang submitChristian König1-1/+2
Since that has now landed bump the minor to let userspace know about it. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-20drm/amdgpu: properly initialize return value during CSChristian König1-0/+1
The return value is no longer initialized before the loop because of moving code around. Signed-off-by: Christian König <[email protected]> Fixes: c2b08e7a6d27 ("drm/amdgpu: move entity selection and job init earlier during CS") Reviewed-by: Guchun Chen <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-20drm/amdgpu: add gang submit frontend v6Christian König5-99/+195
Allows submitting jobs as gang which needs to run on multiple engines at the same time. All members of the gang get the same implicit, explicit and VM dependencies. So no gang member will start running until everything else is ready. The last job is considered the gang leader (usually a submission to the GFX ring) and used for signaling output dependencies. Each job is remembered individually as user of a buffer object, so there is no joining of work at the end. v2: rebase and fix review comments from Andrey and Yogesh v3: use READ instead of BOOKKEEP for now because of VM unmaps, set gang leader only when necessary v4: fix order of pushing jobs and adding fences found by Trigger. v5: fix job index calculation and adding IBs to jobs v6: fix typo found by Alex Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-20drm/amdgpu: add gang submit backend v2Christian König4-2/+67
Allows submitting jobs as gang which needs to run on multiple engines at the same time. Basic idea is that we have a global gang submit fence representing when the gang leader is finally pushed to run on the hardware last. Jobs submitted as gang are never re-submitted in case of a GPU reset since this won't work and will just deadlock the hardware immediately again. v2: fix logic inversion, improve documentation, fix rcu Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-20drm/amdgpu: cleanup instance limit on VCN4 v4Christian König1-19/+23
Similar to what we did for VCN3 use the job instead of the parser entity. Cleanup the coding style quite a bit as well. v2: merge improved application check into this patch v3: finally fix the check v4: limit to the correct engine Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-19drm/amdgpu: revert "fix limiting AV1 to the first instance on VCN3" v3Christian König1-7/+10
This reverts commit 250195ff744f260c169f5427422b6f39c58cb883. The job should now be initialized when we reach the parser functions. v2: merge improved application check into this patch v3: back to the original test, but use the right ring Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-19drm/amdgpu: move entity selection and job init earlier during CSChristian König2-29/+30
Initialize the entity for the CS and scheduler job much earlier. v2: fix job initialisation order and use correct scheduler instance Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-19drm/amdgpu: cleanup error handling in amdgpu_cs_parser_bosChristian König1-19/+18
Return early on success and so remove all those "if (r)" in the error path. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-19drm/amdgpu: cleanup CS pass2 v6Christian König1-184/+168
Cleanup the coding style and function names to represent the data they process for pass2 as well. Go over the chunks only twice now instead of multiple times. v2: fix job initialisation order and use correct scheduler instance v3: try to move all functional changes into a separate patch. v4: separate reordering, pass1 and pass2 change v5: fix va_start calculation v6: fix user fence check Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-19drm/amdgpu: Fixed psp fence and memory issues when removing amdgpu deviceYiPeng Chai3-1/+10
V3: Fixed psp fence and memory issues for the asic using smu v13_0_2 when removing amdgpu device. [Why]: 1. psp_suspend->psp_free_shared_bufs-> psp_ta_free_shared_buf-> amdgpu_bo_free_kernel-> ...->amdgpu_bo_release_notify-> amdgpu_fill_buffer psp will free vram memory used by psp when psp_suspend is called. But for the asic using smu v13_0_2, because psp_suspend is called before adev->shutdown is set to true when removing the first hive device, amdgpu fill_buffer will be called, which will cause fence issues when evicting all vram resources in amdgpu vram mgr_fini. 2. Since psp_hw_fini is not called after calling psp_suspend and psp_suspend only calls psp_ring_stop, the psp ring memory will not be released when amdgpu device is removed. [How]: 1. Set shutdown to true before calling amdgpu_device_gpu_recover, then amdgpu_fill_buffer will not be called when psp_suspend is called. 2. Free psp ring memory in psp_sw_fini. Signed-off-by: YiPeng Chai <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-19drm/amdgpu: Adjust removal control flow for smu v13_0_2YiPeng Chai4-0/+62
Adjust removal control flow for smu v13_0_2: During amdgpu uninstallation, when removing the first device, the kernel needs to first send a mode1reset message to all gpu devices. Otherwise, smu initialization will fail the next time amdgpu is installed. V2: 1. Update commit comments. 2. Remove the global variable amdgpu_device_remove_cnt and add a variable to the structure amdgpu_hive_info. 3. Use hive to detect the first removed device instead of a global variable. V3: 1. Update commit comments. 2. Split a patch into multiple patches. 3. The current patch does: a. Add a work mode of AMDGPU_RESET_FOR_DEVICE_REMOVE into the existing gpu recover path, which make all devices in hive list only have HW reset but no resume (except the base IP). b. Call AMDGPU_RESET_FOR_DEVICE_REMOVE and AMDGPU_NEED_FULL_RESET mode of amdgpu_device_gpu_recover in amdgpu_pci_remove when removing the first device in hive list. c. When removing the first device, the IP blocks keyword function call sequence is as follows: .suspend->mode1reset->.resume(basic ip)->.hw_fini->.early_fini->.sw_fini. ^ | |-<----------<---------<----| The first three sequences are because of a call to amdgpu_device_gpu_recover. The three sequences will be executed in a loop until all devices in the hive list are iterated. The sequences starting from .hw_fini only apply to the first device. Since .suspend has been called before, except the resumed phase1 basic ip blocks, all other ip blocks .hw_fini of current device will do nothing. d. When removing other devices, the calling sequences is the same as legacy: .hw_fini -> .early_fini -> .sw_fini. Since .suspend has been called when removing the first device, except the resumed phase1 basic ip blocks, all of other ip blocks .hw_fini of current device will do nothing. Signed-off-by: YiPeng Chai <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-19drm/amdgpu: add MES and MES-KIQ version in debugfsYifan Zhang1-0/+24
This patch addes MES and MES-KIQ version in debugfs. Signed-off-by: Yifan Zhang <[email protected]> Reviewed-by: Tim Huang <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-19drm/amd/display: clean up some inconsistent indentingsYang Li1-8/+8
clean up some inconsistent indentings Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2178 Reported-by: Abaci Robot <[email protected]> Signed-off-by: Yang Li <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-19drm/amdgpu: add rlcv/rlcp version info to debugfsHawking Zhang1-0/+24
amdgpu_firmware_info debugfs will show rlcv/rlcp ucode version info Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Likun Gao <[email protected]> Reviewed-by: Feifei Xu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-19drm/amdgpu: support print rlc v2_x ucode hdrHawking Zhang1-50/+118
add rlc v2_x support to print_rlc_hdr helper Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Likun Gao <[email protected]> Reviewed-by: Feifei Xu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-19drm/amdgpu: save rlcv/rlcp ucode version in amdgpu_gfxHawking Zhang3-0/+13
cache rlcv/rlcvp ucode version info in amdgpu_gfx structure Signed-off-by: Hawking Zhang <[email protected]> Reviewed-by: Likun Gao <[email protected]> Reviewed-by: Feifei Xu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-19drm/amdgpu: Update PTE flags with TF enabledMukul Joshi2-4/+6
This patch updates the PTE flags when translate further (TF) is enabled: - With translate_further enabled, invalid PTEs can be 0. Reading consecutive invalid PTEs as 0 is considered a fault. To prevent this, ensure invalid PTEs have at least 1 bit set. - The current invalid PTE flags settings to translate a retry fault into a no-retry fault, doesn't work with TF enabled. As a result, update invalid PTE flags settings which works for both TF enabled and disabled case. Fixes: 352e683b72e79d ("drm/amdgpu: Enable translate_further to extend UTCL2 reach") Reviewed-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Mukul Joshi <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-19drm/amdgpu: SDMA update use unlocked iteratorPhilip Yang1-3/+6
SDMA update page table may be called from unlocked context, this generate below warning. Use unlocked iterator to handle this case. WARNING: CPU: 0 PID: 1475 at drivers/dma-buf/dma-resv.c:483 dma_resv_iter_next Call Trace: dma_resv_iter_first+0x43/0xa0 amdgpu_vm_sdma_update+0x69/0x2d0 [amdgpu] amdgpu_vm_ptes_update+0x29c/0x870 [amdgpu] amdgpu_vm_update_range+0x2f6/0x6c0 [amdgpu] svm_range_unmap_from_gpus+0x115/0x300 [amdgpu] svm_range_cpu_invalidate_pagetables+0x510/0x5e0 [amdgpu] __mmu_notifier_invalidate_range_start+0x1d3/0x230 unmap_vmas+0x140/0x150 unmap_region+0xa8/0x110 Signed-off-by: Philip Yang <[email protected]> Suggested-by: Felix Kuehling <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-14drm/amdgpu/mes: zero the sdma_hqd_mask of 2nd SDMA engine for SDMA 6.0.1Yifan Zhang1-0/+3
there is only one SDMA engine in SDMA 6.0.1, the sdma_hqd_mask has to be zeroed for the 2nd engine, otherwise MES scheduler will consider 2nd engine exists and map/unmap SDMA queues to the non-existent engine. Signed-off-by: Yifan Zhang <[email protected]> Reviewed-by: Tim Huang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-09-14drm/amdgpu: make sure to init common IP before gmcAlex Deucher1-3/+11
Move common IP init before GMC init so that HDP gets remapped before GMC init which uses it. This fixes the Unsupported Request error reported through AER during driver load. The error happens as a write happens to the remap offset before real remapping is done. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216373 The error was unnoticed before and got visible because of the commit referenced below. This doesn't fix anything in the commit below, rather fixes the issue in amdgpu exposed by the commit. The reference is only to associate this commit with below one so that both go together. Fixes: 8795e182b02d ("PCI/portdrv: Don't disable AER reporting in get_port_device_capability()") Acked-by: Christian König <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]