aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu
AgeCommit message (Collapse)AuthorFilesLines
2022-03-31drm/amdgpu: Sync up header and implementation to use the same parameter namesMa Jun1-2/+2
Sync up header and implementation to use the same parameter names in function amdgpu_ring_init. ring_size -> max_dw, prio -> hw_prio Reviewed-by: Guchun Chen <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Ma Jun <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-31drm/amdgpu: fix incorrect GCR_GENERAL_CNTL addressRuili Ji1-3/+3
gfx10.3.3/gfx10.3.6/gfx10.3.7 shall use 0x1580 address for GCR_GENERAL_CNTL Acked-by: Prike Liang <[email protected]> Acked-by: Yifan Zhang <[email protected]> Reviewed-by: Aaron Liu <[email protected]> Signed-off-by: Ruili Ji <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-31drm/amdgpu: fix incorrect size printing in error msgChristian König1-1/+1
That are bytes not pages. Signed-off-by: Christian König <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-31drm/amdgpu: fix some kerneldoc in the VM code v2Christian König2-2/+2
Fix two incorrect kerneldocs for the recent VM code changes. v2: fix one more typo Signed-off-by: Christian König <[email protected]> Reported-by: kernel test robot <[email protected]> Reported-by: Stephen Rothwell <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-31drm/amdgpu: Add tlb_cb for unlocked updatePhilip Yang1-1/+1
Flush TLB needs wait for GPU update fence done. MMU notify callback to unmap range from GPUs uses unlocked GPU page table update, so add tlb_cb to unlocked update fence to increase vm->tlb_seq. Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-31drm/amdgpu: Correct unlocked update fence handlingPhilip Yang1-1/+1
To fix two issues with unlocked update fence: 1. vm->last_unlocked store the latest fence without taking refcount. 2. amdgpu_vm_bo_update_mapping returns old fence, not the latest fence. Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-28drm/amdgpu/jpeg: Add jpeg ras error query supportMohammad Zafar Ziya2-0/+81
RAS error query support addition for JPEG 2.6 V2: removed unused options and corrected comment format. Moved register definition to header file. V3: poison query status check added. Removed the error query support V4: Return statement refactored. Signed-off-by: Mohammad Zafar Ziya <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-28drm/amdgpu/vcn: Add VCN ras error query supportMohammad Zafar Ziya3-0/+78
RAS error query support addition for VCN 2.6 V2: removed unused option and corrected comment format Moved the register definition under header file V3: poison query status check added. Removed error query interface V4: MMSCH poison check option removed, return true/false refactored. Signed-off-by: Mohammad Zafar Ziya <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-28drm/amdgpu/jpeg: Add jpeg block ras supportMohammad Zafar Ziya1-0/+8
Ras support addition for JPEG block V2: removed default callback Signed-off-by: Mohammad Zafar Ziya <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-28drm/amdgpu/vcn: Add vcn ras supportMohammad Zafar Ziya1-0/+9
VCN block ras feature support addition V2: default ras callback removed Signed-off-by: Mohammad Zafar Ziya <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-28drm/amdgpu: Add vcn and jpeg ras support flagMohammad Zafar Ziya2-0/+11
Add vcn and jpeg ras support options V2: vcn and jpeg ras flag enabled for aldebaran asic only V3: vcn and jpeg ras flag disabled for error counter query Generic poison query interface added VCN and JPEG ras enabled based on IP version check V4: vcn and jpeg ras flag moved under ecc flag for dGPU Signed-off-by: Mohammad Zafar Ziya <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-28drm/amd/vcn: fix an error msg on vcn 3.0tiancyin1-1/+1
Some video card has more than one vcn instance, passing 0 to vcn_v3_0_pause_dpg_mode is incorrect. Error msg: Register(1) [mmUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002 Reviewed-by: James Zhu <[email protected]> Signed-off-by: tiancyin <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-28drm/amdgpu: Re-classify some log messages in commit pathSean Paul1-2/+3
ATOMIC and DRIVER log categories do not typically contain per-frame log messages. This patch re-classifies some messages in amd to chattier categories to keep ATOMIC/DRIVER quiet. Acked-by: Christian König <[email protected]> Reviewed-by: Harry Wentland <[email protected]> Signed-off-by: Sean Paul <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-28drm/amdgpu/vcn3: send smu interface typeBoyuan Zhang2-0/+12
For VCN FW to detect ASIC type, in order to use different mailbox registers. V2: simplify codes and fix format issue. Signed-off-by: Boyuan Zhang <[email protected]> Acked-by Huang Rui <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdgpu: remove table_freed param from the VM codeChristian König5-21/+14
Better to leave the decision when to flush the VM changes in the TLB to the VM code. Signed-off-by: Christian König <[email protected]> Reviewed-by: Philip Yang<[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdkfd: use tlb_seq from the VM subsystem for SVM as well v2Christian König2-13/+10
Instead of hand rolling the table_freed parameter. v2: add some changes suggested by Philip Signed-off-by: Christian König <[email protected]> Reviewed-by: Philip Yang<[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdgpu: rework TLB flushingChristian König6-32/+76
Instead of tracking the VM updates through the dependencies just use a sequence counter for page table updates which indicates the need to flush the TLB. This reduces the need to flush the TLB drastically. v2: squash in NULL check fix (Christian) Signed-off-by: Christian König <[email protected]> Acked-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdgpu: simplify VM update tracking a bitChristian König4-37/+16
Store the 64bit sequence directly. Makes it simpler to use and saves a bit of fence reference counting overhead. Signed-off-by: Christian König <[email protected]> Acked-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdgpu: separate VM PT handling into amdgpu_vm_pt.cChristian König4-944/+1006
Separate the VM page table backend operations from the state machine since the amdgpu_vm.c file is becoming to complex. The allocating, freeing and updating page tables and page directories can easily be moved into a separate file. While at it cleanup everything checkpatch.pl reported and rename the functions a bit to make more clear that they belong together. Signed-off-by: Christian König <[email protected]> Acked-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdgpu: move VM PDEs to idle after updateChristian König1-32/+10
Move the page tables to the idle list after updating the PDEs. We have gone back and forth with that a couple of times because of problems with the inter PD dependencies, but it should work now that we have the state handling cleanly separated. Signed-off-by: Christian König <[email protected]> Acked-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdgpu: drop redundant check of harvest infoGuchun Chen1-7/+0
Harvest bit setting in IP data structure promises this, so no need to set it explicitly. Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdgpu: Fix spelling mistake "regiser" -> "register"Colin Ian King1-1/+1
There is a spelling mistake in a dev_error error message. Fix it. Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdgpu: add UTCL2 RAS poison query for Aldebaran (v2)Tao Zhou4-0/+24
Add help functions to query and reset RAS UTCL2 poison status. v2: implement it on amdgpu side and kfd only calls it. Signed-off-by: Tao Zhou <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdgpu: make amdgpu_display_gem_fb_verify_and_init() staticAlex Deucher2-9/+5
Unused outside of amdgpu_display.c. Reviewed-by: Guchun Chen <[email protected]> Reviewed-by: Aurabindo Pillai <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdgpu: drop amdgpu_display_gem_fb_init()Alex Deucher2-29/+0
Unused. Reviewed-by: Guchun Chen <[email protected]> Reviewed-by: Aurabindo Pillai <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdgpu: make amdgpu_display_framebuffer_init() staticAlex Deucher2-8/+9
It's not used outside of amdgpu_display.c. Reviewed-by: Guchun Chen <[email protected]> Reviewed-by: Aurabindo Pillai <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdgpu/gfx10: enable gfx1037 clock counter retrieval functionPrike Liang1-0/+1
Enable gfx1037 clock counter retrieval function for KFDPerfCountersTest.ClockCountersBasicTest. Signed-off-by: Prike Liang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdgpu: set noretry for gfx 10.3.7Prike Liang1-0/+1
Disable xnack on the gfx10.3.7 for the KFD test. Signed-off-by: Prike Liang <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdgpu: set noretry=1 for GFX 10.3.4Felix Kuehling1-2/+3
Retry faults are not supported on GFX 10.3.4. Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Guchun Chen <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdgpu: set noretry=1 for gc 10.3.6Yifan Zhang1-0/+1
this patch to set noretry=1 for gc 10.3.6. Signed-off-by: Yifan Zhang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdgpu: add more cases to noretry=1Alex Deucher1-0/+3
Port current list from amd-staging-drm-next. Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdgpu/vcn: improve vcn dpg stop procedureTianci Yin1-0/+3
Prior to disabling dpg, VCN need unpausing dpg mode, or VCN will hang in S3 resuming. Reviewed-by: James Zhu <[email protected]> Signed-off-by: Tianci Yin <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdkfd: Fix Incorrect VMIDs passed to HWSTushar Patel1-1/+1
Compute-only GPUs have more than 8 VMIDs allocated to KFD. Fix this by passing correct number of VMIDs to HWS v2: squash in warning fix (Alex) Signed-off-by: Tushar Patel <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdgpu/vcn: Fix the register setting for vcn1Emily Deng1-2/+2
Correct the code error for setting register UVD_GFX10_ADDR_CONFIG. Need to use inst_idx, or it only will set VCN0. Signed-off-by: Emily Deng <[email protected]> Reviewed-by: James Zhu <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2022-03-25drm/amdgpu: add workarounds for VCN TMZ issue on CHIP_RAVENLang Yu1-0/+71
It is a hardware issue that VCN can't handle a GTT backing stored TMZ buffer on CHIP_RAVEN series ASIC. Move such a TMZ buffer to VRAM domain before command submission as a workaround. v2: - Use patch_cs_in_place callback. v3: - Bail out early if unsecure IBs. Suggested-by: Christian König <[email protected]> Signed-off-by: Lang Yu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2022-03-25drm/amdgpu/gmc: use PCI BARs for APUs in passthroughAlex Deucher5-7/+8
If the GPU is passed through to a guest VM, use the PCI BAR for CPU FB access rather than the physical address of carve out. The physical address is not valid in a guest. v2: Fix HDP handing as suggested by Michel Reviewed-by: Christian König <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdgpu: fix off by one in amdgpu_gfx_kiq_acquire()Dan Carpenter1-1/+1
This post-op should be a pre-op so that we do not pass -1 as the bit number to test_bit(). The current code will loop downwards from 63 to -1. After changing to a pre-op, it loops from 63 to 0. Fixes: 71c37505e7ea ("drm/amdgpu/gfx: move more common KIQ code to amdgpu_gfx.c") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdgpu: conduct a proper cleanup of PDB boGuchun Chen1-1/+1
Use amdgpu_bo_free_kernel instead of amdgpu_bo_unref to perform a proper cleanup of PDB bo. v2: update subject to be more accurate Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdgpu: prevent memory wipe in suspend/shutdown stageGuchun Chen1-1/+3
On GPUs with RAS enabled, below call trace is observed when suspending or shutting down device. The cause is we have enabled memory wipe flag for BOs on such GPUs by default, and such BOs will go to memory wipe by amdgpu_fill_buffer, however, because ring is off already, it fails to clean up the memory and throw this error message. So add a suspend/shutdown check before wipping memory. [drm:amdgpu_fill_buffer [amdgpu]] *ERROR* Trying to clear memory with ring turned off. v2: fix coding style issue Fixes: fc6ea4bee13071 ("drm/amdgpu: Wipe all VRAM on free when RAS is enabled") Signed-off-by: Guchun Chen <[email protected]> Reviewed-by: Christian König <[email protected]> Acked-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-22drm/amd: Add USBC connector IDAurabindo Pillai1-0/+1
[Why&How] Add a dedicated AMDGPU specific ID for use with newer ASICs that support USB-C output Signed-off-by: Aurabindo Pillai <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-15drm/amdgpu: Use drm_mode_copy()Ville Syrjälä1-2/+2
struct drm_display_mode embeds a list head, so overwriting the full struct with another one will corrupt the list (if the destination mode is on a list). Use drm_mode_copy() instead which explicitly preserves the list head of the destination mode. Even if we know the destination mode is not on any list using drm_mode_copy() seems decent as it sets a good example. Bad examples of not using it might eventually get copied into code where preserving the list head actually matters. Obviously one case not covered here is when the mode itself is embedded in a larger structure and the whole structure is copied. But if we are careful when copying into modes embedded in structures I think we can be a little more reassured that bogus list heads haven't been propagated in. @is_mode_copy@ @@ drm_mode_copy(...) { ... } @depends on !is_mode_copy@ struct drm_display_mode *mode; expression E, S; @@ ( - *mode = E + drm_mode_copy(mode, &E) | - memcpy(mode, E, S) + drm_mode_copy(mode, E) ) @depends on !is_mode_copy@ struct drm_display_mode mode; expression E; @@ ( - mode = E + drm_mode_copy(&mode, &E) | - memcpy(&mode, E, S) + drm_mode_copy(&mode, E) ) @@ struct drm_display_mode *mode; @@ - &*mode + mode Cc: Alex Deucher <[email protected]> Cc: Harry Wentland <[email protected]> Cc: Leo Li <[email protected]> Cc: Rodrigo Siqueira <[email protected]> Cc: [email protected] Reviewed-by: Harry Wentland <[email protected]> Signed-off-by: Ville Syrjälä <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-15drm/amdgpu: Use ternary operator in `vcn_v1_0_start()`Paul Menzel1-7/+2
Remove the boilerplate of declaring a variable and using an if else statement by using the ternary operator. Reviewed-by: James Zhu <[email protected]> Signed-off-by: Paul Menzel <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-15drm/amdgpu: fix typos in commentsJulia Lawall1-2/+2
Various spelling mistakes in comments. Detected with the help of Coccinelle. Signed-off-by: Julia Lawall <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-15drm/amdgpu: Add stolen reserved memory for MI25 SRIOV.Yongqiang Sun1-0/+9
MI25 SRIOV guest driver loading failed due to allocated memory overlaps with firmware reserved area. Allocate stolen reserved memory for MI25 SRIOV specifically to avoid the memory overlap. Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Yongqiang Sun <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-15drm/amdgpu: Merge get_reserved_allocation to get_vbios_allocations.Yongqiang Sun3-21/+13
Some ASICs need reserved memory for firmware or other components, which is not allowed to be used by driver. amdgpu_gmc_get_reserved_allocation is to handle additional areas. To avoid any missing calling, merged amdgpu_gmc_get_reserved_allocation to amdgpu_gmc_get_vbios_allocations. Acked-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Yongqiang Sun <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-15drm/amdgpu/vcn: fix vcn ring test failure in igt reload testTianci Yin1-0/+2
[why] On Renoir, vcn ring test failed on the second time insmod in the reload test. After investigation, it proves that vcn only can disable dpg under dpg unpause mode (dpg unpause mode is default for dec only, dpg pause mode is for dec/enc). [how] unpause dpg in dpg stopping procedure. Reviewed-by: James Zhu <[email protected]> Signed-off-by: Tianci Yin <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-15drm/amdgpu: only allow secure submission on rings which support thatLang Yu13-2/+19
Only GFX ring, SDMA ring and VCN decode ring support secure submission at the moment. Suggested-by: Christian König <[email protected]> Signed-off-by: Lang Yu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-15drm/amdgpu: fixed the warnings reported by kernel test robotyipechai3-62/+2
The reported warnings are as follows: 1.warning:no-previous-prototype-for-amdgpu_hdp_ras_fini. 2.warning:no-previous-prototype-for-amdgpu_mmhub_ras_fini. Amdgpu_hdp_ras_fini and amdgpu_mmhub_ras_fini are unused in the code, they are the only functions in amdgpu_hdp.c and amdgpu_mmhub.c. After removing these two functions, both amdgpu_hdp.c and amdgpu_mmhub.c are empty, so these two files can be deleted to fix the warning. Signed-off-by: yipechai <[email protected]> Reviewed-by: Tao Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-15drm/amdgpu: Move reset domain init before calling RREG32Philip Yang1-9/+9
amdgpu_detect_virtualization reads register, amdgpu_device_rreg access adev->reset_domain->sem if kernel defined CONFIG_LOCKDEP, below is the random boot hang backtrace on Vega10. It may get random NULL pointer access backtrace if amdgpu_sriov_runtime is true too. Move amdgpu_reset_create_reset_domain before calling to RREG32. BUG: kernel NULL pointer dereference, address: #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI Workqueue: events work_for_cpu_fn RIP: 0010:down_read_trylock+0x13/0xf0 Call Trace: <TASK> amdgpu_device_skip_hw_access+0x38/0x80 [amdgpu] amdgpu_device_rreg+0x1b/0x170 [amdgpu] amdgpu_detect_virtualization+0x73/0x100 [amdgpu] amdgpu_device_init.cold.60+0xbe/0x16b1 [amdgpu] ? pci_bus_read_config_word+0x43/0x70 amdgpu_driver_load_kms+0x15/0x120 [amdgpu] amdgpu_pci_probe+0x1a1/0x3a0 [amdgpu] Fixes: d0fb18b535679a ("drm/amdgpu: Move reset sem into reset_domain") Signed-off-by: Philip Yang <[email protected]> Reviewed-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-15drm/amd: fix gfx hang on renoir in IGT reload testTianci.Yin1-0/+4
[why] CP hangs in igt reloading test on renoir, more precisely, hangs on the second time insmod. [how] mode2 reset can make it recover, and mode2 reset only effects gfx core, dcn and the screen will not be impacted. Acked-by: Alex Deucher <[email protected]> Signed-off-by: Tianci.Yin <[email protected]> Signed-off-by: Alex Deucher <[email protected]>