aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
AgeCommit message (Collapse)AuthorFilesLines
2019-03-20drm/amdgpu: use more entries for the first paging queueChristian König1-0/+2
To aid recoverable page faults. Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-12-07drm/amdgpu: Skip ring soft recovery when fence was NULLwentalou1-1/+1
amdgpu_ring_soft_recovery would have Call-Trace, when s_fence->parent was NULL inside amdgpu_job_timedout. Check fence first, as drm_sched_hw_job_reset did. Signed-off-by: Wentao Lou <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-11-05drm/amdgpu: further ring test cleanupsChristian König1-1/+7
Move all error messages from IP specific code into the common helper. This way we now uses the ring name in the messages instead of the index and note which device is affected as well. Also cleanup error handling in the IP specific code and consequently use ETIMEDOUT when the ring test timed out. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-11-05drm/amdgpu: Retire amdgpu_ring.ready flag v4Andrey Grodzovsky1-1/+21
Start using drm_gpu_scheduler.ready isntead. v3: Add helper function to run ring test and set sched.ready flag status accordingly, clean explicit sched.ready sets from the IP specific files. v4: Add kerneldoc and rebase. Signed-off-by: Andrey Grodzovsky <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-08-27drm/amdgpu: add ring soft recovery v4Christian König1-0/+25
Instead of hammering hard on the GPU try a soft recovery first. v2: reorder code a bit v3: increase timeout to 10ms, increment GPU reset counter v4: squash in compile fix (Christian) Signed-off-by: Christian König <[email protected]> Reviewed-by: Huang Rui <[email protected]>
2018-08-27drm/amdgpu: remove ring lru handlingChristian König1-98/+0
Not needed any more. Signed-off-by: Christian König <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-07-18drm/amdgpu: allow for more flexible priority handlingChristian König1-1/+2
Allow to call amdgpu_ring_priority_get() after pushing the ring to the scheduler. Signed-off-by: Christian König <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-06-15drm/amdgpu: define and add extra dword for jpeg ringBoyuan Zhang1-1/+1
Define extra dword for jpeg ring. Jpeg ring will allocate extra dword to store the patch commands for fixing the known issue. v2: dropping extra_dw for rings other than jpeg. Signed-off-by: Boyuan Zhang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-05-18drm/amdgpu/vg20:Restruct uvd.inst to support multiple instancesJames Zhu1-0/+1
Vega20 has dual-UVD. Need add multiple instances support for uvd. Restruct uvd.inst, using uvd.inst[0] to replace uvd.inst->. Repurpose amdgpu_ring::me for instance index, and initialize to 0. There are no any logical changes here. Signed-off-by: James Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-05-15drm/amdgpu: add emit_reg_write_reg_wait ring callbackAlex Deucher1-0/+20
This callback writes a value to a register and then reads back another register and waits for a value in a single operation. Provide a helper function using two operations for engines that don't support this opertion. Reviewed-by: Huang Rui <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-03-01drm/amd/amdgpu: Mask rptr as well in ring debugfsTom St Denis1-1/+1
The read/write pointers on sdma4 devices increment beyond the ring size and should be masked. Tested on my Ryzen 2400G. Signed-off-by: Tom St Denis <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-02-19drm/amdgpu: cache the fence to wait for a VMIDChristian König1-0/+3
Beneficial when a lot of processes are waiting for VMIDs. Signed-off-by: Christian König <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-12-18drm/amdgpu: rename amdgpu_wb_* functionsAlex Deucher1-8/+8
add device for consistency. Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-12-07drm: move amd_gpu_scheduler into common locationLucas Stach1-7/+7
This moves and renames the AMDGPU scheduler to a common location in DRM in order to facilitate re-use by other drivers. This is mostly a straight forward rename with no code changes. One notable exception is the function to_drm_sched_fence(), which is no longer a inline header function to avoid the need to export the drm_sched_fence_ops_scheduled and drm_sched_fence_ops_finished structures. Reviewed-by: Chunming Zhou <[email protected]> Tested-by: Dieter Nützel <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Lucas Stach <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-11-08drm/amdgpu: bypass lru touch for KIQ ring submissionPixel Ding1-1/+2
KIQ ring submission is used for register accessing on SRIOV VF that could happen both in irq enabled and irq disabled cases. Inversion lock could happen on adev->ring_lru_list_lock, while this operation is useless and just adds overhead in this use case. Signed-off-by: Pixel Ding <[email protected]> Reviewed-by: Monk Liu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-10-09drm/amdgpu: add framework for HW specific priority settings v9Andres Rodriguez1-1/+75
Add an initial framework for changing the HW priorities of rings. The framework allows requesting priority changes for the lifetime of an amdgpu_job. After the job completes the priority will decay to the next lowest priority for which a request is still valid. A new ring function set_priority() can now be populated to take care of the HW specific programming sequence for priority changes. v2: set priority before emitting IB, and take a ref on amdgpu_job v3: use AMD_SCHED_PRIORITY_* instead of AMDGPU_CTX_PRIORITY_* v4: plug amdgpu_ring_restore_priority_cb into amdgpu_job_free_cb v5: use atomic for tracking job priorities instead of last_job v6: rename amdgpu_ring_priority_[get/put]() and align parameters v7: replace spinlocks with mutexes for KIQ compatibility v8: raise ring priority during cs_ioctl, instead of job_run v9: priority_get() before push_job() Reviewed-by: Christian König <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Andres Rodriguez <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-09-28drm/amdgpu: map compute rings by least recently used pipeAndres Rodriguez1-5/+20
This patch provides a guarantee that the first n queues allocated by an application will be on different pipes. Where n is the number of pipes available from the hardware. This helps avoid ring aliasing which can result in work executing in time-sliced mode instead of truly parallel mode. Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Andres Rodriguez <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-08-29drm/amdgpu: set sched_hw_submission higher for KIQ (v3)Alex Deucher1-4/+12
KIQ doesn't really use the GPU scheduler. The base drivers generally use the KIQ ring directly rather than submitting IBs. However, amdgpu_sched_hw_submission (which defaults to 2) limits the number of outstanding fences to 2. KFD uses the KIQ for TLB flushes and the 2 fence limit hurts performance when there are several KFD processes running. v2: move some expressions to one line change KIQ sched_hw_submission to at least 16 v3: bump to 256 Reviewed-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-08-15drm/amdgpu: don't finish the ring if not initializedTrigger Huang1-0/+4
If a ring is not initialized, it also should not be finished. For example, in Vega10's SR-IOV environment, UVD's decode ring is not initialized, but will be finnished in amdgpu_uvd_sw_fini, because UVD driver put all the uvd decode ring's finish operation into amdgpu_uvd_sw_fini function, while not uvd_vXXX_0_sw_fini. This will lead to amdgpu module unloading failure. Signed-off-by: Trigger Huang <[email protected]> Reviewed-by: Monk Liu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-08-15drm/amdgpu: use 256 bit buffers for all wb allocations (v2)Alex Deucher1-49/+16
May waste a bit of memory, but simplifies the interface significantly. v2: convert internal accounting to use 256bit slots Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-08-15drm/amdgpu: make wb 256bit function names consistentAlex Deucher1-1/+1
Use a lower case b to be consistent with the other wb functions. Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-07-25drm/amdgpu:fix gfx fence allocate sizeMonk Liu1-8/+18
1, for sriov, we need 8dw for the gfx fence due to CP behaviour 2, cleanup wrong logic in wptr/rptr wb alloc and free Change-Id: Ifbfed17a4621dae57244942ffac7de1743de0294 Signed-off-by: Monk Liu <[email protected]> Signed-off-by: Xiangliang Yu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-06-01drm/amdgpu: Move compute vm bug logic to amdgpu_vm.cAlex Xie1-32/+0
In review, Christian would like to keep the logic inside amdgpu_vm.c with a cost of slightly slower. The loop is still optimized out with this patch. v2: remove the if statement. Now it is not slower. Signed-off-by: Alex Xie <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-31drm/amdgpu: guarantee bijective mapping of ring ids for LRU v3Andres Rodriguez1-7/+26
Depending on usage patterns, the current LRU policy may create a non-injective mapping between userspace ring ids and kernel rings. This behaviour is undesired as apps that attempt to fill all HW blocks would be unable to reach some of them. This change forces the LRU policy to create bijective mappings only. v2: compress ring_blacklist v3: simplify amdgpu_ring_is_blacklisted() logic Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-31drm/amdgpu: implement lru amdgpu_queue_mgr policy for compute v4Andres Rodriguez1-0/+63
Use an LRU policy to map usermode rings to HW compute queues. Most compute clients use one queue, and usually the first queue available. This results in poor pipe/queue work distribution when multiple compute apps are running. In most cases pipe 0 queue 0 is the only queue that gets used. In order to better distribute work across multiple HW queues, we adopt a policy to map the usermode ring ids to the LRU HW queue. This fixes a large majority of multi-app compute workloads sharing the same HW queue, even though 7 other queues are available. v2: use ring->funcs->type instead of ring->hw_ip v3: remove amdgpu_queue_mapper_funcs v4: change ring_lru_list_lock to spinlock, grab only once in lru_get() Signed-off-by: Andres Rodriguez <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-31drm/amdgpu: Optimize a function called by every IB shedulingAlex Xie1-0/+33
Move several if statements and a loop statment from run time to initialization time. Signed-off-by: Alex Xie <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-03-29drm/amd/amdgpu: Correct ring wptr address in debugfs (v2)Tom St Denis1-2/+2
On gfx9 hardware the value is not wrapped and is a 64-bit value. So we reduce it modulo the ring size. Signed-off-by: Tom St Denis <[email protected]> Reviewed-by: Christian König <[email protected]> (v2) use buf_mask instead of computing on the fly Signed-off-by: Alex Deucher <[email protected]>
2017-03-29drm/amdgpu:fix ring init sequenceMonk Liu1-3/+3
ring->buf_mask need be set prior to ring_clear_ring invoke and fix ring_clear_ring as well which should use buf_mask instead of ptr_mask Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-03-29drm/amdgpu: add 64bit wb functionsKen Wang1-13/+37
Newer asics need 64 bit writeback slots. Reviewed-by: Christian König <[email protected]> Signed-off-by: Ken Wang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-03-29drm/amdgpu: change wptr to 64 bits (v2)Ken Wang1-1/+4
Newer asics need 64 bit wptrs. If the wptr is now smaller than the rptr that doesn't indicate a wrap-around anymore. v2: integrate Christian's comments. Signed-off-by: Ken Wang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-03-29drm/amdgpu:use clear_ring to clr RBMonk Liu1-1/+1
In resume routine, we need clr RB prior to the ring test of engine, otherwise some engine hang duplicated during GPU reset. Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-01-27drm/amdgpu:set cond_exec polling value to 1 in ring_initMonk Liu1-1/+3
no need to set it per ib_schedule(), hw won't override this polling address. Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-12-16Merge branch 'for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs updates from Al Viro: - more ->d_init() stuff (work.dcache) - pathname resolution cleanups (work.namei) - a few missing iov_iter primitives - copy_from_iter_full() and friends. Either copy the full requested amount, advance the iterator and return true, or fail, return false and do _not_ advance the iterator. Quite a few open-coded callers converted (and became more readable and harder to fuck up that way) (work.iov_iter) - several assorted patches, the big one being logfs removal * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: logfs: remove from tree vfs: fix put_compat_statfs64() does not handle errors namei: fold should_follow_link() with the step into not-followed link namei: pass both WALK_GET and WALK_MORE to should_follow_link() namei: invert WALK_PUT logics namei: shift interpretation of LOOKUP_FOLLOW inside should_follow_link() namei: saner calling conventions for mountpoint_last() namei.c: get rid of user_path_parent() switch getfrag callbacks to ..._full() primitives make skb_add_data,{_nocache}() and skb_copy_to_page_nocache() advance only on success [iov_iter] new primitives - copy_from_iter_full() and friends don't open-code file_inode() ceph: switch to use of ->d_init() ceph: unify dentry_operations instances lustre: switch to use of ->d_init()
2016-12-04don't open-code file_inode()Al Viro1-1/+1
Signed-off-by: Al Viro <[email protected]>
2016-10-25drm/amdgpu: move align_mask and nop into ring funcs as well (v2)Christian König1-10/+9
They are constant as well. v2: update uvd and vce phys ring structures as well Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-10-25drm/amdgpu: move the ring type into the funcs structure (v2)Christian König1-3/+1
It's constant, so it doesn't make to much sense to keep it with the variable data. v2: update vce and uvd phys mode ring structures as well Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-10-13drm/amdgpu: potential NULL dereference in debugfs codeDan Carpenter1-2/+2
debugfs_create_file() returns NULL on error, it only returns error pointers if debugfs isn't enabled in the config and we checked for that earlier so it can't happen. Fixes: 4f4824b55650 ('drm/amd/amdgpu: Convert ring debugfs entries to binary') Reviewed-by: Christian König <[email protected]> Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-09-27drm/amdgpu: clear ring pointer in amdgpu_device on teardownGrazvydas Ignotas1-0/+2
This is in symmetry to setup done in amdgpu_ring_init. Signed-off-by: Grazvydas Ignotas <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-09-14drm/amdgpu: free the BO in kernel by helper amdgpu_bo_free_kernel()Junwei Zhang1-15/+4
Signed-off-by: Junwei Zhang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-08-08drm/amdgpu: use amdgpu_bo_create_kernel in amdgpu_ring.cChristian König1-22/+5
Saves us quite a bunch of code. Signed-off-by: Christian König <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-07-29drm/amdgpu: add begin/end_use ring callbacksChristian König1-0/+10
For manual UVD/VCE power and clock gating. Signed-off-by: Christian König <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-07-29drm/amdgpu: remove fence_lockChristian König1-1/+0
Was never used as far as I can see. Signed-off-by: Christian König <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-07-07drm/amdgpu: remove more of the ring backup codeAlex Deucher1-9/+0
Not used anymore. Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-07-07drm/amdgpu: clean up ring_backup code, no need moreChunming Zhou1-72/+0
Signed-off-by: Chunming Zhou <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-07-07drm/amdgpu: fix ring debugfs bugMonk Liu1-0/+10
debugfs file added but not released after driver unloaded Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-07-07drm/amd/amdgpu: ring debugfs is read in increments of 4 bytesTom St Denis1-1/+1
If a user tries to read a non-multiple of 4 bytes it would have read until the end of the ring potentially crashing the user task. Signed-off-by: Tom St Denis <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-07-07drm/amd/amdgpu: Convert ring debugfs entries to binaryTom St Denis1-65/+62
They now emit ring data in binary which will be read/written by the userspace tool umr shortly. Signed-off-by: Tom St Denis <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-07-07drm/amdgpu: clear RB at ring initMonk Liu1-0/+3
This help fix reloading driver hang issue of SDMA ring. Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-06-09drm/amdgpu: fix missing free wb for cond_execMonk Liu1-0/+1
Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-05-04drm/amdgpu: fix the coding style in amdgpu_ring.cChristian König1-2/+3
No functional change. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>