aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
AgeCommit message (Collapse)AuthorFilesLines
2017-11-08drm/amdgpu: bypass lru touch for KIQ ring submissionPixel Ding1-1/+2
KIQ ring submission is used for register accessing on SRIOV VF that could happen both in irq enabled and irq disabled cases. Inversion lock could happen on adev->ring_lru_list_lock, while this operation is useless and just adds overhead in this use case. Signed-off-by: Pixel Ding <[email protected]> Reviewed-by: Monk Liu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-10-09drm/amdgpu: add framework for HW specific priority settings v9Andres Rodriguez1-1/+75
Add an initial framework for changing the HW priorities of rings. The framework allows requesting priority changes for the lifetime of an amdgpu_job. After the job completes the priority will decay to the next lowest priority for which a request is still valid. A new ring function set_priority() can now be populated to take care of the HW specific programming sequence for priority changes. v2: set priority before emitting IB, and take a ref on amdgpu_job v3: use AMD_SCHED_PRIORITY_* instead of AMDGPU_CTX_PRIORITY_* v4: plug amdgpu_ring_restore_priority_cb into amdgpu_job_free_cb v5: use atomic for tracking job priorities instead of last_job v6: rename amdgpu_ring_priority_[get/put]() and align parameters v7: replace spinlocks with mutexes for KIQ compatibility v8: raise ring priority during cs_ioctl, instead of job_run v9: priority_get() before push_job() Reviewed-by: Christian König <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Andres Rodriguez <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-09-28drm/amdgpu: map compute rings by least recently used pipeAndres Rodriguez1-5/+20
This patch provides a guarantee that the first n queues allocated by an application will be on different pipes. Where n is the number of pipes available from the hardware. This helps avoid ring aliasing which can result in work executing in time-sliced mode instead of truly parallel mode. Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Andres Rodriguez <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-08-29drm/amdgpu: set sched_hw_submission higher for KIQ (v3)Alex Deucher1-4/+12
KIQ doesn't really use the GPU scheduler. The base drivers generally use the KIQ ring directly rather than submitting IBs. However, amdgpu_sched_hw_submission (which defaults to 2) limits the number of outstanding fences to 2. KFD uses the KIQ for TLB flushes and the 2 fence limit hurts performance when there are several KFD processes running. v2: move some expressions to one line change KIQ sched_hw_submission to at least 16 v3: bump to 256 Reviewed-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-08-15drm/amdgpu: don't finish the ring if not initializedTrigger Huang1-0/+4
If a ring is not initialized, it also should not be finished. For example, in Vega10's SR-IOV environment, UVD's decode ring is not initialized, but will be finnished in amdgpu_uvd_sw_fini, because UVD driver put all the uvd decode ring's finish operation into amdgpu_uvd_sw_fini function, while not uvd_vXXX_0_sw_fini. This will lead to amdgpu module unloading failure. Signed-off-by: Trigger Huang <[email protected]> Reviewed-by: Monk Liu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-08-15drm/amdgpu: use 256 bit buffers for all wb allocations (v2)Alex Deucher1-49/+16
May waste a bit of memory, but simplifies the interface significantly. v2: convert internal accounting to use 256bit slots Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-08-15drm/amdgpu: make wb 256bit function names consistentAlex Deucher1-1/+1
Use a lower case b to be consistent with the other wb functions. Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-07-25drm/amdgpu:fix gfx fence allocate sizeMonk Liu1-8/+18
1, for sriov, we need 8dw for the gfx fence due to CP behaviour 2, cleanup wrong logic in wptr/rptr wb alloc and free Change-Id: Ifbfed17a4621dae57244942ffac7de1743de0294 Signed-off-by: Monk Liu <[email protected]> Signed-off-by: Xiangliang Yu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-06-01drm/amdgpu: Move compute vm bug logic to amdgpu_vm.cAlex Xie1-32/+0
In review, Christian would like to keep the logic inside amdgpu_vm.c with a cost of slightly slower. The loop is still optimized out with this patch. v2: remove the if statement. Now it is not slower. Signed-off-by: Alex Xie <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-31drm/amdgpu: guarantee bijective mapping of ring ids for LRU v3Andres Rodriguez1-7/+26
Depending on usage patterns, the current LRU policy may create a non-injective mapping between userspace ring ids and kernel rings. This behaviour is undesired as apps that attempt to fill all HW blocks would be unable to reach some of them. This change forces the LRU policy to create bijective mappings only. v2: compress ring_blacklist v3: simplify amdgpu_ring_is_blacklisted() logic Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-31drm/amdgpu: implement lru amdgpu_queue_mgr policy for compute v4Andres Rodriguez1-0/+63
Use an LRU policy to map usermode rings to HW compute queues. Most compute clients use one queue, and usually the first queue available. This results in poor pipe/queue work distribution when multiple compute apps are running. In most cases pipe 0 queue 0 is the only queue that gets used. In order to better distribute work across multiple HW queues, we adopt a policy to map the usermode ring ids to the LRU HW queue. This fixes a large majority of multi-app compute workloads sharing the same HW queue, even though 7 other queues are available. v2: use ring->funcs->type instead of ring->hw_ip v3: remove amdgpu_queue_mapper_funcs v4: change ring_lru_list_lock to spinlock, grab only once in lru_get() Signed-off-by: Andres Rodriguez <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-31drm/amdgpu: Optimize a function called by every IB shedulingAlex Xie1-0/+33
Move several if statements and a loop statment from run time to initialization time. Signed-off-by: Alex Xie <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-03-29drm/amd/amdgpu: Correct ring wptr address in debugfs (v2)Tom St Denis1-2/+2
On gfx9 hardware the value is not wrapped and is a 64-bit value. So we reduce it modulo the ring size. Signed-off-by: Tom St Denis <[email protected]> Reviewed-by: Christian König <[email protected]> (v2) use buf_mask instead of computing on the fly Signed-off-by: Alex Deucher <[email protected]>
2017-03-29drm/amdgpu:fix ring init sequenceMonk Liu1-3/+3
ring->buf_mask need be set prior to ring_clear_ring invoke and fix ring_clear_ring as well which should use buf_mask instead of ptr_mask Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-03-29drm/amdgpu: add 64bit wb functionsKen Wang1-13/+37
Newer asics need 64 bit writeback slots. Reviewed-by: Christian König <[email protected]> Signed-off-by: Ken Wang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-03-29drm/amdgpu: change wptr to 64 bits (v2)Ken Wang1-1/+4
Newer asics need 64 bit wptrs. If the wptr is now smaller than the rptr that doesn't indicate a wrap-around anymore. v2: integrate Christian's comments. Signed-off-by: Ken Wang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-03-29drm/amdgpu:use clear_ring to clr RBMonk Liu1-1/+1
In resume routine, we need clr RB prior to the ring test of engine, otherwise some engine hang duplicated during GPU reset. Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-01-27drm/amdgpu:set cond_exec polling value to 1 in ring_initMonk Liu1-1/+3
no need to set it per ib_schedule(), hw won't override this polling address. Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-12-16Merge branch 'for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs updates from Al Viro: - more ->d_init() stuff (work.dcache) - pathname resolution cleanups (work.namei) - a few missing iov_iter primitives - copy_from_iter_full() and friends. Either copy the full requested amount, advance the iterator and return true, or fail, return false and do _not_ advance the iterator. Quite a few open-coded callers converted (and became more readable and harder to fuck up that way) (work.iov_iter) - several assorted patches, the big one being logfs removal * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: logfs: remove from tree vfs: fix put_compat_statfs64() does not handle errors namei: fold should_follow_link() with the step into not-followed link namei: pass both WALK_GET and WALK_MORE to should_follow_link() namei: invert WALK_PUT logics namei: shift interpretation of LOOKUP_FOLLOW inside should_follow_link() namei: saner calling conventions for mountpoint_last() namei.c: get rid of user_path_parent() switch getfrag callbacks to ..._full() primitives make skb_add_data,{_nocache}() and skb_copy_to_page_nocache() advance only on success [iov_iter] new primitives - copy_from_iter_full() and friends don't open-code file_inode() ceph: switch to use of ->d_init() ceph: unify dentry_operations instances lustre: switch to use of ->d_init()
2016-12-04don't open-code file_inode()Al Viro1-1/+1
Signed-off-by: Al Viro <[email protected]>
2016-10-25drm/amdgpu: move align_mask and nop into ring funcs as well (v2)Christian König1-10/+9
They are constant as well. v2: update uvd and vce phys ring structures as well Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-10-25drm/amdgpu: move the ring type into the funcs structure (v2)Christian König1-3/+1
It's constant, so it doesn't make to much sense to keep it with the variable data. v2: update vce and uvd phys mode ring structures as well Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-10-13drm/amdgpu: potential NULL dereference in debugfs codeDan Carpenter1-2/+2
debugfs_create_file() returns NULL on error, it only returns error pointers if debugfs isn't enabled in the config and we checked for that earlier so it can't happen. Fixes: 4f4824b55650 ('drm/amd/amdgpu: Convert ring debugfs entries to binary') Reviewed-by: Christian König <[email protected]> Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-09-27drm/amdgpu: clear ring pointer in amdgpu_device on teardownGrazvydas Ignotas1-0/+2
This is in symmetry to setup done in amdgpu_ring_init. Signed-off-by: Grazvydas Ignotas <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-09-14drm/amdgpu: free the BO in kernel by helper amdgpu_bo_free_kernel()Junwei Zhang1-15/+4
Signed-off-by: Junwei Zhang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-08-08drm/amdgpu: use amdgpu_bo_create_kernel in amdgpu_ring.cChristian König1-22/+5
Saves us quite a bunch of code. Signed-off-by: Christian König <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-07-29drm/amdgpu: add begin/end_use ring callbacksChristian König1-0/+10
For manual UVD/VCE power and clock gating. Signed-off-by: Christian König <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-07-29drm/amdgpu: remove fence_lockChristian König1-1/+0
Was never used as far as I can see. Signed-off-by: Christian König <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-07-07drm/amdgpu: remove more of the ring backup codeAlex Deucher1-9/+0
Not used anymore. Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-07-07drm/amdgpu: clean up ring_backup code, no need moreChunming Zhou1-72/+0
Signed-off-by: Chunming Zhou <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-07-07drm/amdgpu: fix ring debugfs bugMonk Liu1-0/+10
debugfs file added but not released after driver unloaded Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-07-07drm/amd/amdgpu: ring debugfs is read in increments of 4 bytesTom St Denis1-1/+1
If a user tries to read a non-multiple of 4 bytes it would have read until the end of the ring potentially crashing the user task. Signed-off-by: Tom St Denis <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-07-07drm/amd/amdgpu: Convert ring debugfs entries to binaryTom St Denis1-65/+62
They now emit ring data in binary which will be read/written by the userspace tool umr shortly. Signed-off-by: Tom St Denis <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-07-07drm/amdgpu: clear RB at ring initMonk Liu1-0/+3
This help fix reloading driver hang issue of SDMA ring. Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-06-09drm/amdgpu: fix missing free wb for cond_execMonk Liu1-0/+1
Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-05-04drm/amdgpu: fix the coding style in amdgpu_ring.cChristian König1-2/+3
No functional change. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-05-04drm/amdgpu: use the ring name for debugfs (v2)Christian König1-33/+23
Instead of hard coding just another name in the ring code. v2: squash in Tom's rebase fix Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-05-04drm/amdgpu: use max_dw in ring_initChristian König1-9/+5
Instead of specifying the total ring size calculate that from the maximum number of dw a submission can have and the number of concurrent submissions. This fixes UVD with 8 concurrent submissions or more. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-05-04drm/amdgpu: Mark all instances of struct drm_info_list as constNils Wallménius1-2/+2
All these are compile time constand and the drm_debugfs_create/remove_files functions take a const pointer argument. Reviewed-by: Christian König <[email protected]> Signed-off-by: Nils Wallménius <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-05-02drm/amdgpu: support cond execMonk Liu1-0/+9
This adds the groundwork for conditional execution on SDMA which is necessary for preemption. Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-03-16drm/amdgpu: add number of hardware submissions to amdgpu_fence_driver_init_ringChristian König1-1/+2
Make this a parameter instead of using the global variable directly. Signed-off-by: Christian König <[email protected]> Acked-by: Alex Deucher <[email protected]> Reviewed-by: Chunming Zhou <[email protected]>
2016-03-14drm/amdgpu: remove amdgpu_ring_from_fenceChristian König1-24/+0
Not used any more. Signed-off-by: Christian König <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-02-10drm/amdgpu: make pad_ib a ring function v3Christian König1-0/+13
The padding depends on the firmware version and we need that for BO moves as well, not only for VM updates. v2: new approach of making pad_ib a ring function v3: fix typo in macro name Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-02-10drm/amdgpu: remove rptr checkingChristian König1-52/+25
With the scheduler enabled we don't need that any more. Signed-off-by: Christian König <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-02-10drm/amdgpu: remove the ring lock v2Christian König1-74/+6
It's not needed any more because all access goes through the scheduler now. v2: Update commit message. Signed-off-by: Christian König <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-02-10drm/amdgpu: remove some more semaphore leftoversAlex Deucher1-4/+0
No longer needed since semaphores were removed. Reviewed-by: Christian König <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-01-22drm/amdgpu: fix next_rptr handling for debugfsChristian König1-1/+1
That somehow got lost. Signed-off-by: Christian König <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Acked-by: Alex Deucher <[email protected]>
2015-10-30drm/amdgpu: move ring_from_fence to common codeChristian König1-0/+24
Going to need that elsewhere as well. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
2015-10-21drm/amdgpu: remove old lockup detection infrastructureChristian König1-43/+0
It didn't worked to well anyway. Signed-off-by: Christian König <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Reviewed-by: Junwei Zhang <[email protected]>
2015-10-14drm/amdgpu: rework sdma structuresAlex Deucher1-2/+2
Rework the sdma structures in the driver to consolidate all of the sdma info into a single structure and allow for asics that may have different numbers of sdma instances. Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>