aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
AgeCommit message (Collapse)AuthorFilesLines
2021-02-09drm/amdgpu: cleanup struct amdgpu_ringNirmoy Das1-6/+2
This patch consist of below related changes: 1 Rename ring->priority to ring->hw_prio. 2 Assign correct hardware ring priority. 3 Remove ring->priority_mutex as ring priority remains unchanged after initialization. 4 Remove unused ring->num_jobs. v3: remove ring->num_jobs. v2: remove ring->priority_mutex. Fixes: 33abcb1f5a17 ("drm/amdgpu: set compute queue priority at mqd_init") Signed-off-by: Nirmoy Das <[email protected]> Reviewed-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-11-13drm/amd/amdgpu/amdgpu_ring: Fix misnaming of param 'max_dw'Lee Jones1-1/+1
Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c:168: warning: Function parameter or member 'max_dw' not described in 'amdgpu_ring_init' drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c:168: warning: Excess function parameter 'max_ndw' description in 'amdgpu_ring_init' Cc: Alex Deucher <[email protected]> Cc: "Christian König" <[email protected]> Cc: David Airlie <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Sumit Semwal <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Lee Jones <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-11-13drm/amd/amdgpu/amdgpu_ring: Fix a bunch of function misdocumentationLee Jones1-6/+6
Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c:63: warning: Excess function parameter 'adev' description in 'amdgpu_ring_alloc' drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c:122: warning: Excess function parameter 'adev' description in 'amdgpu_ring_commit' drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c:167: warning: Function parameter or member 'max_dw' not described in 'amdgpu_ring_init' drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c:167: warning: Function parameter or member 'irq_src' not described in 'amdgpu_ring_init' drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c:167: warning: Function parameter or member 'irq_type' not described in 'amdgpu_ring_init' drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c:167: warning: Function parameter or member 'hw_prio' not described in 'amdgpu_ring_init' drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c:167: warning: Excess function parameter 'max_ndw' description in 'amdgpu_ring_init' drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c:167: warning: Excess function parameter 'nop' description in 'amdgpu_ring_init' drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c:285: warning: Excess function parameter 'adev' description in 'amdgpu_ring_fini' drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c:325: warning: Function parameter or member 'ring' not described in 'amdgpu_ring_emit_reg_write_reg_wait_helper' drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c:325: warning: Excess function parameter 'adev' description in 'amdgpu_ring_emit_reg_write_reg_wait_helper' Cc: Alex Deucher <[email protected]> Cc: "Christian König" <[email protected]> Cc: David Airlie <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Sumit Semwal <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Signed-off-by: Lee Jones <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-11-02drm/amdgpu/amdgpu: use "*" adjacent to data nameDeepak R Varma1-1/+1
When declaring pointer data, the "*" symbol should be used adjacent to the data name as per the coding standards. This resolves following issues reported by checkpatch script: ERROR: "foo * bar" should be "foo *bar" ERROR: "foo * bar" should be "foo *bar" ERROR: "foo* bar" should be "foo *bar" ERROR: "(foo*)" should be "(foo *)" Signed-off-by: Deepak R Varma <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-24drm/amdgpu: Get DRM dev from adev by inline-fLuben Tuikov1-1/+1
Add a static inline adev_to_drm() to obtain the DRM device pointer from an amdgpu_device pointer. Signed-off-by: Luben Tuikov <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-08-18drm/scheduler: Scheduler priority fixes (v2)Luben Tuikov1-1/+1
Remove DRM_SCHED_PRIORITY_LOW, as it was used in only one place. Rename and separate by a line DRM_SCHED_PRIORITY_MAX to DRM_SCHED_PRIORITY_COUNT as it represents a (total) count of said priorities and it is used as such in loops throughout the code. (0-based indexing is the the count number.) Remove redundant word HIGH in priority names, and rename *KERNEL* to *HIGH*, as it really means that, high. v2: Add back KERNEL and remove SW and HW, in lieu of a single HIGH between NORMAL and KERNEL. Signed-off-by: Luben Tuikov <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-04-13drm/amdgpu/ring: simplify scheduler setup logicAlex Deucher1-3/+1
Set up a GPU scheduler based on the ring flag rather than the ring type. Reviewed-by: Christian König <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-04-13drm/amdgpu/ring: add no_scheduler flagAlex Deucher1-1/+2
This allows IPs to flag whether a specific ring requires a GPU scheduler or not. E.g., sometimes instances of an IP are asymmetric and have different capabilities. Reviewed-by: Christian König <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-04-09drm/amdgpu: rework sched_list generationNirmoy Das1-2/+12
Generate HW IP's sched_list in amdgpu_ring_init() instead of amdgpu_ctx.c. This makes amdgpu_ctx_init_compute_sched(), ring.has_high_prio and amdgpu_ctx_init_sched() unnecessary. This patch also stores sched_list for all HW IPs in one big array in struct amdgpu_device which makes amdgpu_ctx_init_entity() much more leaner. v2: fix a coding style issue do not use drm hw_ip const to populate amdgpu_ring_type enum v3: remove ctx reference and move sched array and num_sched to a struct use num_scheds to detect uninitialized scheduler list v4: use array_index_nospec for user space controlled variables fix possible checkpatch.pl warnings Signed-off-by: Nirmoy Das <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-03-09drm/amdgpu: remove unused functionsNirmoy Das1-70/+0
AMDGPU statically sets priority for compute queues at initialization so remove all the functions responsible for changing compute queue priority dynamically. Signed-off-by: Nirmoy Das <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-02-28drm/amdgpu: no need to clean debugfs at amdgpuYintian Tao1-7/+0
drm_minor_unregister will invoke drm_debugfs_cleanup to clean all the child node under primary minor node. We don't need to invoke amdgpu_debugfs_fini and amdgpu_debugfs_regs_cleanup to clean agian. Otherwise, it will raise the NULL pointer like below. [ 45.046029] BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8 [ 45.047256] PGD 0 P4D 0 [ 45.047713] Oops: 0002 [#1] SMP PTI [ 45.048198] CPU: 0 PID: 2796 Comm: modprobe Tainted: G W OE 4.18.0-15-generic #16~18.04.1-Ubuntu [ 45.049538] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 [ 45.050651] RIP: 0010:down_write+0x1f/0x40 [ 45.051194] Code: 90 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb e8 ce d9 ff ff 48 ba 01 00 00 00 ff ff ff ff 48 89 d8 <f0> 48 0f c1 10 85 d2 74 05 e8 53 1c ff ff 65 48 8b 04 25 00 5c 01 [ 45.053702] RSP: 0018:ffffad8f4133fd40 EFLAGS: 00010246 [ 45.054384] RAX: 00000000000000a8 RBX: 00000000000000a8 RCX: ffffa011327dd814 [ 45.055349] RDX: ffffffff00000001 RSI: 0000000000000001 RDI: 00000000000000a8 [ 45.056346] RBP: ffffad8f4133fd48 R08: 0000000000000000 R09: ffffffffc0690a00 [ 45.057326] R10: ffffad8f4133fd58 R11: 0000000000000001 R12: ffffa0113cff0300 [ 45.058266] R13: ffffa0113c0a0000 R14: ffffffffc0c02a10 R15: ffffa0113e5c7860 [ 45.059221] FS: 00007f60d46f9540(0000) GS:ffffa0113fc00000(0000) knlGS:0000000000000000 [ 45.060809] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 45.061826] CR2: 00000000000000a8 CR3: 0000000136250004 CR4: 00000000003606f0 [ 45.062913] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 45.064404] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 45.065897] Call Trace: [ 45.066426] debugfs_remove+0x36/0xa0 [ 45.067131] amdgpu_debugfs_ring_fini+0x15/0x20 [amdgpu] [ 45.068019] amdgpu_debugfs_fini+0x2c/0x50 [amdgpu] [ 45.068756] amdgpu_pci_remove+0x49/0x70 [amdgpu] [ 45.069439] pci_device_remove+0x3e/0xc0 [ 45.070037] device_release_driver_internal+0x18a/0x260 [ 45.070842] driver_detach+0x3f/0x80 [ 45.071325] bus_remove_driver+0x59/0xd0 [ 45.071850] driver_unregister+0x2c/0x40 [ 45.072377] pci_unregister_driver+0x22/0xa0 [ 45.073043] amdgpu_exit+0x15/0x57c [amdgpu] [ 45.073683] __x64_sys_delete_module+0x146/0x280 [ 45.074369] do_syscall_64+0x5a/0x120 [ 45.074916] entry_SYSCALL_64_after_hwframe+0x44/0xa9 v2: remove all debugfs cleanup/fini code at amdgpu v3: squash in unused variable removal Signed-off-by: Yintian Tao <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-02-26drm/amdgpu/ring: move debugfs init into core amdgpu debugfsAlex Deucher1-12/+3
In order to remove the load and unload drm callbacks, we need to reorder the init sequence to move all the drm debugfs file handling. Do this for rings. Tested-by: Thomas Zimmermann <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-02-26drm/amdgpu: cleanup amdgpu_ring_finiNirmoy Das1-1/+2
cleanup amdgpu_ring_fini to check the prerequisites before changing ring->sched.ready Signed-off-by: Nirmoy Das <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-06-25Merge branch 'drm-next' into drm-next-5.3Alex Deucher1-1/+2
Backmerge drm-next and fix up conflicts due to drmP.h removal. Signed-off-by: Alex Deucher <[email protected]>
2019-06-21drm/amdgpu: add the trailing fence per ringJack Xiao1-0/+10
The trailing fence for ring is used to track the completion of preemption. Acked-by: Hawking Zhang <[email protected]> Signed-off-by: Jack Xiao <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-06-11drm/amdgpu: drop the incorrect soft_reset for SRIOVMonk Liu1-1/+1
It's incorrect to do soft reset for SRIOV, when GFX hang the WREG would stuck there becuase it goes KIQ way. the GPU reset counter is incorrect: always increase twice for each timedout Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-06-10drm/amd: drop use of drmP.h in amdgpu/amdgpu*Sam Ravnborg1-1/+2
Drop use of drmP.h in all files named amdgpu* in drm/amd/amdgpu/ Fix fallout. Signed-off-by: Sam Ravnborg <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Cc: "Christian König" <[email protected]> Cc: "David (ChunMing) Zhou" <[email protected]> Cc: David Airlie <[email protected]> Cc: Daniel Vetter <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2019-03-20drm/amdgpu: use more entries for the first paging queueChristian König1-0/+2
To aid recoverable page faults. Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-12-07drm/amdgpu: Skip ring soft recovery when fence was NULLwentalou1-1/+1
amdgpu_ring_soft_recovery would have Call-Trace, when s_fence->parent was NULL inside amdgpu_job_timedout. Check fence first, as drm_sched_hw_job_reset did. Signed-off-by: Wentao Lou <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-11-05drm/amdgpu: further ring test cleanupsChristian König1-1/+7
Move all error messages from IP specific code into the common helper. This way we now uses the ring name in the messages instead of the index and note which device is affected as well. Also cleanup error handling in the IP specific code and consequently use ETIMEDOUT when the ring test timed out. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-11-05drm/amdgpu: Retire amdgpu_ring.ready flag v4Andrey Grodzovsky1-1/+21
Start using drm_gpu_scheduler.ready isntead. v3: Add helper function to run ring test and set sched.ready flag status accordingly, clean explicit sched.ready sets from the IP specific files. v4: Add kerneldoc and rebase. Signed-off-by: Andrey Grodzovsky <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-08-27drm/amdgpu: add ring soft recovery v4Christian König1-0/+25
Instead of hammering hard on the GPU try a soft recovery first. v2: reorder code a bit v3: increase timeout to 10ms, increment GPU reset counter v4: squash in compile fix (Christian) Signed-off-by: Christian König <[email protected]> Reviewed-by: Huang Rui <[email protected]>
2018-08-27drm/amdgpu: remove ring lru handlingChristian König1-98/+0
Not needed any more. Signed-off-by: Christian König <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-07-18drm/amdgpu: allow for more flexible priority handlingChristian König1-1/+2
Allow to call amdgpu_ring_priority_get() after pushing the ring to the scheduler. Signed-off-by: Christian König <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-06-15drm/amdgpu: define and add extra dword for jpeg ringBoyuan Zhang1-1/+1
Define extra dword for jpeg ring. Jpeg ring will allocate extra dword to store the patch commands for fixing the known issue. v2: dropping extra_dw for rings other than jpeg. Signed-off-by: Boyuan Zhang <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-05-18drm/amdgpu/vg20:Restruct uvd.inst to support multiple instancesJames Zhu1-0/+1
Vega20 has dual-UVD. Need add multiple instances support for uvd. Restruct uvd.inst, using uvd.inst[0] to replace uvd.inst->. Repurpose amdgpu_ring::me for instance index, and initialize to 0. There are no any logical changes here. Signed-off-by: James Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-05-15drm/amdgpu: add emit_reg_write_reg_wait ring callbackAlex Deucher1-0/+20
This callback writes a value to a register and then reads back another register and waits for a value in a single operation. Provide a helper function using two operations for engines that don't support this opertion. Reviewed-by: Huang Rui <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-03-01drm/amd/amdgpu: Mask rptr as well in ring debugfsTom St Denis1-1/+1
The read/write pointers on sdma4 devices increment beyond the ring size and should be masked. Tested on my Ryzen 2400G. Signed-off-by: Tom St Denis <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-02-19drm/amdgpu: cache the fence to wait for a VMIDChristian König1-0/+3
Beneficial when a lot of processes are waiting for VMIDs. Signed-off-by: Christian König <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-12-18drm/amdgpu: rename amdgpu_wb_* functionsAlex Deucher1-8/+8
add device for consistency. Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-12-07drm: move amd_gpu_scheduler into common locationLucas Stach1-7/+7
This moves and renames the AMDGPU scheduler to a common location in DRM in order to facilitate re-use by other drivers. This is mostly a straight forward rename with no code changes. One notable exception is the function to_drm_sched_fence(), which is no longer a inline header function to avoid the need to export the drm_sched_fence_ops_scheduled and drm_sched_fence_ops_finished structures. Reviewed-by: Chunming Zhou <[email protected]> Tested-by: Dieter Nützel <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Lucas Stach <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-11-08drm/amdgpu: bypass lru touch for KIQ ring submissionPixel Ding1-1/+2
KIQ ring submission is used for register accessing on SRIOV VF that could happen both in irq enabled and irq disabled cases. Inversion lock could happen on adev->ring_lru_list_lock, while this operation is useless and just adds overhead in this use case. Signed-off-by: Pixel Ding <[email protected]> Reviewed-by: Monk Liu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-10-09drm/amdgpu: add framework for HW specific priority settings v9Andres Rodriguez1-1/+75
Add an initial framework for changing the HW priorities of rings. The framework allows requesting priority changes for the lifetime of an amdgpu_job. After the job completes the priority will decay to the next lowest priority for which a request is still valid. A new ring function set_priority() can now be populated to take care of the HW specific programming sequence for priority changes. v2: set priority before emitting IB, and take a ref on amdgpu_job v3: use AMD_SCHED_PRIORITY_* instead of AMDGPU_CTX_PRIORITY_* v4: plug amdgpu_ring_restore_priority_cb into amdgpu_job_free_cb v5: use atomic for tracking job priorities instead of last_job v6: rename amdgpu_ring_priority_[get/put]() and align parameters v7: replace spinlocks with mutexes for KIQ compatibility v8: raise ring priority during cs_ioctl, instead of job_run v9: priority_get() before push_job() Reviewed-by: Christian König <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Andres Rodriguez <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-09-28drm/amdgpu: map compute rings by least recently used pipeAndres Rodriguez1-5/+20
This patch provides a guarantee that the first n queues allocated by an application will be on different pipes. Where n is the number of pipes available from the hardware. This helps avoid ring aliasing which can result in work executing in time-sliced mode instead of truly parallel mode. Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Andres Rodriguez <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-08-29drm/amdgpu: set sched_hw_submission higher for KIQ (v3)Alex Deucher1-4/+12
KIQ doesn't really use the GPU scheduler. The base drivers generally use the KIQ ring directly rather than submitting IBs. However, amdgpu_sched_hw_submission (which defaults to 2) limits the number of outstanding fences to 2. KFD uses the KIQ for TLB flushes and the 2 fence limit hurts performance when there are several KFD processes running. v2: move some expressions to one line change KIQ sched_hw_submission to at least 16 v3: bump to 256 Reviewed-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-08-15drm/amdgpu: don't finish the ring if not initializedTrigger Huang1-0/+4
If a ring is not initialized, it also should not be finished. For example, in Vega10's SR-IOV environment, UVD's decode ring is not initialized, but will be finnished in amdgpu_uvd_sw_fini, because UVD driver put all the uvd decode ring's finish operation into amdgpu_uvd_sw_fini function, while not uvd_vXXX_0_sw_fini. This will lead to amdgpu module unloading failure. Signed-off-by: Trigger Huang <[email protected]> Reviewed-by: Monk Liu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-08-15drm/amdgpu: use 256 bit buffers for all wb allocations (v2)Alex Deucher1-49/+16
May waste a bit of memory, but simplifies the interface significantly. v2: convert internal accounting to use 256bit slots Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-08-15drm/amdgpu: make wb 256bit function names consistentAlex Deucher1-1/+1
Use a lower case b to be consistent with the other wb functions. Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-07-25drm/amdgpu:fix gfx fence allocate sizeMonk Liu1-8/+18
1, for sriov, we need 8dw for the gfx fence due to CP behaviour 2, cleanup wrong logic in wptr/rptr wb alloc and free Change-Id: Ifbfed17a4621dae57244942ffac7de1743de0294 Signed-off-by: Monk Liu <[email protected]> Signed-off-by: Xiangliang Yu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-06-01drm/amdgpu: Move compute vm bug logic to amdgpu_vm.cAlex Xie1-32/+0
In review, Christian would like to keep the logic inside amdgpu_vm.c with a cost of slightly slower. The loop is still optimized out with this patch. v2: remove the if statement. Now it is not slower. Signed-off-by: Alex Xie <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-31drm/amdgpu: guarantee bijective mapping of ring ids for LRU v3Andres Rodriguez1-7/+26
Depending on usage patterns, the current LRU policy may create a non-injective mapping between userspace ring ids and kernel rings. This behaviour is undesired as apps that attempt to fill all HW blocks would be unable to reach some of them. This change forces the LRU policy to create bijective mappings only. v2: compress ring_blacklist v3: simplify amdgpu_ring_is_blacklisted() logic Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-31drm/amdgpu: implement lru amdgpu_queue_mgr policy for compute v4Andres Rodriguez1-0/+63
Use an LRU policy to map usermode rings to HW compute queues. Most compute clients use one queue, and usually the first queue available. This results in poor pipe/queue work distribution when multiple compute apps are running. In most cases pipe 0 queue 0 is the only queue that gets used. In order to better distribute work across multiple HW queues, we adopt a policy to map the usermode ring ids to the LRU HW queue. This fixes a large majority of multi-app compute workloads sharing the same HW queue, even though 7 other queues are available. v2: use ring->funcs->type instead of ring->hw_ip v3: remove amdgpu_queue_mapper_funcs v4: change ring_lru_list_lock to spinlock, grab only once in lru_get() Signed-off-by: Andres Rodriguez <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-05-31drm/amdgpu: Optimize a function called by every IB shedulingAlex Xie1-0/+33
Move several if statements and a loop statment from run time to initialization time. Signed-off-by: Alex Xie <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-03-29drm/amd/amdgpu: Correct ring wptr address in debugfs (v2)Tom St Denis1-2/+2
On gfx9 hardware the value is not wrapped and is a 64-bit value. So we reduce it modulo the ring size. Signed-off-by: Tom St Denis <[email protected]> Reviewed-by: Christian König <[email protected]> (v2) use buf_mask instead of computing on the fly Signed-off-by: Alex Deucher <[email protected]>
2017-03-29drm/amdgpu:fix ring init sequenceMonk Liu1-3/+3
ring->buf_mask need be set prior to ring_clear_ring invoke and fix ring_clear_ring as well which should use buf_mask instead of ptr_mask Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-03-29drm/amdgpu: add 64bit wb functionsKen Wang1-13/+37
Newer asics need 64 bit writeback slots. Reviewed-by: Christian König <[email protected]> Signed-off-by: Ken Wang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-03-29drm/amdgpu: change wptr to 64 bits (v2)Ken Wang1-1/+4
Newer asics need 64 bit wptrs. If the wptr is now smaller than the rptr that doesn't indicate a wrap-around anymore. v2: integrate Christian's comments. Signed-off-by: Ken Wang <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-03-29drm/amdgpu:use clear_ring to clr RBMonk Liu1-1/+1
In resume routine, we need clr RB prior to the ring test of engine, otherwise some engine hang duplicated during GPU reset. Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-01-27drm/amdgpu:set cond_exec polling value to 1 in ring_initMonk Liu1-1/+3
no need to set it per ib_schedule(), hw won't override this polling address. Signed-off-by: Monk Liu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-12-16Merge branch 'for-linus' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs updates from Al Viro: - more ->d_init() stuff (work.dcache) - pathname resolution cleanups (work.namei) - a few missing iov_iter primitives - copy_from_iter_full() and friends. Either copy the full requested amount, advance the iterator and return true, or fail, return false and do _not_ advance the iterator. Quite a few open-coded callers converted (and became more readable and harder to fuck up that way) (work.iov_iter) - several assorted patches, the big one being logfs removal * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: logfs: remove from tree vfs: fix put_compat_statfs64() does not handle errors namei: fold should_follow_link() with the step into not-followed link namei: pass both WALK_GET and WALK_MORE to should_follow_link() namei: invert WALK_PUT logics namei: shift interpretation of LOOKUP_FOLLOW inside should_follow_link() namei: saner calling conventions for mountpoint_last() namei.c: get rid of user_path_parent() switch getfrag callbacks to ..._full() primitives make skb_add_data,{_nocache}() and skb_copy_to_page_nocache() advance only on success [iov_iter] new primitives - copy_from_iter_full() and friends don't open-code file_inode() ceph: switch to use of ->d_init() ceph: unify dentry_operations instances lustre: switch to use of ->d_init()