aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
AgeCommit message (Collapse)AuthorFilesLines
2019-04-03drm/amdgpu: provide the page fault queue to the VM codeChristian König1-0/+1
We are going to need that for recoverable page faults. Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-03-27drm/amdgpu: drop the ib from the VM update parametersChristian König1-5/+0
It is redundant with the job pointer. Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-03-27drm/amdgpu: move VM table mapping into the backend as wellChristian König1-1/+1
Clean that up further and also fix another case where the BO wasn't kmapped for CPU based updates. Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-03-27drm/amdgpu: XGMI pstate switch initial supportshaoyunl1-0/+4
Driver vote low to high pstate switch whenever there is an outstanding XGMI mapping request. Driver vote high to low pstate when all the outstanding XGMI mapping is terminated. Signed-off-by: shaoyunl <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-03-21drm/amdgpu: use the new VM backend for PTEsChristian König1-13/+0
And remove the existing code when it is unused. Signed-off-by: Christian König <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-03-21drm/amdgpu: new VM update backendsChristian König1-1/+29
Separate out all functions for SDMA and CPU based page table updates into separate backends. This way we can keep most of the complexity of those from the core VM code. Signed-off-by: Christian König <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-03-21drm/amdgpu: move and rename amdgpu_pte_update_paramsChristian König1-0/+45
Move the update parameter into the VM header and rename them. Signed-off-by: Christian König <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-03-21drm/amdgpu: remove some unused VM definesChristian König1-5/+0
Not needed any more. Signed-off-by: Christian König <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-03-19drm/amdgpu: wait for VM to become idle during flushChristian König1-0/+2
Make sure that not only the entities are flush, but that we also wait for the HW to finish all processing. Signed-off-by: Christian König <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-03-19drm/amdgpu: remove chashChristian König1-14/+0
Remove the chash implementation for now since it isn't used any more. Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-03-19drm/amdgpu: drop the huge page flagChristian König1-1/+0
Not needed any more since we now free PDs/PTs on demand. Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Acked-by: Huang Rui <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-03-19drm/amdgpu: allocate VM PDs/PTs on demandChristian König1-3/+0
Let's start to allocate VM PDs/PTs on demand instead of pre-allocating them during mapping. Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Acked-by: Huang Rui <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-01-25drm/amdgpu: set bulk_moveable to false when lru changed v2Chunming Zhou1-0/+2
if lru is changed, we cannot do bulk moving. v2: root bo isn't in bulk moving, skip its change. Signed-off-by: Chunming Zhou <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-12-07drm/amdgpu: remove VM fault_credit handlingChristian König1-5/+0
printk_ratelimit() is much better suited to limit the number of reported VM faults. Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-09-13drm/amdgpu: use a single linked list for amdgpu_vm_bo_baseChristian König1-1/+1
Instead of the double linked list. Gets the size of amdgpu_vm_pt down to 64 bytes again. We could even reduce it down to 32 bytes, but that would require some rather extreme hacks. Signed-off-by: Christian König <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Acked-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-09-12drm/amdgpu: Move fault hash table to amdgpu vmOak Zeng1-0/+13
In stead of share one fault hash table per device, make it per vm. This can avoid inter-process lock issue when fault hash table is full. Change-Id: I5d1281b7c41eddc8e26113e010516557588d3708 Signed-off-by: Oak Zeng <[email protected]> Suggested-by: Christian Konig <[email protected]> Suggested-by: Felix Kuehling <[email protected]> Reviewed-by: Christian Konig <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-09-10drm/amdgpu: correctly sign extend 48bit addresses v3Christian König1-13/+0
Correct sign extend the GMC addresses to 48bit. v2: sign extending turned out easier than thought. v3: clean up the defines and move them into amdgpu_gmc.h as well Signed-off-by: Christian König <[email protected]> Reviewed-by: Junwei Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-09-10drm/amdgpu: separate per VM BOs from normal in the moved stateChristian König1-2/+5
Allows us to avoid taking the spinlock in more places. Signed-off-by: Christian König <[email protected]> Reviewed-by: Junwei Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-08-29drm/amdkfd: Release an acquired process vmOak Zeng1-0/+1
For compute vm acquired from amdgpu, vm.pasid is managed by kfd. Decouple pasid from such vm on process destroy to avoid duplicate pasid release. Signed-off-by: Oak Zeng <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-08-29drm/amdgpu: Set pasid for compute vm (v2)Oak Zeng1-1/+1
To make a amdgpu vm to a compute vm, the old pasid will be freed and replaced with a pasid managed by kfd. Kfd can't reuse original pasid allocated by amdgpu because kfd uses different pasid policy with amdgpu. For example, all graphic devices share one same pasid in a process. v2: rebase (Alex) Signed-off-by: Oak Zeng <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-08-27drm/amdgpu: remove extra root PD alignmentChristian König1-3/+0
Just another leftover from radeon. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Junwei Zhang <[email protected]> Acked-by: Huang Rui <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-08-27drm/amdgpu: Adjust the VM size based on system memory size v2Felix Kuehling1-1/+1
Set the VM size based on system memory size between the ASIC-specific limits given by min_vm_size and max_bits. GFXv9 GPUs will keep their default VM size of 256TB (48 bit). Only older GPUs will adjust VM size depending on system memory size. This makes more VM space available for ROCm applications on GFXv8 GPUs that want to map all available VRAM and system memory in their SVM address space. v2: * Clarify comment * Round up memory size before >> 30 * Round up automatic vm_size to power of two Signed-off-by: Felix Kuehling <[email protected]> Acked-by: Junwei Zhang <[email protected]> Reviewed-by: Huang Rui <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-08-27drm/amdgpu: use bulk moves for efficient VM LRU handling (v6)Huang Rui1-1/+10
I continue to work for bulk moving that based on the proposal by Christian. Background: amdgpu driver will move all PD/PT and PerVM BOs into idle list. Then move all of them on the end of LRU list one by one. Thus, that cause so many BOs moved to the end of the LRU, and impact performance seriously. Then Christian provided a workaround to not move PD/PT BOs on LRU with below patch: Commit 0bbf32026cf5ba41e9922b30e26e1bed1ecd38ae ("drm/amdgpu: band aid validating VM PTs") However, the final solution should bulk move all PD/PT and PerVM BOs on the LRU instead of one by one. Whenever amdgpu_vm_validate_pt_bos() is called and we have BOs which need to be validated we move all BOs together to the end of the LRU without dropping the lock for the LRU. While doing so we note the beginning and end of this block in the LRU list. Now when amdgpu_vm_validate_pt_bos() is called and we don't have anything to do, we don't move every BO one by one, but instead cut the LRU list into pieces so that we bulk move everything to the end in just one operation. Test data: +--------------+-----------------+-----------+---------------------------------------+ | |The Talos |Clpeak(OCL)|BusSpeedReadback(OCL) | | |Principle(Vulkan)| | | +------------------------------------------------------------------------------------+ | | | |0.319 ms(1k) 0.314 ms(2K) 0.308 ms(4K) | | Original | 147.7 FPS | 76.86 us |0.307 ms(8K) 0.310 ms(16K) | +------------------------------------------------------------------------------------+ | Orignial + WA| | |0.254 ms(1K) 0.241 ms(2K) | |(don't move | 162.1 FPS | 42.15 us |0.230 ms(4K) 0.223 ms(8K) 0.204 ms(16K)| |PT BOs on LRU)| | | | +------------------------------------------------------------------------------------+ | Bulk move | 163.1 FPS | 40.52 us |0.244 ms(1K) 0.252 ms(2K) 0.213 ms(4K) | | | | |0.214 ms(8K) 0.225 ms(16K) | +--------------+-----------------+-----------+---------------------------------------+ After test them with above three benchmarks include vulkan and opencl. We can see the visible improvement than original, and even better than original with workaround. v2: move all BOs include idle, relocated, and moved list to the end of LRU and put them together. v3: remove unused parameter and use list_for_each_entry instead of the one with save entry. v4: move the amdgpu_vm_move_to_lru_tail after command submission, at that time, all bo will be back on idle list. v5: remove amdgpu_vm_move_to_lru_tail_by_list(), use bulk_moveable instread of validated, and move ttm_bo_bulk_move_lru_tail() also into amdgpu_vm_move_to_lru_tail(). v6: clean up and fix return value. Signed-off-by: Christian König <[email protected]> Signed-off-by: Huang Rui <[email protected]> Tested-by: Mike Lothian <[email protected]> Tested-by: Dieter Nützel <[email protected]> Acked-by: Chunming Zhou <[email protected]> Reviewed-by: Junwei Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-08-27drm/amdgpu: use new scheduler load balancing for VMsChristian König1-4/+3
Instead of the fixed round robin use let the scheduler balance the load of page table updates. Signed-off-by: Christian König <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-08-27drm/amdgpu: move vm definitions into amdgpu_vm headerHuang Rui1-0/+25
Demangle amdgpu.h. Signed-off-by: Huang Rui <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-07-31drm/amdgpu: add new amdgpu_vm_bo_trace_cs() function v2Christian König1-0/+1
This allows us to trace all VM ranges which should be valid inside a CS. v2: dump mappings without BO as well Signed-off-by: Christian König <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Reviewed-and-tested-by: Andrey Grodzovsky <[email protected]> (v1) Reviewed-by: Huang Rui <[email protected]> (v1) Signed-off-by: Alex Deucher <[email protected]>
2018-07-10drm/amdgpu: Add support for logging process info in amdgpu_vm.Andrey Grodzovsky1-0/+16
Add process and thread names and pids and a function to extract this info from relevant amdgpu_vm. v2: Add documentation and fix identation. v3: Add getter and setter functions for amdgpu_task_info. Signed-off-by: Andrey Grodzovsky <[email protected]> Acked-by: Jim Qu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-05-24drm/amdgpu: move VM BOs on LRU againChristian König1-0/+3
Move all BOs belonging to a VM on the LRU with every submission. Signed-off-by: Christian König <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-05-24drm/amdgpu: rework VM state machine lock handling v2Christian König1-3/+1
Only the moved state needs a separate spin lock protection. All other states are protected by reserving the VM anyway. v2: fix some more incorrect cases Signed-off-by: Christian König <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-05-18drm/amdgpu: remove unused memberChristian König1-3/+0
This lock isn't used any more. Signed-off-by: Christian König <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-05-15drm/amdgpu: Add support to change mtype for 2nd part of gart BOs on GFX9Yong Zhao1-2/+3
This change prepares for a workaround in amdkfd for a GFX9 HW bug. It requires the control stack memory of compute queues, which is allocated from the second page of MQD gart BOs, to have mtype NC, rather than the default UC. Signed-off-by: Yong Zhao <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-03-15drm/amdgpu: Add helper to turn an existing VM into a compute VMFelix Kuehling1-0/+1
v2: Removed updating and checking of vm->vm_context v3: Enable amdgpu_vm_clear_bo in amdgpu_vm_make_compute Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2018-03-15drm/amdgpu: Move KFD-specific fields into struct amdgpu_vmFelix Kuehling1-0/+9
Remove struct amdkfd_vm and move the fields into struct amdgpu_vm. This will allow turning a VM created by a DRM render node into a KFD VM. v2: Removed vm_context field Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2018-02-06drm/amdgpu: Fix header file dependenciesFelix Kuehling1-0/+1
Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2018-02-19drm/amdgpu: reduce reserved VA sizeChristian König1-1/+1
1MB should be more than enough, currently we use about 8K. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Acked-by: Monk Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-12-27drm/amdgpu: drop client_id from VMChristian König1-4/+0
Use the fence context from the scheduler entity. Signed-off-by: Christian König <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-12-27drm/amdgpu: separate VMID and PASID handlingChristian König1-41/+3
Move both into the new files amdgpu_ids.[ch]. No functional change. Signed-off-by: Christian König <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-12-18drm/amdgpu: implement 2+1 PD support for Raven v3Christian König1-0/+6
Instead of falling back to 2 level and very limited address space use 2+1 PD support and 128TB + 512GB of virtual address space. v2: cleanup defines, rebase on top of level enum v3: fix inverted check in hardware setup Signed-off-by: Christian König <[email protected]> Reviewed-and-Tested-by: Chunming Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-12-14drm/amdgpu: add enumerate for PDB/PTB v3Chunming Zhou1-0/+11
v2: remove SUBPTB member v3: remove last_level, use AMDGPU_VM_PTB directly instead. Signed-off-by: Chunming Zhou <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-12-12drm/amdgpu: remove keeping the addr of the VM PDsChristian König1-1/+1
No more double house keeping. Signed-off-by: Christian König <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-12-12drm/amdgpu: remove last_entry_used from the VM codeChristian König1-1/+0
Not needed any more. Signed-off-by: Christian König <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-12-07drm: move amd_gpu_scheduler into common locationLucas Stach1-3/+4
This moves and renames the AMDGPU scheduler to a common location in DRM in order to facilitate re-use by other drivers. This is mostly a straight forward rename with no code changes. One notable exception is the function to_drm_sched_fence(), which is no longer a inline header function to avoid the need to export the drm_sched_fence_ops_scheduled and drm_sched_fence_ops_finished structures. Reviewed-by: Chunming Zhou <[email protected]> Tested-by: Dieter Nützel <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Lucas Stach <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-12-06drm/amdgpu: move validation of the VM size into the VM codeChristian König1-1/+2
This moves validation of the VM size parameter into amdgpu_vm_adjust_size(). Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-12-06drm/amdgpu: unify VM size handling of Vega10 with older generationChristian König1-3/+1
One function to rule them all. Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-12-06drm/amdgpu: fix VA hole handling on Vega10 v3Christian König1-0/+13
Similar to the CPU address space the VA on Vega10 has a hole in it. v2: use dev_dbg instead of dev_err v3: add some more comments to explain how the hw works Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> CC: [email protected] Signed-off-by: Alex Deucher <[email protected]>
2017-12-06drm/amdgpu: cleanup vm_size handlingChristian König1-4/+3
It's pointless to have the same value twice, just always use max_pfn. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-11-13drm/amdgpu: make AMDGPU_VA_RESERVED_SIZE 64bitChristian König1-1/+2
Even when it's a small handle it as 64bit value as well. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-10-09drm/amdgpu: Set the correct value for PDEs/PTEs of ATC memory on RavenYong Zhao1-0/+10
Without the additional bits set in PDEs/PTEs, the ATC memory access would have failed on Raven. Signed-off-by: Yong Zhao <[email protected]> Acked-by: Alex Deucher <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-10-09Merge branch 'drm-next-4.15' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie1-1/+6
into drm-next More new stuff for 4.15. Highlights: - Add clock query interface for raven - Add new FENCE_TO_HANDLE ioctl - UVD video encode ring support on polaris - transparent huge page DMA support - deadlock fixes - compute pipe lru tweaks - powerplay cleanups and regression fixes - fix duplicate symbol issue with radeon and amdgpu - misc bug fixes * 'drm-next-4.15' of git://people.freedesktop.org/~agd5f/linux: (72 commits) drm/radeon/dp: make radeon_dp_get_dp_link_config static drm/radeon: move ci_send_msg_to_smc to where it's used drm/amd/sched: fix deadlock caused by unsignaled fences of deleted jobs drm/amd/sched: NULL out the s_fence field after run_job drm/amd/sched: move adding finish callback to amd_sched_job_begin drm/amd/sched: fix an outdated comment drm/amd/sched: rename amd_sched_entity_pop_job drm/amdgpu: minor coding style fix drm/ttm: add transparent huge page support for DMA allocations v2 drm/ttm: add support for different pool sizes drm/ttm: remove unsued options from ttm_mem_global_alloc_page drm/amdgpu: add uvd enc irq drm/amdgpu: add uvd enc ib test drm/amdgpu: add uvd enc ring test drm/amdgpu: add uvd enc vm functions (v2) drm/amdgpu: add uvd enc into run queue drm/amdgpu: add uvd enc rings drm/amdgpu: add new uvd enc ring methods drm/amdgpu: add uvd enc command in header drm/amdgpu: add uvd enc registers in header ...
2017-09-28drm/amdgpu: Handle GPUVM fault stormsFelix Kuehling1-1/+6
When many wavefronts cause VM faults at the same time, it can overwhelm the interrupt handler and cause IH ring overflows before the driver can notify or kill the faulting application. As a workaround I'm introducing limited per-VM fault credit. After that number of VM faults have occurred, further VM faults are filtered out at the prescreen stage of processing. This depends on the PASID in the interrupt packet, so it currently only works for KFD contexts. Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>