aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
AgeCommit message (Collapse)AuthorFilesLines
2020-05-22drm/amdgpu: Sync with VM root BO when switching VM to CPU update modeFelix Kuehling1-2/+9
This fixes an intermittent bug where a root PD clear operation still in progress could overwrite a PDE update done by the CPU, resulting in a VM fault. Fixes: 108b4d928c03 ("drm/amd/amdgpu: Update VM function pointer") Reported-by: Jay Cornwall <[email protected]> Tested-by: Jay Cornwall <[email protected]> Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-04-28drm/amdgpu: pass unlocked flag to params at amdgpu_vm_bo_update_mappingAlex Sierra1-0/+1
Pass unlocked flag value to amdgpu_vm_update_params.unlocked struct member at amdgpu_vm_bo_update_mapping. Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Sierra <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-04-28drm/amdgpu: add new unlocked flag for PTE updatesChristian König1-18/+24
For HMM support we need the ability to invalidate PTEs from a MM callback where we can't lock the root PD. Add a new flag to better support this instead of assuming that all invalidation updates are unlocked. Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-04-28drm/amdgpu: rename direct to immediate for VM updatesChristian König1-30/+30
To avoid confusion with direct ring submissions rename bottom of pipe VM table changes to immediate updates. Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-04-28drm/amdgpu: set TMZ bits in PTEs for secure BO (v4)Alex Deucher1-0/+4
If a buffer object is secure, i.e. created with AMDGPU_GEM_CREATE_ENCRYPTED, then the TMZ bit of the PTEs that belong the buffer object should be set. v1: design and draft the skeletion of TMZ bits setting on PTEs (Alex) v2: return failure once create secure BO on non-TMZ platform (Ray) v3: amdgpu_bo_encrypted() only checks the BO (Luben) v4: move TMZ flag setting into amdgpu_vm_bo_update (Christian) Signed-off-by: Alex Deucher <[email protected]> Reviewed-by: Huang Rui <[email protected]> Signed-off-by: Huang Rui <[email protected]> Signed-off-by: Luben Tuikov <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Christian König <[email protected]>
2020-04-23drm: amdgpu: fix kernel-doc struct warningRandy Dunlap1-1/+1
Fix a kernel-doc warning of missing struct field desription: ../drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c:92: warning: Function parameter or member 'vm' not described in 'amdgpu_vm_eviction_lock' Fixes: a269e44989f3 ("drm/amdgpu: Avoid reclaim fs while eviction lock") Signed-off-by: Randy Dunlap <[email protected]> Cc: Signed-off-by: Alex Sierra <[email protected]> Cc: Felix Kuehling <[email protected]> Cc: Christian König <[email protected]> Cc: Alex Deucher <[email protected]> Cc: David (ChunMing) Zhou <[email protected]> Cc: [email protected] Reviewed-by: Harry Wentland <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-04-22drm/amdgpu: fix race between pstate and remote buffer mapJonathan Kim1-13/+3
Vega20 arbitrates pstate at hive level and not device level. Last peer to remote buffer unmap could drop P-State while another process is still remote buffer mapped. With this fix, P-States still needs to be disabled for now as SMU bug was discovered on synchronous P2P transfers. This should be fixed in the next FW update. Signed-off-by: Jonathan Kim <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-03-19drm/amdgpu: miss PRT case when bo updateYintian Tao1-3/+3
Originally, only the PTE valid is taken in consider. The PRT case is missied when bo update which raise problem. We need add condition for PRT case. v2: add PRT condition for amdgpu_vm_bo_update_mapping, too v3: fix one typo error Signed-off-by: Yintian Tao <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-03-16drm_amdgpu: Add job fence to resv conditionallyxinhui pan1-14/+2
Job fence on page table should be a shared one, so add it to the root page talbe bo resv. last_delayed field is not needed anymore. so remove it. Cc: Christian König <[email protected]> Cc: Alex Deucher <[email protected]> Cc: Felix Kuehling <[email protected]> Suggested-by: Christian König <[email protected]> Signed-off-by: xinhui pan <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-03-06drm/amdgpu: Update SPM_VMID with the job's vmid when application reserves ↵Jacob He1-0/+20
the vmid SPM access the video memory according to SPM_VMID. It should be updated with the job's vmid right before the job is scheduled. SPM_VMID is a global resource Signed-off-by: Jacob He <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-02-26drm/amdgpu: add VM update fences back to the root PD v2Christian König1-2/+12
Add update fences to the root PD while mapping BOs. Otherwise PDs freed during the mapping won't wait for updates to finish and can cause corruptions. v2: rebased on drm-misc-next Signed-off-by: Christian König <[email protected]> Fixes: 90b69cdc5f159 drm/amdgpu: stop adding VM updates fences to the resv obj Reviewed-by: xinhui pan <[email protected]> Tested-by: Luben Tuikov <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-02-11drm/amdgpu: Do not move root PT bo to relocated listxinhui pan1-17/+17
As root PD has no parent, we just need move its status to idle. Suggested-by: Christian König <[email protected]> Signed-off-by: xinhui pan <[email protected]> Reviewed-by: Christian König <[email protected]> CC: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-02-04drm/amdgpu: rework synchronization of VM updates v4Christian König1-17/+24
If provided we only sync to the BOs reservation object and no longer to the root PD. v2: update comment, cleanup amdgpu_bo_sync_wait_resv v3: use correct reservation object while clearing v4: fix typo in amdgpu_bo_sync_wait_resv Signed-off-by: Christian König <[email protected]> Tested-by: Tom St Denis <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-02-04drm/amdgpu: allow higher level PD invalidationsChristian König1-5/+20
Allow partial invalidation on unallocated PDs. This is useful when we need to silence faults to stop interrupt floods on Vega. Signed-off-by: Christian König <[email protected]> Tested-by: Tom St Denis <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-02-04drm/amdgpu: return EINVAL instead of ENOENT in the VM codeChristian König1-1/+1
That we can't find a PD above the root is expected can only happen if we try to update a larger range than actually managed by the VM. Signed-off-by: Christian König <[email protected]> Tested-by: Tom St Denis <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-02-04drm/amdgpu: fix parentheses in amdgpu_vm_update_ptesChristian König1-1/+1
For the root PD mask can be 0xffffffff as well which would overrun to 0 if we don't cast it before we add one. Signed-off-by: Christian König <[email protected]> Tested-by: Tom St Denis <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-02-04drm/amdgpu: make sure to never allocate PDs/PTs for invalidationsChristian König1-9/+13
Make sure that we never allocate a page table for an invalidation operation. Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-02-04drm/amdgpu: drop unnecessary restriction for huge root PDEsChristian König1-20/+5
The root PD can also contain huge PDEs. Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-01-16drm/amdgpu: Avoid reclaim fs while eviction lockAlex Sierra1-7/+33
[Why] Avoid reclaim filesystem while eviction lock is held called from MMU notifier. [How] Setting PF_MEMALLOC_NOFS flags while eviction mutex is locked. Using memalloc_nofs_save / memalloc_nofs_restore API. Signed-off-by: Alex Sierra <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-12-18drm/amdgpu: replace vm_pte's run-queue list with drm gpu scheds listNirmoy Das1-7/+4
drm_sched_entity_init() takes drm gpu scheduler list instead of drm_sched_rq list. This makes conversion of drm_sched_rq list to drm gpu scheduler list unnecessary Signed-off-by: Nirmoy Das <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-12-18drm/scheduler: rework entity creationNirmoy Das1-4/+10
Entity currently keeps a copy of run_queue list and modify it in drm_sched_entity_set_priority(). Entities shouldn't modify run_queue list. Use drm_gpu_scheduler list instead of drm_sched_rq list in drm_sched_entity struct. In this way we can select a runqueue based on entity/ctx's priority for a drm scheduler. Signed-off-by: Nirmoy Das <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-12-09drm/amdgpu: add VM eviction lock v3Christian König1-7/+32
This allows to invalidate VM entries without taking the reservation lock. v3: use -EBUSY Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-12-09drm/amdgpu: stop adding VM updates fences to the resv objChristian König1-4/+26
Don't add the VM update fences to the resv object and remove the handling to stop implicitely syncing to them. Ongoing updates prevent page tables from being evicted and we manually block for all updates to complete before releasing PDs and PTS. This way we can do updates even without the resv obj locked. Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-12-09drm/amdgpu: move VM eviction decision into amdgpu_vm.cChristian König1-0/+22
When a page tables needs to be evicted the VM code should decide if that is possible or not. Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-22amd/amdgpu: force to trigger a no-retry-fault after a retry-faultAlex Sierra1-1/+10
Only for the debugger use case. [why] Avoid endless translation retries, after an invalid address access has been issued to the GPU. Instead, the trap handler is forced to enter by generating a no-retry-fault. A s_trap instruction is inserted in the debugger case to let the wave to enter trap handler to save context. [how] Intentionally using an invalid flag combination (F and P set at the same time) to trigger a no-retry-fault, after a retry-fault happens. This is only valid under compute context. Signed-off-by: Alex Sierra <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-22drm/amdgpu: add flag to indicate amdgpu vm contextAlex Sierra1-0/+3
Flag added to indicate if the amdgpu vm context is used for compute or graphics. Signed-off-by: Alex Sierra <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-06drm/amdgpu/gpuvm: add some additional comments in amdgpu_vm_update_ptesAlex Deucher1-1/+9
To better clarify what is happening in this function. Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-04Merge tag 'drm-misc-next-2019-10-31' of ↵Dave Airlie1-5/+4
git://anongit.freedesktop.org/drm/drm-misc into drm-next drm-misc-next for 5.5: UAPI Changes: -dma-buf: Introduce and revert dma-buf heap (Andrew/John/Sean) Cross-subsystem Changes: - None Core Changes: -dma-buf: add dynamic mapping to allow exporters to choose dma_resv lock state on mmap/munmap (Christian) -vram: add prepare/cleanup fb helpers to vram helpers (Thomas) -ttm: always keep bo's on the lru + ttm cleanups (Christian) -sched: allow a free_job routine to sleep (Steven) -fb_helper: remove unused drm_fb_helper_defio_init() (Thomas) Driver Changes: -bochs/hibmc/vboxvideo: Use new vram helpers for prepare/cleanup fb (Thomas) -amdgpu: Implement dma-buf import/export without drm helpers (Christian) -panfrost: Simplify devfreq integration in driver (Steven) Cc: Christian König <[email protected]> Cc: Thomas Zimmermann <[email protected]> Cc: Steven Price <[email protected]> Cc: Andrew F. Davis <[email protected]> Cc: John Stultz <[email protected]> Cc: Sean Paul <[email protected]> Signed-off-by: Dave Airlie <[email protected]> From: Sean Paul <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/20191031193015.GA243509@art_vandelay
2019-10-25drm/amdgpu: call amdgpu_vm_prt_fini before deleting the root PDPelloux-prayer, Pierre-eric1-9/+10
amdgpu_vm_prt_fini uses "vm->root.base.bo" so it must still be valid when we call it. Fixes: b65709a92156 ("drm/amdgpu: reserve the root PD while freeing PASIDs") Signed-off-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-10-25drm/ttm: remove pointers to globalsChristian König1-5/+4
As the name says global memory and bo accounting is global. So it doesn't make to much sense having pointers to global structures all around the code. Signed-off-by: Christian König <[email protected]> Reviewed-by: Thomas Hellström <[email protected]> Link: https://patchwork.freedesktop.org/patch/332879/
2019-10-04drm/amdgpu: fix uninitialized variable pasid_mapping_neededColin Ian King1-1/+1
The boolean variable pasid_mapping_needed is not initialized and there are code paths that do not assign it any value before it is is read later. Fix this by initializing pasid_mapping_needed to false. Addresses-Coverity: ("Uninitialized scalar variable") Fixes: 6817bf283b2b ("drm/amdgpu: grab the id mgr lock while accessing passid_mapping") Reviewed-by: Christian König <[email protected]> Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-10-03drm/amdgpu/vm: fix up documentation in amdgpu_vm.cAlex Deucher1-7/+10
Missing parameters, wrong comment type, etc. Acked-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-10-03drm/amdgpu/vm: fix documentation for amdgpu_vm_bo_paramAlex Deucher1-0/+2
Add new parameters. Reviewed-by: Christian König <[email protected]> Reviewed-by: Andrey Grodzovsky <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-10-02drm/amdgpu: revert "disable bulk moves for now"Christian König1-2/+0
This reverts commit a213c2c7e235cfc0e0a161a558f7fdf2fb3a624a. The changes to fix this should have landed in 5.1. Signed-off-by: Christian König <[email protected]> Reviewed-by: Zhou, David(ChunMing) <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-09-16drm/amdgpu: drop double HDP flush in the VM codeChristian König1-6/+0
Already done in the CPU based backend code. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-09-16drm/amdgpu: cleanup coding style in the VM code a bitChristian König1-19/+36
No functional change. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-09-16drm/amdgpu: revert "disable bulk moves for now"Christian König1-2/+0
This reverts commit a213c2c7e235cfc0e0a161a558f7fdf2fb3a624a. The changes to fix this should have landed in 5.1. Signed-off-by: Christian König <[email protected]> Reviewed-by: Zhou, David(ChunMing) <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-09-16drm/amdgpu: add graceful VM fault handling v3Christian König1-0/+73
Next step towards HMM support. For now just silence the retry fault and optionally redirect the request to the dummy page. v2: make sure the VM is not destroyed while we handle the fault. v3: fix VM destroy check, cleanup comments Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-09-16drm/amdgpu: reserve the root PD while freeing PASIDsChristian König1-11/+9
Free the pasid only while the root PD is reserved. This prevents use after free in the page fault handling. Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-09-16drm/amdgpu: allocate PDs/PTs with no_gpu_wait in a page faultChristian König1-3/+5
While handling a page fault we can't wait for other ongoing GPU operations or we can potentially run into deadlocks. Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-09-16drm/amdgpu: allow direct submission of clearsChristian König1-6/+11
For handling PD/PT clears directly in the fault handler. Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-09-16drm/amdgpu: allow direct submission of PTE updatesChristian König1-8/+10
For handling PTE updates directly in the fault handler. Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-09-16drm/amdgpu: allow direct submission of PDE updates v2Christian König1-3/+5
For handling PDE updates directly in the fault handler. v2: fix typo in comment Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-09-16drm/amdgpu: split the VM entity into direct and delayedChristian König1-6/+15
For page fault handling we need to use a direct update which can't be blocked by ongoing user CS. Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-09-16drm/amdgpu: grab the id mgr lock while accessing passid_mappingChristian König1-3/+9
Need to make sure that we actually dropping the right fence. Could be done with RCU as well, but to complicated for a fix. Signed-off-by: Christian König <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-09-16drm/amdgpu: cleanup PTE flag generation v3Christian König1-27/+2
Move the ASIC specific code into a new callback function. v2: mask the flags for SI and CIK instead of a BUG_ON(). v3: remove last missed BUG_ON(). Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Reviewed-by: Huang Rui <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-09-16drm/amdgpu: cleanup mtype mappingChristian König1-2/+4
Unify how we map the UAPI flags to the PTE hardware flags for a mapping. Only the MTYPE is actually ASIC dependent, all other flags should be copied over 1 to 1 and ASIC differences are handled later on. Signed-off-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-09-13drm/amdgpu: use moving fence instead of exclusive for VM updatesChristian König1-1/+1
Make VM updates depend on the moving fence instead of the exclusive one. Makes it less likely to actually have a dependency. Signed-off-by: Christian König <[email protected]> Reviewed-by: Chunming Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-09-13drm/amdgpu: Use optimal mtypes and PTE bits for ArcturusFelix Kuehling1-0/+4
For compute VRAM allocations on Arturus use the new RW mtype for non-coherent local memory, CC mtype for coherent local memory and PTE_SNOOPED bit for invalidating non-dirty cache lines on remote XGMI mappings. Signed-off-by: Felix Kuehling <[email protected]> Acked-by: Christian König <[email protected]> Tested-by: Amber Lin <[email protected]> Reviewed-by: Shaoyun Liu <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-08-27Merge tag 'drm-next-5.4-2019-08-23' of ↵Dave Airlie1-0/+7
git://people.freedesktop.org/~agd5f/linux into drm-next drm-next-5.4-2019-08-23: amdgpu: - Enable power features on Navi12 - Enable power features on Arcturus - RAS updates - Initial Renoir APU support - Enable power featyres on Renoir - DC gamma fixes - DCN2 fixes - GPU reset support for Picasso - Misc cleanups and fixes scheduler: - Possible race fix Signed-off-by: Dave Airlie <[email protected]> From: Alex Deucher <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]