aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
AgeCommit message (Collapse)AuthorFilesLines
2019-05-14mm/mmu_notifier: convert user range->blockable to helper functionJérôme Glisse1-4/+4
Use the mmu_notifier_range_blockable() helper function instead of directly dereferencing the range->blockable field. This is done to make it easier to change the mmu_notifier range field. This patch is the outcome of the following coccinelle patch: %<------------------------------------------------------------------- @@ identifier I1, FN; @@ FN(..., struct mmu_notifier_range *I1, ...) { <... -I1->blockable +mmu_notifier_range_blockable(I1) ...> } ------------------------------------------------------------------->% spatch --in-place --sp-file blockable.spatch --dir . Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jérôme Glisse <[email protected]> Reviewed-by: Ralph Campbell <[email protected]> Reviewed-by: Ira Weiny <[email protected]> Cc: Christian König <[email protected]> Cc: Joonas Lahtinen <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Jan Kara <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Peter Xu <[email protected]> Cc: Felix Kuehling <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Ross Zwisler <[email protected]> Cc: Dan Williams <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Radim Krcmar <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Christian Koenig <[email protected]> Cc: John Hubbard <[email protected]> Cc: Arnd Bergmann <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-12-28mm/mmu_notifier: use structure for invalidate_range_start/end callbackJérôme Glisse1-27/+20
Patch series "mmu notifier contextual informations", v2. This patchset adds contextual information, why an invalidation is happening, to mmu notifier callback. This is necessary for user of mmu notifier that wish to maintains their own data structure without having to add new fields to struct vm_area_struct (vma). For instance device can have they own page table that mirror the process address space. When a vma is unmap (munmap() syscall) the device driver can free the device page table for the range. Today we do not have any information on why a mmu notifier call back is happening and thus device driver have to assume that it is always an munmap(). This is inefficient at it means that it needs to re-allocate device page table on next page fault and rebuild the whole device driver data structure for the range. Other use case beside munmap() also exist, for instance it is pointless for device driver to invalidate the device page table when the invalidation is for the soft dirtyness tracking. Or device driver can optimize away mprotect() that change the page table permission access for the range. This patchset enables all this optimizations for device drivers. I do not include any of those in this series but another patchset I am posting will leverage this. The patchset is pretty simple from a code point of view. The first two patches consolidate all mmu notifier arguments into a struct so that it is easier to add/change arguments. The last patch adds the contextual information (munmap, protection, soft dirty, clear, ...). This patch (of 3): To avoid having to change many callback definition everytime we want to add a parameter use a structure to group all parameters for the mmu_notifier invalidate_range_start/end callback. No functional changes with this patch. [[email protected]: fix drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c kerneldoc] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jérôme Glisse <[email protected]> Acked-by: Jan Kara <[email protected]> Acked-by: Jason Gunthorpe <[email protected]> [infiniband] Cc: Matthew Wilcox <[email protected]> Cc: Ross Zwisler <[email protected]> Cc: Dan Williams <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Radim Krcmar <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Christian Koenig <[email protected]> Cc: Felix Kuehling <[email protected]> Cc: Ralph Campbell <[email protected]> Cc: John Hubbard <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-08-22mm, oom: distinguish blockable mode for mmu notifiersMichal Hocko1-8/+35
There are several blockable mmu notifiers which might sleep in mmu_notifier_invalidate_range_start and that is a problem for the oom_reaper because it needs to guarantee a forward progress so it cannot depend on any sleepable locks. Currently we simply back off and mark an oom victim with blockable mmu notifiers as done after a short sleep. That can result in selecting a new oom victim prematurely because the previous one still hasn't torn its memory down yet. We can do much better though. Even if mmu notifiers use sleepable locks there is no reason to automatically assume those locks are held. Moreover majority of notifiers only care about a portion of the address space and there is absolutely zero reason to fail when we are unmapping an unrelated range. Many notifiers do really block and wait for HW which is harder to handle and we have to bail out though. This patch handles the low hanging fruit. __mmu_notifier_invalidate_range_start gets a blockable flag and callbacks are not allowed to sleep if the flag is set to false. This is achieved by using trylock instead of the sleepable lock for most callbacks and continue as long as we do not block down the call chain. I think we can improve that even further because there is a common pattern to do a range lookup first and then do something about that. The first part can be done without a sleeping lock in most cases AFAICS. The oom_reaper end then simply retries if there is at least one notifier which couldn't make any progress in !blockable mode. A retry loop is already implemented to wait for the mmap_sem and this is basically the same thing. The simplest way for driver developers to test this code path is to wrap userspace code which uses these notifiers into a memcg and set the hard limit to hit the oom. This can be done e.g. after the test faults in all the mmu notifier managed memory and set the hard limit to something really small. Then we are looking for a proper process tear down. [[email protected]: coding style fixes] [[email protected]: minor code simplification] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Michal Hocko <[email protected]> Acked-by: Christian König <[email protected]> # AMD notifiers Acked-by: Leon Romanovsky <[email protected]> # mlx and umem_odp Reported-by: David Rientjes <[email protected]> Cc: "David (ChunMing) Zhou" <[email protected]> Cc: Paolo Bonzini <[email protected]> Cc: Alex Deucher <[email protected]> Cc: David Airlie <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Joonas Lahtinen <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Doug Ledford <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Mike Marciniszyn <[email protected]> Cc: Dennis Dalessandro <[email protected]> Cc: Sudeep Dutt <[email protected]> Cc: Ashutosh Dixit <[email protected]> Cc: Dimitri Sivanich <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Juergen Gross <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Felix Kuehling <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-07-05drm/amd: Remove errors from sphinx documentationDarren Powell1-2/+3
Eliminating the warnings produced by sphinx when processing the sphinx comments in amdgpu_device.c & amdgpu_mn.c Signed-off-by: Darren Powell <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-06-15drm/amdgpu: fix typo in amdgpu_mn.c commentsSlava Abramov1-1/+1
In doc comments for struct amdgpu_mn: destrution -> destruction Signed-off-by: Slava Abramov <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-06-15drm/amdgpu: fix documentation of amdgpu_mn.c v2Christian König1-16/+58
And wire it up as well. v2: improve the wording, fix label mismatch Signed-off-by: Christian König <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-06-15drm/amdgpu: rename rmn to amn in the MMU notifier code (v2)Christian König1-70/+70
Just a copy&paste leftover from radeon. v2: rebase (Alex) Signed-off-by: Christian König <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-03-23drm/amdgpu: Avoid reclaim while holding locks taken in MMU notifierFelix Kuehling1-8/+9
When an MMU notifier runs in memory reclaim context, it can deadlock trying to take locks that are already held in the thread causing the memory reclaim. The solution is to avoid memory reclaim while holding locks that are taken in MMU notifiers. This commit fixes kmalloc while holding rmn->lock by moving the call outside the lock. The GFX MMU notifier also locks reservation objects. I have no good solution for avoiding reclaim while holding reservation objects. The HSA MMU notifier will not lock any reservation objects. v2: Moved allocation outside lock instead of using GFP_NOIO Signed-off-by: Felix Kuehling <[email protected]> Acked-by: Oded Gabbay <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2018-03-23drm/amdgpu: Add MMU notifier type for KFD userptrFelix Kuehling1-15/+79
This commit adds the notion of MMU notifier types GFX and HSA. GFX continues to work like MMU notifiers did before. HSA adds support for KFD userptr BOs. The implementation of KFD userptr eviction is a stub for now. Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2017-09-28Merge branch 'drm-next-4.15' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie1-24/+84
into drm-next First feature pull for 4.15. Highlights: - Per VM BO support - Lots of powerplay cleanups - Powerplay support for CI - pasid mgr for kfd - interrupt infrastructure for recoverable page faults - SR-IOV fixes - initial GPU reset for vega10 - prime mmap support - ttm page table debugging improvements - lots of bug fixes * 'drm-next-4.15' of git://people.freedesktop.org/~agd5f/linux: (232 commits) drm/amdgpu: clarify license in amdgpu_trace_points.c drm/amdgpu: Add gem_prime_mmap support drm/amd/powerplay: delete dead code in smumgr drm/amd/powerplay: delete SMUM_FIELD_MASK drm/amd/powerplay: delete SMUM_WAIT_INDIRECT_FIELD drm/amd/powerplay: delete SMUM_READ_FIELD drm/amd/powerplay: delete SMUM_SET_FIELD drm/amd/powerplay: delete SMUM_READ_VFPF_INDIRECT_FIELD drm/amd/powerplay: delete SMUM_WRITE_VFPF_INDIRECT_FIELD drm/amd/powerplay: delete SMUM_WRITE_FIELD drm/amd/powerplay: delete SMU_WRITE_INDIRECT_FIELD drm/amd/powerplay: move macros to hwmgr.h drm/amd/powerplay: move PHM_WAIT_VFPF_INDIRECT_FIELD to hwmgr.h drm/amd/powerplay: move SMUM_WAIT_VFPF_INDIRECT_FIELD_UNEQUAL to hwmgr.h drm/amd/powerplay: move SMUM_WAIT_INDIRECT_FIELD_UNEQUAL to hwmgr.h drm/amd/powerplay: add new helper functions in hwmgr.h drm/amd/powerplay: use SMU_IND_INDEX/DATA_11 pair drm/amd/powerplay: refine powerplay code. drm/amd/powerplay: delete dead code in hwmgr.h drm/amd/powerplay: refine interface in struct pp_smumgr_func ...
2017-09-12drm/amdgpu: keep the MMU lock until the update ends v4Christian König1-4/+55
This is quite controversial because it adds another lock which is held during page table updates, but I don't see much other option. v2: allow multiple updates to be in flight at the same time v3: simplify the patch, take the read side only once v4: correctly fix rebase conflict Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-09-12drm/amdgpu: stop reserving the BO in the MMU callback v3Christian König1-9/+21
Instead take the callback lock during the final parts of CS. This should solve the last remaining locking order problems with BO reservations. v2: rebase, make dummy functions static inline v3: add one more missing inline and comments Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-09-12drm/amdgpu: use a rw_semaphore for MMU notifiersChristian König1-13/+13
Allow at least some parallel processing. Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-09-12drm/amdgpu: move userptr BOs to CPU domain during CS v2Christian König1-4/+1
Instead of moving them in the MMU notifier move them during CS. v2: still mark pages as accessed/dirty Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> (v1) Signed-off-by: Alex Deucher <[email protected]>
2017-09-08lib/interval_tree: fast overlap detectionDavidlohr Bueso1-4/+4
Allow interval trees to quickly check for overlaps to avoid unnecesary tree lookups in interval_tree_iter_first(). As of this patch, all interval tree flavors will require using a 'rb_root_cached' such that we can have the leftmost node easily available. While most users will make use of this feature, those with special functions (in addition to the generic insert, delete, search calls) will avoid using the cached option as they can do funky things with insertions -- for example, vma_interval_tree_insert_after(). [[email protected]: fix deadlock from typo vm_lock_anon_vma()] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Davidlohr Bueso <[email protected]> Signed-off-by: Jérôme Glisse <[email protected]> Acked-by: Christian König <[email protected]> Acked-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Doug Ledford <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Cc: David Airlie <[email protected]> Cc: Jason Wang <[email protected]> Cc: Christian Benvenuti <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-08-31drm/amdgpu: update to new mmu_notifier semanticJérôme Glisse1-31/+0
Calls to mmu_notifier_invalidate_page() were replaced by calls to mmu_notifier_invalidate_range() and are now bracketed by calls to mmu_notifier_invalidate_range_start()/end() Remove now useless invalidate_page callback. Signed-off-by: Jérôme Glisse <[email protected]> Reviewed-by: Christian König <[email protected]> Cc: [email protected] Cc: Felix Kuehling <[email protected]> Cc: Christian König <[email protected]> Cc: Alex Deucher <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Andrea Arcangeli <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-08-02drm/amdgpu: Use list_del_init in amdgpu_mn_unregisterFelix Kuehling1-1/+1
Otherwise bo->shadow_list (which is aliased by bo->mn_list) will not appear empty in amdgpu_ttm_bo_destroy and cause an oops when freeing former userptr BOs. Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-04-04drm/amdgpu: use a 64bit interval tree for VM management v2Christian König1-0/+1
This only makes a difference for 32-bit systems. The idea is to have a fixed virtual address space size with 4-level page tables and to minimize differences between 32 and 64-bit systems. v2: Update commit message. Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-10-25drm/amdgpu: remove adev pointer from struct amdgpu_bo v2Christian König1-2/+2
It's completely pointless to have two pointers to the device in the same structure. v2: rename function to amdgpu_ttm_adev, fix typos Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-05-23drm/amdgpu: make amdgpu_mn_get wait for mmap_sem killableMichal Hocko1-1/+4
amdgpu_mn_get which is called during ioct path relies on mmap_sem for write. If the waiting task gets killed by the oom killer it would block oom_reaper from asynchronous address space reclaim and reduce the chances of timely OOM resolving. Wait for the lock in the killable mode and return with EINTR if the task got killed while waiting. [[email protected]: use ERR_PTR() to return from amdgpu_mn_get] Signed-off-by: Michal Hocko <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Reviewed-by: Christian König <[email protected]> Cc: David Airlie <[email protected]> Cc: Alex Deucher <[email protected]> Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-03-21drm/amdgpu: add invalidate_page callback for userptrsChristian König1-26/+72
Otherwise we can run into problems with the writeback code. Signed-off-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-03-21drm/amdgpu: Revert "remove the userptr rmn->lock"Christian König1-8/+14
This reverts commit c02196834456f2d5fad334088b70e98ce4967c34. In the meantime we moved get_user_pages() outside of the reservation lock, so that shouldn't be an issue any more Signed-off-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2016-02-17drm/amdgpu: Don't call interval_tree_remove in amdgpu_mn_destroyFelix Kuehling1-1/+0
rbtree_postorder_for_each_entry_safe can skip over some entries if the tree is rebalanced in interval_tree_remove. interval_tree_remove is also redundant when the tree is just about to be freed. Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Christian König <[email protected]>
2016-02-17drm/amdgpu: Fix race condition in amdgpu_mn_unregisterFelix Kuehling1-10/+13
Exchange locking order of adev->mn_lock and mm_sem, so that rmn->mm->mmap_sem can be taken safely, protected by adev->mn_lock, when amdgpu_mn_destroy runs concurrently. Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Christian König <[email protected]>
2016-02-16drm/amdgpu: Fix race condition in MMU notifier releaseFelix Kuehling1-1/+1
The release notifier can get called a second time from mmu_notifier_unregister depending on a race between __mmu_notifier_release and amdgpu_mn_destroy. Use mmu_notifier_unregister_no_release to avoid this. Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Christian König <[email protected]>
2016-02-12drm/amdgpu: remove the userptr rmn->lockChristian König1-20/+12
Avoid a lock inversion problem by just using the mmap_sem to protect the entries of the intervall tree. Signed-off-by: Christian König <[email protected]> Reviewed-by: Felix Kuehling <[email protected]>
2016-02-10drm/amdgpu: fix issue with overlapping userptrsChristian König1-1/+2
Otherwise we could try to evict overlapping userptr BOs in get_user_pages(), leading to a possible circular locking dependency. Signed-off-by: Christian König <[email protected]> Acked-by: Alex Deucher <[email protected]>
2015-06-03drm/amdgpu: fix userptr lockupChristian König1-0/+3
Signed-off-by: Christian König <[email protected]> Reviewed-by: Jammy Zhou <[email protected]> Reviewed-by: Monk Liu <[email protected]>
2015-06-03drm/amdgpu: fix error check issue in amdgpu_mn_invalidate_range_startJack Xiao1-5/+5
Signed-off-by: Jack Xiao <[email protected]> Reviewed-by: Monk Liu <[email protected]> Reviewed-by: Christian König <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2015-06-03drm/amdgpu: add core driver (v4)Alex Deucher1-0/+319
This adds the non-asic specific core driver code. v2: remove extra kconfig option v3: implement minor fixes from Fengguang Wu v4: fix cast in amdgpu_ucode.c Acked-by: Christian König <[email protected]> Acked-by: Jammy Zhou <[email protected]> Signed-off-by: Alex Deucher <[email protected]>