Age | Commit message (Collapse) | Author | Files | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull hmm updates from Jason Gunthorpe:
"This is another round of bug fixing and cleanup. This time the focus
is on the driver pattern to use mmu notifiers to monitor a VA range.
This code is lifted out of many drivers and hmm_mirror directly into
the mmu_notifier core and written using the best ideas from all the
driver implementations.
This removes many bugs from the drivers and has a very pleasing
diffstat. More drivers can still be converted, but that is for another
cycle.
- A shared branch with RDMA reworking the RDMA ODP implementation
- New mmu_interval_notifier API. This is focused on the use case of
monitoring a VA and simplifies the process for drivers
- A common seq-count locking scheme built into the
mmu_interval_notifier API usable by drivers that call
get_user_pages() or hmm_range_fault() with the VA range
- Conversion of mlx5 ODP, hfi1, radeon, nouveau, AMD GPU, and Xen
GntDev drivers to the new API. This deletes a lot of wonky driver
code.
- Two improvements for hmm_range_fault(), from testing done by Ralph"
* tag 'for-linus-hmm' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
mm/hmm: remove hmm_range_dma_map and hmm_range_dma_unmap
mm/hmm: make full use of walk_page_range()
xen/gntdev: use mmu_interval_notifier_insert
mm/hmm: remove hmm_mirror and related
drm/amdgpu: Use mmu_interval_notifier instead of hmm_mirror
drm/amdgpu: Use mmu_interval_insert instead of hmm_mirror
drm/amdgpu: Call find_vma under mmap_sem
nouveau: use mmu_interval_notifier instead of hmm_mirror
nouveau: use mmu_notifier directly for invalidate_range_start
drm/radeon: use mmu_interval_notifier_insert
RDMA/hfi1: Use mmu_interval_notifier_insert for user_exp_rcv
RDMA/odp: Use mmu_interval_notifier_insert()
mm/hmm: define the pre-processor related parts of hmm.h even if disabled
mm/hmm: allow hmm_range to be used with a mmu_interval_notifier or hmm_mirror
mm/mmu_notifier: add an interval tree notifier
mm/mmu_notifier: define the header pre-processor parts even if disabled
mm/hmm: allow snapshot of the special zero page
|
|
Remove the interval tree in the driver and rely on the tree maintained by
the mmu_notifier for delivering mmu_notifier invalidation callbacks.
For some reason amdgpu has a very complicated arrangement where it tries
to prevent duplicate entries in the interval_tree, this is not necessary,
each amdgpu_bo can be its own stand alone entry. interval_tree already
allows duplicates and overlaps in the tree.
Also, there is no need to remove entries upon a release callback, the
mmu_interval API safely allows objects to remain registered beyond the
lifetime of the mm. The driver only has to stop touching the pages during
release.
Link: https://lore.kernel.org/r/[email protected]
Reviewed-by: Philip Yang <[email protected]>
Tested-by: Philip Yang <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
|
|
The placement is something TTM/BO internal and the RAS code should
avoid touching that directly.
Add a helper to create a BO at a fixed location and use that instead.
v2: squash in fixes (Alex)
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Guchun Chen <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
While handling a page fault we can't wait for other ongoing GPU
operations or we can potentially run into deadlocks.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-next
drm-misc-next for 5.4:
UAPI Changes:
Cross-subsystem Changes:
Core Changes:
- dma-buf: add reservation_object_fences helper, relax
reservation_object_add_shared_fence, remove
reservation_object seq number (and then
restored)
- dma-fence: Shrinkage of the dma_fence structure,
Merge dma_fence_signal and dma_fence_signal_locked,
Store the timestamp in struct dma_fence in a union with
cb_list
Driver Changes:
- More dt-bindings YAML conversions
- More removal of drmP.h includes
- dw-hdmi: Support get_eld and various i2s improvements
- gm12u320: Few fixes
- meson: Global cleanup
- panfrost: Few refactors, Support for GPU heap allocations
- sun4i: Support for DDC enable GPIO
- New panels: TI nspire, NEC NL8048HL11, LG Philips LB035Q02,
Sharp LS037V7DW01, Sony ACX565AKM, Toppoly TD028TTEC1
Toppoly TD043MTEA1
Signed-off-by: Dave Airlie <[email protected]>
[airlied: fixup dma_resv rename fallout]
From: Maxime Ripard <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/20190819141923.7l2adietcr2pioct@flea
|
|
Be more consistent with the naming of the other DMA-buf objects.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/323401/
|
|
git://people.freedesktop.org/~agd5f/linux into drm-next
drm-next-5.4-2019-08-09:
Same as drm-next-5.4-2019-08-06, but with the
readq/writeq stuff fixed and 5.3-rc3 backmerged.
amdgpu:
- Add navi14 support
- Add navi12 support
- Add Arcturus support
- Enable mclk DPM for Navi
- Misc DC display fixes
- Add perfmon support for DF
- Add scatter/gather display support for Raven
- Improve SMU handling for GPU reset
- RAS support for GFX
- Drop last of drmP.h
- Add support for wiping memory on buffer release
- Allow cursor async updates for fb swaps
- Misc fixes and cleanups
amdkfd:
- Add navi14 support
- Add navi12 support
- Add Arcturus support
- CWSR trap handlers updates for gfx9, 10
- Drop last of drmP.h
- Update MAINTAINERS
radeon:
- Misc fixes and cleanups
- Make kexec more reliable by tearing down the GPU
ttm:
- Add release_notify callback
uapi:
- Add wipe memory on release flag for buffer creation
Signed-off-by: Dave Airlie <[email protected]>
[airlied: resolved conflicts with ttm resv moving]
From: Alex Deucher <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Drop vma_node from ttm_buffer_object, use the gem struct
(base.vma_node) instead.
Signed-off-by: Gerd Hoffmann <[email protected]>
Reviewed-by: Christian König <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Drop drm_gem_object from amdgpu_bo, use the
ttm_buffer_object.base instead.
Build tested only.
Signed-off-by: Gerd Hoffmann <[email protected]>
Reviewed-by: Christian König <[email protected]>
Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Wipe VRAM memory containing sensitive data when moving or releasing
BOs. Clearing the memory is pipelined to minimize any impact on
subsequent memory allocation latency. Use of a poison value should
help debug future use-after-free bugs.
When moving BOs, the existing ttm_bo_pipelined_move ensures that the
memory won't be reused before being wiped.
When releasing BOs, the BO is fenced with the memory fill operation,
which results in queuing the BO for a delayed delete.
v2: Move amdgpu_amdkfd_unreserve_memory_limit into
amdgpu_bo_release_notify so that KFD can use memory that's still
being cleared in the background
Signed-off-by: Felix Kuehling <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Move the logic to clear AMDGPU_GEM_CREATE_CPU_GTT_USWC in
amdgpu_bo_do_create into standalone helper so it can be reused
in other functions.
v4:
Switch to return bool.
v5: Fix typos.
Signed-off-by: Andrey Grodzovsky <[email protected]>
Acked-by: Michel Dänzer <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This avoids OOM situations when we have lots of threads
submitting at the same time.
v3: apply this to the whole driver, not just CS
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Tested-by: Pierre-Eric Pelloux-Prayer <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Driver vote low to high pstate switch whenever there is an outstanding
XGMI mapping request. Driver vote high to low pstate when all the
outstanding XGMI mapping is terminated.
Signed-off-by: shaoyunl <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This reverts commit 9b638f9751308ae3ae8f28e0c6e9decffd97f5f9.
Adding this to the mapping is complete nonsense and the whole
implementation looks racy. This patch wasn't thoughtfully reviewed
and should be reverted for now.
Signed-off-by: Christian König <[email protected]>
Acked-by: Liu, Shaoyun <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Driver vote low to high pstate switch whenever there is an outstanding
XGMI mapping request. Driver vote high to low pstate when all the
outstanding XGMI mapping is terminated.
Signed-off-by: shaoyunl <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Creates a temporary sync object to wait for the BO reservation. This
generalizes amdgpu_vm_wait_pd.
Signed-off-by: Felix Kuehling <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
It is unused.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Don't grab the reservation lock any more and simplify the handling quite
a bit.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Huang Rui <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Instead of the double linked list. Gets the size of amdgpu_vm_pt down to
64 bytes again.
We could even reduce it down to 32 bytes, but that would require some
rather extreme hacks.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Acked-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Not used any more.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Make struct amdgpu_bo a bit smaller.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Just rename functions, no functional change.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
It could be got by amdgpu_bo_gpu_offset() if need
Signed-off-by: Junwei Zhang <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Move logic of getting supported domain to a helper
function
Signed-off-by: Deepak Sharma <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
When GEM needs to fallback to GTT for VRAM BOs we still want the
preferred domain to be untouched so that the BO has a cance to move back
to VRAM in the future.
Signed-off-by: Chunming Zhou <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
After that, we can easily add new parameter when need.
v2:
a) rebase.
b) Initialize struct amdgpu_bo_param, future new
member could only be used in some one case, but all member
should have its own initial value.
Signed-off-by: Chunming Zhou <[email protected]>
Reviewed-by: Huang Rui <[email protected]> (v1)
Reviewed-by: Christian König <[email protected]>
Reviewed-by: Junwei Zhang <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Alex Deucher <[email protected]>
|
|
amdgpu_bo_create has too many parameters, and used in
too many places. Collect them to one structure.
Signed-off-by: Chunming Zhou <[email protected]>
Reviewed-by: Huang Rui <[email protected]>
Reviewed-by: Christian König <[email protected]>
Reviewed-by: Junwei Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Reserved VRAM is used to avoid overriding pre OS FB.
Once our display stack takes over we don't need the reserved
VRAM anymore.
v2:
Remove comment, we know actually why we need to reserve the stolen VRAM.
Fix return type for amdgpu_ttm_late_init.
v3:
Return 0 in amdgpu_bo_late_init, rebase on changes to previous patch
v4: rebase
v5:
For GMC9 reserve always just 9M and keep the stolem memory around
until GART table curruption on S3 resume is resolved.
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Andrey Grodzovsky <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The detection if a BO was placed in CPU visible VRAM was incorrect.
Fix it and merge it with the correct detection in amdgpu_ttm.c
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Drop the "kernel" and sg parameter and give the BO type to create
explicit to amdgpu_bo_create instead of figuring it out from the
parameters.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Roger He <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
v2:
* Removed unused flags from struct kgd_mem
* Updated some comments
* Added a check to unmap_memory_from_gpu whether BO was mapped
v3: add mutex_destroy in relevant places
Signed-off-by: Felix Kuehling <[email protected]>
Acked-by: Oded Gabbay <[email protected]>
Signed-off-by: Oded Gabbay <[email protected]>
|
|
should use bo_create_kernel instead of split to two
function that create and pin the SA bo
issue:
before this patch, there are DMAR read error in host
side when running SRIOV test, the DMAR address dropped
in the range of SA bo.
fix:
after this cleanups of SA init and fini, above DMAR
eror gone.
v2:
keep sa_bo's fini instead of suspend, to keep
reporting error
Signed-off-by: Monk Liu <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This reverts commit 2046d46db9166bddc84778f0b3477f6d1e9068ea.
Not needed any more.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Rename amdgpu_gtt_mgr_is_allocated() to amdgpu_gtt_mgr_has_gart_addr() and use
that instead.
v2: rename the function as well.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Introduce a flag to signal that access to a BO will be synchronized
through an external mechanism.
Currently all buffers shared between contexts are subject to implicit
synchronization. However, this is only required for protocols that
currently don't support an explicit synchronization mechanism (DRI2/3).
This patch introduces the AMDGPU_GEM_CREATE_EXPLICIT_SYNC, so that
users can specify when it is safe to disable implicit sync.
v2: only disable explicit sync in amdgpu_cs_ioctl
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Andres Rodriguez <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Andres Rodriguez <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
We adjusted the BO flags for USWC handling, but those never took effect
because the placement was passed in instead of generated inside this
function.
v2: better commit message
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Use the VM instead of the BO list to find the BO for a virtual address.
This fixes UVD/VCE in physical mode with VM local BOs.
Signed-off-by: Christian König <[email protected]>
Acked-by: Leo Liu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Except for the reference count all other members are protected
by the VM PD being reserved.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
We changed this to use an extra list a while back, but for the next
series I need a separate flag again.
v2: reorder to avoid unlocked list access
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Split that into vm_bo_base and bo_va to allow other uses as well.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Change "prefered" to "preferred"
Signed-off-by: Kent Russell <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The parameter init_value contains the value to which we initialized
VRAM bo when AMDGPU_GEM_CREATE_VRAM_CLEARED flag is set.
Signed-off-by: Yong Zhao <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Same as amdgpu_bo_create_kernel, but keeps the BO reserved.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Save some memory because only one of those is used at all times.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Move amdgpu_bo and related structures into amdgpu_object.h.
Move amdgpu_bo_list structures to the amdgpu_bo_list functions.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Don't keep around the same pointer twice.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The test was relaxed a bit to much.
Signed-off-by: Christian König <[email protected]>
Acked-by: Tom St Denis <[email protected]>
Reviewed-and-Tested-by: Roger He <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Reviewed-by: Chunming Zhou <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Roger.He <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Signed-off-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Chunming Zhou <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|