Age | Commit message (Collapse) | Author | Files | Lines |
|
If a kfd_bo was shared (e.g. a dmabuf export), the original kfd_bo may be
freed when the amdgpu_bo still lives on. Free the kfd_bo struct in the
release_notify callback then the amdgpu_bo is freed.
Signed-off-by: Felix Kuehling <[email protected]>
Reviewed-By: Ramesh Errabolu <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Currently, all kfd BOs use same destruction routine. But pinned
BOs are not unpinned properly. Separate them from general routine.
v2 (Felix):
Add safeguard to prevent user space from freeing signal BO.
Kunmap signal BO in the event of setting event page error.
Just kunmap signal BO to avoid duplicating the code.
Signed-off-by: Lang Yu <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
We can get the pdev and asic type from the adev. No need
to pass them explicitly.
v2: squash in build fix for !CONFIG_HSA_AMD from Anson
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
In ras poison mode, page retirement will be handled by the irq handler of the
module which consumes corrupted data.
v2: rename ras_process_cb to ras_poison_consumption_handler.
move the handler's implementation from ASIC specific file to common
file.
v3: call gpu reset for xGMI connected mode.
Signed-off-by: Tao Zhou <[email protected]>
Reviewed-by: Hawking Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add amdgpu_amdkfd_resume_iommu for amdgpu.
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277
Signed-off-by: James Zhu <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
|
|
Separate kfd_iommu_resume from kfd_resume for fine-tuning
of amdgpu device init/resume/reset/recovery sequence.
v2: squash in fix for !CONFIG_HSA_AMD
Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277
Signed-off-by: James Zhu <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
|
|
This reverts commit 7ed9876c9793bfe96fed58ba645d6c8e32f26001.
Revert reason: The issue has been resolved.
Signed-off-by: Eric Huang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-5.15-2021-07-29:
amdgpu:
- VCN/JPEG power down sequencing fixes
- Various navi pcie link handling fixes
- Clockgating fixes
- Yellow Carp fixes
- Beige Goby fixes
- Misc code cleanups
- S0ix fixes
- SMU i2c bus rework
- EEPROM handling rework
- PSP ucode handling cleanup
- SMU error handling rework
- AMD HDMI freesync fixes
- USB PD firmware update rework
- MMIO based vram access rework
- Misc display fixes
- Backlight fixes
- Add initial Cyan Skillfish support
- Overclocking fixes suspend/resume
amdkfd:
- Sysfs leak fix
- Add counters for vm faults and migration
- GPUVM TLB optimizations
radeon:
- Misc fixes
Signed-off-by: Dave Airlie <[email protected]>
From: Alex Deucher <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
This reverts commit 7ed9876c9793bfe96fed58ba645d6c8e32f26001.
Revert reason: The issue has been resolved.
Signed-off-by: Eric Huang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Update Arcturus/Aldebaran thermal throttle SMI event path to use
ASIC-independent throttler bits when logging.
Signed-off-by: Graham Sider <[email protected]>
Reviewed-by: Harish Kasiviswanathan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Similar to xGMI reporting the min/max bandwidth between direct peers, PCIe
will report the min/max bandwidth to the KFD.
Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Report the min/max bandwidth in megabytes to the kfd for direct
xgmi connections only. Indirect peers will report 0 since
indirect route is unknown.
Signed-off-by: Jonathan Kim <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This reverts commit 31f33243788dcbae8bd2819ed83923a73f7dfd30.
Reason for revert: it causes regressions on several Asics.
Signed-off-by: Eric Huang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This reverts commit 31f33243788dcbae8bd2819ed83923a73f7dfd30.
Reason for revert: it causes regressions on several Asics.
Signed-off-by: Eric Huang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
It is to optimize memory mapping latency, and also aviod
a page fault in a corner case of changing valid PDE into
PTE.
Signed-off-by: Eric Huang <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Backmerging from drm/drm-next to the patches for AMD devices
for v5.14.
Signed-off-by: Thomas Zimmermann <[email protected]>
|
|
Use it to call disply code dependent on device->drv_data
before it's set to NULL on device unplug
v5:
Move HW finilization into this callback to prevent MMIO accesses
post cpi remove.
v7:
Split kfd suspend from device exit to expdite HW related
stuff to amdgpu_pci_remove
v8:
Squash previous KFD commit into this commit to avoid compile break.
Signed-off-by: Andrey Grodzovsky <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Use DMABufs with dynamic attachment to DMA-map GTT BOs on other GPUs.
Signed-off-by: Felix Kuehling <[email protected]>
Acked-by: Oak Zeng <[email protected]>
Acked-by: Ramesh Errabolu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add BO-type specific helpers functions to DMA-map and unmap
kfd_mem_attachments. Implement this functionality for userptrs by creating
one SG BO per GPU and filling it with a DMA mapping of the pages from the
original mem->bo.
Signed-off-by: Felix Kuehling <[email protected]>
Acked-by: Oak Zeng <[email protected]>
Acked-by: Ramesh Errabolu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This name is more fitting, especially for the changes coming next to
support multi-GPU systems with proper DMA mappings. Cleaned up the code
and renamed some related functions and variables to improve readability.
Signed-off-by: Felix Kuehling <[email protected]>
Acked-by: Oak Zeng <[email protected]>
Acked-by: Ramesh Errabolu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Need do a heavy-weight TLB flush to make sure we have no more dirty data
in the cache for the unmapped pages.
Define enum TLB_FLUSH_TYPE, add flush_type parameter to
amdgpu_amdkfd_flush_gpu_tlb_pasid.
Signed-off-by: Philip Yang <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[why]
As part of the SVM functionality, the eviction mechanism used for
SVM_BOs is different. This mechanism uses one eviction fence per prange,
instead of one fence per kfd_process.
[how]
A svm_bo reference to amdgpu_amdkfd_fence to allow differentiate between
SVM_BO or regular BO evictions. This also include modifications to set the
reference at the fence creation call.
Signed-off-by: Alex Sierra <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
HMM migration alloc sizeof(struct page) on system memory for each VRAM
page, it is 1GB system memory reserved for 64GB VRAM. To avoid
application OOM, increase system memory used size based on VRAM size of
all GPUs, then application alloc memory will fail if system memory usage
reach the limit.
Signed-off-by: Philip Yang <[email protected]>
Reviewed-by: Oak Zeng <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Use amdgpu_vm_bo_update_mapping to update GPU page table to map or unmap
svm range system memory pages address to GPUs.
Signed-off-by: Philip Yang <[email protected]>
Signed-off-by: Alex Sierra <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
DRM render node file handles are used for CPU mapping of BOs using mmap
by the Thunk. It uses the DRM render node of the GPU where the BO was
allocated.
DRM allows mmap access automatically when it creates a GEM handle for a
BO. KFD BOs don't have GEM handles, so KFD needs to manage access
manually. Use drm_vma_node_allow to allow user mode to mmap BOs allocated
with kfd_ioctl_alloc_memory_of_gpu through the DRM render node that was
used in the kfd_ioctl_acquire_vm call for the same GPU.
Signed-off-by: Felix Kuehling <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Philip Yang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu needs the drm_priv to allow mmap
to access the BO through the corresponding file descriptor. The VM can
also be extracted from drm_priv, so drm_priv can replace the vm parameter
in the kfd2kgd interface.
Signed-off-by: Felix Kuehling <[email protected]>
Reviewed-by: Philip Yang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
ROCm user mode has acquired VMs from DRM file descriptors for as long
as it supported the upstream KFD. Legacy code to support older versions
of ROCm is not needed any more.
Signed-off-by: Felix Kuehling <[email protected]>
Reviewed-by: Philip Yang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
amdgpu driver may be in reset state during init which will not initialize the kfd,
driver need to initialize the KFD after reset by check the flag
Signed-off-by: shaoyunl <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Move all the dummy functions in amdgpu_amdkfd.c to
amdgpu_amdkfd.h as inline functions.
Signed-off-by: Lang Yu <[email protected]>
Suggested-by: Felix Kuehling <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Reviewed-by: Huang Rui <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Pull drm updates from Dave Airlie:
"Not a major amount of change, the i915 trees got split into display
and gt trees to better facilitate higher level review, and there's a
major refactoring of i915 GEM locking to use more core kernel concepts
(like ww-mutexes). msm gets per-process pagetables, older AMD SI cards
get DC support, nouveau got a bump in displayport support with common
code extraction from i915.
Outside of drm this contains a couple of patches for hexint
moduleparams which you've acked, and a virtio common code tree that
you should also get via it's regular path.
New driver:
- Cadence MHDP8546 DisplayPort bridge driver
core:
- cross-driver scatterlist cleanups
- devm_drm conversions
- remove drm_dev_init
- devm_drm_dev_alloc conversion
ttm:
- lots of refactoring and cleanups
bridges:
- chained bridge support in more drivers
panel:
- misc new panels
scheduler:
- cleanup priority levels
displayport:
- refactor i915 code into helpers for nouveau
i915:
- split into display and GT trees
- WW locking refactoring in GEM
- execbuf2 extension mechanism
- syncobj timeline support
- GEN 12 HOBL display powersaving
- Rocket Lake display additions
- Disable FBC on Tigerlake
- Tigerlake Type-C + DP improvements
- Hotplug interrupt refactoring
amdgpu:
- Sienna Cichlid updates
- Navy Flounder updates
- DCE6 (SI) support for DC
- Plane rotation enabled
- TMZ state info ioctl
- PCIe DPC recovery support
- DC interrupt handling refactor
- OLED panel fixes
amdkfd:
- add SMI events for thermal throttling
- SMI interface events ioctl update
- process eviction counters
radeon:
- move to dma_ for allocations
- expose sclk via sysfs
msm:
- DSI support for sm8150/sm8250
- per-process GPU pagetable support
- Displayport support
mediatek:
- move HDMI phy driver to PHY
- convert mtk-dpi to bridge API
- disable mt2701 tmds
tegra:
- bridge support
exynos:
- misc cleanups
vc4:
- dual display cleanups
ast:
- cleanups
gma500:
- conversion to GPIOd API
hisilicon:
- misc reworks
ingenic:
- clock handling and format improvements
mcde:
- DSI support
mgag200:
- desktop g200 support
mxsfb:
- i.MX7 + i.MX8M
- alpha plane support
panfrost:
- devfreq support
- amlogic SoC support
ps8640:
- EDID from eDP retrieval
tidss:
- AM65xx YUV workaround
virtio:
- virtio-gpu exported resources
rcar-du:
- R8A7742, R8A774E1 and R8A77961 support
- YUV planar format fixes
- non-visible plane handling
- VSP device reference count fix
- Kconfig fix to avoid displaying disabled options in .config"
* tag 'drm-next-2020-10-15' of git://anongit.freedesktop.org/drm/drm: (1494 commits)
drm/ingenic: Fix bad revert
drm/amdgpu: Fix invalid number of character '{' in amdgpu_acpi_init
drm/amdgpu: Remove warning for virtual_display
drm/amdgpu: kfd_initialized can be static
drm/amd/pm: setup APU dpm clock table in SMU HW initialization
drm/amdgpu: prevent spurious warning
drm/amdgpu/swsmu: fix ARC build errors
drm/amd/display: Fix OPTC_DATA_FORMAT programming
drm/amd/display: Don't allow pstate if no support in blank
drm/panfrost: increase readl_relaxed_poll_timeout values
MAINTAINERS: Update entry for st7703 driver after the rename
Revert "gpu/drm: ingenic: Add option to mmap GEM buffers cached"
drm/amd/display: HDMI remote sink need mode validation for Linux
drm/amd/display: Change to correct unit on audio rate
drm/amd/display: Avoid set zero in the requested clk
drm/amdgpu: align frag_end to covered address space
drm/amdgpu: fix NULL pointer dereference for Renoir
drm/vmwgfx: fix regression in thp code due to ttm init refactor.
drm/amdgpu/swsmu: add interrupt work handler for smu11 parts
drm/amdgpu/swsmu: add interrupt work function
...
|
|
This will allow us to have different defaults per asic
in a future patch.
Reviewed-by: Christian König <[email protected]>
Reviewed-by: Luben Tuikov <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
PASID is defined as a few different types in iommu including "int",
"u32", and "unsigned int". To be consistent and to match with uapi
definitions, define PASID and its variations (e.g. max PASID) as "u32".
"u32" is also shorter and a little more explicit than "unsigned int".
No PASID type change in uapi although it defines PASID as __u64 in
some places.
Suggested-by: Thomas Gleixner <[email protected]>
Signed-off-by: Fenghua Yu <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
Reviewed-by: Lu Baolu <[email protected]>
Acked-by: Felix Kuehling <[email protected]>
Acked-by: Joerg Roedel <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
|
|
Add support for reporting thermal throttling events through SMI.
Also, add a counter to count the number of throttling interrupts
observed and report the count in the SMI event message.
Signed-off-by: Mukul Joshi <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Use the proper API instead.
Fixes: 70539bd795002 ("drm/amd: Update MEC HQD loading code for KFD")
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Tested-by: Jens Axboe <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Reviewed-by: Jens Axboe <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: Zhenyu Wang <[email protected]>
Cc: Zhi Wang <[email protected]>
Cc: Felipe Balbi <[email protected]>
Cc: "Michael S. Tsirkin" <[email protected]>
Cc: Jason Wang <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Switch the function documentation to kerneldoc comments, and add
WARN_ON_ONCE asserts that the calling thread is a kernel thread and does
not have ->mm set (or has ->mm set in the case of unuse_mm).
Also give the functions a kthread_ prefix to better document the use case.
[[email protected]: fix a comment typo, cover the newly merged use_mm/unuse_mm caller in vfio]
Link: http://lkml.kernel.org/r/[email protected]
[[email protected]: powerpc/vas: fix up for {un}use_mm() rename]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Stephen Rothwell <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Tested-by: Jens Axboe <[email protected]>
Reviewed-by: Jens Axboe <[email protected]>
Acked-by: Felix Kuehling <[email protected]>
Acked-by: Greg Kroah-Hartman <[email protected]> [usb]
Acked-by: Haren Myneni <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Felipe Balbi <[email protected]>
Cc: Jason Wang <[email protected]>
Cc: "Michael S. Tsirkin" <[email protected]>
Cc: Zhenyu Wang <[email protected]>
Cc: Zhi Wang <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Patch series "improve use_mm / unuse_mm", v2.
This series improves the use_mm / unuse_mm interface by better documenting
the assumptions, and my taking the set_fs manipulations spread over the
callers into the core API.
This patch (of 3):
Use the proper API instead.
Link: http://lkml.kernel.org/r/[email protected]
These helpers are only for use with kernel threads, and I will tie them
more into the kthread infrastructure going forward. Also move the
prototypes to kthread.h - mmu_context.h was a little weird to start with
as it otherwise contains very low-level MM bits.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Tested-by: Jens Axboe <[email protected]>
Reviewed-by: Jens Axboe <[email protected]>
Acked-by: Felix Kuehling <[email protected]>
Cc: Alex Deucher <[email protected]>
Cc: Al Viro <[email protected]>
Cc: Felipe Balbi <[email protected]>
Cc: Jason Wang <[email protected]>
Cc: "Michael S. Tsirkin" <[email protected]>
Cc: Zhenyu Wang <[email protected]>
Cc: Zhi Wang <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Convert comments that reference mmap_sem to reference mmap_lock instead.
[[email protected]: fix up linux-next leftovers]
[[email protected]: s/lockaphore/lock/, per Vlastimil]
[[email protected]: more linux-next fixups, per Michel]
Signed-off-by: Michel Lespinasse <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Vlastimil Babka <[email protected]>
Reviewed-by: Daniel Jordan <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Jerome Glisse <[email protected]>
Cc: John Hubbard <[email protected]>
Cc: Laurent Dufour <[email protected]>
Cc: Liam Howlett <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ying Han <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The queue mask used for set_resources always assumes the queue number
per pipe is 8, so KFD needs to align with that by using function
amdgpu_queue_mask_bit_to_set_resource_bit().
Signed-off-by: Yong Zhao <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Track GPU VRAM usage on a per process basis and report it through
sysfs.
Signed-off-by: Mukul Joshi <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
In order to surface the ASIC revision to user level, we want
to put it into the HSA topology. This can be because different
ASIC revisions may require user-level software to do different
things (e.g. patch code for things that are changed in later
hardware revisions).
The ASIC revision from the hardware is maximum of 4 bits at this
time, so put it into 4 of the open bits in the HSA capability.
Then user-level software can use this capability information to
know -- for each ASIC -- what revision-based things must be done.
Signed-off-by: Joseph Greathouse <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Given we can query all the asic specific information from amdgpu_gfx_config,
we can make get_tile_config() generic.
Signed-off-by: Yong Zhao <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Devices from Arcturus onwards will have their UUID exposed to Thunk.
Adding neccessary functions to the kernel to propagate the uuid.
Signed-off-by: Divya Shikre <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
No need to trigger eviction as the memory mapping will not be used
anymore.
All pt/pd bos share same resv, hence the same shared eviction fence.
Everytime page table is freed, the fence will be signled and that cuases
kfd unexcepted evictions.
v2: squash in 32 bit fix
CC: Christian König <[email protected]>
CC: Felix Kuehling <[email protected]>
CC: Alex Deucher <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: xinhui pan <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
So far the kfd driver implemented same routines for runtime and system
wide suspend and resume (s2idle or mem). During system wide suspend the
kfd aquires an atomic lock that prevents any more user processes to
create queues and interact with kfd driver and amd gpu. This mechanism
created problem when amdgpu device is runtime suspended with BACO
enabled. Any application that relies on kfd driver fails to load because
the driver reports a locked kfd device since gpu is runtime suspended.
However, in an ideal case, when gpu is runtime suspended the kfd driver
should be able to:
- auto resume amdgpu driver whenever a client requests compute service
- prevent runtime suspend for amdgpu while kfd is in use
This change refactors the amdgpu and amdkfd drivers to support BACO and
runtime power management.
Reviewed-by: Oak Zeng <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Rajneesh Bhardwaj <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
[Why]
TLB flush method has been deprecated using kfd2kgd interface.
This implementation is now on the amdgpu_amdkfd API.
[How]
TLB flush functions now implemented in amdgpu_amdkfd.
Signed-off-by: Alex Sierra <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This is the same idea as the kfd device info probe and move all the
probe control together for easy maintenance.
Signed-off-by: Yong Zhao <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
kfd needs drm_device to call into drm_cgroup functions
Signed-off-by: Harish Kasiviswanathan <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
This optimizes out the pci device id usage in KFD and makes the code
more maintainable.
Signed-off-by: Yong Zhao <[email protected]>
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
These wptrs must be pinned and GPU accessible when this is called
from hqd_load functions. So they should never fault. This resolves
a circular lock dependency issue involving four locks including the
DQM lock and mmap_sem.
Signed-off-by: Felix Kuehling <[email protected]>
Reviewed-by: Oak Zeng <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The same BO can be mapped with different PTE flags by different GPUs.
Therefore determine the PTE flags separately for each mapping instead
of storing them in the KFD buffer object.
Add a helper function to determine the PTE flags to be extended with
ASIC and memory-type-specific logic in subsequent commits.
v2: Split Arcturus-specific MTYPE changes into separate commit
v3: Fix return type of get_pte_flags to uint64_t
Signed-off-by: Felix Kuehling <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Shaoyun Liu <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|