aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdkfd/kfd_events.c
AgeCommit message (Collapse)AuthorFilesLines
2024-08-06drm/amdkfd: support per-queue reset on gfx9Jonathan Kim1-0/+22
Support per-queue reset for GFX9. The recommendation is for the driver to target reset the HW queue via a SPI MMIO register write. Since this requires pipe and HW queue info and MEC FW is limited to doorbell reports of hung queues after an unmap failure, scan the HW queue slots defined by SET_RESOURCES first to identify the user queue candidates to reset. Only signal reset events to processes that have had a queue reset. If queue reset fails, fall back to GPU reset. Signed-off-by: Jonathan Kim <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-01-22drm/amdgpu: Add log info for umc_v12_0YiPeng Chai1-1/+5
Add log info for umc_v12_0. v2: Delete redundant logs. Signed-off-by: YiPeng Chai <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-11-29drm/amdkfd: Copy HW exception data to user eventDavid Yat Sin1-0/+4
Fixes issue where user events of type KFD_EVENT_TYPE_HW_EXCEPTION do not have valid data Signed-off-by: David Yat Sin <[email protected]> Reviewed-by: Lijo Lazar <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-08-11drm/amdkfd: drop IOMMUv2 supportAlex Deucher1-82/+0
Now that we use the dGPU path for all APUs, drop the IOMMUv2 support. v2: drop the now unused queue manager functions for gfx7/8 APUs Reviewed-by: Felix Kuehling <[email protected]> Acked-by: Christian König <[email protected]> Tested-by: Mike Lothian <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-15drm/amdkfd: update user space last_event_ageJames Zhu1-8/+15
Update user space last_event_age when event age is enabled. It is only for KFD_EVENT_TYPE_SIGNAL which is checked by user space. Signed-off-by: James Zhu <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-15drm/amdkfd: set activated flag true when event age unmatchsJames Zhu1-4/+13
Set waiter's activated flag true when event age unmatchs with last_event_age. Signed-off-by: James Zhu <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-15drm/amdkfd: add event_age tracking when receiving interruptJames Zhu1-0/+6
Add event_age tracking when receiving interrupt. Signed-off-by: James Zhu <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdkfd: fix vmfault signalling with additional data.Jonathan Kim1-14/+20
Exception handling for vmfaults should be raised with additional data. Reported-by: Mukul Joshi <[email protected]> Signed-off-by: Jonathan Kim <[email protected]> Reviewed-by: Mukul Joshi <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdkfd: add send exception operationJonathan Kim1-1/+2
Add a debug operation that allows the debugger to send an exception directly to runtime through a payload address. For memory violations, normal vmfault signals will be applied to notify runtime instead after passing in the saved exception data when a memory violation was raised to the debugger. For runtime exceptions, this will unblock the runtime enable function which will be explained and implemented in a follow up patch. Signed-off-by: Jonathan Kim <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-06-09drm/amdkfd: Introduce kfd_node struct (v5)Mukul Joshi1-6/+6
Introduce a new structure, kfd_node, which will now represent a compute node. kfd_node is carved out of kfd_dev structure. kfd_dev struct now will become the parent of kfd_node, and will store common resources such as doorbells, GTT sub-alloctor etc. kfd_node struct will store all resources specific to a compute node, such as device queue manager, interrupt handling etc. This is the first step in adding compute partition support in KFD. v2: introduce kfd_node struct to gc v11 (Hawking) v3: make reference to kfd_dev struct through kfd_node (Morris) v4: use kfd_node instead for kfd isr/mqd functions (Morris) v5: rebase (Alex) Signed-off-by: Mukul Joshi <[email protected]> Tested-by: Amber Lin <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Hawking Zhang <[email protected]> Signed-off-by: Morris Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-03-02Merge tag 'drm-next-2023-03-03-1' of git://anongit.freedesktop.org/drm/drmLinus Torvalds1-6/+3
Pull drm fixes from Dave Airlie: "fbdev: - fix uninit var in error path shmem: - revert unGPLing an export i915: - Don't use stolen memory or BAR mappings for ring buffers with LLC - Add inverted backlight quirk for HP 14-r206nv - Fix GSI offset for MCR lookups - GVT fixes (memleak, debugfs attributes, kconfig, typos) amdgpu: - SMU 13 fixes - Enable TMZ for GC 10.3.6 - Misc display fixes - Buddy allocator fixes - GC 11 fixes - S0ix fix - INFO IOCTL queries for GC 11 - VCN harvest fixes for SR-IOV - UMC 8.10 RAS fixes - Don't restrict bpc to 8 - NBIO 7.5 fix - Allow freesync on PCon for more devices amdkfd: - SDMA fix - Illegal memory access fix" * tag 'drm-next-2023-03-03-1' of git://anongit.freedesktop.org/drm/drm: (45 commits) drm/amdgpu/vcn: fix compilation issue with legacy gcc drm/amd/display: Extend Freesync over PCon support for more devices Revert "drm/amd/display: Do not set DRR on pipe commit" drm/amd/display: fix shift-out-of-bounds in CalculateVMAndRowBytes drm/amd/display: Ext displays with dock can't recognized after resume drm/amdgpu: fix ttm_bo calltrace warning in psp_hw_fini drm/amdgpu: remove unused variable ring drm/amd/display: fix dm irq error message in gpu recover drm/amd: Fix initialization for nbio 7.5.1 drm/amd/display: Don't restrict bpc to 8 bpc drm/amdgpu: Make umc_v8_10_convert_error_address static and remove unused variable drm/radeon: Fix eDP for single-display iMac11,2 drm/shmem-helper: Revert accidental non-GPL export drm: omapdrm: Do not use helper unininitialized in omap_fbdev_init() drm/amd/pm: downgrade log level upon SMU IF version mismatch drm/amdgpu: Add ecc info query interface for umc v8_10 drm/amdgpu: Add convert_error_address function for umc v8_10 drm/amdgpu: add bad_page_threshold check in ras_eeprom_check_err drm/amdgpu: change default behavior of bad_page_threshold parameter drm/amdgpu: exclude duplicate pages from UMC RAS UE count ...
2023-02-23drm/amdkfd: Fix an illegal memory accessQu Huang1-6/+3
In the kfd_wait_on_events() function, the kfd_event_waiter structure is allocated by alloc_event_waiters(), but the event field of the waiter structure is not initialized; When copy_from_user() fails in the kfd_wait_on_events() function, it will enter exception handling to release the previously allocated memory of the waiter structure; Due to the event field of the waiters structure being accessed in the free_waiters() function, this results in illegal memory access and system crash, here is the crash log: localhost kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x185/0x1e0 localhost kernel: RSP: 0018:ffffaa53c362bd60 EFLAGS: 00010082 localhost kernel: RAX: ff3d3d6bff4007cb RBX: 0000000000000282 RCX: 00000000002c0000 localhost kernel: RDX: ffff9e855eeacb80 RSI: 000000000000279c RDI: ffffe7088f6a21d0 localhost kernel: RBP: ffffe7088f6a21d0 R08: 00000000002c0000 R09: ffffaa53c362be64 localhost kernel: R10: ffffaa53c362bbd8 R11: 0000000000000001 R12: 0000000000000002 localhost kernel: R13: ffff9e7ead15d600 R14: 0000000000000000 R15: ffff9e7ead15d698 localhost kernel: FS: 0000152a3d111700(0000) GS:ffff9e855ee80000(0000) knlGS:0000000000000000 localhost kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 localhost kernel: CR2: 0000152938000010 CR3: 000000044d7a4000 CR4: 00000000003506e0 localhost kernel: Call Trace: localhost kernel: _raw_spin_lock_irqsave+0x30/0x40 localhost kernel: remove_wait_queue+0x12/0x50 localhost kernel: kfd_wait_on_events+0x1b6/0x490 [hydcu] localhost kernel: ? ftrace_graph_caller+0xa0/0xa0 localhost kernel: kfd_ioctl+0x38c/0x4a0 [hydcu] localhost kernel: ? kfd_ioctl_set_trap_handler+0x70/0x70 [hydcu] localhost kernel: ? kfd_ioctl_create_queue+0x5a0/0x5a0 [hydcu] localhost kernel: ? ftrace_graph_caller+0xa0/0xa0 localhost kernel: __x64_sys_ioctl+0x8e/0xd0 localhost kernel: ? syscall_trace_enter.isra.18+0x143/0x1b0 localhost kernel: do_syscall_64+0x33/0x80 localhost kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9 localhost kernel: RIP: 0033:0x152a4dff68d7 Allocate the structure with kcalloc, and remove redundant 0-initialization and a redundant loop condition check. Signed-off-by: Qu Huang <[email protected]> Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2023-02-09mm: replace vma->vm_flags direct modifications with modifier callsSuren Baghdasaryan1-2/+2
Replace direct modifications to vma->vm_flags with calls to modifier functions to be able to track flag changes and to keep vma locking correctness. [[email protected]: fix drivers/misc/open-dice.c, per Hyeonggon Yoo] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Suren Baghdasaryan <[email protected]> Acked-by: Michal Hocko <[email protected]> Acked-by: Mel Gorman <[email protected]> Acked-by: Mike Rapoport (IBM) <[email protected]> Acked-by: Sebastian Reichel <[email protected]> Reviewed-by: Liam R. Howlett <[email protected]> Reviewed-by: Hyeonggon Yoo <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Arjun Roy <[email protected]> Cc: Axel Rasmussen <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: David Howells <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: David Rientjes <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Greg Thelen <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jann Horn <[email protected]> Cc: Joel Fernandes <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Kent Overstreet <[email protected]> Cc: Laurent Dufour <[email protected]> Cc: Lorenzo Stoakes <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Peter Oskolkov <[email protected]> Cc: Peter Xu <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Punit Agrawal <[email protected]> Cc: Sebastian Andrzej Siewior <[email protected]> Cc: Shakeel Butt <[email protected]> Cc: Soheil Hassas Yeganeh <[email protected]> Cc: Song Liu <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Will Deacon <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-11-09drm/amdkfd: Fix error handling in kfd_criu_restore_eventsFelix Kuehling1-2/+1
mutex_unlock before the exit label because all the error code paths that jump there didn't take that lock. This fixes unbalanced locking errors in case of restore errors. Fixes: 40e8a766a761 ("drm/amdkfd: CRIU checkpoint and restore events") Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Rajneesh Bhardwaj <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected]
2022-08-10drm/amdkfd: Handle restart of kfd_ioctl_wait_eventsFelix Kuehling1-12/+12
When kfd_ioctl_wait_events needs to restart due to a signal, we need to update the timeout to account for the time already elapsed. We also need to undo auto_reset of events that have signaled already, so that the restarted ioctl will be able to count those signals again. This fixes infinite hangs when kfd_ioctl_wait_events is interrupted by a signal. Signed-off-by: Felix Kuehling <[email protected]> Reviewed-and-tested-by: Xiaogang Chen <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-06-08drm/amdkfd: Document and fix GTT BO kmap APIFelix Kuehling1-3/+2
Removed an unused parameter from two functions and added kernel-doc comments. Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Philip Yang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-04-25drm/amdkfd: Ignore bogus signals from MEC efficientlyFelix Kuehling1-4/+18
MEC firmware sometimes sends signal interrupts without a valid context ID on end of pipe events that don't intend to signal any HSA signals. This triggers the slow path in kfd_signal_event_interrupt that scans the entire event page for signaled events. Detect these signals in the top half interrupt handler to stop processing them as early as possible. Because we now always treat event ID 0 as invalid, reserve that ID during process initialization. v2: Update firmware version checks to support more GPUs Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Philip Yang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-04-14drm/amdkfd: fix race condition in kfd_wait_on_eventsFelix Kuehling1-21/+5
Add the waiters to the wait queue during initialization, while holding the event spinlock. Otherwise the waiter will not get activated if the event signals before being added to the wait queue. Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Philip Yang<[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-04-14drm/amdkfd: potential NULL dereference in kfd_set/reset_event()Dan Carpenter1-2/+12
If lookup_event_by_id() returns a NULL "ev" pointer then the spin_lock(&ev->lock) will crash. This was detected by Smatch: drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_events.c:644 kfd_set_event() error: we previously assumed 'ev' could be null (see line 639) Fixes: 5273e82c5f47 ("drm/amdkfd: Improve concurrency of event handling") Signed-off-by: Dan Carpenter <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-04-12drm/amdkfd: Asynchronously free eventsFelix Kuehling1-2/+1
The synchronize_rcu call in destroy_events can take several ms, which noticeably slows down applications destroying many events. Use kfree_rcu to free the event structure asynchronously and eliminate the synchronize_rcu call in the user thread. Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Philip Yang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-04-07drm/amdkfd: Improve concurrency of event handlingFelix Kuehling1-42/+77
Use rcu_read_lock to read p->event_idr concurrently with other readers and writers. Use p->event_mutex only for creating and destroying events and in kfd_wait_on_events. Protect the contents of the kfd_event structure with a per-event spinlock that can be taken inside the rcu_read_lock critical section. This eliminates contention of p->event_mutex in set_event, which tends to be on the critical path for dispatch latency even when busy waiting is used. It also eliminates lock contention in event interrupt handlers. Since the p->event_mutex is now used much less, the impact of requiring it in kfd_wait_on_events should also be much smaller. This should improve event handling latency for processes using multiple GPUs concurrently. v2: Reschedule the worker periodically to avoid soft lockup warnings Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Sean Keely <[email protected]> # v1 Tested-by: Sanjay Tripathi <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-03-25drm/amdkfd: Check for potential null return of kmalloc_array()QintaoShen1-0/+2
As the kmalloc_array() may return null, the 'event_waiters[i].wait' would lead to null-pointer dereference. Therefore, it is better to check the return value of kmalloc_array() to avoid this confusion. Signed-off-by: QintaoShen <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-02-14drm/amdkfd: update SPDX license headerRajneesh Bhardwaj1-1/+2
Update the SPDX License header for all the KFD files. Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Rajneesh Bhardwaj <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-02-07drm/amdkfd: CRIU implement gpu_id remappingDavid Yat Sin1-8/+37
When doing a restore on a different node, the gpu_id's on the restore node may be different. But the user space application will still refer use the original gpu_id's in the ioctl calls. Adding code to create a gpu id mapping so that kfd can determine actual gpu_id during the user ioctl's. Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: David Yat Sin <[email protected]> Signed-off-by: Rajneesh Bhardwaj <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2022-02-07drm/amdkfd: CRIU checkpoint and restore eventsDavid Yat Sin1-28/+244
Add support to existing CRIU ioctl's to save and restore events during criu checkpoint and restore. Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: David Yat Sin <[email protected]> Signed-off-by: Rajneesh Bhardwaj <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-11-17drm/amdkfd: convert misc checks to IP version checkingGraham Sider1-2/+4
Switch to IP version checking instead of asic_type on various KFD version checks. Signed-off-by: Graham Sider <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-05-19drm/amdkfd: fix a resource leakage issueDennis Li1-0/+2
The function kfd_lookup_process_by_pasid will increase the reference count of kfd_process object, its caller should call kfd_unref_process to decrease the reference count. Otherwise resource leakage will happen. Signed-off-by: Dennis Li <[email protected]> Reviewed-by: Hawking Zhang <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2021-05-19drm/amdkfd: refine the poison data consumption handlingDennis Li1-0/+39
The user applications maybe register the KFD_EVENT_TYPE_HW_EXCEPTION and KFD_EVENT_TYPE_MEMORY events, driver could notify them when poison data consumed. Beside that, some applications maybe register SIGBUS signal hander. These applications will handle poison data by themselves, exit or re-create context to re-dispatch works. Signed-off-by: Dennis Li <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2020-09-17drm, iommu: Change type of pasid to u32Fenghua Yu1-4/+4
PASID is defined as a few different types in iommu including "int", "u32", and "unsigned int". To be consistent and to match with uapi definitions, define PASID and its variations (e.g. max PASID) as "u32". "u32" is also shorter and a little more explicit than "unsigned int". No PASID type change in uapi although it defines PASID as __u64 in some places. Suggested-by: Thomas Gleixner <[email protected]> Signed-off-by: Fenghua Yu <[email protected]> Signed-off-by: Borislav Petkov <[email protected]> Reviewed-by: Tony Luck <[email protected]> Reviewed-by: Lu Baolu <[email protected]> Acked-by: Felix Kuehling <[email protected]> Acked-by: Joerg Roedel <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2020-06-09mmap locking API: use coccinelle to convert mmap_sem rwsem call sitesMichel Lespinasse1-2/+2
This change converts the existing mmap_sem rwsem calls to use the new mmap locking API instead. The change is generated using coccinelle with the following rule: // spatch --sp-file mmap_lock_api.cocci --in-place --include-headers --dir . @@ expression mm; @@ ( -init_rwsem +mmap_init_lock | -down_write +mmap_write_lock | -down_write_killable +mmap_write_lock_killable | -down_write_trylock +mmap_write_trylock | -up_write +mmap_write_unlock | -downgrade_write +mmap_write_downgrade | -down_read +mmap_read_lock | -down_read_killable +mmap_read_lock_killable | -down_read_trylock +mmap_read_trylock | -up_read +mmap_read_unlock ) -(&mm->mmap_sem) +(mm) Signed-off-by: Michel Lespinasse <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Daniel Jordan <[email protected]> Reviewed-by: Laurent Dufour <[email protected]> Reviewed-by: Vlastimil Babka <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: David Rientjes <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jerome Glisse <[email protected]> Cc: John Hubbard <[email protected]> Cc: Liam Howlett <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ying Han <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-03-10drm/amdkfd: Use pr_debug to print the message of reaching event limitYong Zhao1-1/+1
People are inclined to think of the previous pr_warn message as an error, so use pre_debug instead. Signed-off-by: Yong Zhao <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-11-13drm/amdkfd: Simplify the mmap offset related bit operationsYong Zhao1-1/+0
The new code uses straightforward bit shifts and thus has better readability. Signed-off-by: Yong Zhao <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-10-03drm/amdkfd: Use hex print format for pasidYong Zhao1-6/+6
Since KFD pasid starts from 0x8000 (32768 in decimal), it is better perceived as a hex number. Meanwhile, change the pasid type from unsigned int to uint16_t to be consistent throughout the code. Signed-off-by: Yong Zhao <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-09-16drm/amdkfd: add renoir type for the workaround of iommu v2 (v2)Huang Rui1-1/+2
Renoir is the same with Raven, will enable iommu event in future. v2: fix the checking (Thong) Signed-off-by: Huang Rui <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-05-24drm/amdkfd: Cosmetic cleanupKent Russell1-1/+1
Fix some spacing issues, log output, uses of !=NULL/==NULL, unneeded extra lines and clean up a declaration from =1 to =true for clarity Signed-off-by: Kent Russell <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2019-03-19drm/amdkfd: add RAS ECC event support (v3)Eric Huang1-1/+17
RAS ECC event will combine with GPU reset event, due to ECC interrupts are caused by uncorrectable error that triggers GPU reset. v2: Fix misleading-indentation warning v3: fix build with CONFIG_HSA_AMD disabled Signed-off-by: Eric Huang <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2018-07-13drm/amdkfd: Optimize out some duplicated code in kfd_signal_iommu_event()Yong Zhao1-15/+11
memory_exception_data is already initialized for not-present faults. It only needs to be overridden for permission faults. Signed-off-by: Yong Zhao <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Felix Kuehling <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2018-07-13drm/amdkfd: Workaround to accommodate Raven too many PPR issueYong Zhao1-5/+16
On Raven multiple PPRs can be queued up by the hardware. When the first of those requests is handled by the IOMMU driver, the memory access succeeds. After that the application may be done with the memory and unmap it. At that point the page table entries are invalidated, but there are still outstanding duplicate PPRs for those addresses. When the IOMMU driver processes those duplicate requests, it finds invalid page table entries and triggers an invalid PPR fault. As a workaround, don't signal invalid PPR faults on Raven to avoid segfaulting applications that haven't done anything wrong. As a side effect, real GPU memory access faults may go unnoticed by the application. Signed-off-by: Yong Zhao <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Felix Kuehling <[email protected]> Acked-by: Alex Deucher <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2018-07-11drm/amdkfd: Implement GPU reset handlers in KFDShaoyun Liu1-0/+27
Lock KFD and evict existing queues on reset. Notify user mode by signaling hw_exception events. Signed-off-by: Shaoyun Liu <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Felix Kuehling <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2018-07-11drm/amdkfd: Handle VM faults in KFDshaoyunl1-0/+37
1. Pre-GFX9 the amdgpu ISR saves the vm-fault status and address per per-vmid. amdkfd needs to get the information from amdgpu through the new get_vm_fault_info interface. On GFX9 and later, all the required information is in the IH ring 2. amdkfd unmaps all queues from the faulting process and create new run-list without the guilty process 3. amdkfd notifies the runtime of the vm fault trap via EVENT_TYPE_MEMORY Signed-off-by: shaoyun liu <[email protected]> Signed-off-by: Felix Kuehling <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2018-07-11drm/amdkfd: send SIGSEGV to process upon KFD_EVENT_TYPE_MEMORYMoses Reuben1-0/+7
Signed-off-by: Moses Reuben <[email protected]> Reviewed-by: Felix Kuehling <[email protected]> Signed-off-by: Felix Kuehling <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2018-05-01drm/amdkfd: Fix signal handling performance againFelix Kuehling1-1/+1
It turns out that idr_for_each_entry is really slow compared to just iterating over the slots. Based on measurements the difference is estimated to be about a factor 64. That means using idr_for_each_entry is only worth it with very few allocated events. Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2018-04-10drm/amdkfd: Clean up KFD_MMAP_ offset handlingHarish Kasiviswanathan1-1/+1
Use bit-rotate for better clarity and remove _MASK from the #defines as these represent mmap types. Centralize all the parsing of the mmap offset in kfd_mmap and add device parameter to doorbell and reserved_mem map functions. Encode gpu_id into upper bits of vm_pgoff. This frees up the lower bits for encoding the the doorbell ID on Vega10. Signed-off-by: Harish Kasiviswanathan <[email protected]> Signed-off-by: Felix Kuehling <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2018-03-15drm/amdkfd: Kmap event page for dGPUsFelix Kuehling1-2/+29
The events page must be accessible in user mode by the GPU and CPU as well as in kernel mode by the CPU. On dGPUs user mode virtual addresses are managed by the Thunk's GPU memory allocation code. Therefore we can't allocate the memory in kernel mode like we do on APUs. But KFD still needs to map the memory for kernel access. To facilitate this, the Thunk provides the buffer handle of the events page to KFD when creating the first event. Signed-off-by: Felix Kuehling <[email protected]> Acked-by: Christian König <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2017-12-08drm/amdkfd: Centralize IOMMUv2 code and make it conditionalFelix Kuehling1-0/+3
dGPUs work without IOMMUv2. Make IOMMUv2 initialization dependent on ASIC information. Also allow building KFD without IOMMUv2 support. This is still useful for dGPUs and prepares for enabling KFD on architectures that don't support AMD IOMMUv2. v2: * Centralize IOMMUv2 code to avoid #ifdefs in too many places v3: * Imply AMD_IOMMU_V2 in Kconfig Signed-off-by: Felix Kuehling <[email protected]> Acked-by: Christian Konig <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2017-11-27drm/amdkfd: Use ref count to prevent kfd_process destructionFelix Kuehling1-7/+7
Use a reference counter instead of a lock to prevent process destruction while functions running out of process context are using the kfd_process structure. In many cases these functions don't need the structure to be locked. In the few cases that really do need the process lock, take it explicitly. This helps simplify lock dependencies between the process lock and other locks, particularly amdgpu and mm_struct locks. This will be important when amdgpu calls back to amdkfd for memory evictions. Signed-off-by: Felix Kuehling <[email protected]> Acked-by: Christian König <[email protected]> Reviewed-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2017-10-27drm/amdkfd: Make event limit dependent on user mode mapping sizeFelix Kuehling1-6/+19
This allows increasing the KFD_SIGNAL_EVENT_LIMIT in kfd_ioctl.h without breaking processes built with older kfd_ioctl.h versions. Signed-off-by: Felix Kuehling <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2017-10-27drm/amdkfd: Use IH context ID for signal lookupFelix Kuehling1-13/+60
This speeds up signal lookup when the IH ring entry includes a valid context ID or partial context ID. Only if the context ID is found to be invalid, fall back to an exhaustive search of all signaled events. Signed-off-by: Felix Kuehling <[email protected]> Acked-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2017-10-27drm/amdkfd: Simplify event ID and signal slot managementFelix Kuehling1-160/+70
Signal slots are identical to event IDs. Replace the used_slot_bitmap and events hash table with an IDR to allocate and lookup event IDs and signal slots more efficiently. Signed-off-by: Felix Kuehling <[email protected]> Acked-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>
2017-10-27drm/amdkfd: Simplify events page allocatorFelix Kuehling1-129/+68
The first event page is always big enough to handle all events. Handling of multiple events pages is not supported by user mode, and not necessary. Signed-off-by: Yong Zhao <[email protected]> Signed-off-by: Felix Kuehling <[email protected]> Acked-by: Oded Gabbay <[email protected]> Signed-off-by: Oded Gabbay <[email protected]>