Age | Commit message (Collapse) | Author | Files | Lines |
|
Platforms like MTL only have a single tile, but multiple GTs.
Ensure XE_ENGINE_CREATE accepts engine creation on gt1 on such
platforms.
Reviewed-by: Lucas De Marchi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Matt Roper <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
On MTL and beyond, the GPU performs non-coherent accesses to the PPGTT
page tables. These page tables should be mapped as CPU:WC.
Removes CAT errors triggered by xe_exec_basic@once-basic on MTL:
xe 0000:00:02.0: [drm:__xe_pt_bind_vma [xe]] Preparing bind, with range [1a0000...1a0fff) engine 0000000000000000.
xe 0000:00:02.0: [drm:xe_vm_dbg_print_entries [xe]] 1 entries to update
xe 0000:00:02.0: [drm:xe_vm_dbg_print_entries [xe]] 0: Update level 3 at (0 + 1) [0...8000000000) f:0
xe 0000:00:02.0: [drm] Engine memory cat error: guc_id=2
xe 0000:00:02.0: [drm] Engine memory cat error: guc_id=2
xe 0000:00:02.0: [drm] Timedout job: seqno=4294967169, guc_id=2, flags=0x4
v2:
- Rename to XE_BO_PAGETABLE to make it more clear that this BO is the
pagetable itself, rather than just being bound in the PPGTT. (Lucas)
Cc: Lucas De Marchi <[email protected]>
Reviewed-by: Lucas De Marchi <[email protected]>
Acked-by: Nirmoy Das <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Matt Roper <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
The main motivation is with d3cold which will make the suspend and
resume callbacks even more scary, but is useful regardless. We already
have the needed annotation on the acquire side with
xe_device_mem_access_get(), and by adding the annotation on the release
side we should have a lot more confidence that our locking hierarchy is
correct.
v2:
- Move the annotation into both callbacks for better symmetry. Also
don't hold over the entire mem_access_get(); we only need to lockep
to understand what is being held upon entering mem_access_get(), and
how that matches up with locks in the callbacks.
Signed-off-by: Matthew Auld <[email protected]>
Cc: Thomas Hellström <[email protected]>
Cc: Anshuman Gupta <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
We must use migrate engine for page fault binds in order to avoid a
deadlock as the migrate engine has a reserved BCS instance which cannot
be stuck on a fault. To use the migrate engine the engine argument to
xe_migrate_update_pgtables must be NULL, this was incorrectly wired up
so vm->eng[tile_id] was always being used. Fix this.
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Matthew Brost <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Only alloc userptr part of xe_vma for userptrs, this will save on space
in the common BO case.
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Matthew Brost <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
The callback kicks the worker thus mutually exclusive execution,
combining saves a bit of space in xe_vma.
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Matthew Brost <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
This will save us a few bytes in the xe_vma structure.
v2: Use hweight8 rather than hweight_long (Rodrigo)
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Matthew Brost <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
This list isn't used again, list_del is the proper call.
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Matthew Brost <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Combine the userptr, rebind, and destroy links into a union as
the lists these links belong to are mutually exclusive.
v2: Adjust which lists are combined (Thomas H)
v3: Add kernel doc why this is safe (Thomas H), remove related change
of list_del_init -> list_del (Rodrigo)
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Matthew Brost <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
If we dont change page sizes we can avoid doing rebinds rather just do a
partial unbind. The algorithm to determine its page size is greedy as we
assume all pages in the removed VMA are the largest page used in the
VMA.
v2: Don't exceed 100 lines
v3: struct xe_vma_op_unmap remove in different patch, remove XXX comment
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Matthew Brost <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
xe_vma_op_unmap isn't used, remove it.
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Matthew Brost <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
We currently have a race between bind engines which can result in
corrupted page tables leading to faults.
A simple example:
bind A 0x0000-0x1000, engine A, has unsatisfied in-fence
bind B 0x1000-0x2000, engine B, no in-fences
exec A uses 0x1000-0x2000
Bind B will pass bind A and exec A will fault. This occurs as bind A
programs the root of the page table in a bind job which is held up by an
in-fence. Bind B in this case just programs a leaf entry of the
structure.
To fix use range-fence utility to track cross bind engine conflicts. In
the above example bind A would insert an dependency into the range-fence
tree with a key of 0x0-0x7fffffffff, bind B would find that dependency
and its bind job would scheduled behind the unsatisfied in-fence and
bind A's job.
Reviewed-by: Maarten Lankhorst<[email protected]>
Co-developed-by: Thomas Hellström <[email protected]>
Signed-off-by: Matthew Brost <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Add generic utility to track range conflicts signaled by a dma-fence.
Tracking implemented via an interval tree. An example use case being
tracking conflicts for pending (un)binds from multiple bind engines. By
being generic ths idea would this could moved to the DRM level and used
in multiple drivers for similar problems.
v2: Make interval tree functions static (CI)
v3: Remove non-static cleanup function (CI)
Reviewed-by: Matthew Brost <[email protected]>
Signed-off-by: Matthew Brost <[email protected]>
Signed-off-by: Thomas Hellström <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Make explicit in the log that execlist submission is used to prevent from
silently using it over GuC submission.
Signed-off-by: Francois Dugast <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Fix 6 errors and 20 warnings reported by checkpatch.pl.
Signed-off-by: Francois Dugast <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Those look like leftover debug and are not even being used. If they were
real debug/info, they should be using the drm helpers.
Signed-off-by: Francois Dugast <[email protected]>
Reviewed-by: Lucas De Marchi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Those messages are unnecessary because a generic message is already
produced in case of allocation failure. Besides, this also removes a
misuse of the XE_IOCTL_DBG macro.
Signed-off-by: Francois Dugast <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Use FIELD_PREP()/FIELD_GET() to encode the tile id into flags. Besides
protecting for eventual overflow it also makes it easier to see a new
flag can't be added as BIT(7).
Reviewed-by: Matthew Brost <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Rename XE_VM_FLAGS_64K to XE_VM_FLAG_64K to follow the other names and
s/GT/TILE/ that got missed in commit 08dea7674533 ("drm/xe: Move
migration from GT to tile").
Reviewed-by: Rodrigo Vivi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
It looks like bulk_move is set during object construction, but is only
removed on object close, however in various places we might not yet have
an actual fd to close, like on the error paths for the gem_create ioctl,
and also one internal user for the evict_test_run_gt() selftest. Try to
handle those cases by manually resetting the bulk_move. This should
prevent triggering:
WARNING: CPU: 7 PID: 8252 at drivers/gpu/drm/ttm/ttm_bo.c:327
ttm_bo_release+0x25e/0x2a0 [ttm]
v2 (Nirmoy):
- It should be safe to just unconditionally call
__xe_bo_unset_bulk_move() in most places.
Signed-off-by: Matthew Auld <[email protected]>
Cc: Matthew Brost <[email protected]>
Reviewed-by: Nirmoy Das <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Test seems to be failing badly after calling xe_bo_restore_kernel().
Taking a snapshot of the CTB and copying back a potentially old version
seems risky, depending on what might have been inflight. Also it seems
snapshotting the ADS object and copying back results in serious
breakage. Normally when calling xe_bo_restore_kernel() we always fully
restart the GT, which re-intializes such things. We could potentially
skip saving and restoring such objects in xe_bo_evict_all() however
seems quite fragile not to also restart the GT. Try to do that here by
triggering a GT reset.
Signed-off-by: Matthew Auld <[email protected]>
Cc: Matthew Brost <[email protected]>
Acked-by: Nirmoy Das <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
The GPU job will keep the device awake, however assumption here is that
caller of xe_migrate_clear() is also holding mem_access.ref otherwise we
hit the asserts in xe_sa_bo_flush_write() prior to the job construction.
Signed-off-by: Matthew Auld <[email protected]>
Cc: Matthew Brost <[email protected]>
Reviewed-by: Nirmoy Das <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
We are calling fairly low level things like xe_bo_restore_kernel() which
expect caller to be holding mem_access.ref. Since we are doing stuff
like evict_all we likely don't want to race with rpm suspend, since that
potentially wants to do the same thing, so just wrap the whole test.
Signed-off-by: Matthew Auld <[email protected]>
Cc: Matthew Brost <[email protected]>
Reviewed-by: Nirmoy Das <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
The atomics here might hide potential issues, also rpm core is not
holding any lock when calling our rpm resume callback, so add a dummy lock
with the idea that xe_pm_runtime_resume() is eventually going to be
called when we are holding it. This only needs to happen once and then
lockdep can validate all callers and their locks.
v2: (Thomas Hellström)
- Prefer static lockdep_map instead of full blown mutex.
Signed-off-by: Matthew Auld <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: Thomas Hellström <[email protected]>
Acked-by: Matthew Brost <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Lockdep gives the following splat:
[ 594.158863] ffff888140da53f0 (&vm->userptr.notifier_lock){++++}-{3:3}, at: vma_userptr_invalidate+0xeb/0x330 [xe]
[ 594.158921]
but task is already holding lock:
[ 594.158926] ffffffff82761940
(mmu_notifier_invalidate_range_start){+.+.}-{0:0}, at: unmap_vmas+0x0/0x1c0
[ 594.158941]
which lock already depends on the new lock.
[ 594.158947]
the existing dependency chain (in reverse order) is:
[ 594.158953]
-> #5 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}:
[ 594.158961] fs_reclaim_acquire+0x68/0xd0
[ 594.158969] __kmem_cache_alloc_node+0x2c/0x1b0
[ 594.158975] kmalloc_node_trace+0x1d/0xb0
[ 594.158983] alloc_worker+0x18/0x50
[ 594.158989] init_rescuer.part.0+0x13/0xa0
[ 594.158995] workqueue_init+0xdf/0x210
[ 594.159001] kernel_init_freeable+0x5c/0x2f0
[ 594.159009] kernel_init+0x11/0x1a0
[ 594.159017] ret_from_fork+0x29/0x50
[ 594.159023]
-> #4 (fs_reclaim){+.+.}-{0:0}:
[ 594.159031] fs_reclaim_acquire+0xa0/0xd0
[ 594.159037] __kmem_cache_alloc_node+0x2c/0x1b0
[ 594.159042] kmalloc_trace+0x20/0xb0
[ 594.159048] acpi_device_add+0x25a/0x3f0
[ 594.159056] acpi_add_single_object+0x387/0x750
[ 594.159063] acpi_bus_check_add+0x108/0x280
[ 594.159069] acpi_bus_scan+0x34/0xf0
[ 594.159075] acpi_scan_init+0xed/0x2b0
[ 594.159082] acpi_init+0x21e/0x520
[ 594.159087] do_one_initcall+0x53/0x260
[ 594.159092] kernel_init_freeable+0x18a/0x2f0
[ 594.159099] kernel_init+0x11/0x1a0
[ 594.159105] ret_from_fork+0x29/0x50
[ 594.159110]
-> #3 (acpi_device_lock){+.+.}-{3:3}:
[ 594.159117] __mutex_lock+0x95/0xd10
[ 594.159122] acpi_enable_wakeup_device_power+0x30/0x120
[ 594.159130] __acpi_device_wakeup_enable+0x34/0x110
[ 594.159138] acpi_pm_set_device_wakeup+0x55/0x140
[ 594.159143] __pci_enable_wake+0x56/0xb0
[ 594.159150] pci_finish_runtime_suspend+0x35/0x80
[ 594.159157] pci_pm_runtime_suspend+0xb5/0x1a0
[ 594.159162] __rpm_callback+0x3c/0x110
[ 594.159170] rpm_callback+0x58/0x70
[ 594.159176] rpm_suspend+0x15c/0x6f0
[ 594.159182] pm_runtime_work+0x9b/0xb0
[ 594.159188] process_one_work+0x263/0x520
[ 594.159195] worker_thread+0x4d/0x3b0
[ 594.159200] kthread+0xeb/0x120
[ 594.159206] ret_from_fork+0x29/0x50
[ 594.159211]
-> #2 (acpi_wakeup_lock){+.+.}-{3:3}:
[ 594.159218] __mutex_lock+0x95/0xd10
[ 594.159223] acpi_pm_set_device_wakeup+0x7a/0x140
[ 594.159228] __pci_enable_wake+0x77/0xb0
[ 594.159234] pci_pm_runtime_resume+0x70/0xd0
[ 594.159240] __rpm_callback+0x3c/0x110
[ 594.159246] rpm_callback+0x58/0x70
[ 594.159252] rpm_resume+0x50d/0x7a0
[ 594.159258] rpm_resume+0x267/0x7a0
[ 594.159264] __pm_runtime_resume+0x45/0x90
[ 594.159270] xe_pm_runtime_resume_and_get+0x12/0x50 [xe]
[ 594.159314] xe_device_mem_access_get+0x97/0xc0 [xe]
[ 594.159346] hw_engines+0x65/0xf0 [xe]
[ 594.159380] seq_read_iter+0x10d/0x4b0
[ 594.159385] seq_read+0x9e/0xd0
[ 594.159390] full_proxy_read+0x4e/0x80
[ 594.159396] vfs_read+0xb6/0x310
[ 594.159401] ksys_read+0x60/0xe0
[ 594.159406] do_syscall_64+0x38/0x90
[ 594.159413] entry_SYSCALL_64_after_hwframe+0x72/0xdc
[ 594.159419]
-> #1 (&xe->mem_access.lock){+.+.}-{3:3}:
[ 594.159427] xe_device_mem_access_get+0x43/0xc0 [xe]
[ 594.159457] xe_gt_tlb_invalidation_vma+0x53/0x190 [xe]
[ 594.159490] invalidation_fence_init+0x1d2/0x2c0 [xe]
[ 594.159529] __xe_pt_unbind_vma+0x151/0x4e0 [xe]
[ 594.159564] vm_bind_ioctl+0x48a/0xae0 [xe]
[ 594.159602] async_op_work_func+0x20c/0x530 [xe]
[ 594.159634] process_one_work+0x263/0x520
[ 594.159640] worker_thread+0x4d/0x3b0
[ 594.159646] kthread+0xeb/0x120
[ 594.159650] ret_from_fork+0x29/0x50
[ 594.159655]
-> #0 (&vm->userptr.notifier_lock){++++}-{3:3}:
[ 594.159663] __lock_acquire+0x16fa/0x2850
[ 594.159670] lock_acquire+0xd2/0x2e0
[ 594.159676] down_write+0x36/0xd0
[ 594.159681] vma_userptr_invalidate+0xeb/0x330 [xe]
[ 594.159714] __mmu_notifier_invalidate_range_start+0x239/0x2a0
[ 594.159722] unmap_vmas+0x1ac/0x1c0
[ 594.159727] unmap_region+0xb5/0x120
[ 594.159732] do_vmi_align_munmap+0x2be/0x430
[ 594.159739] do_vmi_munmap+0xea/0x120
[ 594.159744] __vm_munmap+0x9c/0x160
[ 594.159750] __x64_sys_munmap+0x12/0x20
[ 594.159756] do_syscall_64+0x38/0x90
[ 594.159761] entry_SYSCALL_64_after_hwframe+0x72/0xdc
[ 594.159768]
other info that might help us debug this:
[ 594.159773] Chain exists of:
&vm->userptr.notifier_lock --> fs_reclaim -->
mmu_notifier_invalidate_range_start
[ 594.159785] Possible unsafe locking scenario:
[ 594.159790] CPU0 CPU1
[ 594.159794] ---- ----
[ 594.159797] lock(mmu_notifier_invalidate_range_start);
[ 594.159802] lock(fs_reclaim);
[ 594.159808]
lock(mmu_notifier_invalidate_range_start);
[ 594.159814] lock(&vm->userptr.notifier_lock);
[ 594.159819]
The VM should be holding a mem_access.ref so this looks like it should
be a false positive and we can just drop the explicit mem_access in
xe_gt_tlb_invalidation(). The GGTT invalidation path also takes care to
hold mem_access.ref so should be fine there also, and we already assert
that we hold access.ref for the GuC communication underneath.
Signed-off-by: Matthew Auld <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: Thomas Hellström <[email protected]>
Reviewed-by: Matthew Brost <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Increase the sensitivity of the ggtt->lock by priming it against
FS_RECLAIM, such that allocating memory while holding will result in
lockdep splats.
Signed-off-by: Matthew Auld <[email protected]>
Cc: Thomas Hellström <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
The callers should already be holding the mem_access reference, before
calling into this.
Signed-off-by: Matthew Auld <[email protected]>
Cc: Thomas Hellström <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Only call access_put after dropping the forcewake. In theory the device
could suspend, but really we want to start asserting that we have a
mem_access.ref when touching mmio.
Signed-off-by: Matthew Auld <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Reviewed-by: Matthew Brost <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Any kind of device memory access should first ensure the device is not
suspended, mmio included.
Signed-off-by: Matthew Auld <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Reviewed-by: Matthew Brost <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
The mem_access is meant to cover any kind of device level memory access,
mmio included.
Signed-off-by: Matthew Auld <[email protected]>
Cc: Matthew Brost <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: Anshuman Gupta <[email protected]>
Reviewed-by: Anshuman Gupta <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
We need keep the device awake when performing any kind of mmio operation.
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/279
Signed-off-by: Matthew Auld <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: Thomas Hellström <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
The xe_device_mem_access_get() should be all that's needed here and
should now work as expected, without any strange races. In theory should
be no functional changes here.
Reported-by: Oded Gabbay <[email protected]>
Signed-off-by: Matthew Auld <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: Thomas Hellström <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
It looks like there is at least one race here, given that the
pm_runtime_suspended() check looks to return false if we are in the
process of suspending the device (RPM_SUSPENDING vs RPM_SUSPENDED). We
later also do xe_pm_runtime_get_if_active(), but since the device is
suspending or has now suspended, this doesn't do anything either.
Following from this we can potentially return from
xe_device_mem_access_get() with the device suspended or about to be,
leading to broken behaviour.
Attempt to fix this by always grabbing the runtime ref when our internal
ref transitions from 0 -> 1. The hard part is then dealing with the
runtime_pm callbacks also calling xe_device_mem_access_get() and
deadlocking, which the pm_runtime_suspended() check prevented.
v2:
- ct->lock looks to be primed with fs_reclaim, so holding that and then
allocating memory will cause lockdep to complain. Now that we
unconditionally grab the mem_access.lock around mem_access_{get,put}, we
need to change the ordering wrt to grabbing the ct->lock, since some of
the runtime_pm routines can allocate memory (or at least that's what
lockdep seems to suggest). Hopefully not a big deal. It might be that
there were already issues with this, just that the atomics where
"hiding" the potential issues.
v3:
- Use Thomas Hellström' idea with tracking the active task that is
executing in the resume or suspend callback, in order to avoid
recursive resume/suspend calls deadlocking on itself.
- Split the ct->lock change.
v4:
- Add smb_mb() around accessing the pm_callback_task for extra safety.
(Thomas Hellström)
v5:
- Clarify the kernel-doc for the mem_access.lock, given that it is quite
strange in what it protects (data vs code). The real motivation is to
aid lockdep. (Rodrigo Vivi)
v6:
- Split out the lock change. We still want this as a lockdep aid but
only for the xe_device_mem_access_get() path. Sticking a lock on the
put() looks be a no-go, also the runtime_put() there is always async.
- Now that the lock is gone move to atomics and rely on the pm code
serialising multiple callers on the 0 -> 1 transition.
- g2h_worker_func() looks to be the next issue, given that
suspend-resume callbacks are using CT, so try to handle that.
v7:
- Add xe_device_mem_access_get_if_ongoing(), and use it in
g2h_worker_func().
v8 (Anshuman):
- Just always grab the rpm, instead of just on the 0 -> 1 transition,
which is a lot clearer and simplifies the code quite a bit.
v9:
- Make sure we also adjust the CT fast-path with if-active.
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/258
Signed-off-by: Matthew Auld <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: Thomas Hellström <[email protected]>
Cc: Matthew Brost <[email protected]>
Cc: Anshuman Gupta <[email protected]>
Acked-by: Anshuman Gupta <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Don't init pcode and restore VRAM objects in vain.
We can rely on primary GT GUC_STATUS to detect whether
card has really lost power even when d3cold is allowed by xe.
Adding d3cold.lost_power flag to avoid pcode init and vram
restoration.
Also cleaning up the TODO code comment.
v2:
- %s/xe_guc_has_lost_power()/xe_guc_in_reset().
- Used existing gt instead of new variable. [Rodrigo]
- Added kernel-doc function comment. [Rodrigo]
- xe_guc_in_reset() return true if failed to get fw.
Cc: Rodrigo Vivi <[email protected]>
Signed-off-by: Anshuman Gupta <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Adding support to control d3cold by using vram_usages metric from
ttm resource manager.
When root port is capable of d3cold but xe has disallowed d3cold
due to vram_usages above vram_d3ccold_threshol. It is required to
disable d3cold to avoid any resume failure because root port can
still transition to d3cold when all of pcie endpoints and
{upstream, virtual} switch ports will transition to d3hot.
Also cleaning up the TODO code comment.
v2:
- Modify d3cold.allowed in xe_pm_d3cold_allowed_toggle. [Riana]
- Cond changed (total_vram_used_mb < xe->d3cold.vram_threshold)
according to doc comment.
v3:
- Added enum instead of true/false argument in
d3cold_toggle(). [Rodrigo]
- Removed TODO comment. [Rodrigo]
Cc: Rodrigo Vivi <[email protected]>
Signed-off-by: Anshuman Gupta <[email protected]>
Reviewed-by: Badal Nilawar <[email protected]>
Acked-by: Rodrigo Vivi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Add per pci device vram_d3cold_threshold Sysfs to
control the d3cold allowed knob.
Adding a d3cold structure embedded in xe_device to encapsulate
d3cold related stuff.
v2:
- Check total vram before initializing default threshold. [Riana]
- Add static scope to vram_d3cold_threshold DEVICE_ATTR. [Riana]
v3:
- Fixed cosmetics review comment. [Riana]
- Fixed CI Hook failures.
- Used drmm_mutex_init().
v4:
- Fixed kernel-doc warnings.
v5:
- Added doc explaining need for the device sysfs. [Rodrigo]
- Removed TODO comment.
Cc: Rodrigo Vivi <[email protected]>
Signed-off-by: Anshuman Gupta <[email protected]>
Reviewed-by: Riana Tauro <[email protected]>
Acked-by: Rodrigo Vivi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Wrap xe_pm_runtime_init inside xe_pm_init.
Cc: Rodrigo Vivi <[email protected]>
Signed-off-by: Anshuman Gupta <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Adding pci d3cold_capable check in order to initialize
d3cold_allowed as false statically.
It avoids vram save/restore latency during runtime
suspend/resume
v2:
- Added else block to xe_pci_runtime_idle. [Rodrigo]
Cc: Rodrigo Vivi <[email protected]>
Signed-off-by: Anshuman Gupta <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Currently GuCRC is disabled in suspend path for xe.
Rc6 is a prerequiste to enable s0ix and
should not be disabled for s2idle. There is no requirement
to disable GuCRC for S3+.
Remove it from xe_guc_pc_stop, thus removing from suspend path.
Retain the call in other places where xe_guc_pc_stop is
called.
v2: add description and return statement to kernel-doc (Rodrigo)
v3: update commit message (Rodrigo)
v4: add mem_access_get to the gucrc disable function
Signed-off-by: Riana Tauro <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Reduce the number of warnings reported by checkpatch.pl from 118 to 48 by
addressing those warnings types:
LEADING_SPACE
LINE_SPACING
BRACES
TRAILING_SEMICOLON
CONSTANT_COMPARISON
BLOCK_COMMENT_STYLE
RETURN_VOID
ONE_SEMICOLON
SUSPECT_CODE_INDENT
LINE_CONTINUATIONS
UNNECESSARY_ELSE
UNSPECIFIED_INT
UNNECESSARY_INT
MISORDERED_TYPE
Signed-off-by: Francois Dugast <[email protected]>
Reviewed-by: Matthew Brost <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Lower log level of XE_IOCTL_ERR macro to debug in order to prevent flooding
kernel log.
v2: Rename XE_IOCTL_ERR to XE_IOCTL_DBG (Rodrigo Vivi)
v3: Rebase
v4: Fix style, remove unrelated change about __FILE__ and __LINE__
Link: https://lists.freedesktop.org/archives/intel-xe/2023-May/004704.html
Signed-off-by: Francois Dugast <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Fix minor issues: remove extra ';' and s/Initialise/Initialize/.
Signed-off-by: Francois Dugast <[email protected]>
Reviewed-by: Matthew Brost <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Remove some style issues of type COMPLEX_MACRO reported by checkpatch.
Signed-off-by: Francois Dugast <[email protected]>
Reviewed-by: Matthew Brost <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Remove all existing style issues of type TRAILING_WHITESPACE reported
by checkpatch.
Signed-off-by: Francois Dugast <[email protected]>
Reviewed-by: Matthew Brost <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Remove all existing style issues of type CODE_INDENT reported
by checkpatch.
Signed-off-by: Francois Dugast <[email protected]>
Reviewed-by: Matthew Brost <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Remove all existing style issues of type POINTER_LOCATION reported
by checkpatch.
Signed-off-by: Francois Dugast <[email protected]>
Reviewed-by: Matthew Brost <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Remove almost all existing style issues of type OPEN_BRACE reported
by checkpatch.
Signed-off-by: Francois Dugast <[email protected]>
Reviewed-by: Matthew Brost <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Remove almost all existing style issues of type SPACING reported
by checkpatch.
Signed-off-by: Francois Dugast <[email protected]>
Reviewed-by: Matthew Brost <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
We need to hold vm->lock before the xe_vm_is_closed_or_banned().
Else we get this splat:
[ 802.555227] ------------[ cut here ]------------
[ 802.555234] WARNING: CPU: 33 PID: 3122 at drivers/gpu/drm/xe/xe_vm.h:60
[ 802.555515] CPU: 33 PID: 3122 Comm: xe_exec_fault_m Tainted:
...
[ 802.555709] Call Trace:
[ 802.555714] <TASK>
[ 802.555720] ? __warn+0x81/0x170
[ 802.555737] ? xe_vm_madvise_ioctl+0x2de/0x440 [xe]
Fixes: 9d858b69b0cf ("drm/xe: Ban a VM if rebind worker hits an error")
Reviewed-by: Matthew Brost <[email protected]>
Signed-off-by: Brian Welty <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
It was missed that print_op needs to include DRM_GPUVA_OP_PREFETCH.
Else we hit the impossible BUG_ON:
[ 886.371040] ------------[ cut here ]------------
[ 886.371047] kernel BUG at drivers/gpu/drm/xe/xe_vm.c:2234!
[ 886.371216] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 886.371229] CPU: 1 PID: 3132 Comm: xe_exec_fault_m
[ 886.371257] RIP: 0010:vm_bind_ioctl_ops_create+0x45f/0x470 [xe]
...
[ 886.371517] Call Trace:
[ 886.371525] <TASK>
[ 886.371531] ? __die_body+0x1a/0x60
[ 886.371546] ? die+0x38/0x60
[ 886.371557] ? do_trap+0x10a/0x120
[ 886.371568] ? vm_bind_ioctl_ops_create+0x45f/0x470 [xe]
v2: add debug print for PREFETCH in print_op
Fixes: b06d47be7c83 ("drm/xe: Port Xe to GPUVA")
Reviewed-by: Matthew Brost <[email protected]>
Signed-off-by: Brian Welty <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|