Age | Commit message (Collapse) | Author | Files | Lines |
|
Wait for HW/PSP initiated ASIC reset to complete before
starting the recovery operations.
v2: Remove typo
Signed-off-by: Andrey Grodzovsky <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
DPC recovery involves ASIC reset just as normal GPU recovery so block
SW GPU schedulers and wait on all concurrent GPU resets.
Signed-off-by: Andrey Grodzovsky <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
At this point the ASIC is already post reset by the HW/PSP
so the HW not in proper state to be configured for suspension,
some blocks might be even gated and so best is to avoid touching it.
v2: Rename in_dpc to more meaningful name
Signed-off-by: Andrey Grodzovsky <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Add PCI Downstream Port Containment (DPC) with
basic recovery functionality
v2: remove pci_save_state to avoid breaking suspend/resume
v3: Fix style comments
v4: Improve description.
Signed-off-by: Andrey Grodzovsky <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
In function flr_work, we should do gpu recovery when no job
is running. Fix the logic by inverting it.
v2: modify the description
Reviewed-by: Christian König <[email protected]>
Signed-off-by: Liu ChengZhe <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
In the resume stage of GPU recovery, start_cpsch will call pm_init
which set pm->allocated as false, cause the next pm_release_ib has
no chance to release ib memory.
Add pm_release_ib in stop_cpsch which will be called in the suspend
stage of GPU recovery.
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Dennis Li <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
The crash log as the below:
[Thu Aug 20 23:18:14 2020] general protection fault: 0000 [#1] SMP NOPTI
[Thu Aug 20 23:18:14 2020] CPU: 152 PID: 1837 Comm: kworker/152:1 Tainted: G OE 5.4.0-42-generic #46~18.04.1-Ubuntu
[Thu Aug 20 23:18:14 2020] Hardware name: GIGABYTE G482-Z53-YF/MZ52-G40-00, BIOS R12 05/13/2020
[Thu Aug 20 23:18:14 2020] Workqueue: events amdgpu_ras_do_recovery [amdgpu]
[Thu Aug 20 23:18:14 2020] RIP: 0010:evict_process_queues_cpsch+0xc9/0x130 [amdgpu]
[Thu Aug 20 23:18:14 2020] Code: 49 8d 4d 10 48 39 c8 75 21 eb 44 83 fa 03 74 36 80 78 72 00 74 0c 83 ab 68 01 00 00 01 41 c6 45 41 00 48 8b 00 48 39 c8 74 25 <80> 78 70 00 c6 40 6d 01 74 ee 8b 50 28 c6 40 70 00 83 ab 60 01 00
[Thu Aug 20 23:18:14 2020] RSP: 0018:ffffb29b52f6fc90 EFLAGS: 00010213
[Thu Aug 20 23:18:14 2020] RAX: 1c884edb0a118914 RBX: ffff8a0d45ff3c00 RCX: ffff8a2d83e41038
[Thu Aug 20 23:18:14 2020] RDX: 0000000000000000 RSI: 0000000000000082 RDI: ffff8a0e2e4178c0
[Thu Aug 20 23:18:14 2020] RBP: ffffb29b52f6fcb0 R08: 0000000000001b64 R09: 0000000000000004
[Thu Aug 20 23:18:14 2020] R10: ffffb29b52f6fb78 R11: 0000000000000001 R12: ffff8a0d45ff3d28
[Thu Aug 20 23:18:14 2020] R13: ffff8a2d83e41028 R14: 0000000000000000 R15: 0000000000000000
[Thu Aug 20 23:18:14 2020] FS: 0000000000000000(0000) GS:ffff8a0e2e400000(0000) knlGS:0000000000000000
[Thu Aug 20 23:18:14 2020] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[Thu Aug 20 23:18:14 2020] CR2: 000055c783c0e6a8 CR3: 00000034a1284000 CR4: 0000000000340ee0
[Thu Aug 20 23:18:14 2020] Call Trace:
[Thu Aug 20 23:18:14 2020] kfd_process_evict_queues+0x43/0xd0 [amdgpu]
[Thu Aug 20 23:18:14 2020] kfd_suspend_all_processes+0x60/0xf0 [amdgpu]
[Thu Aug 20 23:18:14 2020] kgd2kfd_suspend.part.7+0x43/0x50 [amdgpu]
[Thu Aug 20 23:18:14 2020] kgd2kfd_pre_reset+0x46/0x60 [amdgpu]
[Thu Aug 20 23:18:14 2020] amdgpu_amdkfd_pre_reset+0x1a/0x20 [amdgpu]
[Thu Aug 20 23:18:14 2020] amdgpu_device_gpu_recover+0x377/0xf90 [amdgpu]
[Thu Aug 20 23:18:14 2020] ? amdgpu_ras_error_query+0x1b8/0x2a0 [amdgpu]
[Thu Aug 20 23:18:14 2020] amdgpu_ras_do_recovery+0x159/0x190 [amdgpu]
[Thu Aug 20 23:18:14 2020] process_one_work+0x20f/0x400
[Thu Aug 20 23:18:14 2020] worker_thread+0x34/0x410
When GPU hang, user process will fail to create a compute queue whose
struct object will be freed later, but driver wrongly add this queue to
queue list of the proccess. And then kfd_process_evict_queues will
access a freed memory, which cause a system crash.
v2:
The failure to execute_queues should probably not be reported to
the caller of create_queue, because the queue was already created.
Therefore change to ignore the return value from execute_queues.
Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Dennis Li <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
Paul needs 1a21e5b930e8 ("drm/ingenic: Fix leak of device_node
pointer") and 3b5b005ef7d9 ("drm/ingenic: Fix driver not probing when
IPU port is missing") from -fixes to be able to merge further ingenic
patches into -next.
Signed-off-by: Daniel Vetter <[email protected]>
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
drm-misc-fixes for v5.9-rc5:
- Fix double free in virtio.
- Add missing put_device in sun4i, and other fixes.
- Small ingenic fixes.
- Handle sun4i alpha on lowest plane correctly.
- Remove output->enabled from virtio, as it should use crtc_state.
- Fix tve200 enable/disable.
- Documentation fix.
- Fix virtio unblank.
Signed-off-by: Dave Airlie <[email protected]>
From: Maarten Lankhorst <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
drm/i915 fixes for v5.9-rc5:
- Fix regression leading to audio probe failure
Signed-off-by: Dave Airlie <[email protected]>
From: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
This function should be an int, not a bool.
Presumably because we had the same 2 reverts in a slightly different
way, git got confused.
Thanks to Dan for reporting. :)
The conflict is between the 3 reverts in drm-fixes:
4993a8a37808 ("Revert "drm/i915: Remove i915_gem_object_get_dirty_page()"")
ad5d95e4d538 ("Revert "drm/i915/gem: Async GPU relocations only"")
20561da3a2e1 ("Revert "drm/i915/gem: Delete unused code"")
And the slightly different combined revert in drm-intel-gt-next, but
with the same goal:
102a0a9051f4 ("Revert "drm/i915/gem: Async GPU relocations only"")
In the merge commit 1f4b2aca794f ("Merge tag
'drm-intel-gt-next-2020-09-07' of git://anongit.freedesktop.org/drm/drm-intel into drm-next") things
went wrong, but the merge commit view now doesn't show any conflict
anymore (as git tends to do when the resolution picks one or the other
branch).
The need to handle other than just true/false error codes in
__reloc_entry_gpu was added in the dma_resv locking changes in
c43ce12328df ("drm/i915: Use per object locking in execbuf, v12.")
Signed-off-by: Maarten Lankhorst <[email protected]>
Reported-by: Dan Carpenter <[email protected]>
Cc: Dave Airlie <[email protected]>
[danvet: Explain this entire saga a lot better, adding tons of commit
references. Also note that this was merged before full intel-gfx-CI
results, only after BAT, since the breakage at the BAT run is already
severe enough to block all pre-merge testing.]
Fixes: 1f4b2aca794f ("Merge tag 'drm-intel-gt-next-2020-09-07' of git://anongit.freedesktop.org/drm/drm-intel into drm-next")
Acked-by: Joonas Lahtinen <[email protected]>
Signed-off-by: Daniel Vetter <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
git://anongit.freedesktop.org/drm/drm-misc into drm-next
UAPI Changes:
None
Cross-subsystem Changes:
* Moves a bunch of miscellaneous DP code from the i915 driver into a set
of shared DRM DP helpers
Core Changes:
* New DRM DP helpers (see above)
Driver Changes:
* Implements usage of the aforementioned DP helpers in the nouveau
driver, along with some other various HPD related cleanup for nouveau
Signed-off-by: Dave Airlie <[email protected]>
From: Lyude Paul <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
git://linuxtv.org/pinchartl/media into drm-fixes
Kconfig fixes for DRM_ZYNQMP_DPSUB DMA engine dependency
Signed-off-by: Dave Airlie <[email protected]>
From: Laurent Pinchart <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-next
(Same content as drm-intel-gt-next-2020-09-04-3, S-o-b's added)
UAPI Changes:
(- Potential implicit changes from WW locking refactoring)
Cross-subsystem Changes:
(- WW locking changes should align the i915 locking more with others)
Driver Changes:
- MAJOR: Apply WW locking across the driver (Maarten)
- Reverts for 5 commits to make applying WW locking faster (Maarten)
- Disable preparser around invalidations on Tigerlake for non-RCS engines (Chris)
- Add missing dma_fence_put() for error case of syncobj timeline (Chris)
- Parse command buffer earlier in eb_relocate(slow) to facilitate backoff (Maarten)
- Pin engine before pinning all objects (Maarten)
- Rework intel_context pinning to do everything outside of pin_mutex (Maarten)
- Avoid tracking GEM context until registered (Cc: stable, Chris)
- Provide a fastpath for waiting on vma bindings (Chris)
- Fixes to preempt-to-busy mechanism (Chris)
- Distinguish the virtual breadcrumbs from the irq breadcrumbs (Chris)
- Switch to object allocations for page directories (Chris)
- Hold context/request reference while breadcrumbs are active (Chris)
- Make sure execbuffer always passes ww state to i915_vma_pin (Maarten)
- Code refactoring to facilitate use of WW locking (Maarten)
- Locking refactoring to use more granular locking (Maarten, Chris)
- Support for multiple pinned timelines per engine (Chris)
- Move complication of I915_GEM_THROTTLE to the ioctl from general code (Chris)
- Make active tracking/vma page-directory stash work preallocated (Chris)
- Avoid flushing submission tasklet too often (Chris)
- Reduce context termination list iteration guard to RCU (Chris)
- Reductions to locking contention (Chris)
- Fixes for issues found by CI (Chris)
Signed-off-by: Dave Airlie <[email protected]>
From: Joonas Lahtinen <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Commit '6f6a73c8b715d595977774d48450a734297ab21f' from Linus' tree
The fixes reverts cause a bit of a conflict pain with intel next,
start fixing it up here.
Signed-off-by: Dave Airlie <[email protected]>
|
|
In commit 4f0b4352bd26 ("drm/i915: Extract cdclk requirements checking
to separate function") the order of force_min_cdclk_changed check and
intel_modeset_checks(), was reversed. This broke the mechanism to
immediately force a new CDCLK minimum, and lead to driver probe
errors for display audio on GLK platform with 5.9-rc1 kernel. Fix
the issue by moving intel_modeset_checks() call later.
[vsyrjala: It also broke the ability of planes to bump up the cdclk
and thus could lead to underruns when eg. flipping from 32bpp to
64bpp framebuffer. To be clear, we still compute the new cdclk
correctly but fail to actually program it to the hardware due to
intel_set_cdclk_{pre,post}_plane_update() not getting called on
account of state->modeset==false.]
Fixes: 4f0b4352bd26 ("drm/i915: Extract cdclk requirements checking to separate function")
BugLink: https://github.com/thesofproject/linux/issues/2410
Signed-off-by: Kai Vehmanen <[email protected]>
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit cf696856bc54a31f78e6538b84c8f7a006b6108b)
Signed-off-by: Jani Nikula <[email protected]>
|
|
git://people.freedesktop.org/~agd5f/linux into drm-next
amd-drm-next-5.10-2020-09-03:
amdgpu:
- RAS fixes
- Sienna Cichlid updates
- Navy Flounder updates
- DCE6 (SI) support in DC
- Enable plane rotation
- Rework pre-OS vram reservation handling during driver init
- Add standard interface to dump GPU metrics table from SMU
- Rework tiling and tmz state handling in atomic commits
- Pstate fixes
- Add voltage and power hwmon interfaces for renoir
- SW CTF fixes
- S/G display fix for Raven
- Print client strings for vmfaults for vega and newer
- Manual fan control fixes
- Display updates
- Reorg power management directory structure
- Misc bug fixes
- Misc code cleanups
amdkfd:
- Topology fixes
- Add SMI events for thermal throttling and GPU resets
radeon:
- switch from pci_* to dma_* for dma allocations
- PLL fix
Scheduler:
- Clean up priority levels
UAPI:
- amdgpu INFO IOCTL query update for TMZ state
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6049
- amdkfd SMI event interface updates
https://github.com/RadeonOpenCompute/rocm_smi_lib/tree/therm_thrott
From: Alex Deucher <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
These commits caused a regression on Lenovo t520 sandybridge
machine belonging to reporter. We are reverting them for 5.10
for other reasons, so just do it for 5.9 as well.
This reverts commit 7ac2d2536dfa71c275a74813345779b1e7522c91.
Reported-by: Harald Arnesen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
These commits caused a regression on Lenovo t520 sandybridge
machine belonging to reporter. We are reverting them for 5.10
for other reasons, so just do it for 5.9 as well.
This reverts commit 9e0f9464e2ab36b864359a59b0e9058fdef0ce47.
Reported-by: Harald Arnesen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
These commits caused a regression on Lenovo t520 sandybridge
machine belonging to reporter. We are reverting them for 5.10
for other reasons, so just do it for 5.9 as well.
This reverts commit 763fedd6a216f94c2eb98d2f7ca21be3d3806e69.
Reported-by: Harald Arnesen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
https://gitlab.freedesktop.org/drm/msm into drm-fixes
A few fixes for a potential RPTR corruption issue.
Signed-off-by: Dave Airlie <[email protected]>
From: Rob Clark <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/ <CAF6AEGvnr6Nhz2J0sjv2G+j7iceVtaDiJDT8T88uW6jiBfOGKQ@mail.gmail.com
|
|
Backmerge 5.9-rc4 as there is a nasty qxl conflict
that needs to be resolved.
Signed-off-by: Dave Airlie <[email protected]>
|
|
The hwsp_gtt object is used for sub-allocation and could therefore
be shared by many contexts causing unnecessary contention during
concurrent context pinning.
However since we're currently locking it only for pinning, it remains
resident until we unpin it, and therefore it's safe to drop the
lock early, allowing for concurrent thread access.
Signed-off-by: Thomas Hellström <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Signed-off-by: Joonas Lahtinen <[email protected]>
|
|
(NOTE: This is the minimal backportable fix, a full fix is being
developed at https://patchwork.freedesktop.org/patch/388048/)
The flags passed to the wait_entry.func are passed onwards to
try_to_wake_up(), which has a very particular interpretation for its
wake_flags. In particular, beyond the published WF_SYNC, it has a few
internal flags as well. Since we passed the fence->error down the chain
via the flags argument, these ended up in the default_wake_function
confusing the kernel/sched.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2110
Fixes: ef4688497512 ("drm/i915: Propagate fence errors")
Signed-off-by: Chris Wilson <[email protected]>
Cc: Matthew Auld <[email protected]>
Cc: <[email protected]> # v5.4+
Reviewed-by: Matthew Auld <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Rodrigo Vivi <[email protected]>
[Joonas: Rebased and reordered into drm-intel-gt-next branch]
[Joonas: Added a note and link about more complete fix]
Signed-off-by: Joonas Lahtinen <[email protected]>
|
|
To implement preempt-to-busy (and so efficient timeslicing and best utilization
of the hardware submission ports) we let the GPU run asynchronously in respect
to the ELSP submission queue. This created challenges in keeping and accessing
the driver state mirroring the asynchronous GPU execution.
Previous fix 1d9221e9d395 ("drm/i915: Skip signaling a signaled request")
however did not correctly serialize request retirement with the execution
callbacks.
We were using the i915_request.lock to serialise adding an execution callback
with __i915_request_submit. However, if we use an atomic llist_add to serialise
multiple waiters and then check to see if the request is already executing, we
can remove the irq-spinlock and fix serialization between retirement and
execution callbacks in one go.
v2: Avoid using the irq_work when outside of the irq-spinlocks, where we
can execute the callbacks immediately.
v3: Pay close attention to the order of setting ACTIVE on retirement, we
need to ensure the request is signaled and breadcrumbs detached before
we finish removing the request from the engine.
v4: Expanded commit message.
Fixes: 1d9221e9d395 ("drm/i915: Skip signaling a signaled request")
Signed-off-by: Chris Wilson <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Rodrigo Vivi <[email protected]>
[Joonas: Rebased and reordered into drm-intel-gt-next branch]
[Joonas: Added expanded commit message from Tvrtko and Chris]
Signed-off-by: Joonas Lahtinen <[email protected]>
|
|
To implement preempt-to-busy (and so efficient timeslicing and best utilization
of the hardware submission ports) we let the GPU run asynchronously in respect
to the ELSP submission queue. This created challenges in keeping and accessing
the driver state mirroring the asynchronous GPU execution.
The latest occurence of this was spotted by KCSAN:
[ 1413.563200] BUG: KCSAN: data-race in __await_execution+0x217/0x370 [i915]
[ 1413.563221]
[ 1413.563236] race at unknown origin, with read to 0xffff88885bb6c478 of 8 bytes by task 9654 on cpu 1:
[ 1413.563548] __await_execution+0x217/0x370 [i915]
[ 1413.563891] i915_request_await_dma_fence+0x4eb/0x6a0 [i915]
[ 1413.564235] i915_request_await_object+0x421/0x490 [i915]
[ 1413.564577] i915_gem_do_execbuffer+0x29b7/0x3c40 [i915]
[ 1413.564967] i915_gem_execbuffer2_ioctl+0x22f/0x5c0 [i915]
[ 1413.564998] drm_ioctl_kernel+0x156/0x1b0
[ 1413.565022] drm_ioctl+0x2ff/0x480
[ 1413.565046] __x64_sys_ioctl+0x87/0xd0
[ 1413.565069] do_syscall_64+0x4d/0x80
[ 1413.565094] entry_SYSCALL_64_after_hwframe+0x44/0xa9
To complicate matters, we have to both avoid the read tearing of *active and
avoid any write tearing as perform the pending[] -> inflight[] promotion of the
execlists.
This is because we cannot rely on the memcpy doing u64 aligned copies on all
kernels/platforms and so we opt to open-code it with explicit WRITE_ONCE
annotations to satisfy KCSAN.
v2: When in doubt, write the same comment again.
v3: Expanded commit message.
Fixes: b55230e5e800 ("drm/i915: Check for awaits on still currently executing requests")
Signed-off-by: Chris Wilson <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Rodrigo Vivi <[email protected]>
[Joonas: Rebased and reordered into drm-intel-gt-next branch]
[Joonas: Added expanded commit message from Tvrtko and Chris]
Signed-off-by: Joonas Lahtinen <[email protected]>
|
|
Use ww locking for pin_to_display_plane for all the pinning and locking.
With the locking removed from set_cache_level, we need to fix
i915_gem_set_caching_ioctl to take the object reservation lock.
As this is a single lock, we don't need to use the ww dance.
Changes since v1:
- Do not use ww locking in i915_gem_set_caching_ioctl (Thomas).
Signed-off-by: Maarten Lankhorst <[email protected]>
Reviewed-by: Thomas Hellström <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Joonas Lahtinen <[email protected]>
|
|
We want to start requiring the reservation_lock instead of obj->mm.lock
for pinning objects, take the ww lock inside vm_fault_gtt as a first step
towards the legacy lock removal.
Signed-off-by: Maarten Lankhorst <[email protected]>
Reviewed-by: Thomas Hellström <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Joonas Lahtinen <[email protected]>
|
|
Make sure vma_lock is not used as inner lock when kernel context is used,
and add ww handling where appropriate.
Ensure that execbuf selftests keep passing by using ww handling.
Changes since v2:
- Fix i915_gem_context finally.
Signed-off-by: Maarten Lankhorst <[email protected]>
Reviewed-by: Thomas Hellström <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Joonas Lahtinen <[email protected]>
|
|
We want to get rid of intel_context_pin(), convert
intel_context_create_request() first. :)
Signed-off-by: Maarten Lankhorst <[email protected]>
Reviewed-by: Thomas Hellström <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Joonas Lahtinen <[email protected]>
|
|
This function does not use intel_context_create_request, so it has
to use the same locking order as normal code. This is required to
shut up lockdep in selftests.
Signed-off-by: Maarten Lankhorst <[email protected]>
Reviewed-by: Thomas Hellström <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Joonas Lahtinen <[email protected]>
|
|
Some i915 selftests still use i915_vma_lock() as inner lock, and
intel_context_create_request() intel_timeline->mutex as outer lock.
Fortunately for selftests this is not an issue, they should be fixed
but we can move ahead and cleanify lockdep now.
Signed-off-by: Maarten Lankhorst <[email protected]>
Reviewed-by: Thomas Hellström <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Joonas Lahtinen <[email protected]>
|
|
We have the ordering of timeline->mutex vs resv_lock wrong,
convert the i915_pin_vma and intel_context_pin as well to
future-proof this.
We may need to do future changes to do this more transaction-like,
and only get down to a single i915_gem_ww_ctx, but for now this
should work.
Signed-off-by: Maarten Lankhorst <[email protected]>
Reviewed-by: Thomas Hellström <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Joonas Lahtinen <[email protected]>
|
|
Instead of using intel_context_create_request(), use intel_context_pin()
and i915_create_request directly.
Now all those calls are gone outside of selftests. :)
Signed-off-by: Maarten Lankhorst <[email protected]>
Reviewed-by: Thomas Hellström <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Joonas Lahtinen <[email protected]>
|
|
This is the last part outside of selftests that still don't use the
correct lock ordering of timeline->mutex vs resv_lock.
With gem fixed, there are a few places that still get locking wrong:
- gvt/scheduler.c
- i915_perf.c
- Most if not all selftests.
Changes since v1:
- Add intel_engine_pm_get/put() calls to fix use-after-free when using
intel_engine_get_pool().
Signed-off-by: Maarten Lankhorst <[email protected]>
Reviewed-by: Thomas Hellström <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Joonas Lahtinen <[email protected]>
|
|
As a preparation step for full object locking and wait/wound handling
during pin and object mapping, ensure that we always pass the ww context
in i915_gem_execbuffer.c to i915_vma_pin, use lockdep to ensure this
happens.
This also requires changing the order of eb_parse slightly, to ensure
we pass ww at a point where we could still handle -EDEADLK safely.
Signed-off-by: Maarten Lankhorst <[email protected]>
Reviewed-by: Thomas Hellström <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Joonas Lahtinen <[email protected]>
|
|
Instead of doing everything inside of pin_mutex, we move all pinning
outside. Because i915_active has its own reference counting and
pinning is also having the same issues vs mutexes, we make sure
everything is pinned first, so the pinning in i915_active only needs
to bump refcounts. This allows us to take pin refcounts correctly
all the time.
Signed-off-by: Maarten Lankhorst <[email protected]>
Reviewed-by: Thomas Hellström <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Joonas Lahtinen <[email protected]>
|
|
We want to lock all gem objects, including the engine context objects,
rework the throttling to ensure that we can do this. Now we only throttle
once, but can take eb_pin_engine while acquiring objects. This means we
will have to drop the lock to wait. If we don't have to throttle we can
still take the fastpath, if not we will take the slowpath and wait for
the throttle request while unlocked.
The engine has to be pinned as first step, otherwise gpu relocations
won't work.
Changes since v1:
- Only need to get a throttled request in the fastpath, no need for
a global flag any more.
- Always free the waited request correctly.
Changes since v2:
- Use intel_engine_pm_get()/put() to keeep engine pool alive during
EDEADLK handling.
Changes since v3:
- Fix small rq leak.
Changes since v4:
- Use a single reloc_context, for intel_context_pin_ww().
Signed-off-by: Maarten Lankhorst <[email protected]>
Reviewed-by: Thomas Hellström <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Joonas Lahtinen <[email protected]>
|
|
Those arguments are already set as eb.file and eb.args, so kill off
the extra arguments. This will allow us to move eb_pin_engine() to
after we reserved all BO's.
Signed-off-by: Maarten Lankhorst <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Reviewed-by: Thomas Hellström <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Joonas Lahtinen <[email protected]>
|
|
This is required if we want to pass a ww context in intel_context_pin
and gen6_ppgtt_pin().
Signed-off-by: Maarten Lankhorst <[email protected]>
Reviewed-by: Thomas Hellström <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Joonas Lahtinen <[email protected]>
|
|
We want to start using ww locking in intel_context_pin, for this
we need to lock multiple objects, and the single i915_gem_object_lock
is not enough.
Convert to using ww-waiting, and make sure we always pin intel_context_state,
even if we don't have a renderstate object.
Signed-off-by: Maarten Lankhorst <[email protected]>
Reviewed-by: Thomas Hellström <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Joonas Lahtinen <[email protected]>
|
|
Now that we changed execbuf submission slightly to allow us to do all
pinning in one place, we can now simply add ww versions on top of
struct_mutex. All we have to do is a separate path for -EDEADLK
handling, which needs to unpin all gem bo's before dropping the lock,
then starting over.
This finally allows us to do parallel submission, but because not
all of the pinning code uses the ww ctx yet, we cannot completely
drop struct_mutex yet.
Changes since v1:
- Keep struct_mutex for now. :(
Changes since v2:
- Make sure we always lock the ww context in slowpath.
Changes since v3:
- Don't call __eb_unreserve_vma in eb_move_to_gpu now; this can be
done on normal unlock path.
- Unconditionally release vmas and context.
Changes since v4:
- Rebased on top of struct_mutex reduction.
Changes since v5:
- Remove training wheels.
Changes since v6:
- Fix accidentally broken -ENOSPC handling.
Changes since v7:
- Handle gt buffer pool better.
Changes since v8:
- Properly clear variables, to make -EDEADLK handling not BUG.
Change since v9:
- Fix unpinning fence on pnv and below.
Changes since v10:
- Make relocation gpu chaining working again.
Changes since v11:
- Remove relocation chaining, pain to make it work.
Signed-off-by: Maarten Lankhorst <[email protected]>
Reviewed-by: Thomas Hellström <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Joonas Lahtinen <[email protected]>
|
|
We want to introduce backoff logic, but we need to lock the
pool object as well for command parsing. Because of this, we
will need backoff logic for the engine pool obj, move the batch
validation up slightly to eb_lookup_vmas, and the actual command
parsing in a separate function which can get called from execbuf
relocation fast and slowpath.
Signed-off-by: Maarten Lankhorst <[email protected]>
Reviewed-by: Thomas Hellström <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Joonas Lahtinen <[email protected]>
|
|
Execbuffer submission will perform its own WW locking, and we
cannot rely on the implicit lock there.
This also makes it clear that the GVT code will get a lockdep splat when
multiple batchbuffer shadows need to be performed in the same instance,
fix that up.
Signed-off-by: Maarten Lankhorst <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Joonas Lahtinen <[email protected]>
|
|
i915_gem_ww_ctx is used to lock all gem bo's for pinning and memory
eviction. We don't use it yet, but lets start adding the definition
first.
To use it, we have to pass a non-NULL ww to gem_object_lock, and don't
unlock directly. It is done in i915_gem_ww_ctx_fini.
Changes since v1:
- Change ww_ctx and obj order in locking functions (Jonas Lahtinen)
Signed-off-by: Maarten Lankhorst <[email protected]>
Reviewed-by: Thomas Hellström <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Joonas Lahtinen <[email protected]>
|
|
This reverts commit 0f1dd02295f3 ("drm/i915/gem: Split eb_vma into
its own allocation") and also moves all unreserving to a single
place at the end, which is a minor simplification.
With the WW locking, we will drop all references only at the
end when unlocking, so refcounting can now be removed.
Signed-off-by: Maarten Lankhorst <[email protected]>
Reviewed-by: Daniel Vetter <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Joonas Lahtinen <[email protected]>
|
|
This reverts commit 7dc8f1143778 ("drm/i915/gem: Drop relocation
slowpath"). We need the slowpath relocation for taking ww-mutex
inside the page fault handler, and we will take this mutex when
pinning all objects.
We also functionally revert ef398881d27d ("drm/i915/gem: Limit
struct_mutex to eb_reserve"), as we need the struct_mutex in
the slowpath as well, and a tiny part of 003d8b9143a6 ("drm/i915/gem:
Only call eb_lookup_vma once during execbuf ioctl"). Specifically,
we make the -EAGAIN handling part of fallback to slowpath again.
With this, we have a proper working slowpath again, which
will allow us to do fault handling with WW locks held.
[mlankhorst: Adjusted for reloc_gpu_flush() changes]
Cc: Chris Wilson <[email protected]>
Cc: Matthew Auld <[email protected]>
Signed-off-by: Maarten Lankhorst <[email protected]>
[mlankhorst: Removed extra reloc_gpu_flush()]
Reviewed-by: Daniel Vetter <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Joonas Lahtinen <[email protected]>
|
|
This reverts commit 964a9b0f611ee ("drm/i915/gem: Use chained reloc batches")
and commit 0e97fbb080553 ("drm/i915/gem: Use a single chained reloc batches
for a single execbuf").
When adding ww locking to execbuf, it's hard enough to deal with a
single BO that is part of relocation execution. Chaining is hard to
get right, and with GPU relocation deprecated, it's best to drop this
altogether, instead of trying to fix something we will remove.
This is not a completely 1:1 revert, we reset rq_size to 0 in
reloc_cache_init, this was from e3d291301f99 ("drm/i915/gem: Implement legacy
MI_STORE_DATA_IMM"), because we don't want to break the selftests. (Daniel)
Signed-off-by: Maarten Lankhorst <[email protected]>
Reviewed-by: Daniel Vetter <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Joonas Lahtinen <[email protected]>
|
|
This reverts commit 9e0f9464e2ab ("drm/i915/gem: Async GPU relocations only"),
and related commit 7ac2d2536dfa7 ("drm/i915/gem: Delete unused code").
Async GPU relocations are not the path forward, we want to remove
GPU accelerated relocation support eventually when userspace is fixed
to use VM_BIND, and this is the first step towards that. We will keep
async gpu relocations around for now, until userspace is fixed.
Relocation support will be disabled completely on platforms where there
was never any userspace that depends on it, as the hardware doesn't
require it from at least gen9+ onward. For older platforms, the plan
is to use cpu relocations only.
The igt side is fixed in igt commit 39e9aa1032a4e ("tests/i915: Remove
subtests that rely on async relocation behavior").
Signed-off-by: Maarten Lankhorst <[email protected]>
Reviewed-by: Daniel Vetter <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Joonas Lahtinen <[email protected]>
|
|
If dma_fence_chain_find_seqno() reports an error, it does so in its
preamble before it disposes of the input fence. On handling the
error, we need to drop the reference to the fence.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2292
Signed-off-by: Chris Wilson <[email protected]>
Cc: Lionel Landwerlin <[email protected]>
Fixes: 13149e8bafc4 ("drm/i915: add syncobj timeline support")
Reviewed-by: Lionel Landwerlin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Joonas Lahtinen <[email protected]>
|