Age | Commit message (Collapse) | Author | Files | Lines |
|
Adding DC3CO counter in i915_dmc_info debugfs will be
useful for DC3CO validation.
DMC firmware uses DMC_DEBUG3 register as DC3CO counter
register on TGL, as per B.Specs DMC_DEBUG3 is general
purpose register.
v1: comment modification for DMC_DBUG3.
using GEN >= 12 check instead of IS_TIGERLAKE()
to print DMC_DEBUG3 counter value.
Cc: Jani Nikula <[email protected]>
Cc: Imre Deak <[email protected]>
Cc: Animesh Manna <[email protected]>
Reviewed-by: Imre Deak <[email protected]>
Signed-off-by: Anshuman Gupta <[email protected]>
Signed-off-by: Imre Deak <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
DC3CO is useful power state, when DMC detects PSR2 idle frame
while an active video playback, playing 30fps video on 60hz panel
is the classic example of this use case.
B.Specs:49196 has a restriction to enable DC3CO only for Video Playback.
It will be worthy to enable DC3CO after completion of each pageflip
and switch back to DC5 when display is idle because driver doesn't
differentiate between video playback and a normal pageflip.
We will use Frontbuffer flush call tgl_dc3co_flush() to enable DC3CO
state only for ORIGIN_FLIP flush call, because DC3CO state has primarily
targeted for VPB use case. We are not interested here for frontbuffer
invalidates calls because that triggers PSR2 exit, which will
explicitly disable DC3CO.
DC5 and DC6 saves more power, but can't be entered during video
playback because there are not enough idle frames in a row to meet
most PSR2 panel deep sleep entry requirement typically 4 frames.
As PSR2 existing implementation is using minimum 6 idle frames for
deep sleep, it is safer to enable DC5/6 after 6 idle frames
(By scheduling a delayed work of 6 idle frames, once DC3CO has been
enabled after a pageflip).
After manually waiting for 6 idle frames DC5/6 will be enabled and
PSR2 deep sleep idle frames will be restored to 6 idle frames, at this
point DMC will triggers DC5/6 once PSR2 enters to deep sleep after
6 idle frames.
In future when we will enable S/W PSR2 tracking, we can change the
PSR2 required deep sleep idle frames to 1 so DMC can trigger the
DC5/6 immediately after S/W manual waiting of 6 idle frames get
complete.
v2: calculated s/w state to switch over dc3co when there is an
update. [Imre]
Used cancel_delayed_work_sync() in order to avoid any race
with already scheduled delayed work. [Imre]
v3: Cancel_delayed_work_sync() may blocked the commit work.
hence dropping it, dc5_idle_thread() checks the valid wakeref before
putting the reference count, which avoids any chances of dropping
a zero wakeref. [Imre (IRC)]
v4: Used frontbuffer flush mechanism. [Imre]
v5: Used psr.pipe to extract frontbuffer busy bits. [Imre]
Used cancel_delayed_work_sync() in encoder disable path. [Imre]
Used mod_delayed_work() instead of cancelling and scheduling a
delayed work. [Imre]
Used psr.lock in tgl_dc5_idle_thread() to enable psr2 deep
sleep. [Imre]
Removed DC5_REQ_IDLE_FRAMES macro. [Imre]
v6: Used dc3co_exitline check instead of TGL and dc3co allowed_dc_mask
checks, used delayed_work_pending with the psr lock and removed the
psr2_deep_slp_disabled flag. [Imre]
v7: Code refactoring, moved most of functional code to inte_psr.c [Imre]
Using frontbuffer_bits on psr.pipe check instead of
busy_frontbuffer_bits. [Imre]
Calculating dc3co_exit_delay in intel_psr_enable_locked. [Imre]
Cc: Jani Nikula <[email protected]>
Cc: Imre Deak <[email protected]>
Cc: Animesh Manna <[email protected]>
Reviewed-by: Imre Deak <[email protected]>
Signed-off-by: Anshuman Gupta <[email protected]>
Signed-off-by: Imre Deak <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
DC3CO enabling B.Specs sequence requires to enable end configure
exit scanlines to TRANS_EXITLINE register, programming this register
has to be part of modeset sequence as this can't be change when
transcoder or port is enabled.
When system boots with only eDP panel there may not be real
modeset as BIOS has already programmed the necessary registers,
therefore it needs to force a modeset to enable and configure
DC3CO exitline.
v1: Computing dc3co_exitline crtc state from a DP encoder
compute config. [Imre]
Enabling and disabling DC3CO PSR2 transcoder exitline from
encoder pre_enable and post_disable hooks. [Imre]
Computing dc3co_exitline instead of has_dc3co_exitline bool. [Imre]
v2: Code refactoring for symmetry and to avoid exported function. [Imre]
Removing IS_TIGERLAKE check from compute_config, adding PIPE_A
restriction and clearing dc3co_exitline state if crtc is not active
or it is not PSR2 capable in dc3co exitline compute_config. [Imre]
Using GEN >= 12 check in dc3co exitline get_config. [Imre]
Cc: Jani Nikula <[email protected]>
Cc: Imre Deak <[email protected]>
Cc: Animesh Manna <[email protected]>
Reviewed-by: Imre Deak <[email protected]>
Signed-off-by: Anshuman Gupta <[email protected]>
Signed-off-by: Imre Deak <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Add target_dc_state and used by set_target_dc_state API
in order to enable DC3CO state with existing DC states.
target_dc_state will enable/disable the desired DC state in
DC_STATE_EN reg when "DC Off" power well gets disable/enable.
v2: commit log improvement.
v3: Used intel_wait_for_register to wait for DC3CO exit. [Imre]
Used gen9_set_dc_state() to allow/disallow DC3CO. [Imre]
Moved transcoder psr2 exit line enablement from tgl_allow_dc3co()
to a appropriate place haswell_crtc_enable(). [Imre]
Changed the DC3CO power well enabled call back logic as
recommended in review comments. [Imre]
v4: Used wait_for_us() instead of intel_wait_for_reg(). [Imre (IRC)]
v5: using udelay() instead of waiting for DC3CO exit status.
v6: Fixed minor unwanted change.
v7: Removed DC3CO powerwell and POWER_DOMAIN_VIDEO.
v8: Uniform checks by using only target_dc_state instead of allowed_dc_mask
in "DC off" power well callback. [Imre]
Adding "DC off" power well id to older platforms. [Imre]
Removed psr2_deep_sleep flag from tgl_set_target_dc_state. [Imre]
v9: Used switch case for target DC state in
gen9_dc_off_power_well_disable(), checking DC3CO state against
allowed DC mask, using WARN_ON() in
tgl_set_target_dc_state(). [Imre]
v10: Code refactoring and using sanitize_target_dc_state(). [Imre]
Cc: Jani Nikula <[email protected]>
Cc: Imre Deak <[email protected]>
Cc: Animesh Manna <[email protected]>
Reviewed-by: Imre Deak <[email protected]>
Signed-off-by: Anshuman Gupta <[email protected]>
Signed-off-by: Imre Deak <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Enable dc3co state in enable_dc module param and add dc3co
enable mask to allowed_dc_mask and gen9_dc_mask.
v1: Adding enable_dc=3,4 options to enable DC3CO with DC5 and DC6
independently. [Animesh]
v2: Using a switch statement for cleaner code. [Animesh]
Cc: Jani Nikula <[email protected]>
Cc: Imre Deak <[email protected]>
Cc: Animesh Manna <[email protected]>
Reviewed-by: Animesh Manna <[email protected]>
Reviewed-by: Imre Deak <[email protected]>
Signed-off-by: Anshuman Gupta <[email protected]>
Signed-off-by: Imre Deak <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Adding following definition to i915_reg.h
1. DC_STATE_EN register DC3CO bit fields and masks.
DC3CO enable bit will be used by driver to make DC3CO
ready for DMC f/w and status bit will be used as DC3CO
entry status.
2. Transcoder EXITLINE register and its bit fields and mask.
Transcoder EXITLINE enable bit represents PSR2 idle frame
reset should be applied at exit line and exitlines mask
represent required number of scanlines at which DC3CO
exit happens.
B.Specs:49196
v1: Use of REG_BIT and using extra space for EXITLINE_ macro
definition. [Animesh]
v2: Grouping EXITLINE reg bits with EXITLINE(trans) define,
no functional change. [Ville]
Cc: Jani Nikula <[email protected]>
Cc: Imre Deak <[email protected]>
Cc: Animesh Manna <[email protected]>
Reviewed-by: Animesh Manna <[email protected]>
Reviewed-by: Imre Deak <[email protected]>
Signed-off-by: Anshuman Gupta <[email protected]>
Signed-off-by: Imre Deak <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
The BKL struct_mutex is no more, the only serialisation we required for
setting the exclusive stream is already managed by ce->pin_mutex in
gen8_configure_all_contexts(). As such, we can manipulate
i915_perf.exclusive_stream underneath our own (already held) perf->lock.
Signed-off-by: Chris Wilson <[email protected]>
Cc: Umesh Nerlige Ramappa <[email protected]>
Cc: Lionel Landwerlin <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Use the local uncore accessors for the GT rather than using the [not-so]
magic global dev_priv mmio routines. In the process, we also teach the
perf stream to use backpointers to the i915_perf rather than digging it
out of dev_priv.
v2: Rebase onto i915_perf_types.h
Signed-off-by: Chris Wilson <[email protected]>
Cc: Umesh Nerlige Ramappa <[email protected]>
Cc: Lionel Landwerlin <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]> #v1
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Adjust indentation from spaces to tab (+optional two spaces) as in
coding style with command like:
$ sed -e 's/^ /\t/' -i */Kconfig
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Signed-off-by: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
git://anongit.freedesktop.org/drm/drm-intel into drm-next
UAPI Changes:
- Never allow userptr into the mappable GGTT (Chris)
No existing users. Avoid anyone from even trying to
spare a deadlock scenario.
Cross-subsystem Changes:
Core Changes:
Driver Changes:
- Eliminate struct_mutex use as BKL! (Chris)
Only used for execbuf serialisation.
- Initialize DDI TC and TBT ports (D-I) on Tigerlake (Lucas)
- Fix DKL link training for 2.7GHz and 1.62GHz (Jose)
- Add Tigerlake DKL PHY programming sequences (Clinton)
- Add Tigerlake Thunderbolt PLL divider values (Imre)
- drm/i915: Use helpers for drm_mm_node booleans (Chris)
- Restrict L3 remapping sysfs interface to dwords (Chris)
- Fix audio power up sequence for gen10+ display (Kai)
- Skip redundant execlist resubmission (Chris)
- Only unwedge if we can reset GPU first (Chris)
- Initialise breadcrumb lists on the virtual engine (Chris)
- Don't rely on kernel context existing during early errors (Matt A)
- Update Icelake+ MG_DP_MODE programming table (Clinton)
- Update DMC firmware for Icelake (Anusha)
- Downgrade DP MST error after unplugging TypeC cable (Srinivasan)
- Limit MST modes based on plane size too (Ville)
- Polish intel_tv_mode_valid() (Ville)
- Fix g4x sprite scaling stride check with GTT remapping (Ville)
- Don't advertize non-exisiting crtcs (Ville)
- Clean up encoder->crtc_mask setup (Ville)
- Use tc_port instead of port parameter to MG registers (Jose)
- Remove static variable for aux last status (Jani)
- Implement a better i945gm vblank irq vs. C-states workaround (Ville)
- Make the object creation interface consistent (CQ)
- Rename intel_vga_msr_write() to intel_vga_reset_io_mem() (Jani, Ville)
- Eliminate previous drm_dbg/drm_err usage (Jani)
- Move gmbus setup down to intel_modeset_init() (Jani)
- Abstract all vgaarb access to intel_vga.[ch] (Jani)
- Split out i915_switcheroo.[ch] from i915_drv.c (Jani)
- Use intel_gt in has_reset* (Chris)
- Eliminate return value for i915_gem_init_early (Matt A)
- Selftest improvements (Chris)
- Update HuC firmware header version number format (Daniele)
Signed-off-by: Dave Airlie <[email protected]>
From: Joonas Lahtinen <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
If we cannot claim the timeline->mutex while preparing for a wait on it,
we have to skip the timeline. In doing so, treat it as active so that
under a intel_gt_wait_for_idle() loop, we repeat the wait after
scheduling away.
Signed-off-by: Chris Wilson <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Disable irqs around updating the context image to keep lockdep happy:
<4>[ 673.483340] WARNING: possible irq lock inversion dependency detected
<4>[ 673.483342] 5.4.0-rc1-CI-Trybot_5118+ #1 Tainted: G U
<4>[ 673.483342] --------------------------------------------------------
<4>[ 673.483343] swapper/2/0 just changed the state of lock:
<4>[ 673.483344] ffff88845db885a0 (&i915_request_get(rq)->submit/1){-...}, at: __i915_sw_fence_complete+0x1b2/0x250 [i915]
<4>[ 673.483387] but this lock took another, HARDIRQ-unsafe lock in the past:
<4>[ 673.483388] (&ce->pin_mutex/2){+...}
<4>[ 673.483389]
and interrupts could create inverse lock ordering between them.
<4>[ 673.483390]
other info that might help us debug this:
<4>[ 673.483390] Chain exists of:
&i915_request_get(rq)->submit/1 --> &engine->active.lock --> &ce->pin_mutex/2
<4>[ 673.483392] Possible interrupt unsafe locking scenario:
<4>[ 673.483392] CPU0 CPU1
<4>[ 673.483393] ---- ----
<4>[ 673.483393] lock(&ce->pin_mutex/2);
<4>[ 673.483394] local_irq_disable();
<4>[ 673.483395] lock(&i915_request_get(rq)->submit/1);
<4>[ 673.483396] lock(&engine->active.lock);
<4>[ 673.483396] <Interrupt>
<4>[ 673.483397] lock(&i915_request_get(rq)->submit/1);
<4>[ 673.483398]
*** DEADLOCK ***
<4>[ 673.483398] 2 locks held by swapper/2/0:
<4>[ 673.483399] #0: ffff8883f61ac9b0 (&(>->irq_lock)->rlock){-.-.}, at: gen11_gt_irq_handler+0x42/0x280 [i915]
<4>[ 673.483433] #1: ffff88845db8c418 (&(&rq->lock)->rlock){-.-.}, at: intel_engine_breadcrumbs_irq+0x34a/0x5a0 [i915]
<4>[ 673.483463]
the shortest dependencies between 2nd lock and 1st lock:
<4>[ 673.483466] -> (&ce->pin_mutex/2){+...} ops: 614520 {
<4>[ 673.483468] HARDIRQ-ON-W at:
<4>[ 673.483471] lock_acquire+0xa7/0x1c0
<4>[ 673.483501] live_unlite_restore+0x1d8/0x6c0 [i915]
<4>[ 673.483543] __i915_subtests+0xb8/0x210 [i915]
<4>[ 673.483581] __run_selftests+0x112/0x170 [i915]
<4>[ 673.483615] i915_live_selftests+0x2c/0x60 [i915]
<4>[ 673.483644] i915_pci_probe+0x93/0x1b0 [i915]
<4>[ 673.483646] pci_device_probe+0x9e/0x120
<4>[ 673.483648] really_probe+0xea/0x420
<4>[ 673.483649] driver_probe_device+0x10b/0x120
<4>[ 673.483651] device_driver_attach+0x4a/0x50
<4>[ 673.483652] __driver_attach+0x97/0x130
<4>[ 673.483653] bus_for_each_dev+0x74/0xc0
<4>[ 673.483654] bus_add_driver+0x142/0x220
<4>[ 673.483655] driver_register+0x56/0xf0
<4>[ 673.483657] do_one_initcall+0x58/0x2ff
<4>[ 673.483659] do_init_module+0x56/0x1f8
<4>[ 673.483660] load_module+0x243e/0x29f0
<4>[ 673.483661] __do_sys_finit_module+0xe9/0x110
<4>[ 673.483662] do_syscall_64+0x4f/0x210
<4>[ 673.483665] entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[ 673.483665] INITIAL USE at:
<4>[ 673.483667] lock_acquire+0xa7/0x1c0
<4>[ 673.483698] live_unlite_restore+0x1d8/0x6c0 [i915]
<4>[ 673.483733] __i915_subtests+0xb8/0x210 [i915]
<4>[ 673.483764] __run_selftests+0x112/0x170 [i915]
<4>[ 673.483793] i915_live_selftests+0x2c/0x60 [i915]
<4>[ 673.483821] i915_pci_probe+0x93/0x1b0 [i915]
<4>[ 673.483822] pci_device_probe+0x9e/0x120
<4>[ 673.483824] really_probe+0xea/0x420
<4>[ 673.483825] driver_probe_device+0x10b/0x120
<4>[ 673.483826] device_driver_attach+0x4a/0x50
<4>[ 673.483827] __driver_attach+0x97/0x130
<4>[ 673.483828] bus_for_each_dev+0x74/0xc0
<4>[ 673.483829] bus_add_driver+0x142/0x220
<4>[ 673.483830] driver_register+0x56/0xf0
<4>[ 673.483831] do_one_initcall+0x58/0x2ff
<4>[ 673.483833] do_init_module+0x56/0x1f8
<4>[ 673.483834] load_module+0x243e/0x29f0
<4>[ 673.483835] __do_sys_finit_module+0xe9/0x110
<4>[ 673.483836] do_syscall_64+0x4f/0x210
<4>[ 673.483837] entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[ 673.483838] }
<4>[ 673.483868] ... key at: [<ffffffffa0a8f132>] __key.70113+0x2/0xffffffffffef2ed0 [i915]
<4>[ 673.483869] ... acquired at:
<4>[ 673.483935] __execlists_reset+0xfb/0xc20 [i915]
<4>[ 673.483965] execlists_reset+0x3d/0x50 [i915]
<4>[ 673.483995] intel_engine_reset+0xdf/0x230 [i915]
<4>[ 673.484022] live_preempt_hang+0x1d7/0x2e0 [i915]
<4>[ 673.484064] __i915_subtests+0xb8/0x210 [i915]
<4>[ 673.484130] __run_selftests+0x112/0x170 [i915]
<4>[ 673.484163] i915_live_selftests+0x2c/0x60 [i915]
<4>[ 673.484193] i915_pci_probe+0x93/0x1b0 [i915]
<4>[ 673.484194] pci_device_probe+0x9e/0x120
<4>[ 673.484195] really_probe+0xea/0x420
<4>[ 673.484196] driver_probe_device+0x10b/0x120
<4>[ 673.484197] device_driver_attach+0x4a/0x50
<4>[ 673.484198] __driver_attach+0x97/0x130
<4>[ 673.484199] bus_for_each_dev+0x74/0xc0
<4>[ 673.484200] bus_add_driver+0x142/0x220
<4>[ 673.484202] driver_register+0x56/0xf0
<4>[ 673.484203] do_one_initcall+0x58/0x2ff
<4>[ 673.484204] do_init_module+0x56/0x1f8
<4>[ 673.484205] load_module+0x243e/0x29f0
<4>[ 673.484206] __do_sys_finit_module+0xe9/0x110
<4>[ 673.484207] do_syscall_64+0x4f/0x210
<4>[ 673.484208] entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[ 673.484209] -> (&engine->active.lock){..-.} ops: 972791 {
<4>[ 673.484211] IN-SOFTIRQ-W at:
<4>[ 673.484213] lock_acquire+0xa7/0x1c0
<4>[ 673.484214] _raw_spin_lock_irqsave+0x33/0x50
<4>[ 673.484244] execlists_submission_tasklet+0xaf/0x100 [i915]
<4>[ 673.484246] tasklet_action_common.isra.18+0x6c/0x1c0
<4>[ 673.484247] __do_softirq+0xdf/0x47f
<4>[ 673.484248] irq_exit+0xba/0xc0
<4>[ 673.484249] do_IRQ+0x83/0x160
<4>[ 673.484250] ret_from_intr+0x0/0x1d
<4>[ 673.484252] cpuidle_enter_state+0xb2/0x450
<4>[ 673.484253] cpuidle_enter+0x24/0x40
<4>[ 673.484254] do_idle+0x1e7/0x250
<4>[ 673.484256] cpu_startup_entry+0x14/0x20
<4>[ 673.484257] start_secondary+0x15f/0x1b0
<4>[ 673.484258] secondary_startup_64+0xa4/0xb0
<4>[ 673.484259] INITIAL USE at:
<4>[ 673.484261] lock_acquire+0xa7/0x1c0
<4>[ 673.484290] intel_engine_init_active+0x7e/0xb0 [i915]
<4>[ 673.484305] intel_engines_setup+0x1cd/0x3b0 [i915]
<4>[ 673.484305] i915_gem_init+0x12d/0x900 [i915]
<4>[ 673.484305] i915_driver_probe+0xb70/0x15d0 [i915]
<4>[ 673.484305] i915_pci_probe+0x43/0x1b0 [i915]
<4>[ 673.484305] pci_device_probe+0x9e/0x120
<4>[ 673.484305] really_probe+0xea/0x420
<4>[ 673.484305] driver_probe_device+0x10b/0x120
<4>[ 673.484305] device_driver_attach+0x4a/0x50
<4>[ 673.484305] __driver_attach+0x97/0x130
<4>[ 673.484305] bus_for_each_dev+0x74/0xc0
<4>[ 673.484305] bus_add_driver+0x142/0x220
<4>[ 673.484305] driver_register+0x56/0xf0
<4>[ 673.484305] do_one_initcall+0x58/0x2ff
<4>[ 673.484305] do_init_module+0x56/0x1f8
<4>[ 673.484305] load_module+0x243e/0x29f0
<4>[ 673.484305] __do_sys_finit_module+0xe9/0x110
<4>[ 673.484305] do_syscall_64+0x4f/0x210
<4>[ 673.484305] entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[ 673.484305] }
<4>[ 673.484305] ... key at: [<ffffffffa0a8f160>] __key.70307+0x0/0xffffffffffef2ea0 [i915]
<4>[ 673.484305] ... acquired at:
<4>[ 673.484305] _raw_spin_lock_irqsave+0x33/0x50
<4>[ 673.484305] execlists_submit_request+0x2b/0x1e0 [i915]
<4>[ 673.484305] submit_notify+0xa8/0x13c [i915]
<4>[ 673.484305] __i915_sw_fence_complete+0x81/0x250 [i915]
<4>[ 673.484305] i915_sw_fence_wake+0x51/0x70 [i915]
<4>[ 673.484305] __i915_sw_fence_complete+0x1ee/0x250 [i915]
<4>[ 673.484305] dma_i915_sw_fence_wake+0x1b/0x30 [i915]
<4>[ 673.484305] dma_fence_signal_locked+0x9e/0x1b0
<4>[ 673.484305] dma_fence_signal+0x1f/0x40
<4>[ 673.484305] fence_work+0x28/0x80 [i915]
<4>[ 673.484305] process_one_work+0x26a/0x620
<4>[ 673.484305] worker_thread+0x37/0x380
<4>[ 673.484305] kthread+0x119/0x130
<4>[ 673.484305] ret_from_fork+0x24/0x50
<4>[ 673.484305] -> (&i915_request_get(rq)->submit/1){-...} ops: 857694 {
<4>[ 673.484305] IN-HARDIRQ-W at:
<4>[ 673.484305] lock_acquire+0xa7/0x1c0
<4>[ 673.484305] _raw_spin_lock_irqsave_nested+0x39/0x50
<4>[ 673.484305] __i915_sw_fence_complete+0x1b2/0x250 [i915]
<4>[ 673.484305] intel_engine_breadcrumbs_irq+0x3d0/0x5a0 [i915]
<4>[ 673.484305] cs_irq_handler+0x39/0x50 [i915]
<4>[ 673.484305] gen11_gt_irq_handler+0x17b/0x280 [i915]
<4>[ 673.484305] gen11_irq_handler+0x54/0xf0 [i915]
<4>[ 673.484305] __handle_irq_event_percpu+0x41/0x2c0
<4>[ 673.484305] handle_irq_event_percpu+0x2b/0x70
<4>[ 673.484305] handle_irq_event+0x2f/0x50
<4>[ 673.484305] handle_edge_irq+0x99/0x1b0
<4>[ 673.484305] do_IRQ+0x7e/0x160
<4>[ 673.484305] ret_from_intr+0x0/0x1d
<4>[ 673.484305] cpuidle_enter_state+0xb2/0x450
<4>[ 673.484305] cpuidle_enter+0x24/0x40
<4>[ 673.484305] do_idle+0x1e7/0x250
<4>[ 673.484305] cpu_startup_entry+0x14/0x20
<4>[ 673.484305] start_secondary+0x15f/0x1b0
<4>[ 673.484305] secondary_startup_64+0xa4/0xb0
<4>[ 673.484305] INITIAL USE at:
<4>[ 673.484305] lock_acquire+0xa7/0x1c0
<4>[ 673.484305] _raw_spin_lock_irqsave_nested+0x39/0x50
<4>[ 673.484305] __i915_sw_fence_complete+0x1b2/0x250 [i915]
<4>[ 673.484305] __engine_park+0x233/0x420 [i915]
<4>[ 673.484305] ____intel_wakeref_put_last+0x1c/0x70 [i915]
<4>[ 673.484305] intel_gt_resume+0x202/0x2c0 [i915]
<4>[ 673.484305] i915_gem_init+0x36e/0x900 [i915]
<4>[ 673.484305] i915_driver_probe+0xb70/0x15d0 [i915]
<4>[ 673.484305] i915_pci_probe+0x43/0x1b0 [i915]
<4>[ 673.484305] pci_device_probe+0x9e/0x120
<4>[ 673.484305] really_probe+0xea/0x420
<4>[ 673.484305] driver_probe_device+0x10b/0x120
<4>[ 673.484305] device_driver_attach+0x4a/0x50
<4>[ 673.484305] __driver_attach+0x97/0x130
<4>[ 673.484305] bus_for_each_dev+0x74/0xc0
<4>[ 673.484305] bus_add_driver+0x142/0x220
<4>[ 673.484305] driver_register+0x56/0xf0
<4>[ 673.484305] do_one_initcall+0x58/0x2ff
<4>[ 673.484305] do_init_module+0x56/0x1f8
<4>[ 673.484305] load_module+0x243e/0x29f0
<4>[ 673.484305] __do_sys_finit_module+0xe9/0x110
<4>[ 673.484305] do_syscall_64+0x4f/0x210
<4>[ 673.484305] entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[ 673.484305] }
<4>[ 673.484305] ... key at: [<ffffffffa0a8f6a1>] __key.80173+0x1/0xffffffffffef2960 [i915]
<4>[ 673.484305] ... acquired at:
<4>[ 673.484305] mark_lock+0x382/0x500
<4>[ 673.484305] __lock_acquire+0x7e1/0x15d0
<4>[ 673.484305] lock_acquire+0xa7/0x1c0
<4>[ 673.484305] _raw_spin_lock_irqsave_nested+0x39/0x50
<4>[ 673.484305] __i915_sw_fence_complete+0x1b2/0x250 [i915]
<4>[ 673.484305] intel_engine_breadcrumbs_irq+0x3d0/0x5a0 [i915]
<4>[ 673.484305] cs_irq_handler+0x39/0x50 [i915]
<4>[ 673.484305] gen11_gt_irq_handler+0x17b/0x280 [i915]
<4>[ 673.484305] gen11_irq_handler+0x54/0xf0 [i915]
<4>[ 673.484305] __handle_irq_event_percpu+0x41/0x2c0
<4>[ 673.484305] handle_irq_event_percpu+0x2b/0x70
<4>[ 673.484305] handle_irq_event+0x2f/0x50
<4>[ 673.484305] handle_edge_irq+0x99/0x1b0
<4>[ 673.484305] do_IRQ+0x7e/0x160
<4>[ 673.484305] ret_from_intr+0x0/0x1d
<4>[ 673.484305] cpuidle_enter_state+0xb2/0x450
<4>[ 673.484305] cpuidle_enter+0x24/0x40
<4>[ 673.484305] do_idle+0x1e7/0x250
<4>[ 673.484305] cpu_startup_entry+0x14/0x20
<4>[ 673.484305] start_secondary+0x15f/0x1b0
<4>[ 673.484305] secondary_startup_64+0xa4/0xb0
<4>[ 673.484305]
stack backtrace:
<4>[ 673.484305] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G U 5.4.0-rc1-CI-Trybot_5118+ #1
<4>[ 673.484305] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3183.A00.1905020411 05/02/2019
<4>[ 673.484305] Call Trace:
<4>[ 673.484305] <IRQ>
<4>[ 673.484305] dump_stack+0x67/0x9b
<4>[ 673.484305] check_usage_forwards+0x13c/0x150
<4>[ 673.484305] ? mark_lock+0x382/0x500
<4>[ 673.484305] mark_lock+0x382/0x500
<4>[ 673.484305] ? check_usage_backwards+0x140/0x140
<4>[ 673.484305] __lock_acquire+0x7e1/0x15d0
<4>[ 673.484305] ? debug_object_deactivate+0x17e/0x190
<4>[ 673.484305] lock_acquire+0xa7/0x1c0
<4>[ 673.484305] ? __i915_sw_fence_complete+0x1b2/0x250 [i915]
<4>[ 673.484305] _raw_spin_lock_irqsave_nested+0x39/0x50
<4>[ 673.484305] ? __i915_sw_fence_complete+0x1b2/0x250 [i915]
<4>[ 673.484305] __i915_sw_fence_complete+0x1b2/0x250 [i915]
<4>[ 673.484305] intel_engine_breadcrumbs_irq+0x3d0/0x5a0 [i915]
<4>[ 673.484305] cs_irq_handler+0x39/0x50 [i915]
<4>[ 673.484305] gen11_gt_irq_handler+0x17b/0x280 [i915]
<4>[ 673.484305] gen11_irq_handler+0x54/0xf0 [i915]
<4>[ 673.484305] __handle_irq_event_percpu+0x41/0x2c0
<4>[ 673.484305] handle_irq_event_percpu+0x2b/0x70
<4>[ 673.484305] handle_irq_event+0x2f/0x50
<4>[ 673.484305] handle_edge_irq+0x99/0x1b0
<4>[ 673.484305] do_IRQ+0x7e/0x160
<4>[ 673.484305] common_interrupt+0xf/0xf
<4>[ 673.484305] </IRQ>
Signed-off-by: Chris Wilson <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
As we may signal a request and take the engine->active.lock within the
signaler, the engine submission paths have to use a nested annotation on
their requests -- but we guarantee that we can never submit on the same
engine as the signaling fence.
<4>[ 723.763281] WARNING: possible circular locking dependency detected
<4>[ 723.763285] 5.3.0-g80fa0e042cdb-drmtip_379+ #1 Tainted: G U
<4>[ 723.763288] ------------------------------------------------------
<4>[ 723.763291] gem_exec_await/1388 is trying to acquire lock:
<4>[ 723.763294] ffff93a7b53221d8 (&engine->active.lock){..-.}, at: execlists_submit_request+0x2b/0x1e0 [i915]
<4>[ 723.763378]
but task is already holding lock:
<4>[ 723.763381] ffff93a7c25f6d20 (&i915_request_get(rq)->submit/1){-.-.}, at: __i915_sw_fence_complete+0x1b2/0x250 [i915]
<4>[ 723.763420]
which lock already depends on the new lock.
<4>[ 723.763423]
the existing dependency chain (in reverse order) is:
<4>[ 723.763427]
-> #2 (&i915_request_get(rq)->submit/1){-.-.}:
<4>[ 723.763434] _raw_spin_lock_irqsave_nested+0x39/0x50
<4>[ 723.763478] __i915_sw_fence_complete+0x1b2/0x250 [i915]
<4>[ 723.763513] intel_engine_breadcrumbs_irq+0x3aa/0x5e0 [i915]
<4>[ 723.763600] cs_irq_handler+0x49/0x50 [i915]
<4>[ 723.763659] gen11_gt_irq_handler+0x17b/0x280 [i915]
<4>[ 723.763690] gen11_irq_handler+0x54/0xf0 [i915]
<4>[ 723.763695] __handle_irq_event_percpu+0x41/0x2d0
<4>[ 723.763699] handle_irq_event_percpu+0x2b/0x70
<4>[ 723.763702] handle_irq_event+0x2f/0x50
<4>[ 723.763706] handle_edge_irq+0xee/0x1a0
<4>[ 723.763709] do_IRQ+0x7e/0x160
<4>[ 723.763712] ret_from_intr+0x0/0x1d
<4>[ 723.763717] __slab_alloc.isra.28.constprop.33+0x4f/0x70
<4>[ 723.763720] kmem_cache_alloc+0x28d/0x2f0
<4>[ 723.763724] vm_area_dup+0x15/0x40
<4>[ 723.763727] dup_mm+0x2dd/0x550
<4>[ 723.763730] copy_process+0xf21/0x1ef0
<4>[ 723.763734] _do_fork+0x71/0x670
<4>[ 723.763737] __se_sys_clone+0x6e/0xa0
<4>[ 723.763741] do_syscall_64+0x4f/0x210
<4>[ 723.763744] entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[ 723.763747]
-> #1 (&(&rq->lock)->rlock#2){-.-.}:
<4>[ 723.763752] _raw_spin_lock+0x2a/0x40
<4>[ 723.763789] __unwind_incomplete_requests+0x3eb/0x450 [i915]
<4>[ 723.763825] __execlists_submission_tasklet+0x9ec/0x1d60 [i915]
<4>[ 723.763864] execlists_submission_tasklet+0x34/0x50 [i915]
<4>[ 723.763874] tasklet_action_common.isra.5+0x47/0xb0
<4>[ 723.763878] __do_softirq+0xd8/0x4ae
<4>[ 723.763881] irq_exit+0xa9/0xc0
<4>[ 723.763883] smp_apic_timer_interrupt+0xb7/0x280
<4>[ 723.763887] apic_timer_interrupt+0xf/0x20
<4>[ 723.763892] cpuidle_enter_state+0xae/0x450
<4>[ 723.763895] cpuidle_enter+0x24/0x40
<4>[ 723.763899] do_idle+0x1e7/0x250
<4>[ 723.763902] cpu_startup_entry+0x14/0x20
<4>[ 723.763905] start_secondary+0x15f/0x1b0
<4>[ 723.763908] secondary_startup_64+0xa4/0xb0
<4>[ 723.763911]
-> #0 (&engine->active.lock){..-.}:
<4>[ 723.763916] __lock_acquire+0x15d8/0x1ea0
<4>[ 723.763919] lock_acquire+0xa6/0x1c0
<4>[ 723.763922] _raw_spin_lock_irqsave+0x33/0x50
<4>[ 723.763956] execlists_submit_request+0x2b/0x1e0 [i915]
<4>[ 723.764002] submit_notify+0xa8/0x13c [i915]
<4>[ 723.764035] __i915_sw_fence_complete+0x81/0x250 [i915]
<4>[ 723.764054] i915_sw_fence_wake+0x51/0x64 [i915]
<4>[ 723.764054] __i915_sw_fence_complete+0x1ee/0x250 [i915]
<4>[ 723.764054] dma_i915_sw_fence_wake_timer+0x14/0x20 [i915]
<4>[ 723.764054] dma_fence_signal_locked+0x9e/0x1c0
<4>[ 723.764054] dma_fence_signal+0x1f/0x40
<4>[ 723.764054] vgem_fence_signal_ioctl+0x67/0xc0 [vgem]
<4>[ 723.764054] drm_ioctl_kernel+0x83/0xf0
<4>[ 723.764054] drm_ioctl+0x2f3/0x3b0
<4>[ 723.764054] do_vfs_ioctl+0xa0/0x6f0
<4>[ 723.764054] ksys_ioctl+0x35/0x60
<4>[ 723.764054] __x64_sys_ioctl+0x11/0x20
<4>[ 723.764054] do_syscall_64+0x4f/0x210
<4>[ 723.764054] entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[ 723.764054]
other info that might help us debug this:
<4>[ 723.764054] Chain exists of:
&engine->active.lock --> &(&rq->lock)->rlock#2 --> &i915_request_get(rq)->submit/1
<4>[ 723.764054] Possible unsafe locking scenario:
<4>[ 723.764054] CPU0 CPU1
<4>[ 723.764054] ---- ----
<4>[ 723.764054] lock(&i915_request_get(rq)->submit/1);
<4>[ 723.764054] lock(&(&rq->lock)->rlock#2);
<4>[ 723.764054] lock(&i915_request_get(rq)->submit/1);
<4>[ 723.764054] lock(&engine->active.lock);
<4>[ 723.764054]
*** DEADLOCK ***
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111862
Signed-off-by: Chris Wilson <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Avoid going to the base i915 device when we already have a path from gt
to the runtime powermanagement interface. The benefit is that it looks a
bit more self-consistent to always be acquiring the gt->uncore->rpm for
use with the gt->uncore.
Signed-off-by: Chris Wilson <[email protected]>
Cc: Daniele Ceraolo Spurio <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Don't populate the array hw_engine_mask on the stack but instead make it
static. Makes the object code smaller by 316 bytes.
Before:
text data bss dec hex filename
34004 4388 320 38712 9738 gpu/drm/i915/gt/intel_reset.o
After:
text data bss dec hex filename
33528 4548 320 38396 95fc gpu/drm/i915/gt/intel_reset.o
(gcc version 9.2.1, amd64)
Signed-off-by: Colin Ian King <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
The latest documented version of the VBT is 229, but no further data has
been added to the child device definition in block 2. Update the child
device version test to eliminate the "Expected child device config size
for VBT version XXX not known; assuming 39" debug messages from the
logs.
Bspec: 20124
Bspec: 20157
Signed-off-by: Matt Roper <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Jani Nikula <[email protected]>
|
|
Following a pattern used throughout the driver.
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Umesh Nerlige Ramappa <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Lost in the rebasing was Tvrtko's reminder that we need to keep an
uninterruptible wait around for the Ironlake VT-d w/a
Signed-off-by: Chris Wilson <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Signed-off-by: Joonas Lahtinen <[email protected]>
|
|
Pair the gmbus setup and teardown in the same layer. This also fixes the
double gmbus teardown on the i915_driver_modeset_probe() error path.
Move the gmbus setup a bit later in the sequence to make the follow-up
refactoring easier, and to pinpoint any unexpected consequences of this
change right here, instead of the later refactoring.
Reviewed-by: Ville Syrjälä <[email protected]>
Signed-off-by: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Split out code related to vga switcheroo register/unregister and state
handling from i915_drv.c into new i915_switcheroo.[ch] files.
It's a bit difficult to draw the line how much to move to the new file
from i915_drv.c, but it seemed to me keeping i915_suspend_switcheroo()
and i915_resume_switcheroo() in place was the cleanest.
No functional changes.
Cc: Ville Syrjälä <[email protected]>
Cc: Chris Wilson <[email protected]>
Reviewed-by: Ville Syrjälä <[email protected]>
Signed-off-by: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Rename the function per Ville's suggestion. No functional changes.
Cc: Ville Syrjälä <[email protected]>
Suggested-by: Ville Syrjälä <[email protected]>
Reviewed-by: Ville Syrjälä <[email protected]>
Signed-off-by: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Our other backends return an actual error value upon failure. Do the
same for stolen objects, which currently just return NULL on failure.
Signed-off-by: CQ Tang <[email protected]>
Signed-off-by: Matthew Auld <[email protected]>
Cc: Chris Wilson <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
The current "disable C3+" workaround for the delayed vblank
irqs on i945gm no longer works. I'm not sure what changed, but
now I need to also disable C2. I also got my hands on a i915gm
machine that suffers from the same issue.
After some furious poking of registers I managed to find a
better workaround: The "Do not Turn off Core Render Clock in C
states" bit. With that I no longer have to disable any C-states,
and as a nice bonus the power cost is only ~1/4 of the
"disable C3+" method (which mind you doesn't even work anymore,
and so would have an even higher power cost if we made it work
by also disabling C2).
So let's throw out all the cpuidle/qos crap and just toggle
the magic bit as needed. And we extend the workaround to cover
i915gm as well.
Cc: Chris Wilson <[email protected]>
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Acked-by: Chris Wilson <[email protected]>
|
|
We no longer need to placate lockdep by holding struct_mutex for our
initialisation, so don't.
Signed-off-by: Chris Wilson <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
We no longer need struct_mutex to serialise request emission, so remove
it from the gt selftests.
Signed-off-by: Chris Wilson <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
struct_mutex provides no serialisation of the registers and data
structures being saved and restored across suspend/resume. It is
completely superfluous here.
Signed-off-by: Chris Wilson <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Having a struct_mutex around the read of a BIOS blob serves no purpose.
Signed-off-by: Chris Wilson <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
It protects nothing being accessed for the intel_framebuffer, so it's
own locking had better be sufficient.
Signed-off-by: Chris Wilson <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
The overlay uses the modeset mutex to control itself and only required
the struct_mutex for requests, which is now obsolete.
Signed-off-by: Chris Wilson <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Keep track of the GEM contexts underneath i915->gem.contexts and assign
them their own lock for the purposes of list management.
v2: Focus on lock tracking; ctx->vm is protected by ctx->mutex
v3: Correct split with removal of logical HW ID
Signed-off-by: Chris Wilson <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
With the introduction of ctx->engines[] we allow multiple logical
contexts to be used on the same engine (e.g. with virtual engines).
According to bspec, aach logical context requires a unique tag in order
for context-switching to occur correctly between them. [Simple
experiments show that it is not so easy to trick the HW into performing
a lite-restore with matching logical IDs, though my memory from early
Broadwell experiments do suggest that it should be generating
lite-restores.]
We only need to keep a unique tag for the active lifetime of the
context, and for as long as we need to identify that context. The HW
uses the tag to determine if it should use a lite-restore (why not the
LRCA?) and passes the tag back for various status identifies. The only
status we need to track is for OA, so when using perf, we assign the
specific context a unique tag.
v2: Calculate required number of tags to fill ELSP.
Fixes: 976b55f0e1db ("drm/i915: Allow a context to define its set of engines")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111895
Signed-off-by: Chris Wilson <[email protected]>
Acked-by: Daniele Ceraolo Spurio <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
As our global unpark/park keep track of the number of active users, we
can simply move the accounting from the GEM layer to the base GT layer.
It was placed originally inside GEM to benefit from the 100ms extra
delay on idleness, but that has been eliminated and now there is no
substantive difference between the layers. In moving it, we move another
piece of the puzzle out from underneath struct_mutex.
Signed-off-by: Chris Wilson <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Requests are run from the gt and are tided into the gt runtime power
management, so pull the runtime request management under gt/
Signed-off-by: Chris Wilson <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Now that we can retire without taking struct_mutex, we can do so to
handle shrinking the mmap-offset space after an allocation failure.
Signed-off-by: Chris Wilson <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Matthew Auld <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
wait_for_timelines is essentially the same loop as retiring requests
(with an extra timeout), so merge the two into one routine.
v2: i915_retire_requests_timeout and keep VT'd w/a as !interruptible
Signed-off-by: Chris Wilson <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Nothing inside the idle worker now requires struct_mutex, so we can
remove the indirection of using our own worker.
Signed-off-by: Chris Wilson <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
We don't need to hold struct_mutex now for retiring requests, so drop it
from i915_retire_requests() and i915_gem_wait_for_idle(), finally
removing I915_WAIT_LOCKED for good.
Signed-off-by: Chris Wilson <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Now that we now longer need to guarantee that the active callback is
under the struct_mutex, we can lift it out of the i915_gem_park() and
into the engine parking itself.
Signed-off-by: Chris Wilson <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Forgo the struct_mutex serialisation for i915_active, and interpose its
own mutex handling for active/retire.
This is a multi-layered sleight-of-hand. First, we had to ensure that no
active/retire callbacks accidentally inverted the mutex ordering rules,
nor assumed that they were themselves serialised by struct_mutex. More
challenging though, is the rule over updating elements of the active
rbtree. Instead of the whole i915_active now being serialised by
struct_mutex, allocations/rotations of the tree are serialised by the
i915_active.mutex and individual nodes are serialised by the caller
using the i915_timeline.mutex (we need to use nested spinlocks to
interact with the dma_fence callback lists).
The pain point here is that instead of a single mutex around execbuf, we
now have to take a mutex for active tracker (one for each vma, context,
etc) and a couple of spinlocks for each fence update. The improvement in
fine grained locking allowing for multiple concurrent clients
(eventually!) should be worth it in typical loads.
v2: Add some comments that barely elucidate anything :(
Signed-off-by: Chris Wilson <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
As we need to use a mutex to serialise i915_active activation
(because we want to allow the callback to sleep), we need to push the
i915_active.retire into a worker callback in case we get need to retire
from an atomic context.
Signed-off-by: Chris Wilson <[email protected]>
Reviewed-by: Matthew Auld <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Replace the struct_mutex requirement for pinning the i915_vma with the
local vm->mutex instead. Note that the vm->mutex is tainted by the
shrinker (we require unbinding from inside fs-reclaim) and so we cannot
allocate while holding that mutex. Instead we have to preallocate
workers to do allocate and apply the PTE updates after we have we
reserved their slot in the drm_mm (using fences to order the PTE writes
with the GPU work and with later unbind).
In adding the asynchronous vma binding, one subtle requirement is to
avoid coupling the binding fence into the backing object->resv. That is
the asynchronous binding only applies to the vma timeline itself and not
to the pages as that is a more global timeline (the binding of one vma
does not need to be ordered with another vma, nor does the implicit GEM
fencing depend on a vma, only on writes to the backing store). Keeping
the vma binding distinct from the backing store timelines is verified by
a number of async gem_exec_fence and gem_exec_schedule tests. The way we
do this is quite simple, we keep the fence for the vma binding separate
and only wait on it as required, and never add it to the obj->resv
itself.
Another consequence in reducing the locking around the vma is the
destruction of the vma is no longer globally serialised by struct_mutex.
A natural solution would be to add a kref to i915_vma, but that requires
decoupling the reference cycles, possibly by introducing a new
i915_mm_pages object that is own by both obj->mm and vma->pages.
However, we have not taken that route due to the overshadowing lmem/ttm
discussions, and instead play a series of complicated games with
trylocks to (hopefully) ensure that only one destruction path is called!
v2: Add some commentary, and some helpers to reduce patch churn.
Signed-off-by: Chris Wilson <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Since we cannot allocate underneath the vm->mutex (it is used in the
direct-reclaim paths), we need to shift the allocations off into a
mutexless worker with fence recursion prevention. To know when we need
this protection, we mark up the address spaces that do allocate before
insertion. In the future, we may wish to extend the async bind scheme to
more than just allocations.
v2: s/vm->bind_alloc/vm->bind_async_flags/
Signed-off-by: Chris Wilson <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
The premise here is to simply avoiding having to acquire the vm->mutex
inside vma create/destroy to update the vm->unbound_lists, to avoid some
nasty lock recursions later.
Signed-off-by: Chris Wilson <[email protected]>
Reviewed-by: Matthew Auld <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
A subset of 71724f708997 ("drm/mm: Use helpers for drm_mm_node booleans")
in order to prepare drm-intel-next-queued for subsequent patches before
we can backmerge 71724f708997 itself.
Signed-off-by: Chris Wilson <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
The L3 cache remapping is stored as u32 elements, and we should ensure
that the user only supplies complete slice information(u32).
Signed-off-by: Chris Wilson <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
The CDCLK>=2*BCLK constraint applies to all generations since gen10.
Extend the constraint logic in audio get/put_power().
Signed-off-by: Kai Vehmanen <[email protected]>
Signed-off-by: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
A straightforward conversion of assignment and checking of the boolean
state flags (allocated, scanned) into non-atomic bitops. The caller
remains responsible for all locking around the drm_mm and its nodes.
Signed-off-by: Chris Wilson <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
In preparation for rearranging the booleans into a flags field, ensure
all the current users are using the inline helpers and not directly
accessing the members.
Signed-off-by: Chris Wilson <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
On platfroms with gen10+ display, driver must set the enable bit of
AUDIO_PIN_BUF_CTL register before transactions with the HDA controller
can proceed. Add setting this bit to the audio power up sequence.
Failing to do this resulted in errors during display audio codec probe,
and failures during resume from suspend.
Note: We may also need to disable the bit afterwards, but there are
still unresolved issues with that.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111214
Signed-off-by: Kai Vehmanen <[email protected]>
Signed-off-by: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|