aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-04-27drm/i915/gt: Fix up clock frequencyChris Wilson4-6/+147
The bspec lists both the clock frequency and the effective interval. The interval corresponds to observed behaviour, so adjust the frequency to match. v2: Mika rightfully asked if we could measure the clock frequency from a selftest. Signed-off-by: Chris Wilson <[email protected]> Cc: Mika Kuoppala <[email protected]> Acked-by: Mika Kuoppala <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-27drm/i915/gt: Sanitize GT firstChris Wilson2-1/+5
We see that if the HW doesn't actually sleep, the HW may eat the poison we set in its write-only HWSP during sanitize: intel_gt_resume.part.8: 0000:00:02.0 __gt_unpark: 0000:00:02.0 gt_sanitize: 0000:00:02.0 force:yes process_csb: 0000:00:02.0 vcs0: cs-irq head=5, tail=90 process_csb: 0000:00:02.0 vcs0: csb[0]: status=0x5a5a5a5a:0x5a5a5a5a assert_pending_valid: Nothing pending for promotion! The CS TAIL pointer should have been reset by reset_csb_pointers(), so in this case it is likely that we have read back from the CPU cache and so we must clflush our control over that page. In doing so, push the sanitisation to the start of the GT sequence so that our poisoning is assuredly before we start talking to the HW. References: https://gitlab.freedesktop.org/drm/intel/-/issues/1794 Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Tvrtko Ursulin <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-27drm/i915/gt: Check cacheline is valid before acquiringChris Wilson1-0/+2
The hwsp_cacheline pointer from i915_request is very, very flimsy. The i915_request.timeline (and the hwsp_cacheline) are lost upon retiring (after an RCU grace). Therefore we need to confirm that once we have the right pointer for the cacheline, it is not in the process of being retired and disposed of before we attempt to acquire a reference to the cacheline. <3>[ 547.208237] BUG: KASAN: use-after-free in active_debug_hint+0x6a/0x70 [i915] <3>[ 547.208366] Read of size 8 at addr ffff88822a0d2710 by task gem_exec_parall/2536 <4>[ 547.208547] CPU: 3 PID: 2536 Comm: gem_exec_parall Tainted: G U 5.7.0-rc2-ged7a286b5d02d-kasan_117+ #1 <4>[ 547.208556] Hardware name: Dell Inc. XPS 13 9350/, BIOS 1.4.12 11/30/2016 <4>[ 547.208564] Call Trace: <4>[ 547.208579] dump_stack+0x96/0xdb <4>[ 547.208707] ? active_debug_hint+0x6a/0x70 [i915] <4>[ 547.208719] print_address_description.constprop.6+0x16/0x310 <4>[ 547.208841] ? active_debug_hint+0x6a/0x70 [i915] <4>[ 547.208963] ? active_debug_hint+0x6a/0x70 [i915] <4>[ 547.208975] __kasan_report+0x137/0x190 <4>[ 547.209106] ? active_debug_hint+0x6a/0x70 [i915] <4>[ 547.209127] kasan_report+0x32/0x50 <4>[ 547.209257] ? i915_gemfs_fini+0x40/0x40 [i915] <4>[ 547.209376] active_debug_hint+0x6a/0x70 [i915] <4>[ 547.209389] debug_print_object+0xa7/0x220 <4>[ 547.209405] ? lockdep_hardirqs_on+0x348/0x5f0 <4>[ 547.209426] debug_object_assert_init+0x297/0x430 <4>[ 547.209449] ? debug_object_free+0x360/0x360 <4>[ 547.209472] ? lock_acquire+0x1ac/0x8a0 <4>[ 547.209592] ? intel_timeline_read_hwsp+0x4f/0x840 [i915] <4>[ 547.209737] ? i915_active_acquire_if_busy+0x66/0x120 [i915] <4>[ 547.209861] i915_active_acquire_if_busy+0x66/0x120 [i915] <4>[ 547.209990] ? __live_alloc.isra.15+0xc0/0xc0 [i915] <4>[ 547.210005] ? rcu_read_lock_sched_held+0xd0/0xd0 <4>[ 547.210017] ? print_usage_bug+0x580/0x580 <4>[ 547.210153] intel_timeline_read_hwsp+0xbc/0x840 [i915] <4>[ 547.210284] __emit_semaphore_wait+0xd5/0x480 [i915] <4>[ 547.210415] ? i915_fence_get_timeline_name+0x110/0x110 [i915] <4>[ 547.210428] ? lockdep_hardirqs_on+0x348/0x5f0 <4>[ 547.210442] ? _raw_spin_unlock_irq+0x2a/0x40 <4>[ 547.210567] ? __await_execution.constprop.51+0x2e0/0x570 [i915] <4>[ 547.210706] i915_request_await_dma_fence+0x8f7/0xc70 [i915] Fixes: 85bedbf191e8 ("drm/i915/gt: Eliminate the trylock for reading a timeline's hwsp") Signed-off-by: Chris Wilson <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: <[email protected]> # v5.6+ Reviewed-by: Tvrtko Ursulin <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-27drm/i915/execlists: Check preempt-timeout target before submit_portsChris Wilson1-1/+1
We evaluate *active, which is a pointer into execlists->inflight[] during dequeue to decide how long a preempt-timeout we need to apply. However, as soon as we do the submit_ports, the HW may send its ACK interrupt causing us to promote execlists->pending[] tp execlists->inflight[], overwriting the value of *active. We know *active is only stable until we submit (as we only submit when there is no pending promotion). [ 16.102328] BUG: KCSAN: data-race in execlists_dequeue+0x1449/0x1600 [i915] [ 16.102356] [ 16.102375] race at unknown origin, with read to 0xffff8881e9500488 of 8 bytes by task 429 on cpu 1: [ 16.102780] execlists_dequeue+0x1449/0x1600 [i915] [ 16.103160] __execlists_submission_tasklet+0x48/0x60 [i915] [ 16.103540] execlists_submit_request+0x38e/0x3c0 [i915] [ 16.103940] submit_notify+0x8f/0xc0 [i915] [ 16.104308] __i915_sw_fence_complete+0x61/0x420 [i915] [ 16.104683] i915_sw_fence_complete+0x58/0x80 [i915] [ 16.105054] i915_sw_fence_commit+0x16/0x20 [i915] [ 16.105457] __i915_request_queue+0x60/0x70 [i915] [ 16.105843] i915_gem_do_execbuffer+0x2d6b/0x4230 [i915] [ 16.106227] i915_gem_execbuffer2_ioctl+0x2b0/0x580 [i915] [ 16.106257] drm_ioctl_kernel+0xe9/0x130 [ 16.106279] drm_ioctl+0x27d/0x45e [ 16.106311] ksys_ioctl+0x89/0xb0 [ 16.106336] __x64_sys_ioctl+0x42/0x60 [ 16.106370] do_syscall_64+0x6e/0x2c0 [ 16.106397] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Signed-off-by: Chris Wilson <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Reviewed-by: Tvrtko Ursulin <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-27drm/i915: re-disable -Wframe-addressNick Desaulniers1-0/+1
The top level Makefile disables this warning. When building an i386_defconfig with Clang, this warning is triggered a whole bunch via includes of headers from perf. Link: https://github.com/ClangBuiltLinux/continuous-integration/pull/182 Signed-off-by: Nick Desaulniers <[email protected]> Reviewed-by: Nathan Chancellor <[email protected]> Signed-off-by: Chris Wilson <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-25drm/i915: Use indirect ctx bb to mend CMD_BUF_CCTLMika Kuoppala3-3/+115
Use indirect ctx bb to load cmd buffer control value from context image to avoid corruption. v2: add to lrc layout (Chris) v3: end to a cacheline (Chris) v4: add to lrc fixed (Chris) v5: value in offset+1 Testcase: igt/i915_selftest/gt_lrc Signed-off-by: Mika Kuoppala <[email protected]> Acked-by: Chris Wilson <[email protected]> Signed-off-by: Chris Wilson <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-25drm/i915: Add live selftests for indirect ctx batchbuffersMika Kuoppala1-0/+150
Indirect ctx batchbuffers are a hw feature of which batch can be run, by hardware, during context restoration stage. Driver can setup a batchbuffer and also an offset into the context image. When context image is marshalled from memory to registers, and when the offset from the start of context register state is equal of what driver pre-determined, batch will run. So one can manipulate context restoration process at cacheline granularity, given some limitations, as you need to have rudimentaries in place before you can run a batch. Add selftest which will write the ring start register to a canary spot. This will test that hardware will run a batchbuffer for the context in question. v2: request wait fix, naming (Chris) v3: test order (Chris) v4: rebase Signed-off-by: Mika Kuoppala <[email protected]> Acked-by: Chris Wilson <[email protected]> Signed-off-by: Chris Wilson <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-25drm/i915: Add per ctx batchbuffer wa for timestampMika Kuoppala3-13/+125
Restoration of a previous timestamp can collide with updating the timestamp, causing a value corruption. Combat this issue by using indirect ctx bb to modify the context image during restoring process. We can preload value into scratch register. From which we then do the actual write with LRR. LRR is faster and thus less error prone as probability of race drops. v2: tidying (Chris) v3: lrr for all engines v4: grp v5: reg bit v6: wa_bb_offset, virtual engines (Chris) References: HSDES#16010904313 Testcase: igt/i915_selftest/gt_lrc Suggested-by: Joseph Koston <[email protected]> Cc: Chris Wilson <[email protected]> Signed-off-by: Mika Kuoppala <[email protected]> Acked-by: Chris Wilson <[email protected]> Signed-off-by: Chris Wilson <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-25drm/i915: Add engine scratch register to live_lrc_fixedMika Kuoppala2-0/+17
General purpose registers are per engine and in a fixed location. Add to live_lrc_fixed. Signed-off-by: Mika Kuoppala <[email protected]> Reviewed-by: Chris Wilson <[email protected]> Signed-off-by: Chris Wilson <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-24drm/i915: Drop rq->ring->vma peeking from error captureChris Wilson2-4/+2
We only hold the active spinlock while dumping the error state, and this does not prevent another thread from retiring the request -- as it is quite possible that despite us capturing the current state, the GPU has completed the request. As such, it is dangerous to dereference state below the request as it may already be freed, and the simplest way to avoid the danger is not include it in the error state. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1788 Signed-off-by: Chris Wilson <[email protected]> Cc: Mika Kuoppala <[email protected]> Cc: Andi Shyti <[email protected]> Reviewed-by: Mika Kuoppala <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-24drm/i915/gt: Use the RPM config register to determine clk frequenciesChris Wilson11-75/+229
For many configuration details within RC6 and RPS we are programming intervals for the internal clocks. From gen11, these clocks are configuration via the RPM_CONFIG and so for convenience, we would like to convert to/from more natural units (ns). Signed-off-by: Chris Wilson <[email protected]> Cc: Andi Shyti <[email protected]> Cc: Mika Kuoppala <[email protected]> Reviewed-by: Andi Shyti <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-24drm/i915/gt: Trace RPS eventsChris Wilson1-4/+44
Add tracek to the RPS events (interrupts, worker, enabling, threshold selection, frequency setting), so that if we have to debug reticent HW we have some traces to start from. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Andi Shyti <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-24drm/i915/gt: Prefer soft-rc6 over RPS DOWN_TIMEOUTChris Wilson1-23/+18
The RPS DOWN_TIMEOUT interrupt is signaled after a period of rc6, and upon receipt of that interrupt we reprogram the GPU clocks down to the next idle notch [to help convserve power during rc6]. However, on execlists, we benefit from soft-rc6 immediately parking the GPU and setting idle frequencies upon idling [within a jiffie], and here the interrupt prevents us from restarting from our last frequency. In the process, we can simply opt for a static pm_events mask and rely on the enable/disable interrupts to flush the worker on parking. This will reduce the amount of oscillation observed during steady workloads with microsleeps, as each time the rc6 timeout occurs we immediately follow with a waitboost for a dropped frame. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Andi Shyti <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-24drm/i915: Split some long linesVille Syrjälä1-2/+8
Split some overly long lines. Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: José Roberto de Souza <[email protected]>
2020-04-24drm/i915: Introduce .set_idle_link_train() vfuncVille Syrjälä3-27/+32
Relocate a bunch of DDI specific code from intel_dp.c to intel_ddi.c by introducing a .set_idle_link_train() vfunc. Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: José Roberto de Souza <[email protected]>
2020-04-24drm/i915: Introduce .set_signal_levels() vfuncVille Syrjälä3-77/+149
Sort out some of the mess between intel_ddi.c intel_dp.c by introducing a .set_signal_levels() vfunc. Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: José Roberto de Souza <[email protected]>
2020-04-24drm/i915: Introduce .set_link_train() vfuncVille Syrjälä3-81/+104
Sort out some of the mess between intel_ddi.c intel_dp.c by introducing a .set_link_train() vfunc. Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: José Roberto de Souza <[email protected]>
2020-04-24drm/i915: Have pfit calculations return an error codeVille Syrjälä7-38/+49
Change intel_{gmch,pch}_panel_fitting() to return a normal error vs. success int. We'll need this later to validate that the margin properties aren't misconfigured. Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Manasi Navare <[email protected]>
2020-04-24drm/i915: Pass connector state to pfit calculationsVille Syrjälä7-33/+31
Pass the entire connector state to intel_{gmch,pch}_panel_fitting(). For now we just need to get at .scaling_mode but in the future we'll want access to the margin properties as well. v2: Deal with intel_dp_ycbcr420_config() Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Manasi Navare <[email protected]>
2020-04-24drm/i915: s/pipe_config/crtc_state/ in pfit functionsVille Syrjälä7-65/+58
Follow the new naming convention and call the crtc state "crtc_state", and while at it drop the redundant crtc argument. Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Manasi Navare <[email protected]>
2020-04-24drm/i915: Use drm_rect to store the pfit window pos/sizeVille Syrjälä3-50/+67
Make things a bit more abstract by replacing the pch_pfit.pos/size raw register values with a drm_rect. Makes it slighly more convenient to eg. compute the scaling factors. v2: Use drm_rect_init() Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Manasi Navare <[email protected]>
2020-04-24drm/i915: Flatten a bunch of the pfit functionsVille Syrjälä1-120/+117
Most of the pfit functions are of the form: func() { if (pfit_enabled) { ... } } Flip the pfit_enabled check around to flatten the functions. And while we're touching all this let's do the usual s/pipe_config/crtc_state/ replacement. Reviewed-by: Manasi Navare <[email protected]> Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-24drm/i915: Fix skl+ non-scaled pfit modesVille Syrjälä4-46/+19
Fix skl_update_scaler_crtc() to deal with different scaling modes correctly. The current implementation assumes DRM_MODE_SCALE_FULLSCREEN. Fortunately we don't expose any border properties currently so the code does actually end up doing the right thing (assigning a scaler for pfit). The code does need to be fixed before any borders are exposed. Also we have redundant calls to skl_update_scaler_crtc() in dp/hdmi .compute_config() which can be nuked. They were anyway called before we had even computed the pfit state so were basically nonsense. The real call we need to keep is in intel_crtc_atomic_check(). v2: Deal witrh skl_update_scaler_crtc() in intel_dp_ycbcr420_config() Reviewed-by: Manasi Navare <[email protected]> Signed-off-by: Ville Syrjälä <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-24drm/i915: Only close vma we openChris Wilson9-107/+56
The history of i915_vma_close() is confusing, as is its use. As the lifetime of the i915_vma is currently bounded by the object it is attached to, we needed a means of identify when a vma was no longer in use by userspace (via the user's fd). This is further complicated by that only ppgtt vma should be closed at the user's behest, as the ggtt were always shared. Now that we attach the vma to a lut on the user's context, the open count does indicate how many unique and open context/vm are referencing this vma from the user. As such, we can and should just use the open_count to track when the vma is still in use by userspace. It's a poor man's replacement for reference counting. Closes: https://gitlab.freedesktop.org/drm/intel/issues/1193 Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Andi Shyti <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-24drm/i915: Make define for lrc state offsetMika Kuoppala5-14/+14
More often than not, we need a byte offset into lrc register state from the start of the hw state. Make it so. Signed-off-by: Mika Kuoppala <[email protected]> Reviewed-by: Chris Wilson <[email protected]> Signed-off-by: Chris Wilson <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-24drm/i915/selftests: Add context batchbuffers registers to live_lrc_fixedMika Kuoppala3-60/+89
Add per ctx bb and indirect ctx bb register locations to live_lrc_fixed for verification. Signed-off-by: Mika Kuoppala <[email protected]> Reviewed-by: Chris Wilson <[email protected]> Signed-off-by: Chris Wilson <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-23drm/i915/selftests: Add request throughput measurement to perfChris Wilson2-1/+580
Under ideal circumstances, the driver should be able to keep the GPU fully saturated with work. Measure how close to ideal we get under the harshest of conditions with no user payload. v2: Also measure throughput using only one thread. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Andi Shyti <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-23drm/i915/gt: Check carefully for an idle engine in wait-for-idleChris Wilson1-3/+12
intel_gt_wait_for_idle() tries to wait until all the outstanding requests are retired and the GPU is idle. As a side effect of retiring requests, we may submit more work to flush any pm barriers, and so the wait-for-idle tries to flush the background pm work and catch the new requests. However, if the work completed in the background before we were able to flush, it would queue the extra barrier request without us noticing -- and so we would return from wait-for-idle with one request remaining. (This breaks e.g. record_default_state where we need to wait until that barrier is retired, and it may slow suspend down by causing us to wait on the background retirement worker as opposed to immediately retiring the barrier.) However, since we track if there has been a submission since the engine pm barrier, we can very quickly detect if the idle barrier is still outstanding. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1763 Signed-off-by: Chris Wilson <[email protected]> Cc: Matthew Auld <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Reviewed-by: Tvrtko Ursulin <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-23drm/i915/gt: Carefully order virtual_submission_taskletChris Wilson1-2/+5
During the virtual engine's submission tasklet, we take the request and insert into the submission queue on each of our siblings. This seems quite simply, and so no problems with ordering. However, the sibling execlists' submission tasklets may run concurrently with the virtual engine's tasklet, submitting the request to HW before the virtual finishes its task of telling all the siblings. If this happens, the sibling tasklet may *reorder* the ve->sibling[] array that the virtual engine tasklet is processing. This can *only* reorder within the elements already processed by the virtual engine, nevertheless the race is detected by KCSAN: [ 185.580014] BUG: KCSAN: data-race in execlists_dequeue [i915] / virtual_submission_tasklet [i915] [ 185.580054] [ 185.580076] write to 0xffff8881f1919860 of 8 bytes by interrupt on cpu 2: [ 185.580553] execlists_dequeue+0x6ad/0x1600 [i915] [ 185.581044] __execlists_submission_tasklet+0x48/0x60 [i915] [ 185.581517] execlists_submission_tasklet+0xd3/0x170 [i915] [ 185.581554] tasklet_action_common.isra.0+0x42/0x90 [ 185.581585] __do_softirq+0xc8/0x206 [ 185.581613] run_ksoftirqd+0x15/0x20 [ 185.581641] smpboot_thread_fn+0x15a/0x270 [ 185.581669] kthread+0x19a/0x1e0 [ 185.581695] ret_from_fork+0x1f/0x30 [ 185.581717] [ 185.581736] read to 0xffff8881f1919860 of 8 bytes by interrupt on cpu 0: [ 185.582231] virtual_submission_tasklet+0x10e/0x5c0 [i915] [ 185.582265] tasklet_action_common.isra.0+0x42/0x90 [ 185.582291] __do_softirq+0xc8/0x206 [ 185.582315] run_ksoftirqd+0x15/0x20 [ 185.582340] smpboot_thread_fn+0x15a/0x270 [ 185.582368] kthread+0x19a/0x1e0 [ 185.582395] ret_from_fork+0x1f/0x30 [ 185.582417] We can prevent this race by checking for the ve->request after looking up the sibling array. Signed-off-by: Chris Wilson <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Reviewed-by: Tvrtko Ursulin <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-23drm/i915/icl: Fix timeout handling during TypeC AUX power well enablingImre Deak1-49/+25
Fix the check for when an AUX power well enabling timeout is expected on a legacy TypeC port. Fixes: 89e01caac641 ("drm/i915: Use single set of AUX powerwell ops for gen11+") Cc: Matt Roper <[email protected]> Cc: José Roberto de Souza <[email protected]> Signed-off-by: Imre Deak <[email protected]> Reviewed-by: José Roberto de Souza <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-22drm/i915/execlists: Drop request-before-CS assertionChris Wilson1-26/+7
When we migrated to execlists, one of the conditions we wanted to test for was whether the breadcrumb seqno was being written before the breadcumb interrupt was delivered. This was following on from issues observed on previous generations which were not so strongly ordered. With the removal of the missed interrupt detection, we have not reliable means of detecting the out-of-order seqno/interrupt but instead tried to assert that the relationship between the CS event interrupt and the breadwrite should be strongly ordered. However, Icelake proves it is possible for the HW implementation to forget about minor little details such as write ordering and so the order between *processing* the CS event and the breadcrumb is unreliable. Remove the unreliable assertion, but leave a debug telltale in case we have reason to suspect. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1658 Signed-off-by: Chris Wilson <[email protected]> Cc: Mika Kuoppala <[email protected]> Reviewed-by: Mika Kuoppala <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-22drm/i915/gem: Hold obj->vma.lock over for_each_ggtt_vma()Chris Wilson2-6/+24
While the ggtt vma are protected by their object lifetime, the list continues until it hits a non-ggtt vma, and that vma is not protected and may be freed as we inspect it. Hence, we require the obj->vma.lock to protect the list as we iterate. An example of forgetting to hold the obj->vma.lock is [1642834.464973] general protection fault, probably for non-canonical address 0xdead000000000122: 0000 [#1] SMP PTI [1642834.464977] CPU: 3 PID: 1954 Comm: Xorg Not tainted 5.6.0-300.fc32.x86_64 #1 [1642834.464979] Hardware name: LENOVO 20ARS25701/20ARS25701, BIOS GJET94WW (2.44 ) 09/14/2017 [1642834.465021] RIP: 0010:i915_gem_object_set_tiling+0x2c0/0x3e0 [i915] [1642834.465024] Code: 8b 84 24 18 01 00 00 f6 c4 80 74 59 49 8b 94 24 a0 00 00 00 49 8b 84 24 e0 00 00 00 49 8b 74 24 10 48 8b 92 30 01 00 00 89 c7 <80> ba 0a 06 00 00 03 0f 87 86 00 00 00 ba 00 00 08 00 b9 00 00 10 [1642834.465025] RSP: 0018:ffffa98780c77d60 EFLAGS: 00010282 [1642834.465028] RAX: ffff8d232bfb2578 RBX: 0000000000000002 RCX: ffff8d25873a0000 [1642834.465029] RDX: dead000000000122 RSI: fffff0af8ac6e408 RDI: 000000002bfb2578 [1642834.465030] RBP: ffff8d25873a0000 R08: ffff8d252bfb5638 R09: 0000000000000000 [1642834.465031] R10: 0000000000000000 R11: ffff8d252bfb5640 R12: ffffa987801cb8f8 [1642834.465032] R13: 0000000000001000 R14: ffff8d233e972e50 R15: ffff8d233e972d00 [1642834.465034] FS: 00007f6a3d327f00(0000) GS:ffff8d25926c0000(0000) knlGS:0000000000000000 [1642834.465036] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [1642834.465037] CR2: 00007f6a2064d000 CR3: 00000002fb57c001 CR4: 00000000001606e0 [1642834.465038] Call Trace: [1642834.465083] i915_gem_set_tiling_ioctl+0x122/0x230 [i915] [1642834.465121] ? i915_gem_object_set_tiling+0x3e0/0x3e0 [i915] [1642834.465151] drm_ioctl_kernel+0x86/0xd0 [drm] [1642834.465156] ? avc_has_perm+0x3b/0x160 [1642834.465178] drm_ioctl+0x206/0x390 [drm] [1642834.465216] ? i915_gem_object_set_tiling+0x3e0/0x3e0 [i915] [1642834.465221] ? selinux_file_ioctl+0x122/0x1c0 [1642834.465226] ? __do_munmap+0x24b/0x4d0 [1642834.465231] ksys_ioctl+0x82/0xc0 [1642834.465235] __x64_sys_ioctl+0x16/0x20 [1642834.465238] do_syscall_64+0x5b/0xf0 [1642834.465243] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [1642834.465245] RIP: 0033:0x7f6a3d7b047b [1642834.465247] Code: 0f 1e fa 48 8b 05 1d aa 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ed a9 0c 00 f7 d8 64 89 01 48 [1642834.465249] RSP: 002b:00007ffe71adba28 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [1642834.465251] RAX: ffffffffffffffda RBX: 000055f99048fa40 RCX: 00007f6a3d7b047b [1642834.465253] RDX: 00007ffe71adba30 RSI: 00000000c0106461 RDI: 000000000000000e [1642834.465254] RBP: 0000000000000002 R08: 000055f98f3f1798 R09: 0000000000000002 [1642834.465255] R10: 0000000000001000 R11: 0000000000000246 R12: 0000000000000080 [1642834.465257] R13: 000055f98f3f1690 R14: 00000000c0106461 R15: 00007ffe71adba30 Now to take the spinlock during the list iteration, we need to break it down into two phases. In the first phase under the lock, we cannot sleep and so must defer the actual work to a second list, protected by the ggtt->mutex. We also need to hold the spinlock during creation of a new vma to serialise with updates of the tiling on the object. Reported-by: Dave Airlie <[email protected]> Fixes: 2850748ef876 ("drm/i915: Pull i915_vma_pin under the vm->mutex") Signed-off-by: Chris Wilson <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: Dave Airlie <[email protected]> Cc: <[email protected]> # v5.5+ Reviewed-by: Tvrtko Ursulin <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-22drm/i915/selftests: Try to detect rollback during batchbuffer preemptionChris Wilson1-1/+328
Since batch buffers dominant execution time, most preemption requests should naturally occur during execution of a batch buffer. We wish to verify that should a preemption occur within a batch buffer, when we come to restart that batch buffer, it occurs at the interrupted instruction and most importantly does not rollback to an earlier point. v2: Do not clear the GPR at the start of the batch, but rely on them being clear for new contexts. Suggested-by: Mika Kuoppala <[email protected]> Signed-off-by: Chris Wilson <[email protected]> Cc: Mika Kuoppala <[email protected]> Reviewed-by: Mika Kuoppala <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-22drm/i915/selftests: Disable heartbeat around RPS interrupt testingChris Wilson1-4/+55
For verifying reciving the EI interrupts, we need to hold the GPU in very precise conditions (in terms of C0 cycles during the EI). If we preempt the busy load to handle the heartbeat, this may perturb the busy load causing us to miss the interrupt. The other tests, while not as time sensitive, may also be slightly perturbed, so apply the heartbeat protection across all the measurements. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Mika Kuoppala <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-21drm/i915/selftests: Unroll the CS frequency loopChris Wilson1-13/+20
Having noticed that MI_BB_START is incurring a memory stall (see the correlation with uncore frequency), we have to unroll the loop in order to diminish the impact of the MI_BB_START on the instruction throughput. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Mika Kuoppala <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-21drm/i915/gt: Poison residual state [HWSP] across resume.Chris Wilson3-2/+40
Since we may lose the content of any buffer when we relinquish control of the system (e.g. suspend/resume), we have to be careful not to rely on regaining control. A good method to detect when we might be using garbage is by always injecting that garbage prior to first use on load/resume/etc. v2: Drop sanitize callback on cleanup v3: Move seqno reset to timeline enter, so we reset all timelines. However, this is done on every activation during runtime and not reset. The similar level of paranoia we apply to correcting context state after a period of inactivity. Suggested-by: Tvrtko Ursulin <[email protected]> Signed-off-by: Chris Wilson <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: Venkata Ramana Nayana <[email protected]> Cc: Daniele Ceraolo Spurio <[email protected]> Reviewed-by: Tvrtko Ursulin <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-21drm/i915/selftests: Disable C-states when measuring RPS frequency responseChris Wilson1-0/+18
Let's isolate the impact of cpu frequency selection on determing the GPU throughput in response to selection of RPS frequencies. For real systems, we do have to be concerned with the impact of integrating c-states, p-states and rp-states, but for the sake of proving whether or not RPS works, one baby step at a time. For the record, as one would hope, it does not seem to impact on the measured performance, but we do it anyway to reduce the number of variables. Later, we can extend the testing to encourage the the cpu/pkg to try and sleep while the GPU is busy. Signed-off-by: Chris Wilson <[email protected]> Cc: Mika Kuoppala <[email protected]> Reviewed-by: Mika Kuoppala <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-21drm/i915/selftests: Show the full scaling curve on failureChris Wilson1-0/+40
If we detect that the RPS end points do not scale perfectly, take the time to measure all the in between values as well. We are aborting the test, so we might as well spend the available time gathering critical debug information instead. Signed-off-by: Chris Wilson <[email protected]> Cc: Mika Kuoppala <[email protected]> Reviewed-by: Mika Kuoppala <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-21drm/i915/selftests: Show the pstate limits on any failure to reset minChris Wilson1-0/+2
We want to see the pstate limits whenever we fail to set the minimum frequency as that may help for debugging. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Mika Kuoppala <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-21drm/i915/display/vlv_dsi: Prefer drm_WARN_ON over WARN_ONPankaj Bharadiya1-1/+1
struct drm_device specific drm_WARN* macros include device information in the backtrace, so we know what device the warnings originate from. Prefer drm_WARN_ON over WARN_ON. Signed-off-by: Pankaj Bharadiya <[email protected]> Signed-off-by: Jani Nikula <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-21drm/i915/display/overlay: Prefer drm_WARN_ON over WARN_ONPankaj Bharadiya1-3/+3
struct drm_device specific drm_WARN* macros include device information in the backtrace, so we know what device the warnings originate from. Prefer drm_WARN_ON over WARN_ON. Signed-off-by: Pankaj Bharadiya <[email protected]> Signed-off-by: Jani Nikula <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-21drm/i915/display/global_state: Prefer drm_WARN* over WARN*Pankaj Bharadiya1-2/+2
struct drm_device specific drm_WARN* macros include device information in the backtrace, so we know what device the warnings originate from. Prefer drm_WARN* over WARN* calls. Signed-off-by: Pankaj Bharadiya <[email protected]> Signed-off-by: Jani Nikula <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-21drm/i915/display/frontbuffer: Prefer drm_WARN_ON over WARN_ONPankaj Bharadiya1-2/+4
struct drm_device specific drm_WARN* macros include device information in the backtrace, so we know what device the warnings originate from. Prefer drm_WARN_ON over WARN_ON. Signed-off-by: Pankaj Bharadiya <[email protected]> Signed-off-by: Jani Nikula <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-21drm/i915/display/dpll_mgr: Prefer drm_WARN_ON over WARN_ONPankaj Bharadiya1-4/+4
struct drm_device specific drm_WARN* macros include device information in the backtrace, so we know what device the warnings originate from. Prefer drm_WARN_ON over WARN_ON at places where struct drm_device pointer can be extracted. Signed-off-by: Pankaj Bharadiya <[email protected]> Signed-off-by: Jani Nikula <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-21drm/i915/display/display: Prefer drm_WARN_ON over WARN_ONPankaj Bharadiya1-10/+14
struct drm_device specific drm_WARN* macros include device information in the backtrace, so we know what device the warnings originate from. Prefer drm_WARN_ON over WARN_ON at places where struct drm_device pointer can be extracted. Signed-off-by: Pankaj Bharadiya <[email protected]> Signed-off-by: Jani Nikula <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-21drm/i915/display/ddi: Prefer drm_WARN* over WARN*Pankaj Bharadiya1-6/+8
struct drm_device specific drm_WARN* macros include device information in the backtrace, so we know what device the warnings originate from. Prefer drm_WARN* over WARN* calls. Signed-off-by: Pankaj Bharadiya <[email protected]> Signed-off-by: Jani Nikula <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-21drm/i915/display/atomic_plane: Prefer drm_WARN_ON over WARN_ONPankaj Bharadiya1-2/+2
struct drm_device specific drm_WARN* macros include device information in the backtrace, so we know what device the warnings originate from. Prefer drm_WARN_ON over WARN_ON. Signed-off-by: Pankaj Bharadiya <[email protected]> Signed-off-by: Jani Nikula <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-21drm/i915/display/icl_dsi: Prefer drm_WARN_ON over WARN_ONPankaj Bharadiya1-1/+1
struct drm_device specific drm_WARN* macros include device information in the backtrace, so we know what device the warnings originate from. Prefer drm_WARN_ON over WARN_ON. Signed-off-by: Pankaj Bharadiya <[email protected]> Signed-off-by: Jani Nikula <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-21drm/i915/audio: fix compressed_bpp checkJani Nikula1-10/+5
The early check for compressed_bpp being zero is too early, as it is hit also when DSC is not enabled. Move the checks down to where the values are actually needed. This is a paranoid check for a situation that should not happen, so we don't really care about handling it gracefully apart from not oopsing. Fixes: 48b8b04c791d ("drm/i915/display: Enable DP Display Audio WA") Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1750 Cc: Anshuman Gupta <[email protected]> Cc: Uma Shankar <[email protected]> Reviewed-by: Uma Shankar <[email protected]> Signed-off-by: Jani Nikula <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2020-04-21drm/i915: drop a bunch of superfluous inlinesJani Nikula14-67/+58
Remove a number of inlines from .c files, and let the compiler decide what's best. There's more to do, but need to start somewhere, and need to start setting the example. Acked-by: Ville Syrjälä <[email protected]> Signed-off-by: Jani Nikula <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]