Age | Commit message (Collapse) | Author | Files | Lines |
|
This should be done before the soft min/max frequencies are restored.
When we disable the "Ignore efficient frequency" flag, GuC does not
actually bring the requested freq down to RPn.
Specifically, this scenario-
- ignore efficient freq set to true
- reduce min to RPn (from efficient)
- suspend
- resume (includes GuC load, restore soft min/max, restore efficient freq)
- validate min freq has been resored to RPn
This will fail if we didn't first restore(disable, in this case) efficient
freq flag before setting the soft min frequency.
v2: Bring the min freq down to RPn when we disable efficient freq (Rodrigo)
Also made the change to set the min softlimit to RPn at init. Otherwise, we
were storing RPe there.
Link: https://gitlab.freedesktop.org/drm/intel/-/issues/8736
Fixes: 55f9720dbf23 ("drm/i915/guc/slpc: Provide sysfs for efficient freq")
Fixes: 95ccf312a1e4 ("drm/i915/guc/slpc: Allow SLPC to use efficient frequency")
Signed-off-by: Vinay Belgaumkar <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: John Harrison <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
The inclusion of intel_gt_defines.h was initially added to
i915_drv.h to provide the definition of I915_MAX_GT, where it was
originally defined.
However, since I915_MAX_GT is now included in
i915_gem_object_types.h, it sis no longer required in i915_drv.h.
Signed-off-by: Andi Shyti <[email protected]>
Cc: Chris Wilson <[email protected]>
Reviewed-by: Mauro Carvalho Chehab <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
With multi-GT devices, the object may have been bound on each GT.
Invalidate the TLBs across all GT before releasing the pages
back to the system.
Signed-off-by: Chris Wilson <[email protected]>
Cc: Fei Yang <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Reviewed-by: Andi Shyti <[email protected]>
Signed-off-by: Andi Shyti <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Create a new intel_gt_defines.h inside the gt/ directory as a
placeholder for all the generic GT based defines.
As of now place only I915_MAX_GT.
Co-developed-by: Chris Wilson <[email protected]>
Signed-off-by: Chris Wilson <[email protected]>
Signed-off-by: Andi Shyti <[email protected]>
Reviewed-by: Mauro Carvalho Chehab <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Prepare for supporting more TLB invalidation scenarios by moving
the current MMIO invalidation to its own file.
Signed-off-by: Chris Wilson <[email protected]>
Signed-off-by: Mauro Carvalho Chehab <[email protected]>
Reviewed-by: Andi Shyti <[email protected]>
Signed-off-by: Andi Shyti <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Fix the following make htmldocs errors/warnings:
./drivers/gpu/drm/i915/gt/uc/intel_huc.c:29: ERROR: Unexpected indentation.
./drivers/gpu/drm/i915/gt/uc/intel_huc.c:30: WARNING: Block quote ends without a blank line; unexpected unindent.
./drivers/gpu/drm/i915/gt/uc/intel_huc.c:35: WARNING: Bullet list ends without a blank line; unexpected unindent.
This output is a bit misleading. The real issue here is we need a blank
line before and after the bulleted list.
Link: https://www.kernel.org/doc/html/latest/gpu/i915.html#huc
Link: https://lore.kernel.org/dri-devel/[email protected]/
Signed-off-by: David Reaver <[email protected]>
Reviewed-by: Daniele Ceraolo Spurio <[email protected]>
Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Infinite waits for completion of GPU activity have been observed in CI,
mostly inside __i915_active_wait(), triggered by igt@gem_barrier_race or
igt@perf@stress-open-close. Root cause analysis, based of ftrace dumps
generated with a lot of extra trace_printk() calls added to the code,
revealed loops of request dependencies being accidentally built,
preventing the requests from being processed, each waiting for completion
of another one's activity.
After we substitute a new request for a last active one tracked on a
timeline, we set up a dependency of our new request to wait on completion
of current activity of that previous one. While doing that, we must take
care of keeping the old request still in memory until we use its
attributes for setting up that await dependency, or we can happen to set
up the await dependency on an unrelated request that already reuses the
memory previously allocated to the old one, already released. Combined
with perf adding consecutive kernel context remote requests to different
user context timelines, unresolvable loops of await dependencies can be
built, leading do infinite waits.
We obtain a pointer to the previous request to wait upon when we
substitute it with a pointer to our new request in an active tracker,
e.g. in intel_timeline.last_request. In some processing paths we protect
that old request from being freed before we use it by getting a reference
to it under RCU protection, but in others, e.g. __i915_request_commit()
-> __i915_request_add_to_timeline() -> __i915_request_ensure_ordering(),
we don't. But anyway, since the requests' memory is SLAB_FAILSAFE_BY_RCU,
that RCU protection is not sufficient against reuse of memory.
We could protect i915_request's memory from being prematurely reused by
calling its release function via call_rcu() and using rcu_read_lock()
consequently, as proposed in v1. However, that approach leads to
significant (up to 10 times) increase of SLAB utilization by i915_request
SLAB cache. Another potential approach is to take a reference to the
previous active fence.
When updating an active fence tracker, we first lock the new fence,
substitute a pointer of the current active fence with the new one, then we
lock the substituted fence. With this approach, there is a time window
after the substitution and before the lock when the request can be
concurrently released by an interrupt handler and its memory reused, then
we may happen to lock and return a new, unrelated request.
Always get a reference to the current active fence first, before
replacing it with a new one. Having it protected from premature release
and reuse, lock it and then replace with the new one but only if not
yet signalled via a potential concurrent interrupt nor replaced with
another one by a potential concurrent thread, otherwise retry, starting
from getting a reference to the new current one. Adjust users to not
get a reference to the previous active fence themselves and always put the
reference got by __i915_active_fence_set() when no longer needed.
v3: Fix lockdep splat reports and other issues caused by incorrect use of
try_cmpxchg() (use (cmpxchg() != prev) instead)
v2: Protect request's memory by getting a reference to it in favor of
delegating its release to call_rcu() (Chris)
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/8211
Fixes: df9f85d8582e ("drm/i915: Serialise i915_active_fence_set() with itself")
Suggested-by: Chris Wilson <[email protected]>
Signed-off-by: Janusz Krzysztofik <[email protected]>
Cc: <[email protected]> # v5.6+
Reviewed-by: Andi Shyti <[email protected]>
Signed-off-by: Andi Shyti <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
On MTL, if the GSC Proxy init flows haven't completed, submissions to the
GSC engine will fail. Those init flows are dependent on the mei's
gsc_proxy component that is loaded in parallel with i915 and a
worker that could potentially start after i915 driver init is done.
That said, all subsytems that access the GSC engine today does check
for such init flow completion before using the GSC engine. However,
selftests currently don't wait on anything before starting.
To fix this, add a waiter function at the start of __run_selftests
that waits for gsc-proxy init flows to complete. Selftests shouldn't
care if the proxy-init failed as that should be flagged elsewhere.
Difference from prior versions:
v7: - Change the fw status to INTEL_UC_FIRMWARE_LOAD_FAIL if the
proxy-init fails so that intel_gsc_uc_fw_proxy_get_status
catches it. (Daniele)
v6: - Add a helper that returns something more than a boolean
so we selftest can stop waiting if proxy-init hadn't
completed but failed (Daniele).
v5: - Move the call to __wait_gsc_proxy_completed from common
__run_selftests dispatcher to the group-level selftest
function (Trvtko).
- change the pr_info to pr_warn if we hit the timeout.
v4: - Remove generalized waiters function table framework (Tvrtko).
- Remove mention of CI-framework-timeout from comments (Tvrtko).
v3: - Rebase to latest drm-tip.
v2: - Based on internal testing, increase the timeout for gsc-proxy
specific case to 8 seconds.
Signed-off-by: Alan Previn <[email protected]>
Reviewed-by: Daniele Ceraolo Spurio <[email protected]>
Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Perform some refactoring with the purpose of keeping in one
single place all the operations around the aux table
invalidation.
With this refactoring add more engines where the invalidation
should be performed.
Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
Signed-off-by: Andi Shyti <[email protected]>
Cc: Jonathan Cavitt <[email protected]>
Cc: Matt Roper <[email protected]>
Cc: <[email protected]> # v5.8+
Reviewed-by: Andrzej Hajda <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
For platforms that use Aux CCS, wait for aux invalidation to
complete by checking the aux invalidation register bit is
cleared.
Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
Signed-off-by: Jonathan Cavitt <[email protected]>
Signed-off-by: Andi Shyti <[email protected]>
Cc: <[email protected]> # v5.8+
Reviewed-by: Nirmoy Das <[email protected]>
Reviewed-by: Andrzej Hajda <[email protected]>
Reviewed-by: Matt Roper <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Enable the CCS_FLUSH bit 13 in the control pipe for render and
compute engines in platforms starting from Meteor Lake (BSPEC
43904 and 47112).
For the copy engine add MI_FLUSH_DW_CCS (bit 16) in the command
streamer.
Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
Requires: 8da173db894a ("drm/i915/gt: Rename flags with bit_group_X according to the datasheet")
Signed-off-by: Andi Shyti <[email protected]>
Cc: Jonathan Cavitt <[email protected]>
Cc: Nirmoy Das <[email protected]>
Cc: <[email protected]> # v5.8+
Reviewed-by: Matt Roper <[email protected]>
Reviewed-by: Andrzej Hajda <[email protected]>
Reviewed-by: Nirmoy Das <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
In preparation of the next patch align with the datasheet (BSPEC
47112) with the naming of the pipe control set of flag values.
The variable "flags" in gen12_emit_flush_rcs() is applied as a
set of flags called Bit Group 1.
Define also the Bit Group 0 as bit_group_0 where currently only
PIPE_CONTROL0_HDC_PIPELINE_FLUSH bit is set.
Signed-off-by: Andi Shyti <[email protected]>
Cc: <[email protected]> # v5.8+
Reviewed-by: Matt Roper <[email protected]>
Reviewed-by: Andrzej Hajda <[email protected]>
Reviewed-by: Nirmoy Das <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
All memory traffic must be quiesced before requesting
an aux invalidation on platforms that use Aux CCS.
Fixes: 972282c4cf24 ("drm/i915/gen12: Add aux table invalidate for all engines")
Requires: a2a4aa0eef3b ("drm/i915: Add the gen12_needs_ccs_aux_inv helper")
Signed-off-by: Jonathan Cavitt <[email protected]>
Signed-off-by: Andi Shyti <[email protected]>
Cc: <[email protected]> # v5.8+
Reviewed-by: Nirmoy Das <[email protected]>
Reviewed-by: Andrzej Hajda <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
We always assumed that a device might either have AUX or FLAT
CCS, but this is an approximation that is not always true, e.g.
PVC represents an exception.
Set the basis for future finer selection by implementing a
boolean gen12_needs_ccs_aux_inv() function that tells whether aux
invalidation is needed or not.
Currently PVC is the only exception to the above mentioned rule.
Requires: 059ae7ae2a1c ("drm/i915/gt: Cleanup aux invalidation registers")
Signed-off-by: Andi Shyti <[email protected]>
Cc: Matt Roper <[email protected]>
Cc: Jonathan Cavitt <[email protected]>
Cc: <[email protected]> # v5.8+
Reviewed-by: Matt Roper <[email protected]>
Reviewed-by: Andrzej Hajda <[email protected]>
Reviewed-by: Nirmoy Das <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Fix the 'NV' definition postfix that is supposed to be INV.
Take the chance to also order properly the registers based on
their address and call the GEN12_GFX_CCS_AUX_INV address as
GEN12_CCS_AUX_INV like all the other similar registers.
Remove also VD1, VD3 and VE1 registers that don't exist and add
BCS0 and CCS0.
Signed-off-by: Andi Shyti <[email protected]>
Cc: <[email protected]> # v5.8+
Reviewed-by: Nirmoy Das <[email protected]>
Reviewed-by: Andrzej Hajda <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
We can use the existing helper in flush_write_domain() and save some lines
of code.
Signed-off-by: Tvrtko Ursulin <[email protected]>
Reviewed-by: Andi Shyti <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Quite surprised to see that around i915 there are still i915->gt0
references. Replace them with the to_gt() helper.
Signed-off-by: Andi Shyti <[email protected]>
Reviewed-by: Andrzej Hajda <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
All error handling paths go to 'out', except this one. Be consistent and
also branch to 'out' here.
Fixes: c10a652e239e ("drm/i915/selftests: Rework context handling in hugepages selftests")
Signed-off-by: Christophe JAILLET <[email protected]>
Reviewed-by: Andrzej Hajda <[email protected]>
Reviewed-by: Andi Shyti <[email protected]>
Signed-off-by: Andi Shyti <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/7a036b88671312ee9adc01c74ef5b3376f690b76.1689619758.git.christophe.jaillet@wanadoo.fr
|
|
i915_request contains direct alias to i915, there is no point to go via
rq->engine->i915.
v2: added missing rq.i915 initialization in measure_breadcrumb_dw.
Signed-off-by: Andrzej Hajda <[email protected]>
Reviewed-by: Tvrtko Ursulin <[email protected]>
Acked-by: Nirmoy Das <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Due to a change in the auth flow on MTL, GuC 70.7.0 and newer will only
be able to authenticate HuC 8.5.1 and newer. The plan is to update the 2
binaries synchronously in linux-firmware so that the fw repo always has
a matching pair that works; still, it's better to check in the kernel so
we can print an error message and abort HuC loading if the binaries are
out of sync instead of failing the authentication.
v2: Add clarification comment, fix typo in commit msg, clean up variable
declaration (John)
Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Cc: John Harrison <[email protected]>
Reviewed-by: Andrzej Hajda <[email protected]> #v1
Reviewed-by: John Harrison <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
User feedback indicates significant performance gains are possible in
specific games with non default RPS up/down thresholds.
Expose these tunables via sysfs which will allow users to achieve best
performance when running games and best power efficiency elsewhere.
Note this patch supports non GuC based platforms only.
v2:
* Make checkpatch happy.
Signed-off-by: Tvrtko Ursulin <[email protected]>
References: https://gitlab.freedesktop.org/drm/intel/-/issues/8389
Cc: Rodrigo Vivi <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Reviewed-by: Andi Shyti <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
In preparation for exposing via sysfs add helpers for managing rps
thresholds.
v2:
* Force sw and hw re-programming on threshold change.
Signed-off-by: Tvrtko Ursulin <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Reviewed-by: Andi Shyti <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Record the default values as preparation for exposing the sysfs controls.
Signed-off-by: Tvrtko Ursulin <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Reviewed-by: Andi Shyti <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Since 36d516be867c ("drm/i915/gt: Switch to manual evaluation of RPS")
thresholds are invariant so lets move their setting to init time.
Signed-off-by: Tvrtko Ursulin <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Reviewed-by: Andi Shyti <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Use smem on MTL due to a HW bug in MTL that prevents
reading from stolen memory using LMEM BAR.
Cc: Oak Zeng <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: Andi Shyti <[email protected]>
Cc: Andrzej Hajda <[email protected]>
Signed-off-by: Nirmoy Das <[email protected]>
Reviewed-by: Oak Zeng <[email protected]>
Reviewed-by: Andrzej Hajda <[email protected]>
Reviewed-by: Andi Shyti <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
As context structure is shared memory for CPU/GPU, Wa_22016122933 is
needed for this memory block as well.
Signed-off-by: Zhanjun Dong <[email protected]>
CC: Fei Yang <[email protected]>
Reviewed-by: Fei Yang <[email protected]>
Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Arrays passed to reg_in_range_table should end with empty record.
The patch solves KASAN detected bug with signature:
BUG: KASAN: global-out-of-bounds in xehp_is_valid_b_counter_addr+0x2c7/0x350 [i915]
Read of size 4 at addr ffffffffa1555d90 by task perf/1518
CPU: 4 PID: 1518 Comm: perf Tainted: G U 6.4.0-kasan_438-g3303d06107f3+ #1
Hardware name: Intel Corporation Meteor Lake Client Platform/MTL-P DDR5 SODIMM SBS RVP, BIOS MTLPFWI1.R00.3223.D80.2305311348 05/31/2023
Call Trace:
<TASK>
...
xehp_is_valid_b_counter_addr+0x2c7/0x350 [i915]
Fixes: 0fa9349dda03 ("drm/i915/perf: complete programming whitelisting for XEHPSDV")
Signed-off-by: Andrzej Hajda <[email protected]>
Reviewed-by: Andi Shyti <[email protected]>
Reviewed-by: Nirmoy Das <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Commit a4d86249c773 ("drm/i915/gt: Provide a utility to create a scratch
buffer") mistakenly passed in uapi I915_CACHING_CACHED as argument to
i915_gem_object_set_cache_coherency(), which actually takes internal
enum i915_cache_level.
No functional issue since the value matches I915_CACHE_LLC (1 == 1), which
is the intended caching mode, but lets clean it up nevertheless.
Signed-off-by: Tvrtko Ursulin <[email protected]>
Fixes: a4d86249c773 ("drm/i915/gt: Provide a utility to create a scratch buffer")
Cc: Daniele Ceraolo Spurio <[email protected]>
Reviewed-by: Tejas Upadhyay <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Commit 9275277d5324 ("drm/i915: use pat_index instead of cache_level")
added a dedicated gen12_pte_encode but forgot to remove the Gen12 specific
bit from gen8_pte_encode.
Signed-off-by: Tvrtko Ursulin <[email protected]>
Fixes: 9275277d5324 ("drm/i915: use pat_index instead of cache_level")
Cc: Fei Yang <[email protected]>
Cc: Andi Shyti <[email protected]>
Cc: Matt Roper <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Reviewed-by: Fei Yang <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
For reports that are not powers of 2, reports at the end of the OA
buffer may get split across the buffer boundary. When zeroing out such
reports, take the split into consideration.
v2: Use OA_BUFFER_SIZE (Ashutosh)
Fixes: 09a36015d9a0 ("drm/i915/perf: Clear out entire reports after reading if not power of 2 size")
Signed-off-by: Umesh Nerlige Ramappa <[email protected]>
Reviewed-by: Ashutosh Dixit <[email protected]>
Signed-off-by: Ashutosh Dixit <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
GuC load takes longer sometimes due to GT frequency not ramping up.
Add perf_limit_reasons to the existing warn print to see if frequency
is being throttled.
v2: Review comments (Ashutosh)
Signed-off-by: Vinay Belgaumkar <[email protected]>
Reviewed-by: Ashutosh Dixit <[email protected]>
Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Commit 2caffbf11762 ("drm/i915: Revoke mmaps and prevent access to fence
registers across reset") removed the temporary implementation of a reset
under stop machine but forgot to remove this one commented out define.
Signed-off-by: Tvrtko Ursulin <[email protected]>
Reviewed-by: Jani Nikula <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
When checking if the workarounds were applied successfully, the
read-back mask should also contain the bits being set: it's possible
that in a call to wa_write_clr_set(), the cleared bits are not a
superset of the set bits.
Signed-off-by: Lucas De Marchi <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
The comment on the parameter being 0 to avoid the read back doesn't
apply as this is not a call to wa_add(), but rather to
wa_write_clr_set(). So, this register is actually checked and it's
according to the Bspec that the register is RW, not RO.
Signed-off-by: Lucas De Marchi <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Contrary to GEN12_FF_MODE2, platforms using XEHP_FF_MODE2 are not
affected by Wa_1608008084, hence read back can be enabled.
Signed-off-by: Lucas De Marchi <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Now that non-masked registers are already read before programming the
context reads, the additional read became redudant, so remove it.
Signed-off-by: Lucas De Marchi <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Most of the context workarounds tweak masked registers, but not all. For
masked registers, when writing the value it's sufficient to just write
the wa->set_bits since that will take care of both the clr and set bits
as well as not overwriting other bits.
However there are some workarounds, the registers are non-masked. Up
until now the driver was simply emitting a MI_LOAD_REGISTER_IMM with the
set_bits to program the register via the GPU in the WA bb. This has the
side effect of overwriting the content of the register outside of bits
that should be set and also doesn't handle the bits that should be
cleared.
Kenneth reported that on DG2, mesa was seeing a weird behavior due to
the kernel programming of L3SQCREG5 in dg2_ctx_gt_tuning_init(). With
the GPU idle, that register could be read via intel_reg as 0x00e001ff,
but during a 3D workload it would change to 0x0000007f. So the
programming of that tuning was affecting more than the bits in
L3_PWM_TIMER_INIT_VAL_MASK. Matt Roper noticed the lack of rmw for the
context workarounds due to the use of MI_LOAD_REGISTER_IMM.
So, for registers that are not masked, read its value via mmio, modify
and then set it in the buffer to be written by the GPU. This should take
care in a simple way of programming just the bits required by the
tuning/workaround. If in future there are registers that involved that
can't be read by the CPU, a more complex approach may be required like
a) issuing additional instructions to read and modify; or b) scan the
golden context and patch it in place before saving it; or something
else. But for now this should suffice.
Scanning the context workarounds for all platforms, these are the
impacted ones with the respective registers
mtl: DRAW_WATERMARK
mtl/dg2: XEHP_L3SQCREG5, XEHP_FF_MODE2
ICL has some non-masked registers in the context workarounds:
GEN8_L3CNTLREG, IVB_FBC_RT_BASE and VB_FBC_RT_BASE_UPPER, but there
shouldn't be an impact. The first is already being manually read and the
other 2 are intentionally overwriting the entire register. Same
reasoning applies to GEN12_FF_MODE2: the WA is intentionally
overwriting all the bits to avoid a read-modify-write.
v2: Reword commit message wrt GEN12_FF_MODE2 and the changed behavior
on preparatory patches.
v3: Also skip reading if clear|set bits covers everything
Cc: Kenneth Graunke <[email protected]>
Cc: Matt Roper <[email protected]>
Link: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23783#note_1968971
Signed-off-by: Lucas De Marchi <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Right now context workarounds don't do a rmw and instead only write to
the register. Since 2 separate programmings to the same register are
coalesced into a single write, this is not problematic for
GEN12_FF_MODE2 since both TDS and GS timer are going to be written
together and the other remaining bits be zeroed.
However in order to fix other workarounds that may want to preserve the
unrelated bits in the same register, context workarounds need to
be changed to a rmw. To prepare for that, move the programming of
GEN12_FF_MODE2 to a single place so the value passed for "clear" can
be all the bits. Otherwise the second workaround would be dropped as
it'd be detected as overwriting a previously programmed workaround.
Signed-off-by: Lucas De Marchi <[email protected]>
Reviewed-by: Matt Roper <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Move helper function to get all the forcewakes required by the wa list
to the top, so it can be re-used by other functions.
Signed-off-by: Lucas De Marchi <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
kmap() has been deprecated in favor of the kmap_local_page()
due to high cost, restricted mapping space, the overhead of a
global lock for synchronization, and making the process sleep
in the absence of free slots.
kmap_local_page() is faster than kmap() and offers thread-local
and CPU-local mappings, can take pagefaults in a local kmap region
and preserves preemption by saving the mappings of outgoing tasks
and restoring those of the incoming one during a context switch.
The mapping is kept thread local in the function
“i915_vma_coredump_create” in i915_gpu_error.c
Therefore, replace kmap() with kmap_local_page().
Suggested-by: Ira Weiny <[email protected]>
Signed-off-by: Sumitra Sharma <[email protected]>
Reviewed-by: Thomas Hellström <[email protected]>
Reviewed-by: Fabio M. De Francesco <[email protected]>
Signed-off-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
[tursulin: Removed blank line within tags. Fixup commit text.]
|
|
This workaround was already implemented for DG2, PVC, and some steppings
of MTL, but the workaround database has now been updated to extend this
workaround to TGL, RKL, DG1, and ADL.
v2:
- Skip readback verification for these extra gen12lp platforms. On
some of the platforms, the firmware locks this register, preventing
the driver from making any modifications. We should still try to
apply the workaround, but if the register is locked and the value
doesn't stick, that's semi-expected and not something we want to flag
as a driver error on debug builds.
Signed-off-by: Matt Roper <[email protected]>
Reviewed-by: Haridhar Kalvala <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
WA 14018778641 needs an update after recent
performance data on MTL, aligning driver here with
HW WA update.
Signed-off-by: Tejas Upadhyay <[email protected]>
Reviewed-by: Andi Shyti <[email protected]>
Signed-off-by: Andi Shyti <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
The scenario being fixed here is depicted in the following sequence-
modprobe i915
echo 1 > /sys/class/drm/card0/gt/gt0/slpc_ignore_eff_freq
echo 300 > /sys/class/drm/card0/gt_min_freq_mhz (RPn)
cat /sys/class/drm/card0/gt_cur_freq_mhz --> cur == RPn as expected
echo 1 > /sys/kernel/debug/dri/0/gt0/reset --> reset
cat /sys/class/drm/card0/gt_min_freq_mhz --> cached freq is RPn
cat /sys/class/drm/card0/gt_cur_freq_mhz --> it's not RPn, but RPe!!
When SLPC reinitializes, it sets SLPC min freq to efficient frequency.
Even if we disable efficient freq post that, we should restore the cached
min freq (via H2G) for it to take effect.
v2: Clarify commit message (Ashutosh)
Fixes: 95ccf312a1e4 ("drm/i915/guc/slpc: Allow SLPC to use efficient frequency")
Reviewed-by: Ashutosh Dixit <[email protected]>
Signed-off-by: Vinay Belgaumkar <[email protected]>
Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
intel_gsc_uc_fw_proxy_init_done is used by a few code paths
and usages. However, certain paths need a wakeref while others
can't take a wakeref such as from the runtime_pm_resume callstack.
Add a param into this helper to allow callers to direct whether
to take the wakeref or not. This resolves the following bug:
INFO: task sh:2607 blocked for more than 61 seconds.
Not tainted 6.3.0-pxp-gsc-final-jun14+ #297
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:sh state:D stack:13016 pid:2607 ppid:2602 flags:0x00004000
Call Trace:
<TASK>
__schedule+0x47b/0xe10
schedule+0x58/0xd0
rpm_resume+0x1cc/0x800
? __pfx_autoremove_wake_function+0x10/0x10
__pm_runtime_resume+0x42/0x80
__intel_runtime_pm_get+0x19/0x80 [i915]
gsc_uc_get_fw_status+0x10/0x50 [i915]
intel_gsc_uc_fw_init_done+0x9/0x20 [i915]
intel_gsc_uc_load_start+0x5b/0x130 [i915]
__uc_resume+0xa5/0x280 [i915]
intel_runtime_resume+0xd4/0x250 [i915]
? __pfx_pci_pm_runtime_resume+0x10/0x10
__rpm_callback+0x3c/0x160
Fixes: 8c33c3755b75 ("drm/i915/gsc: take a wakeref for the proxy-init-completion check")
Signed-off-by: Alan Previn <[email protected]>
Reviewed-by: Daniele Ceraolo Spurio <[email protected]>
Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
The mmap_offset_attach() function returns error pointers, it doesn't
return NULL.
Fixes: eaee1c085863 ("drm/i915: Add a function to mmap framebuffer obj")
Signed-off-by: Dan Carpenter <[email protected]>
Reviewed-by: Nirmoy Das <[email protected]>
Reviewed-by: Andi Shyti <[email protected]>
Signed-off-by: Nirmoy Das <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/ZH7tHLRZ9oBjedjN@moroto
|
|
The function is only defined if CONFIG_PROC_FS is enabled:
ld.lld: error: undefined symbol: i915_drm_client_fdinfo
>>> referenced by i915_driver.c
>>> drivers/gpu/drm/i915/i915_driver.o:(i915_drm_driver) in archive vmlinux.a
Use the PTR_IF() helper to make the reference NULL otherwise.
Fixes: e894b724c316d ("drm/i915: Use the fdinfo helper")
Signed-off-by: Arnd Bergmann <[email protected]>
Reviewed-by: Andi Shyti <[email protected]>
Acked-by: Randy Dunlap <[email protected]>
Tested-by: Randy Dunlap <[email protected]> # build-tested
Signed-off-by: Tvrtko Ursulin <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Smatch warns:
drivers/gpu/drm/i915/gt/uc/intel_huc.c:388
intel_huc_init() warn: missing error code 'err'
When the allocation of VMAs fail: The value of err is zero at this
point and it is passed to PTR_ERR and also finally returning zero which
is success instead of failure.
Fix this by adding the missing error code when VMA allocation fails.
Fixes: 08872cb13a71 ("drm/i915/mtl/huc: auth HuC via GSC")
Signed-off-by: Harshit Mogalapalli <[email protected]>
Reviewed-by: Daniele Ceraolo Spurio <[email protected]>
Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
Add a new debugfs to dump information about the GSC. This includes:
- the FW path and SW tracking status;
- the release, security and compatibility versions;
- the HECI1 status registers.
Note that those are the same registers that the mei driver dumps in
their own status sysfs on DG2 (where mei owns the GSC).
To make it simpler to loop through the status register, the code has
been update to use a PICK macro and the existing code using the regs had
been adapted to match.
v2: fix includes and copyright dates (Alan)
v3: actually fix the includes
Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Cc: Alan Previn <[email protected]>
Cc: John Harrison <[email protected]>
Reviewed-by: Alan Previn <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
The compatibility version is queried via an MKHI command. Right now, the
only existing interface is 1.0
This is basically the interface version for the GSC FW, so the plan is
to use it as the main tracked version, including for the binary naming
in the fetch code.
v2: use define for the object size (Alan)
Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Cc: Alan Previn <[email protected]>
Cc: John Harrison <[email protected]>
Reviewed-by: Alan Previn <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|
|
The release and security versions of the GSC binary are not used at
runtime to decide interface compatibility (there is a separate version
for that), but they're still useful for debug, so it is still worth
extracting them and printing them out in dmesg.
To get to these version, we need to navigate through various headers in
the binary. See in-code comment for details.
v2: fix and improve size checks when crawling the binary header, add
comment about the different version, wrap the partition base/offset
pairs in the GSC header in a struct (Alan)
Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Cc: Alan Previn <[email protected]>
Reviewed-by: Alan Previn <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
|