aboutsummaryrefslogtreecommitdiff
path: root/drivers/gpu
AgeCommit message (Collapse)AuthorFilesLines
2023-12-21drm/xe: clear the serviced bits on INTR_IDENTITY_REGJonathan Cavitt1-1/+1
The spec for this register, like many other interrupt related ones, asks software to write back '1' to clear the serviced bits. Let's respect the spec. v2: - Update commit message - Add missing CC Signed-off-by: Jonathan Cavitt <[email protected]> CC: Daniele Spurio Ceraolo <[email protected]> CC: Lucas De Marchi <[email protected]> CC: Rodrigo Vivi <[email protected]> CC: Paulo Zanoni <[email protected]> Reviewed-by: Rodrigo Vivi <[email protected]> Signed-off-by: Lucas De Marchi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: move the lmem verification code into a separate functionKoby Elbaz1-11/+21
If lmem (VRAM) is not fully initialized, the punit will power down the GT, which will prevent register access from the driver side. That code moved into a corresponding function (xe_verify_lmem_ready) to make the code clearer. Signed-off-by: Koby Elbaz <[email protected]> Reviewed-by: Lucas De Marchi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Fix unbind of unaccessed VMA (fault mode)Brian Welty1-4/+4
In fault mode, page table binding is deferred until fault handler. Thus vma->tile_present will be unset unless the VMA is accessed by GPU. During a later unbind, the logic doesn't account for the fact that local fence variable will be NULL in this case, leading to pass NULL into dma_fence_add_callback() and causing few WARN_ONs to print to console. The fix is already present in the code, just hoist the fence variable computation to be done earlier. Resolves warnings seen with igt@xe_exec_fault_mode@once-invalid-fault Signed-off-by: Brian Welty <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe/xelpmp: Add Wa_16021867713Gustavo Sousa2-0/+9
This workaround applies to all steppings of Xe_LPM+. Implement the KMD part. Reviewed-by: Matt Roper <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Gustavo Sousa <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Update SPDX deprecated license identifierThomas Hellström5-5/+5
The "GPL-2.0" SPDX license identifier is deprecated. Update the code to use "GPL-2.0-only" instead. Choose this identifier over "GPL-2.0-or-later" since it's the most restrictive of the two and it's not fully clear that "GPL-2.0" also allows "GPL-2.0-or-later". Cc: Francois Dugast <[email protected]> Cc: Rodrigo Vivi <[email protected]> Signed-off-by: Thomas Hellström <[email protected]> Reviewed-by: Francois Dugast <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Fix pagefault and access counter worker functionsBrian Welty1-34/+48
When processing G2H messages for pagefault or access counters, we queue a work item and call queue_work(). This fails if the worker thread is already queued to run. The expectation is that the worker function will do more than process a single item and return. It needs to either process all pending items or requeue itself if items are pending. But requeuing will add latency and potential context switch can occur. We don't want to add unnecessary latency and so the worker should process as many faults as it can within a reasonable duration of time. We also do not want to hog the cpu core, so here we execute in a loop and requeue if still running after more than 20 ms. This seems reasonable framework and easy to tune this futher if needed. This resolves issues seen with several igt@xe_exec_fault_mode subtests where the GPU will hang when KMD ignores a pending pagefault. v2: requeue the worker instead of having an internal processing loop. v3: implement hybrid model of v1 and v2 now, run for 20 msec before we will requeue if still running v4: only requeue in worker if queue is non-empty (Matt B) Signed-off-by: Brian Welty <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Reviewed-by: Stuart Summers <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: implement driver initiated function-resetAndrzej Hajda4-0/+90
Driver initiated function-reset (FLR) is the highest level of reset that we can trigger from within the driver. In contrast to PCI FLR it doesn't require re-enumeration of PCI BAR. It can be useful in case GT fails to reset. It is also the only way to trigger GSC reset from the driver and can be used in future addition of GSC support. v2: - use regs from xe_regs.h - move the flag to xe.mmio - call flr only on root gt - use BIOS protection check - copy/paste comments from i915 v3: - flr code moved to xe_device.c v4: - needs_flr_on_fini moved to xe_device Signed-off-by: Andrzej Hajda <[email protected]> Reviewed-by: Daniele Ceraolo Spurio <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: stringify the argument to avoid potential vulnerabilityCarlos Santa1-1/+1
This error gets printed inside a sandbox with warnings turned on. /mnt/host/source/src/third_party/kernel/v5.15/drivers/ gpu/drm/xe/xe_gt_idle_sysfs.c:87:26: error: format string is not a string literal (potentially insecure) [-Werror,-Wformat-security] return sysfs_emit(buff, gtidle->name); ^~~~~~~~~~~~ /mnt/host/source/src/third_party/kernel/v5.15/drivers/ gpu/drm/xe/xe_gt_idle_sysfs.c:87:26: note: treat the string as an argument to avoid this return sysfs_emit(buff, gtidle->name); ^ "%s", 1 error generated. CC: Rodrigo Vivi <[email protected]> Signed-off-by: Carlos Santa <[email protected]> Reviewed-by: Rodrigo Vivi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Add Wa_14019821291Matt Roper4-1/+29
This workaround is primarily implemented by the BIOS. However if the BIOS applies the workaround it will reserve a small piece of our DSM (which should be at the top, right below the WOPCM); we just need to keep that region reserved so that nothing else attempts to re-use it. v2 (Gustavo): - Check for NULL media_gt - Mask bits [5:0] to avoid potential issues in future platforms Signed-off-by: Matt Roper <[email protected]> Reviewed-by: Gustavo Sousa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lucas De Marchi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe/mtl: Use 16.67 Mhz freq scale factor to get rpXBadal Nilawar1-5/+5
For mtl and above 16.67 Mhz is the scale factor to calculate rpX frequencies. v2: Fix review comment (Ashutosh) Signed-off-by: Badal Nilawar <[email protected]> Reviewed-by: Ashutosh Dixit <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe/xe2: Program correct MOCS registersMatt Roper1-11/+13
The LNCFCMOCS registers no longer exist on Xe2 so there's no need to attempt to program them. Since GLOB_MOCS is the only set of MOCS registers now, it's expected to be used for all platforms (both igpu and dgpu) going forward, so adjust the MOCS programming flags accordingly. v2: - Fix typo (global mocs condition is >=, not >) Bspec: 71582 Reviewed-by: Pallavi Mishra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Matt Roper <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Fix dequeue of access counter work itemBrian Welty1-2/+5
The access counters worker function is fixed to advance the head pointer when dequeuing from the acc_queue. This now matches the similar logic in get_pagefault(). Signed-off-by: Bruce Chang <[email protected]> Signed-off-by: Brian Welty <[email protected]> Reviewed-by: Stuart Summers <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Adjust tile_present mask when skipping rebindsMatthew Brost1-2/+13
If a rebind is skipped the tile_present mask needs to be updated for the newly created vma to properly reflect the state of the vma. Reported-by: <[email protected]> Signed-off-by: Matthew Brost <[email protected]> Reviewed-by: Thomas Hellström <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Dump CTB during TLB timeoutPallavi Mishra1-0/+2
Print CTB info during TLB invalidation timeout event. Reviewed-by: Matthew Brost <[email protected]> Signed-off-by: Pallavi Mishra <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Replace usage of mem_type_to_tileBrian Welty1-25/+38
Currently mem_type_to_tile() is being used to access the tile's underlying tile.mem.vram. However, this function makes the assumption that a mem_type will only ever map to a single tile. Now that the TTM vram manager contains a pointer to the memory_region, make use of this in xe_bo.c. As such, introduce a helper function res_to_mem_region() to get the ttm_vram_mgr->vram from the BO's resource, and use this to replace usage of mem_type_to_tile(). xe_tile is still needed to choose the migration context, so this part is unchanged. But as this is only renaming usage, function is renamed now to mem_type_to_migrate(). Signed-off-by: Brian Welty <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Remove unused xe_bo_to_tileBrian Welty2-15/+0
Unused and would like to remove the memtype_to_tile() which it calls. Signed-off-by: Brian Welty <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Replace xe_ttm_vram_mgr.tile with xe_mem_regionBrian Welty2-7/+6
Replace the xe_ttm_vram_mgr.tile pointer with a xe_mem_region pointer instead. The former is currently unused. TTM VRAM regions are exposing device vram and is better to store a pointer directly to the xe_mem_region instead of the tile. This allows to cleanup unnecessary usage of xe_tile from xe_bo.c in later patch. Signed-off-by: Brian Welty <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe/hwmon: Expose power1_max_intervalBadal Nilawar2-1/+160
Expose power1_max_interval, that is the tau corresponding to PL1, as a custom hwmon attribute. Some bit manipulation is needed because of the format of PKG_PWR_LIM_1_TIME in PACKAGE_RAPL_LIMIT register (1.x * power(2,y)) v2: Get rpm wake ref while accessing power1_max_interval v3: %s/hwmon/xe_hwmon/ v4: - As power1_max_interval is rw attr take lock in read function as well - Refine comment about val to fix point conversion (Andi) - Update kernel version and date in doc v5: Fix review comments (Anshuman) Acked-by: Rodrigo Vivi <[email protected]> Reviewed-by: Himal Prasad Ghimiray <[email protected]> Reviewed-by: Anshuman Gupta <[email protected]> Signed-off-by: Badal Nilawar <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe/hwmon: Protect hwmon rw attributes with hwmon_lockBadal Nilawar1-11/+24
Take hwmon_lock while accessing hwmon rw attributes. For readonly attributes its not required to take lock as reads are protected by sysfs layer and therefore sequential. Cc: Ashutosh Dixit <[email protected]> Cc: Anshuman Gupta <[email protected]> Signed-off-by: Badal Nilawar <[email protected]> Reviewed-by: Anshuman Gupta <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe/hwmon: Add kernel doc and refactor xe hwmonBadal Nilawar1-110/+91
Add kernel doc and refactor some of the hwmon functions, there is no functionality change. Cc: Anshuman Gupta <[email protected]> Cc: Ashutosh Dixit <[email protected]> Signed-off-by: Badal Nilawar <[email protected]> Reviewed-by: Anshuman Gupta <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe/bo: sync kernel fences for KMD buffersMatthew Auld1-0/+31
With things like pipelined evictions, VRAM pages can be marked as free and yet still have some active kernel fences, with the idea that the next caller to allocate the memory will respect them. However it looks like we are missing synchronisation for KMD internal buffers, like page-tables, lrc etc. For userspace objects we should already have the required synchronisation for CPU access via the fault handler, and likewise for GPU access when vm_binding them. To fix this synchronise against any kernel fences for all KMD objects at creation. This should resolve some severe corruption seen during evictions. v2 (Matt B): - Revamp the comment explaining this. Also mention why USAGE_KERNEL is correct here. v3 (Thomas): - Make sure to use ctx.interruptible for the wait. Testcase: igt@xe-evict-ccs Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/853 Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/855 Reported-by: Zbigniew Kempczyński <[email protected]> Signed-off-by: Matthew Auld <[email protected]> Cc: Thomas Hellström <[email protected]> Cc: Matthew Brost <[email protected]> Reviewed-by: Thomas Hellström <[email protected]> Tested-by: Zbigniew Kempczyński <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe/bo: consider dma-resv fences for clear jobMatthew Auld1-2/+14
There could be active fences already in the dma-resv for the object prior to clearing. Make sure to input them as dependencies for the clear job. v2 (Matt B): - We can use USAGE_KERNEL here, since it's only the move fences we care about here. Also add a comment. Signed-off-by: Matthew Auld <[email protected]> Cc: Thomas Hellström <[email protected]> Cc: Matthew Brost <[email protected]> Reviewed-by: Thomas Hellström <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe/migrate: fix MI_ARB_ON_OFF usageMatthew Auld2-16/+2
Spec says: "This is a privileged command; it will not be effective (will be converted to a no-op) if executed from within a non-privileged batch buffer." However here it looks like we are just emitting it inside some bb which was jumped to via the ppGTT, which should be considered a non-privileged address space. It looks like we just need some way of preventing things like the emit_pte() and later copy/clear being preempted in-between so rather just emit directly in the ring for migration jobs. Bspec: 45716 Signed-off-by: Matthew Auld <[email protected]> Cc: Thomas Hellström <[email protected]> Cc: Matthew Brost <[email protected]> Reviewed-by: Matt Roper <[email protected]> Reviewed-by: Thomas Hellström <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe/xelpmp: Extend Wa_22016670082 to Xe_LPM+Shekhar Chauhan2-0/+8
Add Xe_LPM+ support to an existing workaround. BSpec: 51762 Signed-off-by: Shekhar Chauhan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Matt Roper <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe/xe_exec_queue: Add check for access counter granularityPriyanka Dandamudi1-0/+3
Add conditional check for access counter granularity. This check will return -EINVAL if granularity is beyond 64M which is a hardware limitation. v2: Defined XE_ACC_GRANULARITY_128K 0 XE_ACC_GRANULARITY_2M 1 XE_ACC_GRANULARITY_16M 2 XE_ACC_GRANULARITY_64M 3 as part of uAPI. So, that user can also use it.(Oak) v3: Move uAPI to proper location and give proper documentation.(Brian, Oak) Cc: Oak Zeng <[email protected]> Cc: Janga Rahul Kumar <[email protected]> Cc: Brian Welty <[email protected]> Signed-off-by: Priyanka Dandamudi <[email protected]> Reviewed-by: Oak Zeng <[email protected]> Reviewed-by: Oak Zeng <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Fix WA 14010918519 write to wrong registerLucas De Marchi1-2/+2
FORCE_SLM_FENCE_SCOPE_TO_TILE and FORCE_UGM_FENCE_SCOPE_TO_TILE are in the up dword of LSC_CHICKEN_BIT_0 register. Also, the 14010918519 workaround only applies to early steppings, A*. Eventually those should be dropped, like they were in commit eaeb4b361452 ("drm/i915/dg2: Drop pre-production GT workarounds"), so let's make sure they are annotated appropriately. Reviewed-by: Gustavo Sousa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lucas De Marchi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe/huc: Define HuC for MTLDaniele Ceraolo Spurio1-6/+7
MTL uses a versionless GSC-enabled binary. v2: don't use the filename to identify the header type (Lucas) v3: fix commit msg (Lucas) Signed-off-by: Daniele Ceraolo Spurio <[email protected]> Cc: Alan Previn <[email protected]> Cc: John Harrison <[email protected]> Cc: Lucas De Marchi <[email protected]> Reviewed-by: Lucas De Marchi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe/huc: Don't re-auth HuC if it's already authenticatedDaniele Ceraolo Spurio1-0/+6
On newer platforms the HuC survives reset and stays authenticated, so no need to re-authenticate it. Signed-off-by: Daniele Ceraolo Spurio <[email protected]> Cc: Alan Previn <[email protected]> Cc: John Harrison <[email protected]> Reviewed-by: Lucas De Marchi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe/huc: HuC is not supported on GTs that don't have video enginesDaniele Ceraolo Spurio1-1/+10
On MTL-style multi-gt platforms, the HuC is only available on the media GT, so we need to consider it as not supported on the render GT. Signed-off-by: Daniele Ceraolo Spurio <[email protected]> Cc: Alan Previn <[email protected]> Cc: John Harrison <[email protected]> Reviewed-by: Lucas De Marchi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe/huc: Extract version and binary offset from new HuC headersDaniele Ceraolo Spurio4-3/+241
The GSC-enabled HuC binary starts with a GSC header, which is followed by the legacy-style CSS header and the binary itself. We can parse the GSC headers to find the HuC version and the location of the binary to be used for the DMA transfer. The parsing function has been designed to be re-used for the GSC binary, so the entry names are external parameters (because the GSC uses different ones) and the CSS entry is optional (because the GSC doesn't have it). v2: move new code to uc_fw.c, better comments and error checking, split old code move to separate patch (Lucas), move headers and documentation to uc_fw_abi.h. v3: use 2 separate loops, rework marker check (Lucas) Signed-off-by: Daniele Ceraolo Spurio <[email protected]> Cc: Alan Previn <[email protected]> Cc: John Harrison <[email protected]> Cc: Lucas De Marchi <[email protected]> Reviewed-by: Lucas De Marchi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe/uc: Prepare for parsing of different header typesDaniele Ceraolo Spurio2-53/+70
GSC binaries and newer HuC ones use GSC-style headers instead of the CSS. In preparation for adding support for such parsing, split out the current parsing code to its own function, to make it cleaner to add the new paths. The existing doc section has also been renamed to narrow it to CSS-based binaries. v2: new patch in series, split out from next patch for easier reviewing v3: drop unneeded include (Lucas) Signed-off-by: Daniele Ceraolo Spurio <[email protected]> Cc: Alan Previn <[email protected]> Cc: John Harrison <[email protected]> Cc: Lucas De Marchi <[email protected]> Reviewed-by: Lucas De Marchi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Extend rpX values extraction for future platformsBadal Nilawar1-3/+3
In existing code flow for future platforms i.e. >1270, the rpX (rp0,rpn and rpe) fused values are read from gen 6 registers. Which is not correct. Unless specified gen 1270 regs should be valid for gen 1270+ platforms as well. Signed-off-by: Badal Nilawar <[email protected]> Reviewed-by: Anshuman Gupta <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe/mocs: MOCS registers are multicast on Xe_HP and beyondMatt Roper3-11/+20
The MOCS registers should be written in an MCR-specific manner on Xe_HP and beyond to prevent any other driver threads or external firmware from putting the hardware into unicast mode while we initialize the MOCS table. Bspec: 66534, 67609, 71185 Cc: Ruthuvikas Ravikumar <[email protected]> Reviewed-by: Gustavo Sousa <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Matt Roper <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe/xe2: Update SVG state handlingMatt Roper2-2/+14
As with DG2/MTL, Xe2 also fails to emit instruction headers for SVG state instructions if no explicit state has been set. The SVG part of the LRC is nearly identical to DG2/MTL; the only change is that 3DSTATE_DRAWING_RECTANGLE has been replaced by 3DSTATE_DRAWING_RECTANGLE_FAST, so we can just re-use the same state table and handle that single instruction when we encounter it. Bspec: 65182 Reviewed-by: Balasubramani Vivekanandan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Matt Roper <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Emit SVG state on RCS during driver load on DG2 and MTLMatt Roper3-1/+162
When recording the default LRC, the expectation is that the hardware's original state settings (both register and instruction) will be written out to the LRC upon first context switch. For many 3DSTATE_* state instructions that don't truly have "default" values, this translates to a simple instruction header (opcodes + dword length) being written to the LRC, followed by an appropriate number of blank dwords as a place holder. When userspace creates a context (which starts as a copy of the default LRC), they'll generally emit real 3DSTATE_* as part of their initialization to select the settings they desire. If they don't emit one of the 3DSTATE instructions, then the zeroed dwords that remain in their LRC image generally translate to various state remaining disabled. This will either be what userspace wants or will lead to very reproducible and easily-debugged problems (rendering glitches, engine hangs). It turns out that a subset of the 3DSTATE instructions, specifically those belonging to the SVG (State Variable - Global) unit, are not only emitting 0's for the instruction's "body" dwords, but also for the instruction header dword if no specific state has been explicitly set before context switch. This means that when the hardware switches to a context that hasn't explicitly provided an appropriate state setting, the hardware will just see a sequence of NOOPs in the spot reserved for that 3DSTATE instruction while executing the LRC, and the actual hardware state setting will unintentionally inherit the configuration used by the previously running context. Now when userspace makes a mistake and forgets to emit an important state instruction they no longer get consistent, easily-reproducible corruption/hangs, but rather erratic behavior where the presence/absence of a problem depends on what other workloads are running on the system and what order the contexts are scheduled on the engine. A specific example of this that came up recently related to mesh shading The OpenGL driver was not specifically emitting a 3DSTATE_MESH_CONTROL to disable mesh shading at context init, so on context switch, mesh shading would either be on or off depending on what the previous context had been doing. Vulkan apps _were_ enabling mesh shading, so running a Vulkan app and then context switching to an OpenGL app resulted in mesh shading still unexpectedly being enabled during OpenGL operation, and since other Mesh-related state was not properly initialized for that context a GPU hang was seen. Due to the specific ordering requirements (Vulkan app runs first, followed by OpenGL app), it took additional debug effort to track down the cause of the problem. There are various workarounds related to this behavior, with current implementations handled in the userspace drivers. E.g., Wa_14019789679 and Wa_22018402687. However it's been suggested that the kernel driver can help simplify things here by emitting zeroed SVG state with proper instruction headers as part of our default context creation (i.e., at the same point we apply LRC workarounds). This will help ensure that any future cases where a userspace driver does not emit an important state setting will result in consistent behavior. Bspec: 46261 Reviewed-by: Balasubramani Vivekanandan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Matt Roper <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Prepare to emit non-register state while recording default LRCMatt Roper3-1/+57
On some platforms we need to emit some non-register state while recording an engine class' default LRC. Add the infrastructure to support this; actual per-platform tables will be added in future patches. v2: - Checkpatch whitespace fix - Add extra assertion to ensure num_dw != 0. (Bala) Reviewed-by: Balasubramani Vivekanandan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Matt Roper <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe/trace: Optimize trace definitionBalasubramani Vivekanandan1-47/+36
Make use of EVENT_CLASS to group similar trace events Signed-off-by: Balasubramani Vivekanandan <[email protected]> Reviewed-by: Haridhar Kalvala <[email protected]> Link: https://lore.kernel.org/intel-xe/[email protected]/ Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Add event tracing for CTBBalasubramani Vivekanandan2-2/+52
Event tracing enabled for CTB submissions. Additional minor refactor - Removed a unnecessary ct_to_xe() call. v2: Remove a unwanted comment (Hari) Add missing change to commit message Signed-off-by: Balasubramani Vivekanandan <[email protected]> Reviewed-by: Haridhar Kalvala <[email protected]> Link: https://lore.kernel.org/intel-xe/[email protected]/ Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Add performance tuning settings for MTL and Xe2Shekhar Chauhan2-0/+25
Add L3SQCREG5 as part of HW recommended settings. The recommended value in Bspec is 00e0007f. For Xe2-LPG, bits 23:21 don't exist anymore, but it's confirmed with HW engineers that setting them doesn't do anything. They still exist on the media GT, Xe2-LPM, but they are already they are already set as per HW default value. So for Xe2 platform, the only bits that need to be set are 9:0 since HW's default is 0x1ff and the recommended value is 0x7f. Unlike most registers, which have the same relative offset on both the primary and media GT, this register has a different base offset on the media GT. On MTL the register only exists for the primary (graphics) GT, so there's no need to program it on the media gt. Also, it's part of the RCS engine's context, so it needs to be added as a LRC workaround. Bspec: 72161 Signed-off-by: Shekhar Chauhan <[email protected]> Reviewed-by: Matt Roper <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lucas De Marchi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe/xe2: Add initial workaroundsDnyaneshwar Bhadane2-0/+98
Add the initial collection of gt/engine/lrc workarounds. While at it, add some newlines around the platform/IP comments to make them consistent across all workarounds. v2: - FF_MODE is an MCR register (Matt Roper) - Group 18032247524 with other Xe2 workarounds (Matt Roper) - Move WA changing PSS_CHICKEN to lrc_was[] as for Xe2 that register is part of the render context image (Matt Roper) - Apply WA 16020518922 only on render engine (Matt Roper) Signed-off-by: Dnyaneshwar Bhadane <[email protected]> Signed-off-by: Shekhar Chauhan <[email protected]> Reviewed-by: Matt Roper <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lucas De Marchi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Simplify xe_res_get_buddy()Brian Welty1-6/+1
We can remove the unnecessary indirection thru xe->tiles[] to get the TTM VRAM manager. This code can be common for VRAM and STOLEN. Signed-off-by: Brian Welty <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: fix pat[2] programming with 2M/1G pagesMatthew Auld2-6/+12
Bit 7 in the leaf node is normally programmed with pat[2], however with 2M/1G pages that same bit in the PDE/PDPE also toggles 2M/1G pages. For 2M/1G entries the pat[2] is rather moved to bit 12, which is now free given that the address must be aligned to 2M or 1G, leaving bit 7 for toggling 2M/1G pages. Bspec: 59510, 45038 Signed-off-by: Matthew Auld <[email protected]> Cc: Lucas De Marchi <[email protected]> Cc: Matt Roper <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Reviewed-by: Lucas De Marchi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe/guc: Bump PVC GuC version to 70.9.1Daniele Ceraolo Spurio1-1/+1
The PVC GuC version that we're currently using (70.6.4) has a known issue that leads to dropping the disabling of contexts that have pending page faults. This is fixed in newer blobs, so we need to update to a more recent release. Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/696 Signed-off-by: Daniele Ceraolo Spurio <[email protected]> Cc: John Harrison <[email protected]> Reviewed-by: John Harrison <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe/uapi: Fix naming of XE_QUERY_CONFIG_MAX_EXEC_QUEUE_PRIORITYFrancois Dugast1-1/+1
This is used for the priority of an exec queue (not an engine) and should be named accordingly. Signed-off-by: Francois Dugast <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]> Reviewed-by: Matthew Brost <[email protected]>
2023-12-21drm/xe/uapi: Rename gts to gt_listRodrigo Vivi1-20/+20
During the uapi review it was identified a possible confusion with the plural of acronym with a new acronym. So the recommendation is to go with gt_list instead. Suggested-by: Matt Roper <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]> Reviewed-by: Matt Roper <[email protected]> Signed-off-by: Francois Dugast <[email protected]>
2023-12-21drm/xe/uapi: Replace useless 'instance' per unique gt_idRodrigo Vivi3-6/+2
Let's have a single GT ID per GT within the PCI Device Card. Signed-off-by: Rodrigo Vivi <[email protected]> Signed-off-by: Francois Dugast <[email protected]> Reviewed-by: Matthew Brost <[email protected]>
2023-12-21drm/xe: Fix VM bind out-sync signaling orderingMatthew Brost4-8/+125
A case existed where an out-sync of a later VM bind operation could signal before a previous one if the later operation results in a NOP (e.g. a unbind or prefetch to a VA range without any mappings). This breaks the ordering rules, fix this. This patch also lays the groundwork for users to pass in num_binds == 0 and out-syncs. Signed-off-by: Matthew Brost <[email protected]> Reviewed-by: Thomas Hellström <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Remove async worker and rework sync bindsMatthew Brost8-491/+121
Async worker is gone. All jobs and memory allocations done in IOCTL to align with dma fencing rules. Async vs. sync now means when do bind operations complete relative to the IOCTL. Async completes when out-syncs signal while sync completes when the IOCTL returns. In-syncs and out-syncs are only allowed in async mode. If memory allocations fail in the job creation step the VM is killed. This is temporary, eventually a proper unwind will be done and VM will be usable. Signed-off-by: Matthew Brost <[email protected]> Reviewed-by: Thomas Hellström <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe/uapi: Kill DRM_XE_UFENCE_WAIT_VM_ERRORMatthew Brost3-52/+5
This is not used nor does it align VM async document, kill this. Signed-off-by: Matthew Brost <[email protected]> Reviewed-by: Thomas Hellström <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Kill XE_VM_PROPERTY_BIND_OP_ERROR_CAPTURE_ADDRESS extensionRodrigo Vivi1-126/+3
This extension is currently not used and it is not aligned with the error handling on async VM_BIND. Let's remove it and along with that, since it was the only extension for the vm_create, remove VM extension entirely. v2: rebase on top of the removal of drm_xe_ext_exec_queue_set_property Cc: Thomas Hellström <[email protected]> Signed-off-by: Francois Dugast <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]> Reviewed-by: Matthew Brost <[email protected]>