aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-12-19drm/xe/guc: Remove i915_regs.h includeLucas De Marchi1-1/+1
i915_regs.h is not needed, particularly in a header file. What is needed is i915_reg_defs.h for use of _MMIO() and similar macros. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe: Remove outdated build workaroundLucas De Marchi1-8/+2
Use the more common "call cc-disable-warning" way to disable warnings. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe: Remove duplicate media_verLucas De Marchi3-4/+1
media_verx100 supersedes the info from media_ver. Leave media_ver in the struct xe_device_desc, used in xe_pci.c since it's easier to define common parts of the platforms like that. However all the rest of the driver should be using media_verx100 that is more future proof. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/216 Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe: Add missing include xe_wait_user_fence.hLucas De Marchi1-0/+2
Make xe_wait_user_fence.c include xe_wait_user_fence.h so it doesn't rely on indirect includes and also doesn't fail the build due to missing prototype for xe_wait_user_fence_ioctl(). Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe: Add missing doc for xe parameterLucas De Marchi1-0/+1
Fix the following warning: ../drivers/gpu/drm/xe/xe_ttm_stolen_mgr.c:55: warning: Function parameter or member 'xe' not described in 'xe_ttm_stolen_cpu_inaccessible' Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe: Remove unused functionsLucas De Marchi1-25/+0
xe_gt_topology_dss_group_mask and xe_gt_topology_count_dss are probably leftover from initial implementation - they are not called from anywhere. Remove those functions. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe: Fix application of LRC tuningsLucas De Marchi3-1/+17
LRC tunings were added after the gt ones and didn't add the call in xe_gt_record_default_lrcs() to process them like is done for workarounds. Add such a function and call it from xe_gt_record_default_lrcs(). Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe: Make local functions staticLucas De Marchi4-9/+9
A few static functions not being declared like that break the build with W=1, like e.g. cc1: all warnings being treated as errors make[2]: *** [../scripts/Makefile.build:250: drivers/gpu/drm/xe/xe_gt.o] Error 1 ../drivers/gpu/drm/xe/xe_guc.c:240:6: error: no previous prototype for ‘guc_write_params’ [-Werror=missing-prototypes] 240 | void guc_write_params(struct xe_guc *guc) | ^~~~~~~~~~~~~~~~ Make them static. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe/query: zero the region infoMatthew Auld1-5/+1
There are also some reserved fields in here which are not currently cleared when handing back to userspace. Otherwise we might run into issues if we later wish to use them. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Lucas De Marchi lucas.demarchi@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe/stolen: don't map stolen on small-barMatthew Auld3-14/+37
The driver should still be functional with small-bar, just that the vram size is clamped to the BAR size (until we add proper support for tiered vram). For stolen vram we shouldn't iomap anything if the BAR size doesn't also contain the stolen portion, since on discrete the stolen portion is always at the end of normal vram. Stolen should still be functional, just that allocating CPU visible io memory will always return an error. v2 (Lucas) - Mention in the commit message that stolen vram is always as the end of normal vram, which is why stolen in not mappable on small-bar systems. - Just make xe_ttm_stolen_inaccessible() return true for such cases. Also rename to xe_ttm_stolen_cpu_inaccessible to better describe that we are talking about direct CPU access. Plus add some kernel-doc. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/209 Reported-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe/mmio: fix forcewake ref leak in xe_mmio_ioctlMatthew Auld1-14/+14
Make sure we properly release the forcewake ref on all error paths. v2(Lucas): - Make it less verbose and just fold the unimplemented options into the default. The exact return value doesn't seem to matter for the corresponding IGT. - Replace the user triggerable WARN() with drm_dbg(). Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe: Remove unseless xe_force_wake_prune.Rodrigo Vivi3-19/+0
(!(gt->info.engine_mask & BIT(i))) cases are already handled in the init function. And these masks are not modified between the init and the prune. Suggested-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
2023-12-19drm/xe: Update the list of devices to add even more TGL devicesCarlos Santa2-3/+3
The list of GTs got splitted a while back between GT1 and GT2 on TGL. References: https://patchwork.freedesktop.org/patch/388414/ CC: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Carlos Santa <carlos.santa@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe: Initialize ret in mcr_lock()José Roberto de Souza1-1/+1
ret is not initialized in mcr_lock() when running in platforms with graphics IP version < 1270, this could cause drm_WARN_ON_ONCE() to hit eventually(what just happened to me). Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe/rtp: Support multiple actions per entryLucas De Marchi6-135/+179
Just like there is support for multiple rules per entry in an rtp table, also support multiple actions. This makes it easier to add support for workarounds that need to change multiple registers. It also makes it slightly more readable as now the action part resembles the rule part. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe/rtp: Split action and entry flagsLucas De Marchi5-46/+80
Entry flags is meant for the whole entry, including the rule evaluation. Action flags are for flags applied to the register or action being taken. Since there's only one action per entry, the distinction was not important and a u8 was spared. However more and more workarounds are needing multiple actions. This prepares for multiple action support. Right now there are these action flags: - XE_RTP_ACTION_FLAG_MASKED_REG: register in the action is a masked register - XE_RTP_ACTION_FLAG_ENGINE_BASE: the engine base should be added to the register in order to form the real address And this entry flag: - XE_RTP_ENTRY_FLAG_FOREACH_ENGINE: the rules should be evaluated for each engine on the gt. It also automatically implies XE_RTP_ACTION_FLAG_ENGINE_BASE. Since there are likely not that many rules, reduce n_rules to u8 so the overall entry size doesn't increase more than needed. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe: Rename xe_rtp_regval to xe_rtp_actionLucas De Marchi3-24/+27
It's true that the struct records the register and the value (in form of 2 masks) to restore, but it also records more fields important to the application of workarounds/tuning, etc. One important part is what is the macro used to record these fields: SET/CLR/WR/FIELD_SET/etc. Thinking of the table as a set of rules + actions is more intuitive than rules + regval. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe/mcr: Add SQIDI steering for DG2Lucas De Marchi4-32/+59
Like detailed in commit 927dfdd09d8c ("drm/i915/dg2: Add SQIDI steering"), some registers are expected to have the selector initialized just once and never set to anything else. For xe, the registers with SQIDI replication type (SF and MCFG) were missing, resulting in warnings like: [ 410.685565] xe 0000:03:00.0: Did not find MCR register 0x8724 in any MCR steering table While adding these registers, abstract the handling for "dg2_gam_ranges", moving them together with SF/MCFG to a dedicated table. This also avoids that range to be checked for platforms other than DG2. For DG2, this is the new steering output: # cat /sys/kernel/debug/dri/0/gt0/steering ... IMPLICIT steering: group=0x0, instance=0x0 0x000b00 - 0x000bff 0x001000 - 0x001fff 0x004000 - 0x004aff 0x008700 - 0x0087ff 0x00c800 - 0x00cfff 0x00f000 - 0x00ffff Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe/mcr: Use designated init for xe_steering_typesLucas De Marchi1-6/+6
There is already a BUILD_BUG_ON() check to make sure the size follow the number of steering types. Also make sure the right index is being used for each steering type. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe: Remove TODO from workaround documentationLucas De Marchi1-3/+0
LRC workarounds are already implemented: remove leftover TODO. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe: Remove TODO from rtp infraLucas De Marchi1-4/+0
The function pointer is already present as match_func, inside struct xe_rtp_rule and handled as so instead of inside rtp_regval as originally thought out when this was written. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe: Fix xe_tuning includeLucas De Marchi1-1/+1
xe_tuning.c should include xe_tuning.h, not xe_wa.h Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe: Fix typo in MCR documentationLucas De Marchi1-5/+5
Add missing "multicast" word and adapt/wrap the rest of the sentence. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe: Add debugfs for dumping GGTT mappingsMaarten Lankhorst3-0/+27
Adding a debugfs dump of GGTT was useful for some debugging I did, and easy to add. Might be useful for others too. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe: Drop TLB invalidation from ring operationsMatthew Brost1-39/+1
Now that we issue TLB invalidations on unbinds and rebind from execs we no longer need to issue TLB invalidations from the ring operations. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
2023-12-19drm/xe: Add TLB invalidation fence after rebinds issued from execsMatthew Brost1-90/+110
If we add an TLB invalidation fence for rebinds issued from execs we should be able to drop the TLB invalidation from the ring operations. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
2023-12-19drm/xe: Add has_asid to device infoMatthew Brost4-4/+9
Rather than alias supports_usm to ASIS support, add an explicit variable to indicate ASID support. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
2023-12-19drm/xe: Signal invalidation fence immediately if CT send failsMatthew Brost1-9/+14
This means we are in the middle of a GT reset and no need to do TLB invalidation so just signal invalidation fence immediately. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
2023-12-19drm/xe: Propagate VM unbind error to invalidation fenceMatthew Brost1-1/+7
If a VM unbind hits an error, do not issue a TLB invalidation and propagate the error the invalidation fence. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
2023-12-19drm/xe: Lock GGTT on when restoring kernel BOsMatthew Brost1-1/+4
Make lockdep happy as we required to hold the GGTT when calling xe_ggtt_map_bo. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
2023-12-19drm/xe: Use GuC to do GGTT invalidations for the GuC firmwareMatthew Brost13-21/+89
Only the GuC should be issuing TLB invalidations if it is enabled. Part of this patch is sanitize the device on driver unload to ensure we do not send GuC based TLB invalidations during driver unload. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
2023-12-19drm/xe: Propagate error from bind operations to async fenceMatthew Brost1-1/+8
If an bind operation fails we need to report it via the async fence. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
2023-12-19drm/xe: Add range based TLB invalidationsMatthew Brost5-25/+84
If the platform supports range based TLB invalidations use them. Hide these details in the xe_gt_tlb_invalidation layer. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
2023-12-19drm/xe: Add has_range_tlb_invalidation device attributeMatthew Brost2-0/+6
This will help implementing range based TLB invalidations. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
2023-12-19drm/xe: Delete debugfs entry to issue TLB invalidationMatthew Brost1-24/+0
Not used, let's remove this. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
2023-12-19drm/xe: Only set VM->asid for platforms that support a ASIDMatthew Brost1-13/+17
This will help with TLB invalidation as the ASID in TLB invalidate should be zero for platforms that do not support a ASID. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
2023-12-19drm/xe: Add TDR for invalidation fence timeout cleanupMatthew Brost4-5/+65
Endless fences are not good, add a TDR to cleanup any invalidation fences which have not received an invalidation message within a timeout period. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
2023-12-19drm/xe: Add TLB invalidation fence ftraceMatthew Brost3-0/+60
This will help debug issues with TLB invalidation fences. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe: Kernel doc GT TLB invalidationsMatthew Brost1-1/+51
Document all exported functions. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe: Invalidate TLB after unbind is completeMatthew Brost3-0/+107
This gets tricky as we can't do the TLB invalidation until the unbind operation is done on the hardware and we can't signal the unbind as complete until the TLB invalidation is done. To work around this we create an unbind fence which does a TLB invalidation after unbind is done on the hardware, signals on TLB invalidation completion, and this fence is installed in the BO dma-resv slot and installed in out-syncs for the unbind operation. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Suggested-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com Suggested-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe: Add TLB invalidation fenceMatthew Brost8-7/+80
Fence will be signaled when TLB invalidation completion. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Suggested-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe: Move TLB invalidation variable to own sub-structure in GTMatthew Brost3-23/+25
TLB invalidations no longer just restricted to USM, move the variables to own sub-structure. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe: Break of TLB invalidation into its own fileMatthew Brost9-99/+146
TLB invalidation is used by more than USM (page faults) so break this code out into its own file. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe: Don't process TLB invalidation done in CT fast-pathMatthew Brost1-1/+8
We can't currently do this due to TLB invalidation done handler expecting the seqno being received in-order, with the fast-path a TLB invalidation done could pass one being processed in the slow-path in an extreme corner case. Remove TLB invalidation done from the fast-path for now and in a follow up reenable this once the TLB invalidation done handler can deal with out of order seqno. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe/migrate: Update emit_pte to cope with a size level than 4kMatthew Brost1-11/+6
emit_pte assumes the size argument is 4k aligned, this may not be true for the PTEs emitted for CSS as seen by below call stack: [ 56.734228] xe_migrate_copy:585: size=327680, ccs_start=327680, css_size=1280,4096 [ 56.734250] xe_migrate_copy:643: size=262144 [ 56.734252] emit_pte:404: ptes=64 [ 56.734255] emit_pte:418: chunk=64 [ 56.734257] xe_migrate_copy:650: size=1024 @ CCS emit PTE [ 56.734259] emit_pte:404: ptes=1 [ 56.734261] emit_pte:418: chunk=1 [ 56.734339] xe_migrate_copy:643: size=65536 [ 56.734342] emit_pte:404: ptes=16 [ 56.734344] emit_pte:418: chunk=16 [ 56.734346] xe_migrate_copy:650: size=256 # CCS emit PTE [ 56.734348] emit_pte:404: ptes=1 [ 56.734350] emit_pte:418: chunk=1 [ 56.734352] xe_res_next:174: size=4096, remaining=0 Update emit_pte to handle sizes less than 4k. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe/ggtt: fix GGTT scratch usage for DG2Matthew Auld1-10/+18
Scratch page is in VRAM, and therefore requires 64K GTT layout. In GGTT world this just means having 16 consecutive entries, with 64K GTT alignment for the GTT address of the first entry (also matching physical alignment). However to keep things simple just dump it into system memory, like we already do for ppGTT. While we are here, also give it known default value. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe/ggtt: fix alignment usage for DG2Matthew Auld2-4/+24
Spec says we need to use 64K VRAM pages for GGTT on platforms like DG2. In GGTT this just means aligning the GTT address to 64K and ensuring that we have 16 consecutive entries each pointing to the respective 4K entry. We already ensure we have 64K pages underneath, so it's just a case of forcing the GTT alignment. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe/ppgtt: fix scratch page usage on DG2Matthew Auld1-5/+13
On DG2 when running the xe_vm IGT, the kernel generates loads of CAT errors and GT resets (sometimes at least). On small-bar systems seems to trigger a lot more easily (maybe due to difference in allocation strategy). Appears to be related to scratch, since we seem to use the 64K TLB hint on scratch entries, even though the scratch page is a 4K vram page. Bumping the scratch page size and physical alignment seems to fix it. Or at least we no longer hit: [ 148.872683] xe 0000:03:00.0: [drm] Engine memory cat error: guc_id=0 [ 148.872701] xe 0000:03:00.0: [drm] Engine memory cat error: guc_id=0 [ 148.875108] WARNING: CPU: 0 PID: 953 at drivers/gpu/drm/xe/xe_guc_submit.c:797 However to keep things simple, so we don't have to deal with 64K TLB hints, just move the scratch page into system memory on platforms that require 64K VRAM pages. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe/ppgtt: clear the scratch pageMatthew Auld1-6/+8
We need to ensure we don't leak the contents to userspace. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2023-12-19drm/xe/bo: reduce xe_bo_create_pin_map() restrictionsMatthew Auld1-15/+20
On DGFX this blows up if can call this with a system memory object: XE_BUG_ON(!mem_type_is_vram(place->mem_type) && place->mem_type != XE_PL_STOLEN); If we consider dpt it looks like we can already in theory hit this, if we run out of vram and stolen vram. It at least seems reasonable to allow calling this on any object which supports CPU access. Note this also changes the behaviour with stolen VRAM and suspend, such that we no longer attempt to migrate stolen objects into system memory. However nothing in stolen should ever need to be restored (same on integrated), so should be fine. Also on small-bar systems the stolen portion is pretty much always non-CPU accessible, and currently pinned objects use plain memcpy when being moved, which doesn't play nicely. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>