aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2023-12-21drm/xe/pm: Add vram_d3cold_threshold for d3cold capable deviceAnshuman Gupta1-2/+5
Do not register vram_d3cold_threshold device sysfs universally for each gfx device, only register sysfs and set the threshold value for d3cold capable devices. Cc: Rodrigo Vivi <[email protected]> Signed-off-by: Anshuman Gupta <[email protected]> Reviewed-by: Rodrigo Vivi <[email protected]> Link: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Add Wa_14015150844 for DG2 and Xe_LPGMatt Roper2-0/+12
The workaround database tells us to set this bit, even though the bspec indicates the bit doesn't exist on these platforms. Since this is a write-only register, we also can't read back its value to verify whether it's actually working or not. For now we'll trust that the workaround database knows what it's talking about; if not, the hardware will just ignore the attempt to write to a non-existent bit and it shouldn't cause any problems. Reviewed-by: Matt Atwood <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Matt Roper <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: don't warn for bogus pagefaultsMatthew Auld2-3/+3
This appears to be easily user triggerable so warning is perhaps too much. Rather just make it debug print. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/534 Signed-off-by: Matthew Auld <[email protected]> Cc: Matthew Brost <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Implement HW workaround 14016763929Oak Zeng3-5/+12
To workaround a HW bug on DG2, driver is required to map the whole ppgtt virtual address space before GPU workload submission. Thus set the XE_VM_FLAG_SCRATCH_PAGE flag during vm create so the whole address space is mapped to point to scratch page. v1: - Move the workaround implementation from xe_vm_create to xe_vm_create_ioctl - Brian - Reorder error checking in xe_vm_create_ioctl - Jose - Implement WA only for DG2-G10 and DG2-G12 Signed-off-by: Oak Zeng <[email protected]> Reviewed-by: Brian Welty <[email protected]> Reviewed-by: José Roberto de Souza <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Update ARL-S DevIDs to the latest BSpecLucas De Marchi1-2/+1
BSpec changed with regard the DevIDs for ARL-S. Update the define accordingly. Bspec: 55420 Reviewed-by: Niranjana Vishwanathapura <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lucas De Marchi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Set max pte size when skipping rebindsMatthew Brost1-1/+18
When a rebind is skipped, we must set the max pte size of the newly created vma to value of the old vma as we do not pte walk for the new vma. Without this future rebinds may be incorrectly skipped due to the wrong max pte size. Null binds are more likely to expose this bug as larger ptes are more frequently used compared to normal bindings. Reviewed-by: Rodrigo Vivi <[email protected]> Signed-off-by: Matthew Brost <[email protected]> Testcase: dEQP-VK.sparse_resources.buffer.ssbo.sparse_residency.buffer_size_2_24 Reported-by: Paulo Zanoni <[email protected]> Fixes: 8f33b4f054fc ("drm/xe: Avoid doing rebinds") Reference: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23045 Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe/guc_submit: prevent repeated unregisterMatthew Auld1-2/+12
It seems that various things can trigger the lr cleanup worker, including CAT error, engine reset and destroying the actual engine, so seems plausible to end up triggering the worker more than once in some cases. If that does happen we can race with an ongoing engine deregister before it has completed, thus triggering it again and also changing the state back into pending_disable. Checking if the engine has been marked as destroyed looks like it should prevent this. Signed-off-by: Matthew Auld <[email protected]> Cc: Matthew Brost <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Fix error path in xe_guc_pc_start()Lucas De Marchi1-1/+2
If the forcewake failed, put xe_device_mem_access. Reviewed-by: Matthew Brost <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lucas De Marchi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Fix error path in xe_guc_pc_gucrc_disable()Lucas De Marchi1-4/+6
Make sure to always call xe_device_mem_access_put(), even on error. Reviewed-by: Matthew Brost <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lucas De Marchi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Add min/max cap for engine scheduler propertiesTejas Upadhyay7-6/+509
Add sysfs entries for the min, max, and defaults for each of engine scheduler controls for every hardware engine class. Non-elevated user IOCTLs to set these controls must be within the min-max ranges of the sysfs entries, elevated user can set these controls to any value. However, introduced compile time CONFIG min-max values which restricts elevated user to be in compile time min-max range if at all sysfs min/max are violated. Sysfs entries examples are, DUT# cat /sys/class/drm/cardX/device/tileN/gtN/engines/ccs/.defaults/ job_timeout_max job_timeout_ms preempt_timeout_min timeslice_duration_max timeslice_duration_us job_timeout_min preempt_timeout_max preempt_timeout_us timeslice_duration_min DUT# cat /sys/class/drm/card1/device/tileN/gtN/engines/ccs/ .defaults/ job_timeout_min preempt_timeout_max preempt_timeout_us timeslice_duration_min job_timeout_max job_timeout_ms preempt_timeout_min timeslice_duration_max timeslice_duration_us V12: - Rebase V11: - Make engine_get_prop_minmax and enforce_sched_limit static - Matt - use enum in place of string in engine_get_prop_minmax - Matt - no need to use enforce_sched_limit or no need to filter min/max per user type in sysfs - Matt V10: - Add kernel doc for non-static func - Make helper to get min/max for range validation - Matt - Filter min/max per user type V9 : - Rebase to use s/xe_engine/xe_hw_engine/ - Matt V8 : - fix enforce_sched_limit and avoid code duplication - Niranjana - Make sure min < max - Niranjana V7 : - Rebase to replace hw engine with eclass interface - return EINVAL in place of EPERM - Use some APIs to avoid code duplication V6 : - Rebase changes to reflect per engine class props interface - MattB - Use #if ENABLED - MattB - Remove MAX_SCHED_TIMEOUT check as range validation is enough V5 : - Rebase to resolve conflicts - CI V4 : - Rebase - Update commit to reflect tile addition - Use XE_HW macro directly as they are already filtered for CONFIG checks - Niranjana - Add CONFIG for enable/disable min/max limitation on elevated user. Default is enable - Matt/Joonas V3 : - Resolve CI hooks warning for kernel-doc V2 : - Restric min/max setting to #define default min/max for elevated user - Himal - Remove unrelated changes from patch - Niranjana Reviewed-by: Niranjana Vishwanathapura <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Signed-off-by: Tejas Upadhyay <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Add sysfs for preempt reset timeoutTejas Upadhyay1-0/+29
The preemption request and timeout is used for higher priority context or kill hung context and reset hardware engine. The preempt timeout can be adjusted per-engine class using, /sys/class/drm/cardX/device/tileN/gtN/engines/ccs/preempt_timeout_us and can be disabled by setting it to 0. V7: - Rebase V6: - Rebase to use s/xe_engine/xe_hw_engine/ - Matt V5: - Remove timeout validation, not relevant - Niranjana V4: - Rebase to replace hw engine with eclass interface V3: - Rebase to per class engine props interface V2: - Rebase - Update commit message to add tile Reviewed-by: Niranjana Vishwanathapura <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Signed-off-by: Tejas Upadhyay <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Add timeslice duration engine property to sysfsTejas Upadhyay1-0/+30
Timeslices between multiple context is supported via guc scheduling. Add sysfs entry to provide user defined timeslice duration to guc scheduling. The timeslice duration can be adjusted per-engine class using, /sys/class/drm/cardX/device/tileN/gtN/engines/ccs/timeslice_duration_us V8: - Rebase V7: - Rebase to use s/xe_engine/xe_hw_engine/ - Matt V6: - Remove duration validation, not relevant - Niranjana V5: - Rebase to replace hw engine with eclass interface V4: - Rebase to per class engine props interface V3: - Rebase - Update commit messge to add tile V2: - Rebase Reviewed-by: Niranjana Vishwanathapura <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Signed-off-by: Tejas Upadhyay <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Add job timeout engine property to sysfsTejas Upadhyay1-24/+62
The time after which a job is removed from the scheduler. Add sysfs entry to provide user defined job timeout to scheduler. The job timeout can be adjusted per-engine class using, /sys/class/drm/cardX/device/tileN/gtN/engines/ccs/job_timeout_ms V8: - Rebase V7: - Rebase to use s/xe_engine/xe_hw_engine/ - Matt V6: - Remove timeout validation, not relevant - Niranjana - Rebase to use common error path V5: - Rebase to use engine class interface instead of hw engine V4: - Rebase to per class engine props interface V3: - Rebase - Update commit message to reflect tile update V2: - Use sysfs_create_files as part of this patch Reviewed-by: Niranjana Vishwanathapura <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Signed-off-by: Tejas Upadhyay <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Add sysfs for default engine scheduler propertiesTejas Upadhyay7-26/+190
For each HW engine under GT we are adding defaults sysfs entry to list all engine scheduler properties and its default values. So that it will be easier for user to fetch default values of these properties anytime to go back to default. For example, DUT# cat /sys/class/drm/card1/device/tileN/gtN/engines/bcs/.defaults/ job_timeout_ms preempt_timeout_us timeslice_duration_us where, @job_timeout_ms: The time after which a job is removed from the scheduler. @preempt_timeout_us: How long to wait (in microseconds) for a preemption event to occur when submitting a new context. @timeslice_duration_us: Each context is scheduled for execution for the timeslice duration, before switching to the next context. V12: - Add missing drmm_add_action_or_reset and remove sysfs files V11: - Rebase V10: - Remove xe_gt.h inclusion from .h - Matt V9 : - Remove jiffies for job_timeout_ms - Matt V8 : - replace xe_engine with xe_hw_engine - Matt V7 : - Push all errors to one error path at every places - Niranjana - Describe struct member to resolve kernel doc err - CI hooks V6 : - Use engine class interface instead of hw engine in sysfs for better interfacing readability - Niranjana V5 : - Scheduling props should apply per class engine not per hardware engine - Matt - Do not record value of job_timeout_ms if changed based on dma_fence - Matt V4 : - Resolve merge conflicts - CI V3 : - Rearrange code in its own file - Rebase - Update commit message to reflect tile addition V2 : - Use sysfs_create_files in this patch - Niranjana - Handle prototype error for xe_add_engine_defaults - CI hooks - Remove unused member sysfs_hwe - Niranjana Reviewed-by: Niranjana Vishwanathapura <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Signed-off-by: Tejas Upadhyay <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Add sysfs entries for engines under its GTTejas Upadhyay4-0/+174
Add engines sysfs directory under its GT and create sub directory for all engine class (note its not per instance) present on GT. For example, DUT# cat /sys/class/drm/cardX/device/tileN/gtN/engines/ bcs/ ccs/ V9 : - Add missing drmm_add_action_or_reset V8 : - Rebase V7 : - Remove xe_gt.h from .h and include in .c - Matt V6 : - Add kernel doc and arrange file in make file by alphabet - Matt V5 : - replace xe_engine with xe_hw_engine - Matt V4 : - Rebase to resolve conflicts - CI V3 : - Move code in its own file - Rename API name V2 : - Correct class mask logic - Himal - Remove extra parenthesis Reviewed-by: Niranjana Vishwanathapura <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Signed-off-by: Tejas Upadhyay <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Rename engine to exec_queueFrancois Dugast46-1680/+1679
Engine was inappropriately used to refer to execution queues and it also created some confusion with hardware engines. Where it applies the exec_queue variable name is changed to q and comments are also updated. Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/162 Signed-off-by: Francois Dugast <[email protected]> Reviewed-by: Rodrigo Vivi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Rename xe_engine.[ch] to xe_exec_queue.[ch]Francois Dugast15-14/+14
This is a preparation commit for a larger renaming of engine to exec queue. Signed-off-by: Francois Dugast <[email protected]> Reviewed-by: Rodrigo Vivi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Fix error paths of __xe_bo_create_lockedMaarten Lankhorst1-2/+6
ttm_bo_init_reserved() calls the destroy() callback if it fails. Because of this, __xe_bo_create_locked is required to be responsible for freeing the bo even when it's passed in as argument. Additionally, if the placement check fails, the bo was kept alive. Fix it too. Reported-by: Oded Gabbay <[email protected]> Signed-off-by: Maarten Lankhorst <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: remove header variable from parse_g2h_msgMatthew Brost1-2/+1
The header variable is unused, remove it. Reviewed-by: Rodrigo Vivi <[email protected]> Suggested-by: Oded Gabbay <[email protected]> Signed-off-by: Matthew Brost <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Prefer WARN() over BUG() to avoid crashing the kernelFrancois Dugast33-219/+218
Replace calls to XE_BUG_ON() with calls XE_WARN_ON() which in turn calls WARN() instead of BUG(). BUG() crashes the kernel and should only be used when it is absolutely unavoidable in case of catastrophic and unrecoverable failures, which is not the case here. Signed-off-by: Francois Dugast <[email protected]> Reviewed-by: Rodrigo Vivi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe/macro: Remove unused constantFrancois Dugast1-1/+0
Remove XE_EXTRA_DEBUG for cleanup as it is not used. Signed-off-by: Francois Dugast <[email protected]> Reviewed-by: Rodrigo Vivi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Add define WQ_HEADER_SIZEMatthew Brost1-2/+3
Previously used a a magic '+ 3', use define instead. Suggested-by: Oded Gabbay <[email protected]> Signed-off-by: Matthew Brost <[email protected]> Reviewed-by: Matthew Auld <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Remove ct->fence_contextMatthew Brost2-3/+0
This is unused, remove it. Suggested-by: Oded Gabbay <[email protected]> Signed-off-by: Matthew Brost <[email protected]> Reviewed-by: Matthew Auld <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Remove XE_GUC_CT_SELFTESTMatthew Brost4-98/+0
XE_GUC_CT_SELFTEST enabled a debugfs entry to which ran a very simple selftest ensuring the GuC CT code worked. This was added before the kunit framework was available and before submissions were working too. This test isn't worth porting over to the kunit frame as if the GuC CT didn't work, literally almost nothing would work so just remove this. Suggested-by: Oded Gabbay <[email protected]> Signed-off-by: Matthew Brost <[email protected]> Reviewed-by: Matthew Auld <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe/mtl: Reduce Wa_14018575942 scope to the CCS engineMatt Roper1-9/+1
The MTL version of Wa_14018575942 has been updated to suggest only applying the register change on the CCS engine. Note that DG2 and PVC have a functionally equivalent workaround with Wa_18018781329; for now that one is still applying to all engines, although we'll keep an eye on it in case it changes to be CCS-specific too. Reviewed-by: Rodrigo Vivi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Matt Roper <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Ensure memory eviction on s2idle.Rodrigo Vivi2-22/+43
On discrete cards we cannot allow the pci subsystem to skip the regular suspend and we need to unblock the d3cold. Cc: Anshuman Gupta <[email protected]> Reviewed-by: Anshuman Gupta <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Only init runtime PM after all d3cold config is in place.Rodrigo Vivi1-1/+3
We cannot allow runtime pm suspend after we configured the d3cold capable and threshold. Cc: Anshuman Gupta <[email protected]> Reviewed-by: Anshuman Gupta <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Fix the runtime_idle call and d3cold.allowed decision.Rodrigo Vivi1-2/+2
According to Documentation/power/runtime_pm.txt: int pm_runtime_put(struct device *dev); - decrement the device's usage counter; if the result is 0 then run pm_request_idle(dev) and return its result int pm_runtime_put_autosuspend(struct device *dev); - decrement the device's usage counter; if the result is 0 then run pm_request_autosuspend(dev) and return its result We need to ensure that the idle function is called before suspending so we take the right d3cold.allowed decision and respect the values set on vram_d3cold_threshold sysfs. So we need pm_runtime_put() instead of pm_runtime_put_autosuspend(). Cc: Anshuman Gupta <[email protected]> Reviewed-by: Anshuman Gupta <[email protected]> Tested-by: Anshuman Gupta <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Move d3cold_allowed decision all together.Rodrigo Vivi3-15/+12
And let's use the VRAM threshold to keep d3cold temporarily disabled. With this we have the ability to run D3Cold experiments just by touching the vram_d3cold_threshold sysfs entry. Cc: Anshuman Gupta <[email protected]> Reviewed-by: Anshuman Gupta <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Only set PCI d3cold_allowed when we are really allowing.Rodrigo Vivi1-2/+1
First of all it was strange to see: if (allowed) { ... } else { D3COLD_ENABLE } But besides this misalignment, let's also use the pci d3cold_allowed useful to us and know that we are not really allowing d3cold. Cc: Anshuman Gupta <[email protected]> Reviewed-by: Anshuman Gupta <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Introduce fault injection for gt resetHimal Prasad Ghimiray3-1/+31
To trigger gt reset failure: echo 100 > /sys/kernel/debug/dri/<cardX>/fail_gt_reset/probability echo 2 > /sys/kernel/debug/dri/<cardX>/fail_gt_reset/times Cc: Rodrigo Vivi <[email protected]> Cc: Lucas De Marchi <[email protected]> Reviewed-by: Rodrigo Vivi <[email protected]> Signed-off-by: Himal Prasad Ghimiray <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Notify Userspace when gt reset failsHimal Prasad Ghimiray2-0/+29
Send uevent in case of gt reset failure. This intimation can be used by userspace monitoring tool to do the device level reset/reboot when GT reset fails. udevadm can be used to monitor the uevents. v2: - Support only gt failure notification (Rodrigo) v3 - Rectify the comments in header file. v4 - Use pci kobj instead of drm kobj for notification.(Rodrigo) - Cleanup (Badal) v5 - Add tile id and gt id as additional info provided by uevent. - Provide code documentation for the uevent. (Rodrigo) Cc: Aravind Iddamsetty <[email protected]> Cc: Tejas Upadhyay <[email protected]> Cc: Rodrigo Vivi <[email protected]> Reviewed-by: Badal Nilawar <[email protected]> Signed-off-by: Himal Prasad Ghimiray <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Invert mask and val in xe_mmio_wait32.Rodrigo Vivi7-24/+17
The order: 'offset, mask, val'; is more common in other drivers and in special in i915, where any dev could copy a sequence and end up with unexpected behavior. Done with coccinelle: @rule1@ expression gt, reg, val, mask, timeout, out, atomic; @@ - xe_mmio_wait32(gt, reg, val, mask, timeout, out, atomic) + xe_mmio_wait32(gt, reg, mask, val, timeout, out, atomic) spatch -sp_file mmio.cocci *.c *.h compat-i915-headers/intel_uncore.h \ --in-place v2: Rebased after changes on xe_guc_mcr usage of xe_mmio_wait32. Reviewed-by: Matthew Brost <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Fix an invalid locking wait context bugRodrigo Vivi1-6/+26
We cannot have spin locks around xe_irq_reset, since it will call the intel_display_power_is_enabled() function, and that needs a mutex lock. Hence causing the undesired "[ BUG: Invalid wait context ]" We cannot convert i915's power domain lock to spin lock due to the nested dependency of non-atomic context waits. So, let's move the xe_irq_reset functions from the critical area, while still ensuring that we are protecting the irq.enabled and ensuring the right serialization in the irq handlers. v2: On the first version, I had missed the fact that irq.enabled is checked on the xe/display glue layer, and that i915 display code is actually using the irq spin lock properly. So, this got changed to a version suggested by Matthew Auld. v3: do not use lockdep_assert for display glue. do not save restore irq from inside IRQ or we can get bogus irq restore warnings Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/463 Suggested-by: Matthew Auld <[email protected]> Reviewed-by: Matthew Auld <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Sort xe_regs.hLucas De Marchi1-30/+33
Sort it by register address to make it easy to update when needed. v2: Do not create exception for registers with same functionality. Always sort it. Reviewed-by: Matt Roper <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lucas De Marchi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Carve out top of DSM as reservedLucas De Marchi2-1/+10
Top of DSM contains the WOPCM where kernel driver shouldn't access as it contains data from other HW agents. Carve it out from the stolen memory. On a MTL system, the output now matches the expected values: Reviewed-by: Matt Roper <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lucas De Marchi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Fix MTL+ stolen memory mappingLucas De Marchi1-2/+13
Based on commit 8d8d062be6b9 ("drm/i915/mtl: Fix MTL stolen memory GGTT mapping"). For stolen on MTL and beyond, the address in the PTE is the offset from DSM base. While at it, update the comments explaining each part of the calculation. Reviewed-by: Matt Roper <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lucas De Marchi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Set PTE_DM bit for stolen on MTLLucas De Marchi5-12/+29
Integrated graphics 1270 and beyond should set the PTE_LM bit in the PTE when it's stolen memory. Add a new function, xe_bo_is_stolen_devmem(), and use it when encoding the PTE. In some places in the spec the PTE bit is called "Local Memory", abbreviated as LM, and in others it's called "Device Memory" (DM). Since we moved away from "Local Memory" and preferred the "vram" terminology, also rename the macros as DM to follow the name of the new function. Reviewed-by: Matt Roper <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lucas De Marchi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Decouple vram check from xe_bo_addr()Lucas De Marchi6-43/+23
The output arg is_vram in xe_bo_addr() is unused by several callers. It's also not what the function is mainly doing. Remove the argument and let the interested callers to call xe_bo_is_vram(). Reviewed-by: Matt Roper <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lucas De Marchi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Remove vma arg from xe_pte_encode()Lucas De Marchi4-48/+13
All the callers pass a NULL vma, so the buffer is always the BO. Remove the argument and the side effects of dealing with it. Reviewed-by: Matthew Brost <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lucas De Marchi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: fix mcr semaphore locking for MTLDaniele Ceraolo Spurio1-3/+4
in commit 81593af6c88d ("drm/xe: Convert xe_mmio_wait32 to us so we can stop using wait_for_us.") the mcr semaphore register read was accidentally switched from waiting for the register to go to 1 to waiting for the register to go to 0, so we need to flip it back. Signed-off-by: Daniele Ceraolo Spurio <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Matthew Brost <[email protected]> Reviewed-by: Matt Roper <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Fix checking for unset valueLucas De Marchi1-1/+3
Commit 37430402618d ("drm/xe: NULL binding implementation") introduced the NULL binding implementation, but left a case in which the out value is_vram is not set and the caller will use whatever was on stack. Eventually the is_vram out could be removed, but this should at least fix the current bug. Fixes: 37430402618d ("drm/xe: NULL binding implementation") Reviewed-by: Matt Roper <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lucas De Marchi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe/engine: add missing rpm for bind enginesMatthew Auld2-0/+20
Bind engines need to use the migration vm, however we don't have any rpm for such a vm, otherwise the kernel would prevent rpm suspend-resume. There are two issues here, first is the actual engine create which needs to touch the lrc, but since that is in VRAM we trigger loads of missing mem_access asserts. The second issue is when destroying the actual engine, which requires GuC CT to deregister the context. v2 (Rodrigo): - Just use ENGINE_FLAG_VM as the indicator that we need to hold an rpm ref. This also handles the case in xe_vm_create() where we create default bind engines. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/499 Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/504 Cc: Rodrigo Vivi <[email protected]> Cc: Matthew Brost <[email protected]> Signed-off-by: Matthew Auld <[email protected]> Reviewed-by: Rodrigo Vivi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Signal out-syncs on VM binds if no operationsMatthew Brost1-0/+2
If no operations are generated for VM binds the out-syncs must still be signaled. Signed-off-by: Matthew Brost <[email protected]> Reviewed-by: Rodrigo Vivi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Always use xe_vm_queue_rebind_worker helperMatthew Brost2-9/+8
Do not queue the rebind worker directly, rather use the helper xe_vm_queue_rebind_worker. This ensures we use the correct work queue. Signed-off-by: Matthew Brost <[email protected]> Reviewed-by: Rodrigo Vivi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe: Invert guc vs execlists parameters and info.Rodrigo Vivi6-14/+9
The module parameter should reflect the name of the optional, experimental and unsafe option, rather than the default one. Signed-off-by: Rodrigo Vivi <[email protected]> Reviewed-by: José Roberto de Souza <[email protected]>
2023-12-21drm/xe/uapi: Remove XE_QUERY_CONFIG_FLAGS_USE_GUCRodrigo Vivi2-4/+0
This config is the only real one. If execlist remains in the code it will forever be experimental and we shouldn't maintain an uapi like that for that experimental piece of code that should never be used by real users. Signed-off-by: Rodrigo Vivi <[email protected]> Reviewed-by: José Roberto de Souza <[email protected]>
2023-12-21drm/xe: fully turn on small-bar supportMatthew Auld1-9/+2
This allows vram_size > io_size, instead of just clamping the vram size to the BAR size, now that the driver supports it. Signed-off-by: Matthew Auld <[email protected]> Cc: Gwan-gyeong Mun <[email protected]> Cc: Lucas De Marchi <[email protected]> Cc: Michael J. Ruhl <[email protected]> Reviewed-by: Maarten Lankhorst <[email protected]> Reviewed-by: Gwan-gyeong Mun <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe/uapi: add the userspace bits for small-barMatthew Auld5-5/+85
Mostly the same as i915. We add a new hint for userspace to force an object into the mappable part of vram. We also need to tell userspace how large the mappable part is. In Vulkan for example, there will be two vram heaps for small-bar systems. And here the size of each heap needs to be known. Likewise the used/avail tracking needs to account for the mappable part. We also limit the available tracking going forward, such that we limit to privileged users only, since these values are system wide and are technically considered an info leak. v2 (Maarten): - s/NEEDS_CPU_ACCESS/NEEDS_VISIBLE_VRAM/ in the uapi. We also no longer require smem as an extra placement. This is more flexible, and lets us use this for clear-color surfaces, since we need CPU access there but we don't want to attach smem, since that effectively disables CCS from kernel pov. - Reject clear-color CCS buffers where NEEDS_VISIBLE_VRAM is not set, instead of migrating it behind the scenes. v3 (José): - Split the changes that limit the accounting for perfmon_capable() into a separate patch. - Use XE_BO_CREATE_VRAM_MASK. v4 (Gwan-gyeong Mun): - Add some kernel-doc for the query bits. v5: - One small kernel-doc correction. The cpu_visible_size and corresponding used tracking are always zero for non XE_MEM_REGION_CLASS_VRAM. v6: - Without perfmon_capable() it likely makes more sense to report as zero, instead of reporting as used == total size. This should give similar behaviour as i915 which rather tracks free instead of used. - Only enforce NEEDS_VISIBLE_VRAM on rc_ccs_cc_plane surfaces when the device is actually small-bar. Testcase: igt/tests/xe_query Testcase: igt/tests/xe_mmap@small-bar Signed-off-by: Matthew Auld <[email protected]> Cc: Maarten Lankhorst <[email protected]> Cc: Thomas Hellström <[email protected]> Cc: Gwan-gyeong Mun <[email protected]> Cc: Lucas De Marchi <[email protected]> Cc: José Roberto de Souza <[email protected]> Cc: Filip Hazubski <[email protected]> Cc: Carl Zhang <[email protected]> Cc: Effie Yu <[email protected]> Reviewed-by: José Roberto de Souza <[email protected]> Reviewed-by: Gwan-gyeong Mun <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>
2023-12-21drm/xe/bo: support tiered vram allocation for small-barMatthew Auld4-16/+40
Add the new flag XE_BO_NEEDS_CPU_ACCESS, to force allocating in the mappable part of vram. If no flag is specified we do a topdown allocation, to limit the chances of stealing the precious mappable part, if we don't need it. If this is a full-bar system, then this all gets nooped. For kernel users, it looks like xe_bo_create_pin_map() is the central place which users should call if they want CPU access to the object, so add the flag there. We still need to plumb this through for userspace allocations. Also it looks like page-tables are using pin_map(), which is less than ideal. If we can already use the GPU to do page-table management, then maybe we should just force that for small-bar. Signed-off-by: Matthew Auld <[email protected]> Cc: Gwan-gyeong Mun <[email protected]> Cc: Thomas Hellström <[email protected]> Cc: Lucas De Marchi <[email protected]> Reviewed-by: Maarten Lankhorst <[email protected]> Reviewed-by: Gwan-gyeong Mun <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]>