aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-08-20drm/xe: Use topology to determine page fault queue sizeStuart Summers2-14/+49
Currently the page fault queue size is hard coded. However the hardware supports faulting for each EU and each CS. For some applications running on hardware with a large number of EUs and CSs, this can result in an overflow of the page fault queue. Add a small calculation to determine the page fault queue size based on the number of EUs and CSs in the platform as detmined by fuses. Signed-off-by: Stuart Summers <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Signed-off-by: Matthew Brost <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/24d582a3b48c97793b8b6a402f34b4b469471636.1723862633.git.stuart.summers@intel.com
2024-08-20drm/xe: Fix missing workqueue destroy in xe_gt_pagefaultStuart Summers1-2/+16
On driver reload we never free up the memory for the pagefault and access counter workqueues. Add those destroy calls here. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Stuart Summers <[email protected]> Reviewed-by: Rodrigo Vivi <[email protected]> Signed-off-by: Matthew Brost <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/c9a951505271dc3a7aee76de7656679f69c11518.1723862633.git.stuart.summers@intel.com
2024-08-19drm/xe/lnl: Offload system clear page activity to GPUNirmoy Das3-2/+38
On LNL because of flat CCS, driver creates migrates job to clear CCS meta data. Extend that to also clear system pages using GPU. Inform TTM to allocate pages without __GFP_ZERO to avoid double page clearing by clearing out TTM_TT_FLAG_ZERO_ALLOC flag and set TTM_TT_FLAG_CLEARED_ON_FREE while freeing to skip ttm pool's clear on free as XE now takes care of clearing pages. If a bo is in system placement such as BO created with DRM_XE_GEM_CREATE_FLAG_DEFER_BACKING and there is a cpu map then for such BO gpu clear will be avoided as there is no dma mapping for such BO at that moment to create migration jobs. Tested this patch api_overhead_benchmark_l0 from https://github.com/intel/compute-benchmarks Without the patch: api_overhead_benchmark_l0 --testFilter=UsmMemoryAllocation: UsmMemoryAllocation(api=l0 type=Host size=4KB) 84.206 us UsmMemoryAllocation(api=l0 type=Host size=1GB) 105775.56 us erf tool top 5 entries: 71.44% api_overhead_be [kernel.kallsyms] [k] clear_page_erms 6.34% api_overhead_be [kernel.kallsyms] [k] __pageblock_pfn_to_page 2.24% api_overhead_be [kernel.kallsyms] [k] cpa_flush 2.15% api_overhead_be [kernel.kallsyms] [k] pages_are_mergeable 1.94% api_overhead_be [kernel.kallsyms] [k] find_next_iomem_res With the patch: api_overhead_benchmark_l0 --testFilter=UsmMemoryAllocation: UsmMemoryAllocation(api=l0 type=Host size=4KB) 79.439 us UsmMemoryAllocation(api=l0 type=Host size=1GB) 98677.75 us Perf tool top 5 entries: 11.16% api_overhead_be [kernel.kallsyms] [k] __pageblock_pfn_to_page 7.85% api_overhead_be [kernel.kallsyms] [k] cpa_flush 7.59% api_overhead_be [kernel.kallsyms] [k] find_next_iomem_res 7.24% api_overhead_be [kernel.kallsyms] [k] pages_are_mergeable 5.53% api_overhead_be [kernel.kallsyms] [k] lookup_address_in_pgd_attr Without this patch clear_page_erms() dominates execution time which is also not pipelined with migration jobs. With this patch page clearing will get pipelined with migration job and will free CPU for more work. v2: Handle regression on dgfx(Himal) Update commit message as no ttm API changes needed. v3: Fix Kunit test. v4: handle data leak on cpu mmap(Thomas) v5: s/gpu_page_clear/gpu_page_clear_sys and move setting it to xe_ttm_sys_mgr_init() and other nits (Matt Auld) v6: Disable it when init_on_alloc and/or init_on_free is active(Matt) Use compute-benchmarks as reporter used it to report this allocation latency issue also a proper test application than mime. In v5, the test showed significant reduction in alloc latency but that is not the case any more, I think this was mostly because previous test was done on IFWI which had low mem BW from CPU. Cc: Himal Prasad Ghimiray <[email protected]> Cc: Matthew Auld <[email protected]> Cc: Matthew Brost <[email protected]> Cc: Thomas Hellström <[email protected]> Reviewed-by: Matthew Auld <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Nirmoy Das <[email protected]>
2024-08-19drm/ttm: Add a flag to allow drivers to skip clear-on-freeNirmoy Das2-8/+16
Add TTM_TT_FLAG_CLEARED_ON_FREE, which DRM drivers can set before releasing backing stores if they want to skip clear-on-free. Cc: Matthew Auld <[email protected]> Cc: Thomas Hellström <[email protected]> Suggested-by: Christian König <[email protected]> Reviewed-by: Christian König <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Nirmoy Das <[email protected]>
2024-08-19drm/xe/oa: Use vma_pages() helper function in xe_oa_mmap()Thorsten Blum1-2/+1
Use the vma_pages() helper function and remove the following Coccinelle/coccicheck warning reported by vma_pages.cocci: WARNING: Consider using vma_pages helper on vma Reviewed-by: Ashutosh Dixit <[email protected]> Signed-off-by: Thorsten Blum <[email protected]> Signed-off-by: Ashutosh Dixit <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-08-19drm/xe/display: Make display suspend/resume work on discreteMaarten Lankhorst2-5/+29
We should unpin before evicting all memory, and repin after GT resume. This way, we preserve the contents of the framebuffers, and won't hang on resume due to migration engine not being restored yet. Signed-off-by: Maarten Lankhorst <[email protected]> Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: [email protected] # v6.8+ Reviewed-by: Uma Shankar <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Maarten Lankhorst,,, <[email protected]>
2024-08-19drm/xe/display: Match i915 driver suspend/resume sequences betterMaarten Lankhorst1-5/+14
Suspend fbdev sooner, and disable user access before suspending to prevent some races. I've noticed this when comparing xe suspend to i915's. Matches the following commits from i915: 24b412b1bfeb ("drm/i915: Disable intel HPD poll after DRM poll init/enable") 1ef28d86bea9 ("drm/i915: Suspend the framebuffer console earlier during system suspend") bd738d859e71 ("drm/i915: Prevent modesets during driver init/shutdown") Thanks to Imre for pointing me to those commits. Driver shutdown is currently missing, but I have some idea how to implement it next. Signed-off-by: Maarten Lankhorst <[email protected]> Cc: Imre Deak <[email protected]> Reviewed-by: Uma Shankar <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Maarten Lankhorst,,, <[email protected]>
2024-08-19drm/xe: prevent UAF around preempt fenceMatthew Auld4-4/+4
The fence lock is part of the queue, therefore in the current design anything locking the fence should then also hold a ref to the queue to prevent the queue from being freed. However, currently it looks like we signal the fence and then drop the queue ref, but if something is waiting on the fence, the waiter is kicked to wake up at some later point, where upon waking up it first grabs the lock before checking the fence state. But if we have already dropped the queue ref, then the lock might already be freed as part of the queue, leading to uaf. To prevent this, move the fence lock into the fence itself so we don't run into lifetime issues. Alternative might be to have device level lock, or only release the queue in the fence release callback, however that might require pushing to another worker to avoid locking issues. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2454 References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2342 References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2020 Signed-off-by: Matthew Auld <[email protected]> Cc: Matthew Brost <[email protected]> Cc: <[email protected]> # v6.8+ Reviewed-by: Matthew Brost <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-08-19drm/xe: Remove redundant param from xe_bo_create_userNirmoy Das5-14/+12
BO from xe_bo_create_user() will always be of type, ttm_bo_type_device. So remove that redundant parameter. Cc: Matthew Auld <[email protected]> Cc: Matthew Brost <[email protected]> Cc: Thomas Hellström <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Nirmoy Das <[email protected]>
2024-08-17drm/xe/device: Remove unused xe_device::usm::num_vm_in_*Francois Dugast3-26/+0
Those counters were used to keep track of the numbers VMs in fault mode and in non-fault mode, to determine if the whole device was in fault mode or not. This is no longer needed so remove those variables and their usages. Signed-off-by: Francois Dugast <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Signed-off-by: Matthew Brost <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-08-17drm/xe/vm: Remove restriction that all VMs must be faulting if one isFrancois Dugast1-8/+0
With this restriction, all VMs on the device must be faulting VMs if there is already one faulting VM, in which case the device is considered in fault mode. This prevents for example an application from running 3D jobs for the compositor while submitting a SVM compute job on the same device. Now that mutual exclusion of faulting LR jobs and dma fence jobs is ensured on the hw engine group, remove this restriction to allow running faulting and non-faulting VMs on the same device. Signed-off-by: Francois Dugast <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Signed-off-by: Matthew Brost <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-08-17drm/xe/exec: Switch hw engine group execution mode upon job submissionFrancois Dugast3-2/+84
If the job about to be submitted is a dma-fence job, update the current execution mode of the hw engine group. This triggers an immediate suspend of the exec queues running faulting long-running jobs. If the job about to be submitted is a long-running job, kick a new worker used to resume the exec queues running faulting long-running jobs once the dma-fence jobs have completed. v2: Kick the resume worker from exec IOCTL, switch to unordered workqueue, destroy it after use (Matt Brost) v3: Do not resume if no exec queue was suspended (Matt Brost) v4: Squash commits (Matt Brost) v5: Do not kick the worker when xe_vm_in_preempt_fence_mode (Matt Brost) Signed-off-by: Francois Dugast <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Signed-off-by: Matthew Brost <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-08-17drm/xe/hw_engine_group: Ensure safe transition between execution modesFrancois Dugast2-0/+80
Provide a way to safely transition execution modes of the hw engine group ahead of the actual execution. When necessary, either wait for running jobs to complete or preempt them, thus ensuring mutual exclusion between execution modes. Unlike a mutex, the rw_semaphore used in this context allows multiple submissions in the same mode. v2: Use lockdep_assert_held_write, add annotations (Matt Brost) v3: Fix kernel doc, remove redundant code (Matt Brost) v4: Now that xe_hw_engine_group_suspend_faulting_lr_jobs can fail, propagate the error to the caller (Matt Brost) Signed-off-by: Francois Dugast <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Signed-off-by: Matthew Brost <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-08-17drm/xe/hw_engine_group: Add helper to wait for dma fence jobsFrancois Dugast1-0/+33
This is a required feature for faulting long running jobs not to be submitted while dma fence jobs are running on the hw engine group. v2: Switch to lockdep_assert_held_write in worker, get a proper reference for the last fence (Matt Brost) v3: Directly call dma_fence_put with the fence ref (Matt Brost) Signed-off-by: Francois Dugast <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Signed-off-by: Matthew Brost <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-08-17drm/xe/exec_queue: Prepare last fence for hw engine group resume contextFrancois Dugast2-2/+33
Ensure we can safely take a ref of the exec queue's last fence from the context of resuming jobs from the hw engine group. The locking requirements differ from the general case, hence the introduction of this new function. v2: Add kernel doc, rework the code to prevent code duplication v3: Fix kernel doc, remove now unnecessary lockdep variants (Matt Brost) v4: Remove new put function (Matt Brost) Signed-off-by: Francois Dugast <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Signed-off-by: Matthew Brost <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-08-17drm/xe/exec_queue: Remove duplicated codeFrancois Dugast1-4/+1
This code section is the same as the body of xe_exec_queue_last_fence_put_unlocked() so call the function instead and remove duplicated code to make maintenance easier. Signed-off-by: Francois Dugast <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Signed-off-by: Matthew Brost <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-08-17drm/xe/hw_engine_group: Add helper to suspend faulting LR jobsFrancois Dugast1-0/+36
This is a required feature for dma fence jobs to preempt faulting long running jobs in order to ensure mutual exclusion on a given hw engine group. v2: Pipeline calls to suspend(q) and suspend_wait(q) to improve efficiency, switch to lockdep_assert_held_write (Matt Brost) v3: Return error on suspend_wait failure to propagate on the call stack up to IOCTL (Matt Brost) Signed-off-by: Francois Dugast <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Signed-off-by: Matthew Brost <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-08-17'drm/xe/hw_engine_group: Register hw engine group's exec queuesFrancois Dugast5-0/+86
Add helpers to safely add and delete the exec queues attached to a hw engine group, and make use them at the time of creation and destruction of the exec queues. Keeping track of them is required to control the execution mode of the hw engine group. v2: Improve error handling and robustness, suspend exec queues created in fault mode if group in dma-fence mode, init queue link (Matt Brost) v3: Delete queue from hw engine group when it is destroyed by the user, also clean up at the time of closing the file in case the user did not destroy the queue v4: Use correct list when checking if empty, do not add the queue if VM is in xe_vm_in_preempt_fence_mode (Matt Brost) v5: Remove unrelated newline, add checks and asserts for group, unwind on suspend failure (Matt Brost) Signed-off-by: Francois Dugast <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Signed-off-by: Matthew Brost <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-08-17drm/xe/guc_submit: Make suspend_wait interruptibleFrancois Dugast1-6/+6
Rely on wait_event_interruptible_timeout() to put the process to sleep with TASK_INTERRUPTIBLE. It allows using this function in interruptible context. v2: Propagate error on wait_event_interruptible_timeout (Matt Brost) Signed-off-by: Francois Dugast <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Signed-off-by: Matthew Brost <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-08-17drm/xe/hw_engine_group: Introduce xe_hw_engine_groupFrancois Dugast6-0/+176
A xe_hw_engine_group is a group of hw engines. Two hw engines belong to the same xe_hw_engine_group if one hw engine cannot make progress while the other is stuck on a page fault. Typically, hw engines of the same group share some resources such as EUs, but this really depends on the hardware configuration of the platforms. The simple engines partitioning proposed here might be too conservative but is intended to work for existing platforms. It can be optimized later if more sets of independent engines are identified. The hw engine groups are intended to be used in the context of faulting long-running jobs submissions. v2: Move to own files, improve error handling (Matt Brost) v3: Fix build issue reported by CI, improve commit message (Matt Roper) v4: Fix kernel doc v5: Add switch case for XE_ENGINE_CLASS_OTHER Signed-off-by: Francois Dugast <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Signed-off-by: Matthew Brost <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-08-17drm/xe: Use reserved copy engine for user binds on faulting devicesMatthew Brost4-68/+70
User binds map to engines with can fault, faults depend on user binds completion, thus we can deadlock. Avoid this by using reserved copy engine for user binds on faulting devices. While we are here, normalize bind queue creation with a helper. v2: - Pass in extensions to bind queue creation (CI) v3: - s/resevered/reserved (Lucas) - Fix NULL hwe check (Jonathan) Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: Thomas Hellström <[email protected]> Signed-off-by: Matthew Brost <[email protected]> Reviewed-by: Jonathan Cavitt <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-08-16drm/xe/mcr: Try to derive dss_per_grp from hwconfig attributesMatt Roper4-4/+83
When steering MCR register ranges of type "DSS," the group_id and instance_id values are calculated by dividing the DSS pool according to the size of a gslice or cslice, depending on the platform. These values haven't changed much on past platforms, so we've been able to hardcode the proper divisor so far. However the layout may not be so fixed on future platforms so the proper, future-proof way to determine this is by using some of the attributes from the GuC's hwconfig table. The hwconfig has two attributes reflecting the architectural maximum slice and subslice counts (i.e., before any fusing is considered) that can be used for the purposes of calculating MCR steering targets. If the hwconfig is lacking the necessary values (which should only be possible on older platforms before these attributes were added), we can still fall back to the old hardcoded values. Going forward the hwconfig is expected to always provide the information we need on newer platforms, and any failure to do so will be considered a bug in the firmware that will prevent us from switching to the buggy firmware release. It's worth noting that over time GuC's hwconfig has provided a couple different keys with similar-sounding descriptions. For our purposes here, we only trust the newer key "70" which has supplanted the similarly-named key "2" that existed on older platforms. Bspec: 73210 Signed-off-by: Matt Roper <[email protected]> Reviewed-by: Jonathan Cavitt <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-08-16drm/xe: Add debugfs to dump GuC's hwconfigMatt Roper3-0/+70
Although the query uapi is the official way to get at the GuC's hwconfig table contents, it's still useful to have a quick debugfs interface to dump the table in a human-readable format while debugging the driver. Signed-off-by: Matt Roper <[email protected]> Reviewed-by: Jonathan Cavitt <[email protected]> Reviewed-by: Jagmeet Randhawa <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-08-16Merge drm/drm-next into drm-xe-nextLucas De Marchi12299-187054/+550185
Get drm-xe-next on v6.11-rc2 and synchronized with drm-intel-next for the display side. This resolves the current conflict for the enable_display module parameter and allows further pending refactors. Signed-off-by: Lucas De Marchi <[email protected]>
2024-08-16drm/xe: Use for_each_remote_tile rather than manual checkMatthew Brost1-3/+2
Replace for_each_tile plus a check against primary tile with for_each_remote_tile in tiles_fini. The latter macro does this for us. Signed-off-by: Matthew Brost <[email protected]> Reviewed-by: Lucas De Marchi <[email protected]> Reviewed-by: Himal Prasad Ghimiray <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Lucas De Marchi <[email protected]>
2024-08-16drm/xe/uc: Use devm to register cleanup that includes exec_queuesDaniele Ceraolo Spurio2-4/+4
Exec_queue cleanup requires HW access, so we need to use devm instead of drmm for it. Signed-off-by: Daniele Ceraolo Spurio <[email protected]> Cc: John Harrison <[email protected]> Cc: Alan Previn <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Reviewed-by: Lucas De Marchi <[email protected]> Reviewed-by: Matthew Auld <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Lucas De Marchi <[email protected]>
2024-08-16drm/xe/uc: Use managed bo for HuC and GSC objectsDaniele Ceraolo Spurio3-53/+14
Drmm actions are not the right ones to clean up BOs and we should use devm instead. However, we can also instead just allocate the objects using the managed_bo function, which will internally register the correct cleanup call and therefore allows us to simplify the code. While at it, switch to drmm_kzalloc for the GSC proxy allocation to further simplify the cleanup. Cc: John Harrison <[email protected]> Cc: Alan Previn <[email protected]> Signed-off-by: Daniele Ceraolo Spurio <[email protected]> Reviewed-by: Lucas De Marchi <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Reviewed-by: Matthew Auld <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Lucas De Marchi <[email protected]>
2024-08-16Merge tag 'drm-intel-next-2024-08-13' of ↵Dave Airlie78-1406/+1817
https://gitlab.freedesktop.org/drm/i915/kernel into drm-next - Type-C programming fix for MTL+ (Gustavo) - Fix display clock workaround (Mitul) - Fix DP LTTPR detection (Imre) - Calculate vblank delay more accurately (Ville) - Make vrr_{enabling,disabling}() usable outside intel_display.c (Ville) - FBC clean-up (Ville) - DP link-training fixes and clean-up (Imre) - Make I2C terminology more inclusive (Easwar) - Make read-only array bw_gbps static const (Colin) - HDCP fixes and improvements (Suraj) - DP VSC SDP fixes and clean-ups (Suraj, Mitul) - Fix opregion leak in Xe code (Lucas) - Fix possible int overflow in skl_ddi_calculate_wrpll (Nikita)] - General display clean-ups and conversion towards intel_display (Jani) - On DP MST, Enable LT fallback for UHBR<->non-UHBR rates (Imre) - Add VRR condition for DPKGC Enablement (Suraj) - Use backlight power constants (Zimmermann) - Correct dual pps handling for MTL_PCH+ (Dnyaneshwar) - Dump DSC HW state (Imre) - Replace double blank with single blank after comma (Andi) - Read display register timeout on BMG (Mitul) Signed-off-by: Dave Airlie <[email protected]> From: Rodrigo Vivi <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-08-15drm/xe: Fix tile fini sequenceMatthew Brost1-1/+4
Only set tile->mmio.regs to NULL if not the root tile in tile_fini. The root tile mmio regs is setup ealier in MMIO init thus it should be set to NULL in mmio_fini. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Matthew Brost <[email protected]> Reviewed-by: Lucas De Marchi <[email protected]> Reviewed-by: Himal Prasad Ghimiray <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-08-15drm/xe: use devm instead of drmm for managed boDaniele Ceraolo Spurio1-3/+3
The BO cleanup touches the GGTT and therefore requires the HW to be available, so we need to use devm instead of drmm. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1160 Signed-off-by: Daniele Ceraolo Spurio <[email protected]> Cc: Lucas De Marchi <[email protected]> Cc: Matthew Auld <[email protected]> Reviewed-by: Matthew Auld <[email protected]> Reviewed-by: Lucas De Marchi <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Lucas De Marchi <[email protected]>
2024-08-15drm/xe: Make exec_queue_kill safe to call twiceDaniele Ceraolo Spurio2-2/+15
An upcoming PXP patch will kill queues at runtime when a PXP invalidation event occurs, so we need exec_queue_kill to be safe to call multiple times. v2: Add documentation (Matt B) Signed-off-by: Daniele Ceraolo Spurio <[email protected]> Cc: Matthew Brost <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-08-15drm/xe: fix engine_class bounds check againMatthew Auld1-1/+1
This was fixed in commit b7dce525c4fc ("drm/xe/queue: fix engine_class bounds check"), but then re-introduced in commit 6f20fc09936e ("drm/xe: Move and export xe_hw_engine lookup.") which should only be simple code movement of the existing function. Fixes: 6f20fc09936e ("drm/xe: Move and export xe_hw_engine lookup.") Signed-off-by: Matthew Auld <[email protected]> Cc: Dominik Grzegorzek <[email protected]> Cc: Mika Kuoppala <[email protected]> Cc: Matthew Brost <[email protected]> Reviewed-by: Jonathan Cavitt <[email protected]> Reviewed-by: Lucas De Marchi <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Lucas De Marchi <[email protected]>
2024-08-14drm/xe: Define STATELESS_COMPRESSION_CTRL as mcr registerTejas Upadhyay1-1/+1
Register STATELESS_COMPRESSION_CTRL should be considered mcr register which should write to all slices as per documentation. Bspec: 71185 Fixes: ecabb5e6ce54 ("drm/xe/xe2: Add performance turning changes") Signed-off-by: Tejas Upadhyay <[email protected]> Reviewed-by: Shekhar Chauhan <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Lucas De Marchi <[email protected]>
2024-08-14drm/xe: Write all slices if its mcr registerTejas Upadhyay2-5/+5
Register GAMREQSTRM_CTRL should be considered mcr register which should write to all slices as per documentation. Bspec: 71185 Fixes: 01570b446939 ("drm/xe/bmg: implement Wa_16023588340") Reviewed-by: Matt Roper <[email protected]> Signed-off-by: Tejas Upadhyay <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Lucas De Marchi <[email protected]>
2024-08-14drm/xe: Move enable host l2 VRAM post MCR initTejas Upadhyay1-1/+1
xe_gt_enable_host_l2_vram() is reading the XE2_GAMREQSTRM_CTRL register that is currently missing the MCR annotation. However, just adding the annotation doesn't work as this function is called before MCR handling is initialized in xe_gt_mcr_init(). xe_gt_enable_host_l2_vram() is used to implement WA 16023588340 that needs to be done as early as possible during initialization in order to be effective since the MMIO writes impact it. In the failure scenario, driver would simply not be able to bind successfully. Moving xe_gt_enable_host_l2_vram() later, after MCR initialization is done, only incurs a few additional HW accesses, particularly when loading GuC for hwconfig. Binding/unbinding the driver 100 times in BMG still works so it should be ok to start handling the WA a little bit later. This is sufficient to allow adding the MCR annotation to XE2_GAMREQSTRM_CTRL. Cc: Lucas De Marchi <[email protected]> Signed-off-by: Tejas Upadhyay <[email protected]> Reviewed-by: Matt Roper <[email protected]> Reviewed-by: Lucas De Marchi <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Lucas De Marchi <[email protected]>
2024-08-13drm/xe: Rename enable_display module paramLucas De Marchi6-34/+41
The different approach used by xe regarding the initialization of display HW has been proved a great addition for early driver bring up: core xe can be tested without having all the bits sorted out on the display side. On the other hand, the approach exposed by i915-display is to *actively* disable the display by programming it if needed, i.e. if it was left enabled by firmware. It also has its use to make sure the HW is actually disabled and not wasting power. However having both the way it is in xe doesn't expose a good interface wrt module params. From modinfo: disable_display:Disable display (default: false) (bool) enable_display:Enable display (bool) Rename enable_display to probe_display to try to convey the message that the HW is being touched and improve the module param description. To avoid confusion, the enable_display is renamed everywhere, not only in the module param. New description for the parameters: disable_display:Disable display (default: false) (bool) probe_display:Probe display HW, otherwise it's left untouched (default: true) (bool) Reviewed-by: Matt Roper <[email protected]> Acked-by: Jani Nikula <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Lucas De Marchi <[email protected]>
2024-08-13drm/xe: Remove unused xe parameterHimal Prasad Ghimiray1-11/+6
Remove the xe parameter from the pde_encode_pat_index and pte_encode_pat_index functions, as it is no longer used. Signed-off-by: Himal Prasad Ghimiray <[email protected]> Reviewed-by: Tejas Upadhyay <[email protected]> Reviewed-by: Lucas De Marchi <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Lucas De Marchi <[email protected]>
2024-08-13drm/xe: Name and document Wa_14019789679Matt Roper2-10/+27
Early in the development of Xe we identified an issue with SVG state handling on DG2 and MTL (and later on Xe2 as well). In commit 72ac304769dd ("drm/xe: Emit SVG state on RCS during driver load on DG2 and MTL") and commit fb24b858a20d ("drm/xe/xe2: Update SVG state handling") we implemented our own workaround to prevent SVG state from leaking from context A to context B in cases where context B never issues a specific state setting. The hardware teams have now created official workaround Wa_14019789679 to cover this issue. The workaround description only requires emitting 3DSTATE_MESH_CONTROL, since they believe that's the only SVG instruction that would potentially remain unset by a context B, but still cause notable issues if unwanted values were inherited from context A. However since we already have a more extensive implementation that emits the entire SVG state and prevents _any_ SVG state from unintentionally leaking, we'll stick with our existing implementation just to be safe. Signed-off-by: Matt Roper <[email protected]> Reviewed-by: Jonathan Cavitt <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-08-13drm/xe: add kdev_to_xe_device() helper and use itJani Nikula2-7/+7
There are enough users for kernel device to xe device conversion, add a helper for it. Reviewed-by: Gustavo Sousa <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/38c80846e70c7e410850530426384e17cff9d031.1723458544.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <[email protected]>
2024-08-13drm/xe: use pdev_to_xe_device() instead of pci_get_drvdata() directlyJani Nikula1-1/+1
We have a helper for converting pci device to xe device, use it. Reviewed-by: Gustavo Sousa <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/1b87c2e56200e001ce3a5d2f4a93eb26b294df32.1723458544.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <[email protected]>
2024-08-13drm/xe/tests: remove unused leftover xe_call_for_each_device()Jani Nikula2-53/+0
xe_call_for_each_device() has been unused since commit 57ecead343e7 ("drm/xe/tests: Convert xe_mocs live tests"). Remove it and the related dev_to_xe_device_fn() and struct kunit_test_data. Cc: Lucas De Marchi <[email protected]> Reviewed-by: Gustavo Sousa <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/fa3bb23d005313c9797f557e1211fde09fcb59cc.1723458544.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <[email protected]>
2024-08-13drm/xe/migrate: Parameterize ccs and bo data clear in xe_migrate_clear()Nirmoy Das5-16/+45
Parameterize clearing ccs and bo data in xe_migrate_clear() which higher layers can utilize. This patch will be used later on when doing bo data clear for igfx as well. v2: Replace multiple params with flags in xe_migrate_clear (Matt B) v3: s/CLEAR_BO_DATA_FLAG_*/XE_MIGRATE_CLEAR_FLAG_* and move to xe_migrate.h. other nits(Matt B) Cc: Himal Prasad Ghimiray <[email protected]> Cc: Matthew Auld <[email protected]> Cc: Matthew Brost <[email protected]> Cc: Thomas Hellström <[email protected]> Signed-off-by: Akshata Jahagirdar <[email protected]> Reviewed-by: Matthew Auld <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Nirmoy Das <[email protected]>
2024-08-13drm/i915: use pdev_to_i915() instead of pci_get_drvdata() directlyJani Nikula1-3/+3
We have a helper for converting pci device to i915 device, use it. v2: Also convert i915_pci_probe() (Gustavo) Reviewed-by: Gustavo Sousa <[email protected]> Reviewed-by: Andi Shyti <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Jani Nikula <[email protected]>
2024-08-12drm/xe/xe2hpg: Add Wa_14021821874Tejas Upadhyay2-0/+5
Wa_14021821874 applies to xe2_hpg V2(Himal): - Use space after define Cc: Matt Roper <[email protected]> Reviewed-by: Himal Prasad Ghimiray <[email protected]> Signed-off-by: Tejas Upadhyay <[email protected]> Reviewed-by: Matt Roper <[email protected]> Signed-off-by: Matt Roper <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-08-12drm/xe: Add stats for tlb invalidation countNirmoy Das3-0/+4
Add stats for tlb invalidation count which can be viewed with per GT stat debugfs file. Example output: cat /sys/kernel/debug/dri/0/gt0/stats tlb_inval_count: 22 v2: fix #include order(Tejas) Cc: Lucas De Marchi <[email protected]> Cc: Matthew Brost <[email protected]> Cc: Michal Wajdeczko <[email protected]> Cc: Sai Gowtham Ch <[email protected]> Reviewed-by: Tejas Upadhyay <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Nirmoy Das <[email protected]>
2024-08-12drm/xe/gt: Add APIs for printing stats over debugfsNirmoy Das5-0/+88
Add skeleton APIs for recording and printing various stats over debugfs. This currently only added counter types stats which is backed by atomic_t and wrapped with CONFIG_DRM_XE_STATS so this can be disabled on production system. v4: Rebase and other minor fixes (Matt) v3: s/CONFIG_DRM_XE_STATS/CONFIG_DEBUG_FS(Lucas) v2: add missing docs Add boundary checks for stats id and other improvements (Michal) Fix build when CONFIG_DRM_XE_STATS is disabled(Matt) Cc: Lucas De Marchi <[email protected]> Cc: Matthew Brost <[email protected]> Cc: Michal Wajdeczko <[email protected]> Cc: Sai Gowtham Ch <[email protected]> Reviewed-by: Matthew Brost <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Nirmoy Das <[email protected]>
2024-08-12drm/i915/bios: convert to struct intel_displayJani Nikula18-451/+495
Going forward, struct intel_display shall replace struct drm_i915_private as the main display device data pointer type. Convert intel_bios.[ch] to struct intel_display. Do one drive-by conversion of unnecessary hex usage to decimal. Reviewed-by: Imre Deak <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/0d0261a53aff5f141b16b482222a5ffce78e176e.1723213547.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <[email protected]>
2024-08-12drm/i915/opregion: convert to struct intel_displayJani Nikula9-208/+240
Going forward, struct intel_display shall replace struct drm_i915_private as the main display device data pointer type. Convert intel_opregion.[ch] to struct intel_display. v2: - Fix declarations for !CONFIG_ACPI (Imre, kernel test robot) - Pass encoder/connector directly to intel_display() (Imre) Reviewed-by: Imre Deak <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/aef94503909bbbf95f0244dc382a4d4cd050b903.1723213547.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <[email protected]>
2024-08-12drm/i915/opregion: unify intel_encoder/intel_connector namingJani Nikula2-14/+13
Prefer the short encoder/connector names for struct intel_encoder/intel_connector variables and parameters. Reviewed-by: Imre Deak <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/60c67da6b7282ab521366524109ade0470408cf8.1723213547.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <[email protected]>
2024-08-12drm/i915/acpi: convert to struct intel_displayJani Nikula4-25/+28
Going forward, struct intel_display shall replace struct drm_i915_private as the main display device data pointer type. Convert intel_acpi.[ch] to struct intel_display. Reviewed-by: Imre Deak <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/465436a3442807b49609fc55c9f652a29f96fd02.1723213547.git.jani.nikula@intel.com Signed-off-by: Jani Nikula <[email protected]>