Age | Commit message (Collapse) | Author | Files | Lines |
|
Expose hwmon energy attribute to show device level energy usage
v2:
- %s/hwm_/hwmon_/
- Convert enums to upper case
v3:
- %s/hwmon_/xe_hwmon
- Remove gt specific hwmon attributes
v4:
- %s/REG_PKG_ENERGY_STATUS/REG_ENERGY_STATUS_ALL (Riana)
- %s/hwmon_energy_info/xe_hwmon_energy_info (Riana)
Acked-by: Rodrigo Vivi <[email protected]>
Reviewed-by: Riana Tauro <[email protected]>
Signed-off-by: Badal Nilawar <[email protected]>
Reviewed-by: Andi Shyti <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Use Xe HWMON subsystem to display the input voltage.
v2:
- Rename hwm_get_vltg to hwm_get_voltage (Riana)
- Use scale factor SF_VOLTAGE (Riana)
v3:
- %s/gt_perf_status/REG_GT_PERF_STATUS/
- Remove platform check from hwmon_get_voltage()
v4:
- Fix review comments (Andi)
Acked-by: Rodrigo Vivi <[email protected]>
Reviewed-by: Riana Tauro <[email protected]>
Signed-off-by: Badal Nilawar <[email protected]>
Reviewed-by: Andi Shyti <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Expose the card reactive critical (I1) power. I1 is exposed as
power1_crit in microwatts (typically for client products) or as
curr1_crit in milliamperes (typically for server).
v2: Move PCODE_MBOX macro to pcode file (Riana)
v3: s/IS_DG2/(gt_to_xe(gt)->info.platform == XE_DG2)
v4: Fix review comments (Andi)
Acked-by: Rodrigo Vivi <[email protected]>
Reviewed-by: Riana Tauro <[email protected]>
Signed-off-by: Badal Nilawar <[email protected]>
Reviewed-by: Andi Shyti <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Expose Card reactive sustained (pl1) power limit as power_max and
card default power limit (tdp) as power_rated_max.
v2:
- Fix review comments (Riana)
v3:
- Use drmm_mutex_init (Matt Brost)
- Print error value (Matt Brost)
- Convert enums to uppercase (Matt Brost)
- Avoid extra reg read in hwmon_is_visible function (Riana)
- Use xe_device_assert_mem_access when applicable (Matt Brost)
- Add [email protected] in Documentation (Matt Brost)
v4:
- Use prefix xe_hwmon prefix for all functions (Matt Brost/Andi)
- %s/hwmon_reg/xe_hwmon_reg (Andi)
- Fix review comments (Guenter/Andi)
v5:
- Fix review comments (Riana)
v6:
- Use drm_warn in default case (Rodrigo)
- s/ENODEV/EOPNOTSUPP (Andi)
Acked-by: Rodrigo Vivi <[email protected]>
Reviewed-by: Riana Tauro <[email protected]>
Signed-off-by: Badal Nilawar <[email protected]>
Reviewed-by: Andi Shyti <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Passing in a NULL exec queue to __xe_pt_unbind_vma results in the
migrate exec queue being used. This is not the intent from the VM bind
IOCTL, rather a NULL exec queue should use default VM exec queue.
Reviewed-by: Niranjana Vishwanathapura <[email protected]>
Signed-off-by: Matthew Brost <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
On platforms that support read L3 caching, set the default mocs index in
CCS RING_CMD_CTL to leverage the read caching in L3.
Currently PVC and Xe2 platforms have the support.
Bspec: 72161
Signed-off-by: Balasubramani Vivekanandan <[email protected]>
Reviewed-by: Matt Roper <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Xe2 changes or adds bits for mocs in a few BLT instructions:
XY_CTRL_SURF_COPY_BLT, XY_FAST_COLOR_BLT, XY_FAST_COPY_BLT, and MEM_SET.
Modify the code to deal with the new location. Unlike Xe1, the MOCS
field in those instructions is only the MOCS index and not the
Structure_MEMORY_OBJECT_CONTROL_STATE anymore. The pxp bit is now
explicitly documented separately.
Bspec: 57567,57566,57565,57562
Cc: Matt Roper <[email protected]>
Signed-off-by: Haridhar Kalvala <[email protected]>
Reviewed-by: Matt Roper <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Set bits 30 and 31 of XY_FAST_COPY_BLT's dword1 for XeHP and above.
Destination or source being Y-Major is selected on dword0 and there's
nothing to set on dword1. According to the bspec for Xe2,
"Behavior is undefined when programmed the value 0". Also for XeHP,
the only value allowed in those bits is 0b11, not being possible to
select "Legacy Tile-Y" anymore, only the newer Tile4.
So, unconditionally set those bits for graphics IP 12.50 and above.
v2: Reword commit message and extend it to graphics version >= 12.50
(Matt Roper)
Bspec: 57567
Cc: Matt Roper <[email protected]>
Signed-off-by: Haridhar Kalvala <[email protected]>
Reviewed-by: Matt Roper <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
PVC_MS_* doesn't reflect the real name of the instruction. Rename
it to follow the name used in the bspec.
Cc: Matt Roper <[email protected]>
Signed-off-by: Haridhar Kalvala <[email protected]>
Reviewed-by: Matt Roper <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Instead of using xe_mocs_index_to_value(), simply define the bitmask
with the shift left applied. This will make it easier to adapt to new
platforms that simply use the index.
This also fixes PVC bug in emit_clear_link_copy() where the MOCS was
getting shifted both by PVC_MS_MOCS_INDEX_MASK definition and by the
xe_moc_index_to_value function.
Bspec: 44509
Cc: Matt Roper <[email protected]>
Signed-off-by: Haridhar Kalvala <[email protected]>
Reviewed-by: Matt Roper <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
For xe2, besides the previous sizes, the reserved portion of stolen can
also have 16MB and 32MB.
Bspec: 53148
Reviewed-by: Matt Roper <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
The LRC tuning settings we have today are modifying registers that are
part of the RCS engine's context; they're not part of the general CSFE
context that would apply to all engines. Add ENGINE_CLASS(RENDER) to
the RTP rules to properly restrict these to the RCS.
Bspec: 46255, 46261
Reviewed-by: Lucas De Marchi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Matt Roper <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
In xe_wait_user_fence_ioctl, the timeout is currently defined as
unsigned long. That could potentially pass a negative value to
the schedule_timeout() call because nsecs_to_jiffies() returns an
unsigned long which gets used as signed long.
[ 187.732238] schedule_timeout: wrong timeout value fffffffffffffc18
[ 187.733180] CPU: 0 PID: 792 Comm: test_thread_dim Tainted: G U 6.4.0-xe #1
[ 187.734251] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[ 187.735019] Call Trace:
[ 187.735373] <TASK>
[ 187.735687] dump_stack_lvl+0x92/0xb0
[ 187.736193] schedule_timeout+0x348/0x430
[ 187.736739] ? __might_fault+0x67/0xd0
[ 187.737255] ? check_chain_key+0x224/0x2d0
[ 187.737812] ? __pfx_schedule_timeout+0x10/0x10
[ 187.738429] ? __might_fault+0x6b/0xd0
[ 187.738946] ? __pfx_lock_release+0x10/0x10
[ 187.739512] ? __pfx_lock_release+0x10/0x10
[ 187.740080] wait_woken+0x86/0x100
[ 187.740556] xe_wait_user_fence_ioctl+0x34b/0xe00 [xe]
[ 187.741281] ? __pfx_xe_wait_user_fence_ioctl+0x10/0x10 [xe]
[ 187.742075] ? lock_acquire+0x169/0x3d0
[ 187.742601] ? check_chain_key+0x224/0x2d0
[ 187.743158] ? drm_dev_enter+0x9/0xe0 [drm]
[ 187.743740] ? __pfx_woken_wake_function+0x10/0x10
[ 187.744388] ? drm_dev_exit+0x11/0x50 [drm]
[ 187.744969] ? __pfx_lock_release+0x10/0x10
[ 187.745536] ? __might_fault+0x67/0xd0
[ 187.746052] ? check_chain_key+0x224/0x2d0
[ 187.746610] drm_ioctl_kernel+0x172/0x250 [drm]
[ 187.747242] ? __pfx_xe_wait_user_fence_ioctl+0x10/0x10 [xe]
[ 187.748037] ? __pfx_drm_ioctl_kernel+0x10/0x10 [drm]
[ 187.748729] ? __pfx_xe_wait_user_fence_ioctl+0x10/0x10 [xe]
[ 187.749524] ? __pfx_xe_wait_user_fence_ioctl+0x10/0x10 [xe]
[ 187.750319] drm_ioctl+0x35e/0x620 [drm]
[ 187.750871] ? __pfx_drm_ioctl+0x10/0x10 [drm]
[ 187.751495] ? restore_fpregs_from_fpstate+0x99/0x140
[ 187.752172] ? __pfx_restore_fpregs_from_fpstate+0x10/0x10
[ 187.752901] ? mark_held_locks+0x24/0x90
[ 187.753438] __x64_sys_ioctl+0xb4/0xf0
[ 187.753954] do_syscall_64+0x3f/0x90
[ 187.754450] entry_SYSCALL_64_after_hwframe+0x72/0xdc
[ 187.755127] RIP: 0033:0x7f4e6651aaff
[ 187.755623] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00
[ 187.757995] RSP: 002b:00007fff05f37a50 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 187.758995] RAX: ffffffffffffffda RBX: 000055eca47c8130 RCX: 00007f4e6651aaff
[ 187.759935] RDX: 00007fff05f37b60 RSI: 00000000c050644b RDI: 0000000000000004
[ 187.760874] RBP: 0000000000000017 R08: 0000000000000017 R09: 7fffffffffffffff
[ 187.761814] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[ 187.762753] R13: 0000000000000000 R14: 0000000000000000 R15: 00007f4e65d19ce0
[ 187.763694] </TASK>
Fixes: 5572a0046857 ("drm/xe: Use nanoseconds instead of jiffies in uapi for user fence")
Signed-off-by: Fei Yang <[email protected]>
Cc: Andi Shyti <[email protected]>
Cc: Zbigniew Kempczyński <[email protected]>
Reviewed-by: Andi Shyti <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Atomic access is supported by PVC, and became a common feature for all
platforms starting from Xe2. To enable that XE_VMA_ATOMIC_PTE_BIT needs
to be set, then pte encode will eventually set PTE_AE for devmem.
Signed-off-by: Fei Yang <[email protected]>
Reviewed-by: Matt Roper <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Ensure that the mutex is destroyed at fini function.
Cc: Maarten Lankhorst <[email protected]>
Signed-off-by: Bommithi Sakeena <[email protected]>
Reviewed-by: Niranjana Vishwanathapura <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Add missing mutex_destroy calls to fini functions or convert to
drmm_mutex_init where fini function is not available.
Cc: Matthew Brost <[email protected]>
Signed-off-by: Bommithi Sakeena <[email protected]>
Reviewed-by: Niranjana Vishwanathapura <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
When working without GuC (i.e. working with execlists), the flow
attempts to perform suspend operation which is failing due to a
lack of support without GuC.
If PM ops are not supported without GuC we may as well avoid PM
registration rather than returning errors from various PM flows.
Signed-off-by: Ohad Sharabi <[email protected]>
Reviewed-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Use 2 different functions for encoding the ggtt's pte, assigning them
during initialization. Main difference is that before Xe-LPG, the pte
didn't have the cache bits.
v2: Re-use xelp_ggtt_pte_encode_bo() for the common part with
xelpg_ggtt_pte_encode_bo() (Matt Roper)
Reviewed-by: Matt Roper <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Change the xelp_pte_encode() and xelp_pde_encode() functions to use the
platform-dependent pat_index. The same function can be used for all
platforms as they only need to encode the pat_index bits in the same
pte/pde layout. For platforms that don't have the most significant bit,
as long as they don't return a bogus index they should be fine.
v2: Use the same logic to encode pde as it's compatible with previous
logic, it's more future proof and also fixes the cache setting for
PVC (Matt Roper)
Reviewed-by: Matt Roper <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Some of the PAT entries are relevant for internal driver use, which
varies per platform. Let the PAT early initialization set what they
should point to so the rest of the driver can use them where needed.
Reviewed-by: Matt Roper <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Both DG2 and PVC are derived from XeHP, but DG2 should not really
re-use something introduced by PVC, so it's odd to have DG2 re-using the
PVC programming for PAT. Let's prefer using the architecture and/or IP
names.
Reviewed-by: Matt Roper <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
DG2 should use the MCR variant to program the PAT registers, like PVC,
but shouldn't use the same table as PVC.
Reviewed-by: Matt Roper <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Split the PAT initialization between SW-only and HW. The _early() only
sets up the ops and data structure that are used later to program the
tables. This allows the PAT to be easily extended to other platforms.
Reviewed-by: Matt Roper <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Instead of encoding the pte, call a new vfunc from xe_vm to handle that.
The encoding may not be the same on every platform, so keeping it in one
place helps to better support them.
Reviewed-by: Matt Roper <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Move the function to encode pte/pde to be vfuncs inside struct xe_vm.
This will allow to easily extend to platforms that don't have a
compatible encoding.
v2: Fix kunit build
Reviewed-by: Matt Roper <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
vma at this point can never be NULL as otherwise it would crash earlier
in the only caller, xe_pt_stage_bind_entry(). Remove the extra check and
avoid adding and removing the bits from the pte.
Reviewed-by: Matt Roper <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Split functions that do only part of the pde/pte encoding and that can
be called by the different places. This normalizes how pde/pte are
encoded so they can be moved elsewhere in a subsequent change.
xe_pte_encode() was calling __pte_encode() with a NULL vma, which is the
opposite of what xe_pt_stage_bind_entry() does. Stop passing a NULL vma
and just split another function that deals with a vma rather than a bo.
Reviewed-by: Matt Roper <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
On platforms with multiple BCS engines (i.e., PVC and Xe2), not all BCS
engines are created equal. The BCS0 engine is what the specs refer to
as a "resource copy engine," which supports the platform's full set of
copy/fill instructions. In contast, the non-BCS0 "service copy" engines
are more streamlined and only support a subset of the GPU instructions
supported by the resource copy engine. Platforms with both types of
copy engines always support the MEM_COPY and MEM_SET instructions which
can be used for simple copy and fill operations on either type of BCS
engine. Since the simple MEM_SET instruction meets the needs of Xe's
migrate code (and since the more elaborate XY_FAST_COLOR_BLT instruction
isn't available to use on service copy engines), we always prefer to use
MEM_SET for clearing buffers on our newer platforms.
We've been using a 'has_link_copy_engine' feature flag to keep track of
which platforms should use MEM_SET for fills. However a feature flag
like this is unnecessary since we can already derive the presence of
service copy engines (and in turn the MEM_SET instruction) just by
looking at the platform's pre-fusing engine list. Utilizing the engine
list for this also avoids mistakes like we've made on Xe2 where we
forget to set the feature flag in the IP definition.
For clarity, "service copy" is a general term that covers any blitter
engines that support a limited subset of the overall blitter instruction
set (in practice this is any non-BCS0 blitter engine). The "link copy
engines" introduced on PVC and the "paging copy engine" present in Xe2
are both instances of service copy engines.
v2:
- Rewrite / expand the commit message. (Bala)
- Fix checkpatch whitespace error.
Bspec: 65019
Cc: Lucas De Marchi <[email protected]>
Cc: Balasubramani Vivekanandan <[email protected]>
Reviewed-by: Haridhar Kalvala <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Matt Roper <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Starting with Xe_LP+, GFX_MSTR_IRQ contains status bits that have W1C
behavior. If we do not properly reset them, we would miss delivery of
interrupts if a pending bit is set when enabling IRQs.
As an example, the display part of our probe routine contains paths
where we wait for vblank interrupts. If a display interrupt was already
pending when enabling IRQs, we would time out waiting for the vblank.
That in fact happened recently when modprobing Xe on a Lunar Lake with a
specific configuration; and that's how we found out we were missing this
step in the IRQ enabling logic.
Fix the issue by clearing GFX_MSTR_IRQ as part of the IRQ reset.
v2:
- Make resetting GFX_MSTR_IRQ be the last step to avoid bit
re-latching. (Ville)
v3:
- Swap nesting order: guard loop with the IP version check instead of
doing the check at each iteration. (Lucas)
v4:
- Add braces for the "if" statement guarding the loop to make the
compiler happy. (Gustavo)
BSpec: 50875, 54028, 62357
Cc: Matt Roper <[email protected]>
Cc: Ville Syrjälä <[email protected]>
Reviewed-by: Lucas De Marchi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Gustavo Sousa <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Drop UGM per set fragment threshold to 3
BSpec: 54833
Signed-off-by: Shekhar Chauhan <[email protected]>
Reviewed-by: Matt Roper <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Matt Roper <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Ensure alignment with PAGE_SIZE for the size parameter
passed to __xe_bo_create_locked()
v2: move size alignment under else condition (Lucas)
Signed-off-by: Pallavi Mishra <[email protected]>
Reviewed-by: Lucas De Marchi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Depending on the context, it's preferred to have a const pointer to make
sure nothing is modified underneath. The assert macros only ever read
data from xe/tile/gt for printing, so they can be made const by default.
Reviewed-by: Michal Wajdeczko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Lucas De Marchi <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
In future devices we will need to support msix interrupts.
Reviewed-by: Ohad Sharabi <[email protected]>
Signed-off-by: Dani Liberman <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
As a preparation for msix support, changing for new msi irq api
which supports both msi and msix.
Reviewed-by: Ohad Sharabi <[email protected]>
Signed-off-by: Dani Liberman <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
[Rebase fixes by Rodrigo]
|
|
IRQ enabled flag should be set only after request irq succeeds.
Reviewed-by: Ohad Sharabi <[email protected]>
Signed-off-by: Dani Liberman <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Use the newly added drm_print_memory_stats helper to show memory
utilisation of our objects in drm/driver specific fdinfo output.
To collect the stats we walk the per memory regions object lists
and accumulate object size into the respective drm_memory_stats
categories.
Objects with multiple possible placements are reported in multiple
regions for total and shared sizes, while other categories are
counted only for the currently active region.
V4:
- Remove rcu lock - Auld/Thomas
- take refcnt only if its non-zero - Auld
- DMA_RESV_USAGE_BOOKKEEP covers all fences - Auld
- covert to xe_bo for public objects
V3:
- dont use xe_bo_get/put, not needed
- use designated initializer - Jani
- use list_for_each_entry_rcu
- Fix Checkpatch err - CI
V2:
- Use static initializer for mem_type - Himal/Jani
Reviewed-by: Himal Prasad Ghimiray <[email protected]>
Signed-off-by: Tejas Upadhyay <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Account ring buffers and logical context space against the owning client
memory usage stats.
Reviewed-by: Himal Prasad Ghimiray <[email protected]>
Signed-off-by: Tejas Upadhyay <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Account page table memory usage in the owning client
memory usage stats.
V2:
- Minor tweak to if (vm->pt_root[id]) check - Himal
Reviewed-by: Himal Prasad Ghimiray <[email protected]>
Signed-off-by: Tejas Upadhyay <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Enable accounting of indirect client memory usage.
Reviewed-by: Himal Prasad Ghimiray <[email protected]>
Signed-off-by: Tejas Upadhyay <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
In order to show per client memory consumption, we
need tracking support APIs to add at every bo consumption
and removal. Adding APIs here to add tracking calls at
places wherever it is applicable.
V5:
- Rebase
V4:
- remove client bo before vm_put
- spin_lock_irqsave not required - Auld
V3:
- update .h to return xe_drm_client_remove_bo void
- protect xe_drm_client_remove_bo under CONFIG_PROC_FS check - Himal
- Fixed Checkpatch error - CI
V2:
- make xe_drm_client_remove_bo return void - Himal
Reviewed-by: Himal Prasad Ghimiray <[email protected]>
Signed-off-by: Tejas Upadhyay <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
DRM core driver has introduced recently fdinfo interface to
show memory stats of individual drm client. Lets interface
xe drm client to fdinfo interface.
V2:
- cover call to xe_drm_client_fdinfo under PROC_FS
Reviewed-by: Himal Prasad Ghimiray <[email protected]>
Signed-off-by: Tejas Upadhyay <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Add drm-client infrastructure to record stats of consumption
done by individual drm client.
V2:
- Typo - CI
Reviewed-by: Himal Prasad Ghimiray <[email protected]>
Signed-off-by: Tejas Upadhyay <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
The CAT_ERROR message from the GuC provides the guc id of the context
that caused the problem, which can be a child context. We therefore
need to be able to match that id to the exec_queue that owns it, which
we do by adding child context to the context lookup.
While at it, fix the error path of the guc id allocation code to
correctly free the ids allocated for parallel queues.
v2: rebase on s/XE_WARN_ON/xe_assert
Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/590
Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Cc: Matthew Brost <[email protected]>
Cc: John Harrison <[email protected]>
Reviewed-by: Matthew Brost <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
Although the vast majority of workarounds the driver needs to implement
are either GT-based or display-based, there are occasionally workarounds
that reside outside those parts of the hardware (i.e., in they target
registers in the sgunit/soc); we can consider these to be "tile"
workarounds since there will be instance of these registers per tile.
The registers in question should only lose their values during a
function-level reset, so they only need to be applied during probe and
resume; the registers will not be affected by GT/engine resets.
Tile workarounds are rare (there's only one, 22010954014, that's
relevant to Xe at the moment) so it's probably not worth updating the
xe_rtp design to handle tile-level workarounds yet, although we may want
to consider that in the future if/when more of these show up on future
platforms.
Reviewed-by: Lucas De Marchi <[email protected]>
Acked-by: Jani Nikula <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Matt Roper <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
For now only support pinning in TT memory, for two reasons:
1) Avoid pinning in a placement not accessible to some importers.
2) Pinning in VRAM requires PIN accounting which is a to-do.
v2:
- Adjust the dma-buf kunit test accordingly.
Suggested-by: Oded Gabbay <[email protected]>
Signed-off-by: Thomas Hellström <[email protected]>
Reviewed-by: Oded Gabbay <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
At the end of the function, we will always return err no matter it's
value. Simplify this by just returning the result of
drmm_add_action_or_reset().
Reviewed-by: Matt Roper <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Gustavo Sousa <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
With the GPUVA conversion, the xe_bo::vmas member became replaced with
drm_gem_object::gpuva.list, however there was a couple of usage instances
left using the old member. Most notably the pipelined fence
enable_signaling.
Remove the xe_bo::vmas member completely, fix usage instances and
also enable this pipelined fence enable_signaling even for faulting
VM:s since we actually wait for bind fences to complete.
v2:
- Rebase.
v3:
- Fix display code build error.
Cc: Matthew Brost <[email protected]>
Signed-off-by: Thomas Hellström <[email protected]>
Reviewed-by: Matthew Brost <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
When testing a new binary and/or debugging binary-related issues, it is
useful to have the option to change which binary is loaded without
having to update and re-compile the kernel. To support this option, this
patch adds 2 new modparams to override the FW path for GuC and HuC. The
HuC modparam can also be set to an empty string to disable HuC loading.
Note that those modparams only take effect on platforms where we already
have a default FW, so we're sure there is support for FW loading and the
kernel isn't going to explode in an undefined path.
v2: simplify comment (John),
rebase on s/guc_submission_enabled/uc_enabled
Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Cc: John Harrison <[email protected]>
Cc: Matthew Brost <[email protected]>
Reviewed-by: John Harrison <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
The current uC status tracking has a few issues:
1) the HuC is moved to "disabled" instead of "not supported"
2) the status is left uninitialized instead of "disabled" when the
modparam is used to disable support
3) due to #1, a number of checks are done against "disabled" instead of
the appropriate status.
Address all of those by making sure to follow the appropriate state
transition and checking against the required state.
v2: rebase on s/guc_submission_enabled/uc_enabled/
Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Cc: John Harrison <[email protected]>
Cc: Matthew Brost <[email protected]>
Reviewed-by: John Harrison <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|
|
The guc_submission_enabled() function is being used as a boolean toggle
for all firmwares and all related features, not just GuC submission. We
could add additional flags/functions to distinguish and allow different
use-cases (e.g. loading HuC but not using GuC submission), but given
that not using GuC is a debug-only scenario having a global switch for
all FWs is enough. However, we want to make it clear that this switch
turns off everything, so rename it to uc_enabled().
v2: rebase on s/XE_WARN_ON/xe_assert
Signed-off-by: Daniele Ceraolo Spurio <[email protected]>
Cc: John Harrison <[email protected]>
Cc: Matthew Brost <[email protected]>
Reviewed-by: Matthew Brost <[email protected]>
Signed-off-by: Rodrigo Vivi <[email protected]>
|