aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-03-26drm/i915: Remove extra multi-gt pm-referencesJanusz Krzysztofik1-18/+0
There was an attempt to fix an issue of illegal attempts to free a still active i915 VMA object when parking a GT believed to be idle, reported by CI on 2-GT Meteor Lake. As a solution, an extra wakeref for a Primary GT was acquired from i915_gem_do_execbuffer() -- see commit f56fe3e91787 ("drm/i915: Fix a VMA UAF for multi-gt platform"). However, that fix occurred insufficient -- the issue was still reported by CI. That wakeref was released on exit from i915_gem_do_execbuffer(), then potentially before completion of the request and deactivation of its associated VMAs. Moreover, CI reports indicated that single-GT platforms also suffered sporadically from the same race. Since the issue has now been fixed by a preceding patch "drm/i915/vma: Fix UAF on destroy against retire race", drop the no longer useful changes introduced by that insufficient fix. v3: Also drop the no longer used .wakeref_gt0 field from struct i915_execbuffer. v2: Avoid the word "revert" in commit message (Rodrigo), - update commit description reusing relevant chunks dropped from the description of the proper fix (Rodrigo). Signed-off-by: Janusz Krzysztofik <[email protected]> Cc: Nirmoy Das <[email protected]> Cc: Rodrigo Vivi <[email protected]> Reviewed-by: Nirmoy Das <[email protected]> Signed-off-by: Andi Shyti <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-03-26drm/i915/vma: Fix UAF on destroy against retire raceJanusz Krzysztofik1-7/+43
Object debugging tools were sporadically reporting illegal attempts to free a still active i915 VMA object when parking a GT believed to be idle. [161.359441] ODEBUG: free active (active state 0) object: ffff88811643b958 object type: i915_active hint: __i915_vma_active+0x0/0x50 [i915] [161.360082] WARNING: CPU: 5 PID: 276 at lib/debugobjects.c:514 debug_print_object+0x80/0xb0 ... [161.360304] CPU: 5 PID: 276 Comm: kworker/5:2 Not tainted 6.5.0-rc1-CI_DRM_13375-g003f860e5577+ #1 [161.360314] Hardware name: Intel Corporation Rocket Lake Client Platform/RocketLake S UDIMM 6L RVP, BIOS RKLSFWI1.R00.3173.A03.2204210138 04/21/2022 [161.360322] Workqueue: i915-unordered __intel_wakeref_put_work [i915] [161.360592] RIP: 0010:debug_print_object+0x80/0xb0 ... [161.361347] debug_object_free+0xeb/0x110 [161.361362] i915_active_fini+0x14/0x130 [i915] [161.361866] release_references+0xfe/0x1f0 [i915] [161.362543] i915_vma_parked+0x1db/0x380 [i915] [161.363129] __gt_park+0x121/0x230 [i915] [161.363515] ____intel_wakeref_put_last+0x1f/0x70 [i915] That has been tracked down to be happening when another thread is deactivating the VMA inside __active_retire() helper, after the VMA's active counter has been already decremented to 0, but before deactivation of the VMA's object is reported to the object debugging tool. We could prevent from that race by serializing i915_active_fini() with __active_retire() via ref->tree_lock, but that wouldn't stop the VMA from being used, e.g. from __i915_vma_retire() called at the end of __active_retire(), after that VMA has been already freed by a concurrent i915_vma_destroy() on return from the i915_active_fini(). Then, we should rather fix the issue at the VMA level, not in i915_active. Since __i915_vma_parked() is called from __gt_park() on last put of the GT's wakeref, the issue could be addressed by holding the GT wakeref long enough for __active_retire() to complete before that wakeref is released and the GT parked. I believe the issue was introduced by commit d93939730347 ("drm/i915: Remove the vma refcount") which moved a call to i915_active_fini() from a dropped i915_vma_release(), called on last put of the removed VMA kref, to i915_vma_parked() processing path called on last put of a GT wakeref. However, its visibility to the object debugging tool was suppressed by a bug in i915_active that was fixed two weeks later with commit e92eb246feb9 ("drm/i915/active: Fix missing debug object activation"). A VMA associated with a request doesn't acquire a GT wakeref by itself. Instead, it depends on a wakeref held directly by the request's active intel_context for a GT associated with its VM, and indirectly on that intel_context's engine wakeref if the engine belongs to the same GT as the VMA's VM. Those wakerefs are released asynchronously to VMA deactivation. Fix the issue by getting a wakeref for the VMA's GT when activating it, and putting that wakeref only after the VMA is deactivated. However, exclude global GTT from that processing path, otherwise the GPU never goes idle. Since __i915_vma_retire() may be called from atomic contexts, use async variant of wakeref put. Also, to avoid circular locking dependency, take care of acquiring the wakeref before VM mutex when both are needed. v7: Add inline comments with justifications for: - using untracked variants of intel_gt_pm_get/put() (Nirmoy), - using async variant of _put(), - not getting the wakeref in case of a global GTT, - always getting the first wakeref outside vm->mutex. v6: Since __i915_vma_active/retire() callbacks are not serialized, storing a wakeref tracking handle inside struct i915_vma is not safe, and there is no other good place for that. Use untracked variants of intel_gt_pm_get/put_async(). v5: Replace "tile" with "GT" across commit description (Rodrigo), - avoid mentioning multi-GT case in commit description (Rodrigo), - explain why we need to take a temporary wakeref unconditionally inside i915_vma_pin_ww() (Rodrigo). v4: Refresh on top of commit 5e4e06e4087e ("drm/i915: Track gt pm wakerefs") (Andi), - for more easy backporting, split out removal of former insufficient workarounds and move them to separate patches (Nirmoy). - clean up commit message and description a bit. v3: Identify root cause more precisely, and a commit to blame, - identify and drop former workarounds, - update commit message and description. v2: Get the wakeref before VM mutex to avoid circular locking dependency, - drop questionable Fixes: tag. Fixes: d93939730347 ("drm/i915: Remove the vma refcount") Closes: https://gitlab.freedesktop.org/drm/intel/issues/8875 Signed-off-by: Janusz Krzysztofik <[email protected]> Cc: Thomas Hellström <[email protected]> Cc: Nirmoy Das <[email protected]> Cc: Andi Shyti <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: [email protected] # v5.19+ Reviewed-by: Nirmoy Das <[email protected]> Signed-off-by: Andi Shyti <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-03-20drm/i915/xelpg: Add Wa_14020495402Radhakrishna Sripada2-1/+6
Disable clockgating for TDL SVHS fub. v2: Implement in general render/compute wa's(MattR) Bspec: 46045 Cc: Matt Roper <[email protected]> Signed-off-by: Radhakrishna Sripada <[email protected]> Reviewed-by: Matt Roper <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-03-13drm/i915/selftests: Pick correct caching mode.Nirmoy Das1-1/+4
Caching mode is HW dependent so pick a correct one using intel_gt_coherent_map_type(). Cc: Andi Shyti <[email protected]> Cc: Janusz Krzysztofik <[email protected]> Cc: Jonathan Cavitt <[email protected]> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/10249 Signed-off-by: Nirmoy Das <[email protected]> Acked-by: Jonathan Cavitt <[email protected]> Reviewed-by: Andi Shyti <[email protected]> Signed-off-by: Andi Shyti <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-03-11drm/i915: Drop WA 16015675438Lucas De Marchi2-6/+2
With dynamic load-balancing disabled on the compute side, there's no reason left to enable WA 16015675438. Drop it from both PVC and DG2. Note that this can be done because now the driver always set a fixed partition of EUs during initialization via the ccs_mode configuration. The flag to GuC is still needed because of 18020744125, so update the comment accordingly. Cc: Rodrigo Vivi <[email protected]> Acked-by: Mateusz Jablonski <[email protected]> Acked-by: Michal Mrozek <[email protected]> Reviewed-by: Rodrigo Vivi <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Lucas De Marchi <[email protected]>
2024-03-07drm/i915/guc: Enable Wa_14019159160John Harrison6-9/+38
Use the new w/a KLV support to enable a MTL w/a. Note, this w/a is a super-set of Wa_16019325821, so requires turning that one as well as setting the new flag for Wa_14019159160 itself. Signed-off-by: John Harrison <[email protected]> Reviewed-by: Vinay Belgaumkar <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-03-07drm/i915/guc: Add support for w/a KLVsJohn Harrison5-2/+85
To prevent running out of bits, new w/a enable flags are being added via a KLV system instead of a 32 bit flags word. Signed-off-by: John Harrison <[email protected]> Reviewed-by: Vinay Belgaumkar <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-03-07drm/i915: Enable Wa_16019325821John Harrison5-13/+27
Some platforms require holding RCS context switches until CCS is idle (the reverse w/a of Wa_14014475959). Some platforms require both versions. Signed-off-by: John Harrison <[email protected]> Reviewed-by: Vinay Belgaumkar <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-03-07drm/i915/guc: Use context hints for GT frequencyVinay Belgaumkar10-2/+86
Allow user to provide a low latency context hint. When set, KMD sends a hint to GuC which results in special handling for this context. SLPC will ramp the GT frequency aggressively every time it switches to this context. The down freq threshold will also be lower so GuC will ramp down the GT freq for this context more slowly. We also disable waitboost for this context as that will interfere with the strategy. We need to enable the use of SLPC Compute strategy during init, but it will apply only to contexts that set this bit during context creation. Userland can check whether this feature is supported using a new param- I915_PARAM_HAS_CONTEXT_FREQ_HINT. This flag is true for all guc submission enabled platforms as they use SLPC for frequency management. The Mesa usage model for this flag is here - https://gitlab.freedesktop.org/sushmave/mesa/-/commits/compute_hint v2: Rename flags as per review suggestions (Rodrigo, Tvrtko). Also, use flag bits in intel_context as it allows finer control for toggling per engine if needed (Tvrtko). v3: Minor review comments (Tvrtko) v4: Update comment (Sushma) Cc: Rodrigo Vivi <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: Sushma Venkatesh Reddy <[email protected]> Reviewed-by: Rodrigo Vivi <[email protected]> Acked-by: Ivan Briano <[email protected]> Signed-off-by: Vinay Belgaumkar <[email protected]> Signed-off-by: John Harrison <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-03-07drm/i915/mtl: Update workaround 14018575942Tejas Upadhyay1-0/+1
Applying WA 14018575942 only on Compute engine has impact on some apps like chrome. Updating this WA to apply on Render engine as well as it is helping with performance on Chrome. Note: There is no concern from media team thus not applying WA on media engines. We will revisit if any issues reported from media team. V2(Matt): - Use correct WA number Fixes: 668f37e1ee11 ("drm/i915/mtl: Update workaround 14018778641") Signed-off-by: Tejas Upadhyay <[email protected]> Reviewed-by: Matt Roper <[email protected]> Reviewed-by: Andi Shyti <[email protected]> Signed-off-by: Andi Shyti <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-03-05drm/i915/guc: Simplify/extend platform check for Wa_14018913170John Harrison1-3/+5
The above w/a is required for every platform that the i915 driver supports. It is fixed on the latest platforms but they are only supported by Xe instead of i915. So just remove the platform check completely and keep the code simple. v2: Add extra comment (review feedback from Rodrigo). Signed-off-by: John Harrison <[email protected]> Reviewed-by: Rodrigo Vivi <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-03-05drm/i915/selftests: Fix dependency of some timeouts on HZJanusz Krzysztofik1-2/+4
Third argument of i915_request_wait() accepts a timeout value in jiffies. Most users pass either a simple HZ based expression, or a result of msecs_to_jiffies(), or MAX_SCHEDULE_TIMEOUT, or a very small number not exceeding 4 if applicable as that value. However, there is one user -- intel_selftest_wait_for_rq() -- that passes a WAIT_FOR_RESET_TIME symbol, defined as a large constant value that most probably represents a desired timeout in ms. While that usage results in the intended value of timeout on usual x86_64 kernel configurations, it is not portable across different architectures and custom kernel configs. Rename the symbol to clearly indicate intended units and convert it to jiffies before use. Fixes: 3a4bfa091c46 ("drm/i915/selftest: Fix workarounds selftest for GuC submission") Signed-off-by: Janusz Krzysztofik <[email protected]> Cc: Rahul Kumar Singh <[email protected]> Cc: John Harrison <[email protected]> Cc: Matthew Brost <[email protected]> Reviewed-by: Andi Shyti <[email protected]> Signed-off-by: Andi Shyti <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-03-05drm/i915/selftest_hangcheck: Check sanity with more patienceJanusz Krzysztofik1-1/+1
While trying to reproduce some other issues reported by CI for i915 hangcheck live selftest, I found them hidden behind timeout failures reported by igt_hang_sanitycheck -- the very first hangcheck test case executed. Feb 22 19:49:06 DUT1394ACMR kernel: calling mei_gsc_driver_init+0x0/0xff0 [mei_gsc] @ 121074 Feb 22 19:49:06 DUT1394ACMR kernel: i915 0000:03:00.0: [drm] DRM_I915_DEBUG enabled Feb 22 19:49:06 DUT1394ACMR kernel: i915 0000:03:00.0: [drm] Cannot find any crtc or sizes Feb 22 19:49:06 DUT1394ACMR kernel: probe of i915.mei-gsc.768 returned 0 after 1475 usecs Feb 22 19:49:06 DUT1394ACMR kernel: probe of i915.mei-gscfi.768 returned 0 after 1441 usecs Feb 22 19:49:06 DUT1394ACMR kernel: initcall mei_gsc_driver_init+0x0/0xff0 [mei_gsc] returned 0 after 3010 usecs Feb 22 19:49:06 DUT1394ACMR kernel: i915 0000:03:00.0: [drm] DRM_I915_DEBUG_GEM enabled Feb 22 19:49:06 DUT1394ACMR kernel: i915 0000:03:00.0: [drm] DRM_I915_DEBUG_RUNTIME_PM enabled Feb 22 19:49:06 DUT1394ACMR kernel: i915: Performing live selftests with st_random_seed=0x4c26c048 st_timeout=500 Feb 22 19:49:07 DUT1394ACMR kernel: i915: Running hangcheck Feb 22 19:49:07 DUT1394ACMR kernel: calling mei_hdcp_driver_init+0x0/0xff0 [mei_hdcp] @ 121074 Feb 22 19:49:07 DUT1394ACMR kernel: i915: Running intel_hangcheck_live_selftests/igt_hang_sanitycheck Feb 22 19:49:07 DUT1394ACMR kernel: probe of 0000:00:16.0-b638ab7e-94e2-4ea2-a552-d1c54b627f04 returned 0 after 1398 usecs Feb 22 19:49:07 DUT1394ACMR kernel: probe of i915.mei-gsc.768-b638ab7e-94e2-4ea2-a552-d1c54b627f04 returned 0 after 97 usecs Feb 22 19:49:07 DUT1394ACMR kernel: initcall mei_hdcp_driver_init+0x0/0xff0 [mei_hdcp] returned 0 after 101960 usecs Feb 22 19:49:07 DUT1394ACMR kernel: calling mei_pxp_driver_init+0x0/0xff0 [mei_pxp] @ 121094 Feb 22 19:49:07 DUT1394ACMR kernel: probe of 0000:00:16.0-fbf6fcf1-96cf-4e2e-a6a6-1bab8cbe36b1 returned 0 after 435 usecs Feb 22 19:49:07 DUT1394ACMR kernel: mei_pxp i915.mei-gsc.768-fbf6fcf1-96cf-4e2e-a6a6-1bab8cbe36b1: bound 0000:03:00.0 (ops i915_pxp_tee_component_ops [i915]) Feb 22 19:49:07 DUT1394ACMR kernel: 100ms wait for request failed on rcs0, err=-62 Feb 22 19:49:07 DUT1394ACMR kernel: probe of i915.mei-gsc.768-fbf6fcf1-96cf-4e2e-a6a6-1bab8cbe36b1 returned 0 after 158425 usecs Feb 22 19:49:07 DUT1394ACMR kernel: initcall mei_pxp_driver_init+0x0/0xff0 [mei_pxp] returned 0 after 224159 usecs Feb 22 19:49:07 DUT1394ACMR kernel: i915/intel_hangcheck_live_selftests: igt_hang_sanitycheck failed with error -5 Feb 22 19:49:07 DUT1394ACMR kernel: i915: probe of 0000:03:00.0 failed with error -5 Those request waits, once timed out after 100ms, have never been confirmed to still persist over another 100ms, always being able to complete within the originally requested wait time doubled. Taking into account potentially significant additional concurrent workload generated by new auxiliary drivers that didn't exist before and now are loaded in parallel with the i915 module also when loaded in selftest mode, relax our expectations on time consumed by the sanity check request before it completes. Signed-off-by: Janusz Krzysztofik <[email protected]> Reviewed-by: Andi Shyti <[email protected]> Signed-off-by: Andi Shyti <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-03-04drm/i915/guc: Correct capture of EIR register on hangJohn Harrison1-5/+1
The EIR register (0x20B0) was being included in the engine class list for render and compute as the absolute register address. However, it is actually a ring register available on all engines at an offset of (base) + 0xB0. As it was included as an RCS engine but with the absolute address, GuC was adding on another 0x2000 and coming out at an invalid location. Thus it would reject the register and complain about only managing a partial capture. So update the list to use the RING_EIR version of the register and include it for all engines. Signed-off-by: John Harrison <[email protected]> Reviewed-by: Alan Previn <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-03-01drm/i915/guc: Use the new gt_to_guc() wrapperAndi Shyti9-27/+28
Get the guc reference from the gt using the gt_to_guc() helper. Signed-off-by: Andi Shyti <[email protected]> Reviewed-by: Nirmoy Das <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-03-01drm/i915/gt: Create the gt_to_guc() wrapperAndi Shyti10-23/+25
We already have guc_to_gt() and getting to guc from the GT it requires some mental effort. Add the gt_to_guc(). Given the reference to the "gt", the gt_to_guc() will return the pinter to the "guc". Update all the files under the gt/ directory. Signed-off-by: Andi Shyti <[email protected]> Reviewed-by: Nirmoy Das <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-02-28drm/i915: Check before removing mm notifierNirmoy Das1-0/+3
Error in mmu_interval_notifier_insert() can leave a NULL notifier.mm pointer. Catch that and return early. Fixes: ed29c2691188 ("drm/i915: Fix userptr so we do not have to worry about obj->mm.lock, v7.") Cc: <[email protected]> # v5.13+ [tursulin: Added Fixes and cc stable.] Cc: Andi Shyti <[email protected]> Cc: Shawn Lee <[email protected]> Signed-off-by: Nirmoy Das <[email protected]> Reviewed-by: Rodrigo Vivi <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Tvrtko Ursulin <[email protected]>
2024-02-20drm/i915: Add some boring kerneldocTvrtko Ursulin1-0/+4
Tooling appears very strict so lets pacify it by adding some comments, even if fields are completely self-explanatory. Signed-off-by: Tvrtko Ursulin <[email protected]> Fixes: b11236486749 ("drm/i915: Add GuC submission interface version query") Reported-by: Stephen Rothwell <[email protected]> Cc: Jose Souza <[email protected]> Reviewed-by: José Roberto de Souza <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-02-14drm/i915/gt: Restart the heartbeat timer when forcing a pulseJohn Harrison1-0/+3
The context persistence code does things like send super high priority heartbeat pulses to ensure any leaked context can still be pre-empted and thus isn't a total denial of service but only a minor denial of service. Unfortunately, it wasn't bothering to restart the heartbeat worker with a fresh timeout. Thus, if a persistent context happened to be closed just before the heartbeat was going to go ping anyway then the forced pulse would get a negligble execution time. And as the forced pulse is super high priority, the worker thread's next step is a reset. Which means a potentially innocent system randomly goes boom when attempting to close a context. So, force a re-schedule of the worker thread with the appropriate timeout. Signed-off-by: John Harrison <[email protected]> Reviewed-by: Andi Shyti <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-02-14drm/i915: Add GuC submission interface version queryTvrtko Ursulin2-0/+45
Add a new query to the GuC submission interface version. Mesa intends to use this information to check for old firmware versions with a known bug where using the render and compute command streamers simultaneously can cause GPU hangs due issues in firmware scheduling. Based on patches from Vivaik and Joonas. Compile tested only. v2: * Added branch version. Signed-off-by: Tvrtko Ursulin <[email protected]> Cc: Kenneth Graunke <[email protected]> Cc: Jose Souza <[email protected]> Cc: Sagar Ghuge <[email protected]> Cc: Paulo Zanoni <[email protected]> Cc: John Harrison <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: Vivaik Balasubrawmanian <[email protected]> Cc: Joonas Lahtinen <[email protected]> Reviewed-by: José Roberto de Souza <[email protected]> Tested-by: José Roberto de Souza <[email protected]> Signed-off-by: José Roberto de Souza <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-02-13drm/i915/selftests: Increasing the sleep time for live_rc6_manualAnirban Sk1-2/+2
Sometimes gt_pm live_rc6_manual selftest fails due to no power being measured for the rc6 disabled period. Therefore increasing the rc6 disable period from 250ms to 1000ms to rule out such sporadic failure. v3: - More descriptive and improved commit message (Anshuman) Signed-off-by: Anirban Sk <[email protected]> Reviewed-by: Anshuman Gupta <[email protected]> Signed-off-by: Anshuman Gupta <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-02-12drm/i915: Add flex arrays to struct i915_syncmapErick Archer1-11/+8
The "struct i915_syncmap" uses a dynamically sized set of trailing elements. It can use an "u32" array or a "struct i915_syncmap *" array. So, use the preferred way in the kernel declaring flexible arrays [1]. Because there are two possibilities for the trailing arrays, it is necessary to declare a union and use the DECLARE_FLEX_ARRAY macro. The comment can be removed as the union is now clear enough. Also, avoid the open-coded arithmetic in the memory allocator functions [2] using the "struct_size" macro. Moreover, refactor the "__sync_seqno" and "__sync_child" functions due to now it is possible to use the union members added to the structure. This way, it is also possible to avoid the open-coded arithmetic in pointers. Link: https://www.kernel.org/doc/html/next/process/deprecated.html#zero-length-and-one-element-arrays [1] Link: https://www.kernel.org/doc/html/next/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments [2] Signed-off-by: Erick Archer <[email protected]> Reviewed-by: Kees Cook <[email protected]> Signed-off-by: Tvrtko Ursulin <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-01-24drm/i915/gt: Reflect the true and current status of rc6_enableJuan Escamilla1-2/+2
The sysfs file is named 'enabled', thus users might want to know the true state of the RC6 instead of only the indication if the RC6 should be enabled. Let's use rc6.enable directly instead of rc6.supported. Signed-off-by: Juan Escamilla <[email protected]> Reviewed-by: Rodrigo Vivi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-01-23drm/i915/mtl: Wake GT before sending H2G messageVinay Belgaumkar1-1/+4
Instead of waiting until the interrupt reaches GuC, we can grab a forcewake while triggering the H2G interrupt. GEN11_GUC_HOST_INTERRUPT is inside sgunit and is not affected by forcewakes. However, there could be some delays when platform is entering/exiting some higher level platform sleep states and a H2G is triggered. A forcewake ensures those sleep states have been fully exited and further processing occurs as expected. The hysteresis timers for C6 and higher sleep states will ensure there is no unwanted race between the wake and processing of the interrupts by GuC. This will have an official WA soon so adding a FIXME in the comments. v2: Make the new ranges watertight to address BAT failures and update commit message (Matt R). Reviewed-by: Matt Roper <[email protected]> Signed-off-by: Vinay Belgaumkar <[email protected]> Signed-off-by: John Harrison <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-01-18drm/i915/xelpg: Extend some workarounds/tuning to gfx version 12.74Matt Roper3-12/+18
Some of our existing Xe_LPG workarounds and tuning are also applicable to the version 12.74 variant. Extend the condition bounds accordingly. Also fix the comment on Wa_14018575942 while we're at it. v2: Extend some more workarounds (Harish) Signed-off-by: Matt Roper <[email protected]> Signed-off-by: Harish Chegondi <[email protected]> Signed-off-by: Haridhar Kalvala <[email protected]> Reviewed-by: Matt Atwood <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-01-18drm/i915/xelpg: Extend driver code of Xe_LPG to Xe_LPG+Harish Chegondi4-4/+5
Xe_LPG+ (IP version 12.74) should take the same general code paths as Xe_LPG (versions 12.70 and 12.71). Xe_LPG+'s workaround list will be handled by the next patch. Signed-off-by: Harish Chegondi <[email protected]> Signed-off-by: Haridhar Kalvala <[email protected]> Reviewed-by: Matt Roper <[email protected]> Signed-off-by: Matt Roper <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-01-10drm/i915/gt: Use rc6.supported flag from intel_gt for rc6_enable sysfsJuan Escamilla1-16/+2
Currently if rc6 is supported, it gets enabled and the sysfs files for rc6_enable_show and rc6_enable_dev_show uses masks to check information from drm_i915_private. However rc6_support functions take more variables and conditions into consideration and thus these masks are not enough for most of the modern hardware and it is simpley lyting to the user. Let's fix it by at least use the rc6.supported flag from intel_gt information. Signed-off-by: Juan Escamilla <[email protected]> Reviewed-by: Rodrigo Vivi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-01-09drm/i915/guc: Avoid circular locking issue on busyness flushJohn Harrison2-2/+48
Avoid the following lockdep complaint: <4> [298.856498] ====================================================== <4> [298.856500] WARNING: possible circular locking dependency detected <4> [298.856503] 6.7.0-rc5-CI_DRM_14017-g58ac4ffc75b6+ #1 Tainted: G N <4> [298.856505] ------------------------------------------------------ <4> [298.856507] kworker/4:1H/190 is trying to acquire lock: <4> [298.856509] ffff8881103e9978 (&gt->reset.backoff_srcu){++++}-{0:0}, at: _intel_gt_reset_lock+0x35/0x380 [i915] <4> [298.856661] but task is already holding lock: <4> [298.856663] ffffc900013f7e58 ((work_completion)(&(&guc->timestamp.work)->work)){+.+.}-{0:0}, at: process_scheduled_works+0x264/0x530 <4> [298.856671] which lock already depends on the new lock. The complaint is not actually valid. The busyness worker thread does indeed hold the worker lock and then attempt to acquire the reset lock (which may have happened in reverse order elsewhere). However, it does so with a trylock that exits if the reset lock is not available (specifically to prevent this and other similar deadlocks). Unfortunately, lockdep does not understand the trylock semantics (the lock is an i915 specific custom implementation for resets). Not doing a synchronous flush of the worker thread when a reset is in progress resolves the lockdep splat by never even attempting to grab the lock in this particular scenario. There are situatons where a synchronous cancel is required, however. So, always do the synchronous cancel if not in reset. And add an extra synchronous cancel to the end of the reset flow to account for when a reset is occurring at driver shutdown and the cancel is no longer synchronous but could lead to unallocated memory accesses if the worker is not stopped. Signed-off-by: Zhanjun Dong <[email protected]> Signed-off-by: John Harrison <[email protected]> Cc: Andi Shyti <[email protected]> Cc: Daniel Vetter <[email protected]> Reviewed-by: Andi Shyti <[email protected]> Acked-by: Daniel Vetter <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-01-09drm/i915/guc: Close deregister-context race against CT-lossAlan Previn2-5/+78
If we are at the end of suspend or very early in resume its possible an async fence signal (via rcu_call) is triggered to free_engines which could lead us to the execution of the context destruction worker (after a prior worker flush). Thus, when suspending, insert rcu_barriers at the start of i915_gem_suspend (part of driver's suspend prepare) and again in i915_gem_suspend_late so that all such cases have completed and context destruction list isn't missing anything. In destroyed_worker_func, close the race against CT-loss by checking that CT is enabled before calling into deregister_destroyed_contexts. Based on testing, guc_lrc_desc_unpin may still race and fail as we traverse the GuC's context-destroy list because the CT could be disabled right before calling GuC's CT send function. We've witnessed this race condition once every ~6000-8000 suspend-resume cycles while ensuring workloads that render something onscreen is continuously started just before we suspend (and the workload is small enough to complete and trigger the queued engine/context free-up either very late in suspend or very early in resume). In such a case, we need to unroll the entire process because guc-lrc-unpin takes a gt wakeref which only gets released in the G2H IRQ reply that never comes through in this corner case. Without the unroll, the taken wakeref is leaked and will cascade into a kernel hang later at the tail end of suspend in this function: intel_wakeref_wait_for_idle(&gt->wakeref) (called by) - intel_gt_pm_wait_for_idle (called by) - wait_for_suspend Thus, do an unroll in guc_lrc_desc_unpin and deregister_destroyed_- contexts if guc_lrc_desc_unpin fails due to CT send falure. When unrolling, keep the context in the GuC's destroy-list so it can get picked up on the next destroy worker invocation (if suspend aborted) or get fully purged as part of a GuC sanitization (end of suspend) or a reset flow. Signed-off-by: Alan Previn <[email protected]> Signed-off-by: Anshuman Gupta <[email protected]> Tested-by: Mousumi Jana <[email protected]> Acked-by: Daniele Ceraolo Spurio <[email protected]> Reviewed-by: Rodrigo Vivi <[email protected]> Signed-off-by: Matt Roper <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-01-09drm/i915/guc: Flush context destruction worker at suspendAlan Previn3-0/+9
When suspending, flush the context-guc-id deregistration worker at the final stages of intel_gt_suspend_late when we finally call gt_sanitize that eventually leads down to __uc_sanitize so that the deregistration worker doesn't fire off later as we reset the GuC microcontroller. Signed-off-by: Alan Previn <[email protected]> Reviewed-by: Rodrigo Vivi <[email protected]> Tested-by: Mousumi Jana <[email protected]> Signed-off-by: Matt Roper <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-01-05drm/i915/huc: Allow for very slow HuC loadingJohn Harrison2-11/+63
A failure to load the HuC is occasionally observed where the cause is believed to be a low GT frequency leading to very long load times. So a) increase the timeout so that the user still gets a working system even in the case of slow load. And b) report the frequency during the load to see if that is the cause of the slow down. Also update the similar code on the GuC load to not use uncore->gt when there is a local gt available. The two should match, but no need for unnecessary de-referencing. Signed-off-by: John Harrison <[email protected]> Reviewed-by: Daniele Ceraolo Spurio <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-01-05drm/i915/xelpg: Add workaround 14019877138Tejas Upadhyay1-0/+3
WA 14019877138 needed for Graphics 12.70/71 both V2(Jani): - Use drm/i915 Signed-off-by: Tejas Upadhyay <[email protected]> Reviewed-by: Matt Roper <[email protected]> Reviewed-by: Andi Shyti <[email protected]> Signed-off-by: Matt Roper <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-01-05drm/i915: don't make assumptions about intel_wakeref_t typeJani Nikula1-2/+2
intel_wakeref_t is supposed to be a mostly opaque cookie to its users. It should only be checked for being non-zero and set to zero. Debug logging its actual value is meaningless. Switch to just debug logging whether the async_put_wakeref is non-zero. The issue dates back to much earlier than commit b49e894c3fd8 ("drm/i915: Replace custom intel runtime_pm tracker with ref_tracker library"), but this is the one that brought about a build failure due to the printf format. Reported-by: Stephen Rothwell <[email protected]> Closes: https://lore.kernel.org/r/[email protected] Fixes: b49e894c3fd8 ("drm/i915: Replace custom intel runtime_pm tracker with ref_tracker library") Cc: Andrzej Hajda <[email protected]> Cc: Imre Deak <[email protected]> Signed-off-by: Jani Nikula <[email protected]> Reviewed-by: Imre Deak <[email protected]> Reviewed-by: Andrzej Hajda <[email protected]> Reviewed-by: Andi Shyti <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-01-02drm/i915/guc: Change wa and EU_PERF_CNTL registers to MCR typeShuicheng Lin1-8/+13
Some of the wa registers are MCR register, and EU_PERF_CNTL registers are MCR register. MCR register needs extra process for read/write. As normal MMIO register also could work with the MCR register process, change all wa registers to MCR type for code simplicity. Signed-off-by: Shuicheng Lin <[email protected]> Cc: Matt Roper <[email protected]> Cc: Umesh Nerlige Ramappa <[email protected]> Reviewed-by: Matt Roper <[email protected]> Signed-off-by: Matt Roper <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-12-29drm/i915/perf: reconcile Excess struct member kernel-doc warningsRandy Dunlap1-3/+6
Document nested struct members with full names as described in Documentation/doc-guide/kernel-doc.rst. i915_perf_types.h:341: warning: Excess struct member 'ptr_lock' description in 'i915_perf_stream' i915_perf_types.h:341: warning: Excess struct member 'head' description in 'i915_perf_stream' i915_perf_types.h:341: warning: Excess struct member 'tail' description in 'i915_perf_stream' 3 warnings as Errors Signed-off-by: Randy Dunlap <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Joonas Lahtinen <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: [email protected] Cc: Jonathan Corbet <[email protected]> Cc: [email protected] Reviewed-by: Rodrigo Vivi <[email protected]> Reviewed-by: Andi Shyti <[email protected]> Signed-off-by: Andi Shyti <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-12-29drm/i915/guc: reconcile Excess struct member kernel-doc warningsRandy Dunlap1-33/+42
Document nested struct members with full names as described in Documentation/doc-guide/kernel-doc.rst. intel_guc.h:305: warning: Excess struct member 'lock' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'guc_ids' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'num_guc_ids' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'guc_ids_bitmap' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'guc_id_list' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'guc_ids_in_use' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'destroyed_contexts' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'destroyed_worker' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'reset_fail_worker' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'reset_fail_mask' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'sched_disable_delay_ms' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'sched_disable_gucid_threshold' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'lock' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'gt_stamp' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'ping_delay' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'work' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'shift' description in 'intel_guc' intel_guc.h:305: warning: Excess struct member 'last_stat_jiffies' description in 'intel_guc' 18 warnings as Errors Signed-off-by: Randy Dunlap <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Joonas Lahtinen <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: [email protected] Cc: Jonathan Corbet <[email protected]> Cc: [email protected] Reviewed-by: Rodrigo Vivi <[email protected]> Reviewed-by: Andi Shyti <[email protected]> Signed-off-by: Andi Shyti <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-12-29drm/i915/gt: reconcile Excess struct member kernel-doc warningsRandy Dunlap1-2/+5
Document nested struct members with full names as described in Documentation/doc-guide/kernel-doc.rst. intel_gsc.h:34: warning: Excess struct member 'gem_obj' description in 'intel_gsc' Also add missing field member descriptions. Signed-off-by: Randy Dunlap <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Joonas Lahtinen <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: [email protected] Cc: Jonathan Corbet <[email protected]> Cc: [email protected] Reviewed-by: Rodrigo Vivi <[email protected]> Cc: Andi Shyti <[email protected]> Reviewed-by: Andi Shyti <[email protected]> Signed-off-by: Andi Shyti <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-12-29drm/i915/gem: reconcile Excess struct member kernel-doc warningsRandy Dunlap1-2/+2
Document nested struct members with full names as described in Documentation/doc-guide/kernel-doc.rst. i915_gem_context_types.h:420: warning: Excess struct member 'lock' description in 'i915_gem_context' Signed-off-by: Randy Dunlap <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Joonas Lahtinen <[email protected]> Cc: Rodrigo Vivi <[email protected]> Cc: Tvrtko Ursulin <[email protected]> Cc: [email protected] Cc: Jonathan Corbet <[email protected]> Cc: [email protected] Reviewed-by: Rodrigo Vivi <[email protected]> Reviewed-by: Andi Shyti <[email protected]> Signed-off-by: Andi Shyti <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-12-22drm/i915/perf: Update handling of MMIO triggered reportsUmesh Nerlige Ramappa1-5/+34
On XEHP platforms user is not able to find MMIO triggered reports in the OA buffer since i915 squashes the context ID fields. These context ID fields hold the MMIO trigger markers. Update logic to not squash the context ID fields of MMIO triggered reports. Fixes: 7eeaedf79989 ("drm/i915/perf: Determine context valid in OA reports") Signed-off-by: Umesh Nerlige Ramappa <[email protected]> Reviewed-by: Ashutosh Dixit <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-12-19drm/i915/gem: Atomically invalidate userptr on mmu-notifierJonathan Cavitt5-77/+0
Never block for outstanding work on userptr object upon receipt of a mmu-notifier. The reason we originally did so was to immediately unbind the userptr and unpin its pages, but since that has been dropped in commit b4b9731b02c3c ("drm/i915: Simplify userptr locking"), we never return the pages to the system i.e. never drop our page->mapcount and so do not allow the page and CPU PTE to be revoked. Based on this history, we know we are safe to drop the wait entirely. Upon return from mmu-notifier, we will still have the userptr pages pinned preventing the following PTE operation (such as try_to_unmap) adjusting the vm_area_struct, so it is safe to keep the pages around for as long as we still have i/o pending. We do not have any means currently to asynchronously revalidate the userptr pages, that is always prior to next use. Signed-off-by: Chris Wilson <[email protected]> Signed-off-by: Jonathan Cavitt <[email protected]> Reviewed-by: Andi Shyti <[email protected]> Signed-off-by: Andi Shyti <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-12-15drm/i915: Use kmap_local_page() in gem/i915_gem_execbuffer.cZhao Liu1-5/+5
The use of kmap_atomic() is being deprecated in favor of kmap_local_page()[1], and this patch converts the calls from kmap_atomic() to kmap_local_page(). The main difference between atomic and local mappings is that local mappings doesn't disable page faults or preemption (the preemption is disabled for !PREEMPT_RT case, otherwise it only disables migration). With kmap_local_page(), we can avoid the often unwanted side effect of unnecessary page faults and preemption disables. In i915_gem_execbuffer.c, eb->reloc_cache.vaddr is mapped by kmap_atomic() in eb_relocate_entry(), and is unmapped by kunmap_atomic() in reloc_cache_reset(). And this mapping/unmapping occurs in two places: one is in eb_relocate_vma(), and another is in eb_relocate_vma_slow(). The function eb_relocate_vma() or eb_relocate_vma_slow() doesn't need to disable pagefaults and preemption during the above mapping/ unmapping. So it can simply use kmap_local_page() / kunmap_local() that can instead do the mapping / unmapping regardless of the context. Convert the calls of kmap_atomic() / kunmap_atomic() to kmap_local_page() / kunmap_local(). [1]: https://lore.kernel.org/all/[email protected] Suggested-by: Ira Weiny <[email protected]> Suggested-by: Fabio M. De Francesco <[email protected]> Signed-off-by: Zhao Liu <[email protected]> Reviewed-by: Ira Weiny <[email protected]> Reviewed-by: Fabio M. De Francesco <[email protected]> Signed-off-by: Tvrtko Ursulin <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-12-15drm/i915: Use kmap_local_page() in i915_cmd_parser.cZhao Liu1-2/+2
The use of kmap_atomic() is being deprecated in favor of kmap_local_page()[1], and this patch converts the call from kmap_atomic() to kmap_local_page(). The main difference between atomic and local mappings is that local mappings doesn't disable page faults or preemption (the preemption is disabled for !PREEMPT_RT case, otherwise it only disables migration). With kmap_local_page(), we can avoid the often unwanted side effect of unnecessary page faults and preemption disables. There're 2 reasons why function copy_batch() doesn't need to disable pagefaults and preemption for mapping: 1. The flush operation is safe. In i915_cmd_parser.c, copy_batch() calls drm_clflush_virt_range() to use CLFLUSHOPT or WBINVD to flush. Since CLFLUSHOPT is global on x86 and WBINVD is called on each cpu in drm_clflush_virt_range(), the flush operation is global. 2. Any context switch caused by preemption or page faults (page fault may cause sleep) doesn't affect the validity of local mapping. Therefore, copy_batch() is a function where the use of kmap_local_page() in place of kmap_atomic() is correctly suited. Convert the calls of kmap_atomic() / kunmap_atomic() to kmap_local_page() / kunmap_local(). [1]: https://lore.kernel.org/all/[email protected] Suggested-by: Dave Hansen <[email protected]> Suggested-by: Ira Weiny <[email protected]> Suggested-by: Fabio M. De Francesco <[email protected]> Signed-off-by: Zhao Liu <[email protected]> Reviewed-by: Ira Weiny <[email protected]> Reviewed-by: Fabio M. De Francesco <[email protected]> Signed-off-by: Tvrtko Ursulin <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-12-15drm/i915: Use memcpy_from_page() in gt/uc/intel_uc_fw.cZhao Liu1-4/+1
The use of kmap_atomic() is being deprecated in favor of kmap_local_page()[1], and this patch converts the call from kmap_atomic() to kmap_local_page(). The main difference between atomic and local mappings is that local mappings doesn't disable page faults or preemption (the preemption is disabled for !PREEMPT_RT case, otherwise it only disables migration). With kmap_local_page(), we can avoid the often unwanted side effect of unnecessary page faults or preemption disables. In drm/i915/gt/uc/intel_us_fw.c, the function intel_uc_fw_copy_rsa() just use the mapping to do memory copy so it doesn't need to disable pagefaults and preemption for mapping. Thus the local mapping without atomic context (not disable pagefaults / preemption) is enough. Therefore, intel_uc_fw_copy_rsa() is a function where the use of memcpy_from_page() with kmap_local_page() in place of memcpy() with kmap_atomic() is correctly suited. Convert the calls of memcpy() with kmap_atomic() / kunmap_atomic() to memcpy_from_page() which uses local mapping to copy. [1]: https://lore.kernel.org/all/[email protected]/T/#u Suggested-by: Ira Weiny <[email protected]> Suggested-by: Fabio M. De Francesco <[email protected]> Signed-off-by: Zhao Liu <[email protected]> Reviewed-by: Ira Weiny <[email protected]> Reviewed-by: Fabio M. De Francesco <[email protected]> Signed-off-by: Tvrtko Ursulin <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-12-15drm/i915: Use kmap_local_page() in gem/selftests/i915_gem_context.cZhao Liu1-4/+4
The use of kmap_atomic() is being deprecated in favor of kmap_local_page()[1], and this patch converts the call from kmap_atomic() to kmap_local_page(). The main difference between atomic and local mappings is that local mappings doesn't disable page faults or preemption. With kmap_local_page(), we can avoid the often unwanted side effect of unnecessary page faults or preemption disables. In drm/i915/gem/selftests/i915_gem_context.c, functions cpu_fill() and cpu_check() mainly uses mapping to flush cache and check/assign the value. There're 2 reasons why cpu_fill() and cpu_check() don't need to disable pagefaults and preemption for mapping: 1. The flush operation is safe. cpu_fill() and cpu_check() call drm_clflush_virt_range() to use CLFLUSHOPT or WBINVD to flush. Since CLFLUSHOPT is global on x86 and WBINVD is called on each cpu in drm_clflush_virt_range(), the flush operation is global. 2. Any context switch caused by preemption or page faults (page fault may cause sleep) doesn't affect the validity of local mapping. Therefore, cpu_fill() and cpu_check() are functions where the use of kmap_local_page() in place of kmap_atomic() is correctly suited. Convert the calls of kmap_atomic() / kunmap_atomic() to kmap_local_page() / kunmap_local(). [1]: https://lore.kernel.org/all/[email protected] Suggested-by: Dave Hansen <[email protected]> Suggested-by: Ira Weiny <[email protected]> Suggested-by: Fabio M. De Francesco <[email protected]> Signed-off-by: Zhao Liu <[email protected]> Reviewed-by: Ira Weiny <[email protected]> Reviewed-by: Fabio M. De Francesco <[email protected]> Signed-off-by: Tvrtko Ursulin <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-12-15drm/i915: Use kmap_local_page() in gem/selftests/i915_gem_coherency.cZhao Liu1-8/+4
The use of kmap_atomic() is being deprecated in favor of kmap_local_page()[1], and this patch converts the call from kmap_atomic() to kmap_local_page(). The main difference between atomic and local mappings is that local mappings doesn't disable page faults or preemption (the preemption is disabled for !PREEMPT_RT case, otherwise it only disables migration).. With kmap_local_page(), we can avoid the often unwanted side effect of unnecessary page faults or preemption disables. In drm/i915/gem/selftests/i915_gem_coherency.c, functions cpu_set() and cpu_get() mainly uses mapping to flush cache and assign the value. There're 2 reasons why cpu_set() and cpu_get() don't need to disable pagefaults and preemption for mapping: 1. The flush operation is safe. cpu_set() and cpu_get() call drm_clflush_virt_range() to use CLFLUSHOPT or WBINVD to flush. Since CLFLUSHOPT is global on x86 and WBINVD is called on each cpu in drm_clflush_virt_range(), the flush operation is global. 2. Any context switch caused by preemption or page faults (page fault may cause sleep) doesn't affect the validity of local mapping. Therefore, cpu_set() and cpu_get() are functions where the use of kmap_local_page() in place of kmap_atomic() is correctly suited. Convert the calls of kmap_atomic() / kunmap_atomic() to kmap_local_page() / kunmap_local(). [1]: https://lore.kernel.org/all/[email protected] Suggested-by: Dave Hansen <[email protected]> Suggested-by: Ira Weiny <[email protected]> Suggested-by: Fabio M. De Francesco <[email protected]> Signed-off-by: Zhao Liu <[email protected]> Reviewed-by: Ira Weiny <[email protected]> Reviewed-by: Fabio M. De Francesco <[email protected]> Signed-off-by: Tvrtko Ursulin <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-12-15drm/i915: Use kmap_local_page() in gem/selftests/huge_pages.cZhao Liu1-3/+3
The use of kmap_atomic() is being deprecated in favor of kmap_local_page()[1], and this patch converts the call from kmap_atomic() to kmap_local_page(). The main difference between atomic and local mappings is that local mappings doesn't disable page faults or preemption (the preemption is disabled for !PREEMPT_RT case, otherwise it only disables migration). With kmap_local_page(), we can avoid the often unwanted side effect of unnecessary page faults or preemption disables. In drm/i915/gem/selftests/huge_pages.c, function __cpu_check_shmem() mainly uses mapping to flush cache and check the value. There're 2 reasons why __cpu_check_shmem() doesn't need to disable pagefaults and preemption for mapping: 1. The flush operation is safe. Function __cpu_check_shmem() calls drm_clflush_virt_range() to use CLFLUSHOPT or WBINVD to flush. Since CLFLUSHOPT is global on x86 and WBINVD is called on each cpu in drm_clflush_virt_range(), the flush operation is global. 2. Any context switch caused by preemption or page faults (page fault may cause sleep) doesn't affect the validity of local mapping. Therefore, __cpu_check_shmem() is a function where the use of kmap_local_page() in place of kmap_atomic() is correctly suited. Convert the calls of kmap_atomic() / kunmap_atomic() to kmap_local_page() / kunmap_local(). [1]: https://lore.kernel.org/all/[email protected] Suggested-by: Dave Hansen <[email protected]> Suggested-by: Ira Weiny <[email protected]> Suggested-by: Fabio M. De Francesco <[email protected]> Signed-off-by: Zhao Liu <[email protected]> Reviewed-by: Ira Weiny <[email protected]> Reviewed-by: Fabio M. De Francesco <[email protected]> Signed-off-by: Tvrtko Ursulin <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-12-15drm/i915: Use kmap_local_page() in gem/i915_gem_shmem.cZhao Liu1-2/+4
The use of kmap_atomic() is being deprecated in favor of kmap_local_page()[1]. The main difference between atomic and local mappings is that local mappings doesn't disable page faults or preemption (the preemption is disabled for !PREEMPT_RT case, otherwise it only disables migration). With kmap_local_page(), we can avoid the often unwanted side effect of unnecessary page faults or preemption disables. In drm/i915/gem/i915_gem_shmem.c, the function shmem_pwrite() need to disable pagefault to eliminate the potential recursion fault[2]. But here __copy_from_user_inatomic() doesn't need to disable preemption and local mapping is valid for sched in/out. So it can use kmap_local_page() / kunmap_local() with pagefault_disable() / pagefault_enable() to replace atomic mapping. [1]: https://lore.kernel.org/all/[email protected] [2]: https://patchwork.freedesktop.org/patch/295840/ Suggested-by: Ira Weiny <[email protected]> Suggested-by: Fabio M. De Francesco <[email protected]> Signed-off-by: Zhao Liu <[email protected]> Reviewed-by: Ira Weiny <[email protected]> Reviewed-by: Fabio M. De Francesco <[email protected]> Signed-off-by: Tvrtko Ursulin <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-12-15drm/i915: Use memcpy_[from/to]_page() in gem/i915_gem_pyhs.cZhao Liu1-8/+2
The use of kmap_atomic() is being deprecated in favor of kmap_local_page()[1], and this patch converts the call from kmap_atomic() + memcpy() to memcpy_[from/to]_page(), which use kmap_local_page() to build local mapping and then do memcpy(). The main difference between atomic and local mappings is that local mappings doesn't disable page faults or preemption (the preemption is disabled for !PREEMPT_RT case, otherwise it only disables migration). With kmap_local_page(), we can avoid the often unwanted side effect of unnecessary page faults and preemption disables. In drm/i915/gem/i915_gem_phys.c, the functions i915_gem_object_get_pages_phys() and i915_gem_object_put_pages_phys() don't need to disable pagefaults and preemption for mapping because of 2 reasons: 1. The flush operation is safe. In drm/i915/gem/i915_gem_object.c, i915_gem_object_get_pages_phys() and i915_gem_object_put_pages_phys() calls drm_clflush_virt_range() to use CLFLUSHOPT or WBINVD to flush. Since CLFLUSHOPT is global on x86 and WBINVD is called on each cpu in drm_clflush_virt_range(), the flush operation is global. 2. Any context switch caused by preemption or page faults (page fault may cause sleep) doesn't affect the validity of local mapping. Therefore, i915_gem_object_get_pages_phys() and i915_gem_object_put_pages_phys() are two functions where the uses of local mappings in place of atomic mappings are correctly suited. Convert the calls of kmap_atomic() / kunmap_atomic() + memcpy() to memcpy_from_page() and memcpy_to_page(). [1]: https://lore.kernel.org/all/[email protected] Suggested-by: Dave Hansen <[email protected]> Suggested-by: Ira Weiny <[email protected]> Suggested-by: Fabio M. De Francesco <[email protected]> Signed-off-by: Zhao Liu <[email protected]> Reviewed-by: Ira Weiny <[email protected]> Reviewed-by: Fabio M. De Francesco <[email protected]> Signed-off-by: Tvrtko Ursulin <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-12-15drm/i915: Use kmap_local_page() in gem/i915_gem_object.cZhao Liu1-5/+3
The use of kmap_atomic() is being deprecated in favor of kmap_local_page()[1], and this patch converts the call from kmap_atomic() to kmap_local_page(). The main difference between atomic and local mappings is that local mappings doesn't disable page faults or preemption (the preemption is disabled for !PREEMPT_RT case, otherwise it only disables migration). With kmap_local_page(), we can avoid the often unwanted side effect of unnecessary page faults and preemption disables. There're 2 reasons why i915_gem_object_read_from_page_kmap() doesn't need to disable pagefaults and preemption for mapping: 1. The flush operation is safe. In drm/i915/gem/i915_gem_object.c, i915_gem_object_read_from_page_kmap() calls drm_clflush_virt_range() to use CLFLUSHOPT or WBINVD to flush. Since CLFLUSHOPT is global on x86 and WBINVD is called on each cpu in drm_clflush_virt_range(), the flush operation is global. 2. Any context switch caused by preemption or page faults (page fault may cause sleep) doesn't affect the validity of local mapping. Therefore, i915_gem_object_read_from_page_kmap() is a function where the use of kmap_local_page() in place of kmap_atomic() is correctly suited. Convert the calls of kmap_atomic() / kunmap_atomic() to kmap_local_page() / kunmap_local(). And remove the redundant variable that stores the address of the mapped page since kunmap_local() can accept any pointer within the page. [1]: https://lore.kernel.org/all/[email protected] Suggested-by: Dave Hansen <[email protected]> Suggested-by: Ira Weiny <[email protected]> Suggested-by: Fabio M. De Francesco <[email protected]> Signed-off-by: Zhao Liu <[email protected]> Reviewed-by: Ira Weiny <[email protected]> Reviewed-by: Fabio M. De Francesco <[email protected]> Signed-off-by: Tvrtko Ursulin <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2023-12-14drm/i915: Add Wa_14019877138Haridhar Kalvala2-0/+6
Enable Force Dispatch Ends Collection for DG2. BSpec: 46001 Signed-off-by: Haridhar Kalvala <[email protected]> Reviewed-by: Matt Roper <[email protected]> Signed-off-by: Matt Roper <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]