aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-09-02perf lock contention: Fix spinlock and rwlock accountingNamhyung Kim1-0/+3
The spinlock and rwlock use a single-element per-cpu array to track current locks due to performance reason. But this means the key is always available and it cannot simply account lock stats in the array because some of them are invalid. In fact, the contention_end() program in the BPF invalidates the entry by setting the 'lock' value to 0 instead of deleting the entry for the hashmap. So it should skip entries with the lock value of 0 in the account_end_timestamp(). Otherwise, it'd have spurious high contention on an idle machine: $ sudo perf lock con -ab -Y spinlock sleep 3 contended total wait max wait avg wait type caller 8 4.72 s 1.84 s 590.46 ms spinlock rcu_core+0xc7 8 1.87 s 1.87 s 233.48 ms spinlock process_one_work+0x1b5 2 1.87 s 1.87 s 933.92 ms spinlock worker_thread+0x1a2 3 1.81 s 1.81 s 603.93 ms spinlock tmigr_update_events+0x13c 2 1.72 s 1.72 s 861.98 ms spinlock tick_do_update_jiffies64+0x25 6 42.48 us 13.02 us 7.08 us spinlock futex_q_lock+0x2a 1 13.03 us 13.03 us 13.03 us spinlock futex_wake+0xce 1 11.61 us 11.61 us 11.61 us spinlock rcu_core+0xc7 I don't believe it has contention on a spinlock longer than 1 second. After this change, it only reports some small contentions. $ sudo perf lock con -ab -Y spinlock sleep 3 contended total wait max wait avg wait type caller 4 133.51 us 43.29 us 33.38 us spinlock tick_do_update_jiffies64+0x25 4 69.06 us 31.82 us 17.27 us spinlock process_one_work+0x1b5 2 50.66 us 25.77 us 25.33 us spinlock rcu_core+0xc7 1 28.45 us 28.45 us 28.45 us spinlock rcu_core+0xc7 1 24.77 us 24.77 us 24.77 us spinlock tmigr_update_events+0x13c 1 23.34 us 23.34 us 23.34 us spinlock raw_spin_rq_lock_nested+0x15 Fixes: b5711042a1c8 ("perf lock contention: Use per-cpu array map for spinlocks") Reported-by: Xi Wang <[email protected]> Cc: Song Liu <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2024-09-02perf test pmu: Set uninitialized PMU alias to nullVeronika Molnarova1-1/+3
Commit 3e0bf9fde2984469 ("perf pmu: Restore full PMU name wildcard support") adds a test case "PMU cmdline match" that covers PMU name wildcard support provided by function perf_pmu__match(). The test works with a wide range of supported combinations of PMU name matching but omits the case that if the perf_pmu__match() cannot match the PMU name to the wildcard, it tries to match its alias. However, this variable is not set up, causing the test case to fail when run with subprocesses or to segfault if run as a single process. ./perf test -vv 9 9: Sysfs PMU tests : 9.1: Parsing with PMU format directory : Ok 9.2: Parsing with PMU event : Ok 9.3: PMU event names : Ok 9.4: PMU name combining : Ok 9.5: PMU name comparison : Ok 9.6: PMU cmdline match : FAILED! ./perf test -F 9 9.1: Parsing with PMU format directory : Ok 9.2: Parsing with PMU event : Ok 9.3: PMU event names : Ok 9.4: PMU name combining : Ok 9.5: PMU name comparison : Ok Segmentation fault (core dumped) Initialize the PMU alias to null for all tests of perf_pmu__match() as this functionality is not being tested and the alias matching works exactly the same as the matching of the PMU name. ./perf test -F 9 9.1: Parsing with PMU format directory : Ok 9.2: Parsing with PMU event : Ok 9.3: PMU event names : Ok 9.4: PMU name combining : Ok 9.5: PMU name comparison : Ok 9.6: PMU cmdline match : Ok Fixes: 3e0bf9fde2984469 ("perf pmu: Restore full PMU name wildcard support") Signed-off-by: Veronika Molnarova <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]>
2024-09-02drm/xe/pf: Reset thresholds when releasing a VF configMichal Wajdeczko1-0/+13
As part of the VF config release, we should reset all parameters, including thresholds, to always start with the clean VF config. Signed-off-by: Michal Wajdeczko <[email protected]> Reviewed-by: Piotr Piórkowski <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-09-02drm/xe/pf: Add thresholds to the VF KLV configMichal Wajdeczko1-0/+8
We are pushing threshold KLV to the GuC immediately during the threshold provisioning, but those configs will be lost during a GT reset. Include threshold KLVs while encoding full VF config buffer to make sure the GuC receives all of the config KLVs. Signed-off-by: Michal Wajdeczko <[email protected]> Reviewed-by: Piotr Piórkowski <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
2024-09-02btrfs: qgroup: don't use extent changeset when not neededFedor Pchelkin1-2/+1
The local extent changeset is passed to clear_record_extent_bits() where it may have some additional memory dynamically allocated for ulist. When qgroup is disabled, the memory is leaked because in this case the changeset is not released upon __btrfs_qgroup_release_data() return. Since the recorded contents of the changeset are not used thereafter, just don't pass it. Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Reported-by: [email protected] Closes: https://lore.kernel.org/lkml/[email protected] Fixes: af0e2aab3b70 ("btrfs: qgroup: flush reservations during quota disable") CC: [email protected] # 6.10+ Reviewed-by: Boris Burkov <[email protected]> Reviewed-by: Qu Wenruo <[email protected]> Signed-off-by: Fedor Pchelkin <[email protected]> Signed-off-by: David Sterba <[email protected]>
2024-09-02spi: zynq-qspi: Replace kzalloc with kmalloc for buffer allocationKuan-Wei Chiu1-1/+1
In zynq_qspi_exec_mem_op(), the temporary buffer is allocated with kzalloc and then immediately initialized using memset to 0xff. To optimize this, replace kzalloc with kmalloc, as the zeroing operation is redundant and unnecessary. Signed-off-by: Kuan-Wei Chiu <[email protected]> Reviewed-by: Michal Simek <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Mark Brown <[email protected]>
2024-09-02drm/amd/display: Block timing sync for different signals in PMODillon Varone1-1/+2
PMO assumes that like timings can be synchronized, but DC only allows this if the signal types match. Reviewed-by: Austin Zheng <[email protected]> Signed-off-by: Dillon Varone <[email protected]> Signed-off-by: Hamza Mahfooz <[email protected]> Signed-off-by: Alex Deucher <[email protected]> (cherry picked from commit 29d3d6af43135de7bec677f334292ca8dab53d67) Cc: [email protected]
2024-09-02drm/amd/display: Lock DC and exit IPS when changing backlightLeo Li1-1/+12
Backlight updates require aux and/or register access. Therefore, driver needs to disallow IPS beforehand. So, acquire the dc lock before calling into dc to update backlight - we should be doing this regardless of IPS. Then, while the lock is held, disallow IPS before calling into dc, then allow IPS afterwards (if it was previously allowed). Reviewed-by: Aurabindo Pillai <[email protected]> Reviewed-by: Roman Li <[email protected]> Signed-off-by: Leo Li <[email protected]> Signed-off-by: Hamza Mahfooz <[email protected]> Signed-off-by: Alex Deucher <[email protected]> (cherry picked from commit 988fe2862635c1b1b40e41c85c24db44ab337c13) Cc: [email protected] # 6.10+
2024-09-02drm/amdgpu: always allocate cleared VRAM for GEM allocationsAlex Deucher1-0/+3
This adds allocation latency, but aligns better with user expectations. The latency should improve with the drm buddy clearing patches that Arun has been working on. In addition this fixes the high CPU spikes seen when doing wipe on release. v2: always set AMDGPU_GEM_CREATE_VRAM_CLEARED (Christian) Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3528 Fixes: a68c7eaa7a8f ("drm/amdgpu: Enable clear page functionality") Acked-by: Arunpravin Paneer Selvam <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> (v1) Signed-off-by: Alex Deucher <[email protected]> Cc: Arunpravin Paneer Selvam <[email protected]> Cc: Christian König <[email protected]> (cherry picked from commit 6c0a7c3c693ac84f8b50269a9088af8f37446863) Cc: [email protected] # 6.10.x
2024-09-02drm/amdgpu/mes: add mes mapping legacy queue switchJack Xiao4-20/+43
For mes11 old firmware has issue to map legacy queue, add a flag to switch mes to map legacy queue. Fixes: f9d8c5c7855d ("drm/amdgpu/gfx: enable mes to map legacy queue support") Reported-by: Andrew Worsley <[email protected]> Link: https://lists.freedesktop.org/archives/amd-gfx/2024-August/112773.html Signed-off-by: Jack Xiao <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]> (cherry picked from commit 52491d97aadcde543986d596ed55f70bf2142851)
2024-09-02drm/amd/display: Determine IPS mode by ASIC and PMFW versionsLeo Li1-1/+25
[Why] DCN IPS interoperates with other system idle power features, such as Zstates. On DCN35, there is a known issue where system Z8 + DCN IPS2 causes a hard hang. We observe this on systems where the SBIOS allows Z8. Though there is a SBIOS fix, there's no guarantee that users will get it any time soon, or even install it. A workaround is needed to prevent this from rearing its head in the wild. [How] For DCN35, check the pmfw version to determine whether the SBIOS has the fix. If not, set IPS1+RCG as the deepest possible state in all cases except for s0ix and display off (DPMS). Otherwise, enable all IPS Signed-off-by: Leo Li <[email protected]> Reviewed-by: Harry Wentland <[email protected]> Signed-off-by: Alex Deucher <[email protected]> (cherry picked from commit 28d43d0895896f84c038d906d244e0a95eb243ec) Cc: [email protected]
2024-09-02tcp_bpf: Remove an unused parameter for bpf_tcp_ingress()Yaxin Chen1-2/+2
Parameter flags is not used in bpf_tcp_ingress(). Signed-off-by: Yaxin Chen <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Cong Wang <[email protected]> Acked-by: Jakub Sitnicki <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2024-09-02bpf, sockmap: Correct spelling skmsg.cSimon Horman1-1/+1
Correct spelling in skmsg.c. As reported by codespell. Signed-off-by: Simon Horman <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Stanislav Fomichev <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2024-09-02Revert "wifi: ath11k: support hibernation"Baochen Qiang8-139/+49
This reverts commit 166a490f59ac10340ee5330e51c15188ce2a7f8f. There are several reports that this commit breaks system suspend on some specific Lenovo platforms. Since there is no fix available, for now revert this commit to make suspend work again on those platforms. Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219196 Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2301921 Cc: <[email protected]> # 6.10.x: d3e154d7776b: Revert "wifi: ath11k: restore country code during resume" Cc: <[email protected]> # 6.10.x Signed-off-by: Baochen Qiang <[email protected]> Acked-by: Jeff Johnson <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://patch.msgid.link/[email protected]
2024-09-02Revert "wifi: ath11k: restore country code during resume"Baochen Qiang1-10/+0
This reverts commit 7f0343b7b8710436c1e6355c71782d32ada47e0c. We are going to revert commit 166a490f59ac ("wifi: ath11k: support hibernation"), on which this commit depends. With that commit reverted, this one is not needed any more, so revert this commit first. Signed-off-by: Baochen Qiang <[email protected]> Acked-by: Jeff Johnson <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://patch.msgid.link/[email protected]
2024-09-02bpftool: Add missing blank lines in bpftool-net doc exampleQuentin Monnet1-0/+2
In bpftool-net documentation, two blank lines are missing in a recently added example, causing docutils to complain: $ cd tools/bpf/bpftool $ make doc DESCEND Documentation GEN bpftool-btf.8 GEN bpftool-cgroup.8 GEN bpftool-feature.8 GEN bpftool-gen.8 GEN bpftool-iter.8 GEN bpftool-link.8 GEN bpftool-map.8 GEN bpftool-net.8 <stdin>:189: (INFO/1) Possible incomplete section title. Treating the overline as ordinary text because it's so short. <stdin>:192: (INFO/1) Blank line missing before literal block (after the "::")? Interpreted as a definition list item. <stdin>:199: (INFO/1) Possible incomplete section title. Treating the overline as ordinary text because it's so short. <stdin>:201: (INFO/1) Blank line missing before literal block (after the "::")? Interpreted as a definition list item. GEN bpftool-perf.8 GEN bpftool-prog.8 GEN bpftool.8 GEN bpftool-struct_ops.8 Add the missing blank lines. Fixes: 0d7c06125cea ("bpftool: Add document for net attach/detach on tcx subcommand") Signed-off-by: Quentin Monnet <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2024-09-02iommu/vt-d: Introduce batched cache invalidationTina Zhang1-15/+107
Converts IOTLB and Dev-IOTLB invalidation to a batched model. Cache tag invalidation requests for a domain are now accumulated in a qi_batch structure before being flushed in bulk. It replaces the previous per- request qi_flush approach with a more efficient batching mechanism. Co-developed-by: Lu Baolu <[email protected]> Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Tina Zhang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2024-09-02iommu/vt-d: Add qi_batch for dmar_domainLu Baolu5-1/+27
Introduces a qi_batch structure to hold batched cache invalidation descriptors on a per-dmar_domain basis. A fixed-size descriptor array is used for simplicity. The qi_batch is allocated when the first cache tag is added to the domain and freed during iommu_free_domain(). Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Tina Zhang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2024-09-02iommu/vt-d: Refactor IOTLB and Dev-IOTLB flush for batchingTina Zhang3-67/+83
Extracts IOTLB and Dev-IOTLB invalidation logic from cache tag flush interfaces into dedicated helper functions. It prepares the codebase for upcoming changes to support batched cache invalidations. To enable direct use of qi_flush helpers in the new functions, iommu->flush.flush_iotlb and quirk_extra_dev_tlb_flush() are opened up. No functional changes are intended. Co-developed-by: Lu Baolu <[email protected]> Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Tina Zhang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2024-09-02iommu/vt-d: Factor out invalidation descriptor compositionTina Zhang2-87/+115
Separate the logic for constructing IOTLB and device TLB invalidation descriptors from the qi_flush interfaces. New helpers, qi_desc(), are introduced to encapsulate this common functionality. Moving descriptor composition code to new helpers enables its reuse in the upcoming qi_batch interfaces. No functional changes are intended. Signed-off-by: Tina Zhang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Joerg Roedel <[email protected]>
2024-09-02iommu/vt-d: Unconditionally flush device TLB for pasid table updatesLu Baolu1-9/+3
The caching mode of an IOMMU is irrelevant to the behavior of the device TLB. Previously, commit <304b3bde24b5> ("iommu/vt-d: Remove caching mode check before device TLB flush") removed this redundant check in the domain unmap path. Checking the caching mode before flushing the device TLB after a pasid table entry is updated is unnecessary and can lead to inconsistent behavior. Extends this consistency by removing the caching mode check in the pasid table update path. Suggested-by: Yi Liu <[email protected]> Signed-off-by: Lu Baolu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2024-09-02iommu/vt-d: Move PCI PASID enablement to probe pathLu Baolu1-14/+15
Currently, PCI PASID is enabled alongside PCI ATS when an iommu domain is attached to the device and disabled when the device transitions to block translation mode. This approach is inappropriate as PCI PASID is a device feature independent of the type of the attached domain. Enable PCI PASID during the IOMMU device probe and disables it during the release path. Suggested-by: Yi Liu <[email protected]> Signed-off-by: Lu Baolu <[email protected]> Reviewed-by: Yi Liu <[email protected]> Tested-by: Yi Liu <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2024-09-02iommu/vt-d: Fix potential lockup if qi_submit_sync called with 0 countSanjay K Kumar1-5/+11
If qi_submit_sync() is invoked with 0 invalidation descriptors (for instance, for DMA draining purposes), we can run into a bug where a submitting thread fails to detect the completion of invalidation_wait. Subsequently, this led to a soft lockup. Currently, there is no impact by this bug on the existing users because no callers are submitting invalidations with 0 descriptors. This fix will enable future users (such as DMA drain) calling qi_submit_sync() with 0 count. Suppose thread T1 invokes qi_submit_sync() with non-zero descriptors, while concurrently, thread T2 calls qi_submit_sync() with zero descriptors. Both threads then enter a while loop, waiting for their respective descriptors to complete. T1 detects its completion (i.e., T1's invalidation_wait status changes to QI_DONE by HW) and proceeds to call reclaim_free_desc() to reclaim all descriptors, potentially including adjacent ones of other threads that are also marked as QI_DONE. During this time, while T2 is waiting to acquire the qi->q_lock, the IOMMU hardware may complete the invalidation for T2, setting its status to QI_DONE. However, if T1's execution of reclaim_free_desc() frees T2's invalidation_wait descriptor and changes its status to QI_FREE, T2 will not observe the QI_DONE status for its invalidation_wait and will indefinitely remain stuck. This soft lockup does not occur when only non-zero descriptors are submitted.In such cases, invalidation descriptors are interspersed among wait descriptors with the status QI_IN_USE, acting as barriers. These barriers prevent the reclaim code from mistakenly freeing descriptors belonging to other submitters. Considered the following example timeline: T1 T2 ======================================== ID1 WD1 while(WD1!=QI_DONE) unlock lock WD1=QI_DONE* WD2 while(WD2!=QI_DONE) unlock lock WD1==QI_DONE? ID1=QI_DONE WD2=DONE* reclaim() ID1=FREE WD1=FREE WD2=FREE unlock soft lockup! T2 never sees QI_DONE in WD2 Where: ID = invalidation descriptor WD = wait descriptor * Written by hardware The root of the problem is that the descriptor status QI_DONE flag is used for two conflicting purposes: 1. signal a descriptor is ready for reclaim (to be freed) 2. signal by the hardware that a wait descriptor is complete The solution (in this patch) is state separation by using QI_FREE flag for #1. Once a thread's invalidation descriptors are complete, their status would be set to QI_FREE. The reclaim_free_desc() function would then only free descriptors marked as QI_FREE instead of those marked as QI_DONE. This change ensures that T2 (from the previous example) will correctly observe the completion of its invalidation_wait (marked as QI_DONE). Signed-off-by: Sanjay K Kumar <[email protected]> Signed-off-by: Jacob Pan <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Joerg Roedel <[email protected]>
2024-09-02iommu/vt-d: Cleanup si_domainLu Baolu1-72/+19
The static identity domain has been introduced, rendering the si_domain obsolete. Remove si_domain and cleanup the code accordingly. Signed-off-by: Lu Baolu <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2024-09-02iommu/vt-d: Add support for static identity domainLu Baolu2-5/+111
Software determines VT-d hardware support for passthrough translation by inspecting the capability register. If passthrough translation is not supported, the device is instructed to use DMA domain for its default domain. Add a global static identity domain with guaranteed attach semantics for IOMMUs that support passthrough translation mode. The global static identity domain is a dummy domain without corresponding dmar_domain structure. Consequently, the device's info->domain will be NULL with the identity domain is attached. Refactor the code accordingly. Signed-off-by: Lu Baolu <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2024-09-02iommu/vt-d: Factor out helpers from domain_context_mapping_one()Lu Baolu1-41/+58
Extract common code from domain_context_mapping_one() into new helpers, making it reusable by other functions such as the upcoming identity domain implementation. No intentional functional changes. Signed-off-by: Lu Baolu <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Jerry Snitselaar <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2024-09-02iommu/vt-d: Remove has_iotlb_device flagLu Baolu3-37/+1
The has_iotlb_device flag was used to indicate if a domain had attached devices with ATS enabled. Domains without this flag didn't require device TLB invalidation during unmap operations, optimizing performance by avoiding unnecessary device iteration. With the introduction of cache tags, this flag is no longer needed. The code to iterate over attached devices was removed by commit 06792d067989 ("iommu/vt-d: Cleanup use of iommu_flush_iotlb_psi()"). Remove has_iotlb_device to avoid unnecessary code. Suggested-by: Jason Gunthorpe <[email protected]> Signed-off-by: Lu Baolu <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2024-09-02iommu/vt-d: Always reserve a domain ID for identity setupLu Baolu1-3/+3
We will use a global static identity domain. Reserve a static domain ID for it. Signed-off-by: Lu Baolu <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Jerry Snitselaar <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2024-09-02iommu/vt-d: Remove identity mappings from si_domainLu Baolu1-118/+4
As the driver has enforced DMA domains for devices managed by an IOMMU hardware that doesn't support passthrough translation mode, there is no need for static identity mappings in the si_domain. Remove the identity mapping code to avoid dead code. Signed-off-by: Lu Baolu <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2024-09-02iommu/vt-d: Require DMA domain if hardware not support passthroughLu Baolu1-0/+10
The iommu core defines the def_domain_type callback to query the iommu driver about hardware capability and quirks. The iommu driver should declare IOMMU_DOMAIN_DMA requirement for hardware lacking pass-through capability. Earlier VT-d hardware implementations did not support pass-through translation mode. The iommu driver relied on a paging domain with all physical system memory addresses identically mapped to the same IOVA to simulate pass-through translation before the def_domain_type was introduced and it has been kept until now. It's time to adjust it now to make the Intel iommu driver follow the def_domain_type semantics. Signed-off-by: Lu Baolu <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Jerry Snitselaar <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2024-09-02drm/amdgpu/gfx10: use rlc safe mode for soft recoveryAlex Deucher1-0/+2
Protect the MMIO access with safe mode. Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/gfx11: use rlc safe mode for soft recoveryAlex Deucher1-0/+2
Protect the MMIO access with safe mode. Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/gfx12: use rlc safe mode for soft recoveryAlex Deucher1-0/+2
Protect the MMIO access with safe mode. Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/gfx12: use proper rlc safe mode helpersAlex Deucher1-2/+2
Rather than open coding it for the queue reset. Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/gfx11: use proper rlc safe mode helpersAlex Deucher1-4/+4
Rather than open coding it for the queue reset. Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/gfx10: use proper rlc safe mode helpersAlex Deucher1-2/+2
Rather than open coding it for the queue reset. Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/gfx12: per queue reset only on bare metalAlex Deucher1-0/+6
It's not supported under SR-IOV at the moment. Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/gfx11: per queue reset only on bare metalAlex Deucher1-0/+6
It's not supported under SR-IOV at the moment. Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/gfx10: per queue reset only on bare metalAlex Deucher1-0/+6
It's not supported under SR-IOV at the moment. Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/mes11: implement mmio queue reset for gfx11Jiadong Zhu1-0/+80
Implement queue reset for graphic and compute queue. v2: use amdgpu_gfx_rlc funcs to enter/exit safe mode. v3: use gfx_v11_0_request_gfx_index_mutex() v4: fix mutex handling Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Jiadong Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/mes: implement amdgpu_mes_reset_hw_queue_mmioJiadong Zhu1-0/+20
The reset_queue api could be used from kfd or kgd. v2: add use_mmio parameter for mes_reset_legacy_queue. Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Jiadong Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/mes: modify mes api for mmio queue resetJiadong Zhu4-4/+17
Add me/pipe/queue parameters for queue reset input. v2: fix build (Alex) Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Jiadong Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/gfx12: fallback to driver reset compute queue directlyAlex Deucher1-14/+79
Since the MES FW resets kernel compute queue always failed, this may caused by the KIQ failed to process unmap KCQ. So, before MES FW work properly that will fallback to driver executes dequeue and resets SPI directly. Besides, rework the ring reset function and make the busy ring type reset in each function respectively. Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/gfx12: add ring reset callbacksAlex Deucher1-0/+18
Add ring reset callbacks for gfx and compute. Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/gfx10: rework reset sequenceAlex Deucher1-7/+19
To match other GFX IPs. Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/gfx10: wait for reset done before remapJiadong Zhu1-11/+30
There is a racing condition that cp firmware modifies MQD in reset sequence after driver updates it for remapping. We have to wait till CP_HQD_ACTIVE becoming false then remap the queue. v2: fix KIQ locking (Alex) v3: fix KIQ locking harder (Jessie) Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Jiadong Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/gfx10: remap queue after reset successfullyJiadong Zhu1-11/+35
Kiq command unmap_queues only does the dequeueing action. We have to map the queue back with clean mqd. v2: fix up error handling (Alex) Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Jiadong Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/gfx10: add ring reset callbacksAlex Deucher1-0/+91
Add ring reset callbacks for gfx and compute. v2: fix gfx handling v3: wait for KIQ to complete Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/gfx11: wait for reset done before remapJiadong Zhu1-1/+14
There is a racing condition that cp firmware modifies MQD in reset sequence after driver updates it for remapping. We have to wait till CP_HQD_ACTIVE becoming false then remap the queue. Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Jiadong Zhu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2024-09-02drm/amdgpu/gfx11: rename gfx_v11_0_gfx_init_queue()Alex Deucher1-3/+3
Rename to gfx_v11_0_kgq_init_queue() to better align with the other naming in the file. Acked-by: Vitaly Prosyak <[email protected]> Signed-off-by: Alex Deucher <[email protected]>