aboutsummaryrefslogtreecommitdiff
path: root/drivers/iommu
AgeCommit message (Collapse)AuthorFilesLines
2023-11-27iommu: Allow passing custom allocators to pgtable driversBoris Brezillon1-0/+23
This will be useful for GPU drivers who want to keep page tables in a pool so they can: - keep freed page tables in a free pool and speed-up upcoming page table allocations - batch page table allocation instead of allocating one page at a time - pre-reserve pages for page tables needed for map/unmap operations, to ensure map/unmap operations don't try to allocate memory in paths they're allowed to block or fail It might also be valuable for other aspects of GPU and similar use-cases, like fine-grained memory accounting and resource limiting. We will extend the Arm LPAE format to support custom allocators in a separate commit. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Steven Price <[email protected]> Reviewed-by: Robin Murphy <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2023-11-27iommu/vt-d: Set variable intel_dirty_ops to staticKunwu Chan1-2/+2
Fix the following warning: drivers/iommu/intel/iommu.c:302:30: warning: symbol 'intel_dirty_ops' was not declared. Should it be static? This variable is only used in its defining file, so it should be static. Fixes: f35f22cc760e ("iommu/vt-d: Access/Dirty bit support for SS domains") Signed-off-by: Kunwu Chan <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Joao Martins <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Joerg Roedel <[email protected]>
2023-11-27iommu/vt-d: Fix incorrect cache invalidation for mm notificationLu Baolu1-0/+26
Commit 6bbd42e2df8f ("mmu_notifiers: call invalidate_range() when invalidating TLBs") moved the secondary TLB invalidations into the TLB invalidation functions to ensure that all secondary TLB invalidations happen at the same time as the CPU invalidation and added a flush-all type of secondary TLB invalidation for the batched mode, where a range of [0, -1UL) is used to indicates that the range extends to the end of the address space. However, using an end address of -1UL caused an overflow in the Intel IOMMU driver, where the end address was rounded up to the next page. As a result, both the IOTLB and device ATC were not invalidated correctly. Add a flush all helper function and call it when the invalidation range is from 0 to -1UL, ensuring that the entire caches are invalidated correctly. Fixes: 6bbd42e2df8f ("mmu_notifiers: call invalidate_range() when invalidating TLBs") Cc: [email protected] Cc: Huang Ying <[email protected]> Cc: Alistair Popple <[email protected]> Tested-by: Luo Yuzhang <[email protected]> # QAT Tested-by: Tony Zhu <[email protected]> # DSA Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Alistair Popple <[email protected]> Signed-off-by: Lu Baolu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2023-11-27iommu/vt-d: Add MTL to quirk list to skip TE disablingAbdul Halim, Mohd Syazwan1-1/+1
The VT-d spec requires (10.4.4 Global Command Register, TE field) that: Hardware implementations supporting DMA draining must drain any in-flight DMA read/write requests queued within the Root-Complex before switching address translation on or off and reflecting the status of the command through the TES field in the Global Status register. Unfortunately, some integrated graphic devices fail to do so after some kind of power state transition. As the result, the system might stuck in iommu_disable_translation(), waiting for the completion of TE transition. Add MTL to the quirk list for those devices and skips TE disabling if the qurik hits. Fixes: b1012ca8dc4f ("iommu/vt-d: Skip TE disabling on quirky gfx dedicated iommu") Cc: [email protected] Signed-off-by: Abdul Halim, Mohd Syazwan <[email protected]> Signed-off-by: Lu Baolu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2023-11-27iommu/vt-d: Make context clearing consistent with context mappingLu Baolu1-2/+2
In the iommu probe_device path, domain_context_mapping() allows setting up the context entry for a non-PCI device. However, in the iommu release_device path, domain_context_clear() only clears context entries for PCI devices. Make domain_context_clear() behave consistently with domain_context_mapping() by clearing context entries for both PCI and non-PCI devices. Fixes: 579305f75d34 ("iommu/vt-d: Update to use PCI DMA aliases") Signed-off-by: Lu Baolu <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2023-11-27iommu/vt-d: Disable PCI ATS in legacy passthrough modeLu Baolu1-1/+2
When IOMMU hardware operates in legacy mode, the TT field of the context entry determines the translation type, with three supported types (Section 9.3 Context Entry): - DMA translation without device TLB support - DMA translation with device TLB support - Passthrough mode with translated and translation requests blocked Device TLB support is absent when hardware is configured in passthrough mode. Disable the PCI ATS feature when IOMMU is configured for passthrough translation type in legacy (non-scalable) mode. Fixes: 0faa19a1515f ("iommu/vt-d: Decouple PASID & PRI enabling from SVA") Signed-off-by: Lu Baolu <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2023-11-27iommu/vt-d: Omit devTLB invalidation requests when TES=0Lu Baolu1-0/+18
The latest VT-d spec indicates that when remapping hardware is disabled (TES=0 in Global Status Register), upstream ATS Invalidation Completion requests are treated as UR (Unsupported Request). Consequently, the spec recommends in section 4.3 Handling of Device-TLB Invalidations that software refrain from submitting any Device-TLB invalidation requests when address remapping hardware is disabled. Verify address remapping hardware is enabled prior to submitting Device- TLB invalidation requests. Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by default") Signed-off-by: Lu Baolu <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2023-11-27iommu/vt-d: Support enforce_cache_coherency only for empty domainsLu Baolu2-1/+7
The enforce_cache_coherency callback ensures DMA cache coherency for devices attached to the domain. Intel IOMMU supports enforced DMA cache coherency when the Snoop Control bit in the IOMMU's extended capability register is set. Supporting it differs between legacy and scalable modes. In legacy mode, it's supported page-level by setting the SNP field in second-stage page-table entries. In scalable mode, it's supported in PASID-table granularity by setting the PGSNP field in PASID-table entries. In legacy mode, mappings before attaching to a device have SNP fields cleared, while mappings after the callback have them set. This means partial DMAs are cache coherent while others are not. One possible fix is replaying mappings and flipping SNP bits when attaching a domain to a device. But this seems to be over-engineered, given that all real use cases just attach an empty domain to a device. To meet practical needs while reducing mode differences, only support enforce_cache_coherency on a domain without mappings if SNP field is used. Fixes: fc0051cb9590 ("iommu/vt-d: Check domain force_snooping against attached devices") Signed-off-by: Lu Baolu <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2023-11-27iommu: Clean up open-coded ownership checksRobin Murphy7-43/+6
Some drivers already implement their own defence against the possibility of being given someone else's device. Since this is now taken care of by the core code (and via a slightly different path from the original fwspec-based idea), let's clean them up. Acked-by: Will Deacon <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Jerry Snitselaar <[email protected]> Signed-off-by: Robin Murphy <[email protected]> Link: https://lore.kernel.org/r/58a9879ce3f03562bb061e6714fe6efb554c3907.1700589539.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <[email protected]>
2023-11-27iommu: Retire bus opsRobin Murphy1-13/+18
With the rest of the API internals converted, it's time to finally tackle probe_device and how we bootstrap the per-device ops association to begin with. This ends up being disappointingly straightforward, since fwspec users are already doing it in order to find their of_xlate callback, and it works out that we can easily do the equivalent for other drivers too. Then shuffle the remaining awareness of iommu_ops into the couple of core headers that still need it, and breathe a sigh of relief. Ding dong the bus ops are gone! CC: Rafael J. Wysocki <[email protected]> Acked-by: Christoph Hellwig <[email protected]> Acked-by: Greg Kroah-Hartman <[email protected]> Reviewed-by: Lu Baolu <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Jerry Snitselaar <[email protected]> Signed-off-by: Robin Murphy <[email protected]> Link: https://lore.kernel.org/r/a59011ef65b4b6657cb0b7a388d786b779b61305.1700589539.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <[email protected]>
2023-11-27iommu/arm-smmu: Don't register fwnode for legacy bindingRobin Murphy1-1/+2
When using the legacy binding we bypass the of_xlate mechanism, so avoid registering the instance fwnodes which act as keys for that. This will help __iommu_probe_device() to retrieve the registered ops the same way as for x86 etc. when no fwspec has previously been set up by of_xlate. Acked-by: Will Deacon <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Jerry Snitselaar <[email protected]> Signed-off-by: Robin Murphy <[email protected]> Link: https://lore.kernel.org/r/18b0f812a42a74dd6924aea24e68ab409d6e1b52.1700589539.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <[email protected]>
2023-11-27iommu: Decouple iommu_domain_alloc() from bus opsRobin Murphy1-3/+22
As the final remaining piece of bus-dependent API, iommu_domain_alloc() can now take responsibility for the "one iommu_ops per bus" rule for itself. It turns out we can't safely make the internal allocation call any more group-based or device-based yet - that will have to wait until the external callers can pass the right thing - but we can at least get as far as deriving "bus ops" based on which driver is actually managing devices on the given bus, rather than whichever driver won the race to register first. This will then leave us able to convert the last of the core internals over to the IOMMU-instance model, allow multiple drivers to register and actually coexist (modulo the above limitation for unmanaged domain users in the short term), and start trying to solve the long-standing iommu_probe_device() mess. Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Jerry Snitselaar <[email protected]> Signed-off-by: Robin Murphy <[email protected]> Reviewed-by: Lu Baolu <[email protected]> Link: https://lore.kernel.org/r/6c7313009aae0e39ae2855920990ebf85af4662f.1700589539.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <[email protected]>
2023-11-27iommu: Validate that devices match domainsRobin Murphy2-0/+12
Before we can allow drivers to coexist, we need to make sure that one driver's domain ops can't misinterpret another driver's dev_iommu_priv data. To that end, add a token to the domain so we can remember how it was allocated - for now this may as well be the device ops, since they still correlate 1:1 with drivers. We can trust ourselves for internal default domain attachment, so add checks to cover all the public attach interfaces. Reviewed-by: Lu Baolu <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Jerry Snitselaar <[email protected]> Signed-off-by: Robin Murphy <[email protected]> Link: https://lore.kernel.org/r/097c6f30480e4efe12195d00ba0e84ea4837fb4c.1700589539.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <[email protected]>
2023-11-27iommu: Decouple iommu_present() from bus opsRobin Murphy1-1/+20
Much as I'd like to remove iommu_present(), the final remaining users are proving stubbornly difficult to clean up, so kick that can down the road and just rework it to preserve the current behaviour without depending on bus ops. Since commit 57365a04c921 ("iommu: Move bus setup to IOMMU device registration"), any registered IOMMU instance is already considered "present" for every entry in iommu_buses, so it's simply a case of validating the bus and checking we have at least once IOMMU. Reviewed-by: Jason Gunthorpe<[email protected]> Reviewed-by: Lu Baolu <[email protected]> Reviewed-by: Jerry Snitselaar <[email protected]> Signed-off-by: Robin Murphy <[email protected]> Link: https://lore.kernel.org/r/caa93680bb9d35a8facbcd8ff46267ca67335229.1700589539.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <[email protected]>
2023-11-27iommu: Factor out some helpersRobin Murphy1-30/+26
The pattern for picking the first device out of the group list is repeated a few times now, so it's clearly worth factoring out, which also helps hide the iommu_group_dev detail from places that don't need to know. Similarly, the safety check for dev_iommu_ops() at certain public interfaces starts looking a bit repetitive, and might not be completely obvious at first glance, so let's factor that out for clarity as well, in preparation for more uses of both. Reviewed-by: Lu Baolu <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Jerry Snitselaar <[email protected]> Signed-off-by: Robin Murphy <[email protected]> Link: https://lore.kernel.org/r/566cbd161546caa6aed49662c9b3e8f09dc9c3cf.1700589539.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <[email protected]>
2023-11-27iommu/virtio: Add ops->flush_iotlb_all and enable deferred flushNiklas Schnelle1-0/+16
Add ops->flush_iotlb_all operation to enable virtio-iommu for the dma-iommu deferred flush scheme. This results in a significant increase in performance in exchange for a window in which devices can still access previously IOMMU mapped memory when running with CONFIG_IOMMU_DEFAULT_DMA_LAZY. The previous strict behavior can be achieved with iommu.strict=1 on the kernel command line or CONFIG_IOMMU_DEFAULT_DMA_STRICT. Link: https://lore.kernel.org/lkml/20230802123612.GA6142@myrica/ Signed-off-by: Niklas Schnelle <[email protected]> Reviewed-by: Jean-Philippe Brucker <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2023-11-27iommu/virtio: Make use of ops->iotlb_sync_mapNiklas Schnelle1-1/+16
Pull out the sync operation from viommu_map_pages() by implementing ops->iotlb_sync_map. This allows the common IOMMU code to map multiple elements of an sg with a single sync (see iommu_map_sg()). Link: https://lore.kernel.org/lkml/[email protected]/ Signed-off-by: Niklas Schnelle <[email protected]> Reviewed-by: Jean-Philippe Brucker <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2023-11-27iommu/amd: Set variable amd_dirty_ops to staticKunwu Chan1-2/+2
Fix the followng warning: drivers/iommu/amd/iommu.c:67:30: warning: symbol 'amd_dirty_ops' was not declared. Should it be static? This variable is only used in its defining file, so it should be static. Signed-off-by: Kunwu Chan <[email protected]> Reviewed-by: Joao Martins <[email protected]> Reviewed-by: Vasant Hegde <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2023-11-27iommu: Avoid more races around device probeRobin Murphy2-13/+19
It turns out there are more subtle races beyond just the main part of __iommu_probe_device() itself running in parallel - the dev_iommu_free() on the way out of an unsuccessful probe can still manage to trip up concurrent accesses to a device's fwspec. Thus, extend the scope of iommu_probe_device_lock() to also serialise fwspec creation and initial retrieval. Reported-by: Zhenhua Huang <[email protected]> Link: https://lore.kernel.org/linux-iommu/[email protected]/ Fixes: 01657bc14a39 ("iommu: Avoid races around device probe") Signed-off-by: Robin Murphy <[email protected]> Acked-by: Greg Kroah-Hartman <[email protected]> Reviewed-by: André Draszik <[email protected]> Tested-by: André Draszik <[email protected]> Link: https://lore.kernel.org/r/16f433658661d7cadfea51e7c65da95826112a2b.1700071477.git.robin.murphy@arm.com Cc: [email protected] Signed-off-by: Joerg Roedel <[email protected]>
2023-11-27iommu: Flow ERR_PTR out from __iommu_domain_alloc()Jason Gunthorpe1-20/+39
Most of the calling code now has error handling that can carry an error code further up the call chain. Keep the exported interface iommu_domain_alloc() returning NULL and reflow the internal code to use ERR_PTR not NULL for domain allocation failure. Optionally allow drivers to return ERR_PTR from any of the alloc ops. Many of the new ops (user, sva, etc) already return ERR_PTR, so having two rules is confusing and hard on drivers. This fixes a bug in DART that was returning ERR_PTR. Fixes: 482feb5c6492 ("iommu/dart: Call apple_dart_finalize_domain() as part of alloc_paging()") Reported-by: Dan Carpenter <[email protected]> Link: https://lore.kernel.org/linux-iommu/[email protected]/ Signed-off-by: Jason Gunthorpe <[email protected]> Reviewed-by: Jerry Snitselaar <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2023-11-27iommu: Map reserved memory as cacheable if device is coherentLaurentiu Tudor1-0/+3
Check if the device is marked as DMA coherent in the DT and if so, map its reserved memory as cacheable in the IOMMU. This fixes the recently added IOMMU reserved memory support which uses IOMMU_RESV_DIRECT without properly building the PROT for the mapping. Fixes: a5bf3cfce8cb ("iommu: Implement of_iommu_get_resv_regions()") Signed-off-by: Laurentiu Tudor <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Acked-by: Thierry Reding <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2023-11-21x86/apic: Drop apic::delivery_modeAndrew Cooper2-3/+3
This field is set to APIC_DELIVERY_MODE_FIXED in all cases, and is read exactly once. Fold the constant in uv_program_mmr() and drop the field. Searching for the origin of the stale HyperV comment reveals commit a31e58e129f7 ("x86/apic: Switch all APICs to Fixed delivery mode") which notes: As a consequence of this change, the apic::irq_delivery_mode field is now pointless, but this needs to be cleaned up in a separate patch. 6 years is long enough for this technical debt to have survived. [ bp: Fold in https://lore.kernel.org/r/[email protected] ] Signed-off-by: Andrew Cooper <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Signed-off-by: Borislav Petkov (AMD) <[email protected]> Reviewed-by: Steve Wahl <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2023-11-09Merge tag 'iommu-updates-v6.7' of ↵Linus Torvalds38-2737/+2024
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu updates from Joerg Roedel: "Core changes: - Make default-domains mandatory for all IOMMU drivers - Remove group refcounting - Add generic_single_device_group() helper and consolidate drivers - Cleanup map/unmap ops - Scaling improvements for the IOVA rcache depot - Convert dart & iommufd to the new domain_alloc_paging() ARM-SMMU: - Device-tree binding update: - Add qcom,sm7150-smmu-v2 for Adreno on SM7150 SoC - SMMUv2: - Support for Qualcomm SDM670 (MDSS) and SM7150 SoCs - SMMUv3: - Large refactoring of the context descriptor code to move the CD table into the master, paving the way for '->set_dev_pasid()' support on non-SVA domains - Minor cleanups to the SVA code Intel VT-d: - Enable debugfs to dump domain attached to a pasid - Remove an unnecessary inline function AMD IOMMU: - Initial patches for SVA support (not complete yet) S390 IOMMU: - DMA-API conversion and optimized IOTLB flushing And some smaller fixes and improvements" * tag 'iommu-updates-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (102 commits) iommu/dart: Remove the force_bypass variable iommu/dart: Call apple_dart_finalize_domain() as part of alloc_paging() iommu/dart: Convert to domain_alloc_paging() iommu/dart: Move the blocked domain support to a global static iommu/dart: Use static global identity domains iommufd: Convert to alloc_domain_paging() iommu/vt-d: Use ops->blocked_domain iommu/vt-d: Update the definition of the blocking domain iommu: Move IOMMU_DOMAIN_BLOCKED global statics to ops->blocked_domain Revert "iommu/vt-d: Remove unused function" iommu/amd: Remove DMA_FQ type from domain allocation path iommu: change iommu_map_sgtable to return signed values iommu/virtio: Add __counted_by for struct viommu_request and use struct_size() iommu/vt-d: debugfs: Support dumping a specified page table iommu/vt-d: debugfs: Create/remove debugfs file per {device, pasid} iommu/vt-d: debugfs: Dump entry pointing to huge page iommu/vt-d: Remove unused function iommu/arm-smmu-v3-sva: Remove bond refcount iommu/arm-smmu-v3-sva: Remove unused iommu_sva handle iommu/arm-smmu-v3: Rename cdcfg to cd_table ...
2023-11-01Merge tag 'for-linus-iommufd' of ↵Linus Torvalds23-178/+2200
git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd Pull iommufd updates from Jason Gunthorpe: "This brings three new iommufd capabilities: - Dirty tracking for DMA. AMD/ARM/Intel CPUs can now record if a DMA writes to a page in the IOPTEs within the IO page table. This can be used to generate a record of what memory is being dirtied by DMA activities during a VM migration process. A VMM like qemu will combine the IOMMU dirty bits with the CPU's dirty log to determine what memory to transfer. VFIO already has a DMA dirty tracking framework that requires PCI devices to implement tracking HW internally. The iommufd version provides an alternative that the VMM can select, if available. The two are designed to have very similar APIs. - Userspace controlled attributes for hardware page tables (HWPT/iommu_domain). There are currently a few generic attributes for HWPTs (support dirty tracking, and parent of a nest). This is an entry point for the userspace iommu driver to control the HW in detail. - Nested translation support for HWPTs. This is a 2D translation scheme similar to the CPU where a DMA goes through a first stage to determine an intermediate address which is then translated trough a second stage to a physical address. Like for CPU translation the first stage table would exist in VM controlled memory and the second stage is in the kernel and matches the VM's guest to physical map. As every IOMMU has a unique set of parameter to describe the S1 IO page table and its associated parameters the userspace IOMMU driver has to marshal the information into the correct format. This is 1/3 of the feature, it allows creating the nested translation and binding it to VFIO devices, however the API to support IOTLB and ATC invalidation of the stage 1 io page table, and forwarding of IO faults are still in progress. The series includes AMD and Intel support for dirty tracking. Intel support for nested translation. Along the way are a number of internal items: - New iommu core items: ops->domain_alloc_user(), ops->set_dirty_tracking, ops->read_and_clear_dirty(), IOMMU_DOMAIN_NESTED, and iommu_copy_struct_from_user - UAF fix in iopt_area_split() - Spelling fixes and some test suite improvement" * tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: (52 commits) iommufd: Organize the mock domain alloc functions closer to Joerg's tree iommufd/selftest: Fix page-size check in iommufd_test_dirty() iommufd: Add iopt_area_alloc() iommufd: Fix missing update of domains_itree after splitting iopt_area iommu/vt-d: Disallow read-only mappings to nest parent domain iommu/vt-d: Add nested domain allocation iommu/vt-d: Set the nested domain to a device iommu/vt-d: Make domain attach helpers to be extern iommu/vt-d: Add helper to setup pasid nested translation iommu/vt-d: Add helper for nested domain allocation iommu/vt-d: Extend dmar_domain to support nested domain iommufd: Add data structure for Intel VT-d stage-1 domain allocation iommu/vt-d: Enhance capability check for nested parent domain allocation iommufd/selftest: Add coverage for IOMMU_HWPT_ALLOC with nested HWPTs iommufd/selftest: Add nested domain allocation for mock domain iommu: Add iommu_copy_struct_from_user helper iommufd: Add a nested HW pagetable object iommu: Pass in parent domain with user_data to domain_alloc_user op iommufd: Share iommufd_hwpt_alloc with IOMMUFD_OBJ_HWPT_NESTED iommufd: Derive iommufd_hwpt_paging from iommufd_hw_pagetable ...
2023-11-01Merge tag 'asm-generic-6.7' of ↵Linus Torvalds2-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull ia64 removal and asm-generic updates from Arnd Bergmann: - The ia64 architecture gets its well-earned retirement as planned, now that there is one last (mostly) working release that will be maintained as an LTS kernel. - The architecture specific system call tables are updated for the added map_shadow_stack() syscall and to remove references to the long-gone sys_lookup_dcookie() syscall. * tag 'asm-generic-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: hexagon: Remove unusable symbols from the ptrace.h uapi asm-generic: Fix spelling of architecture arch: Reserve map_shadow_stack() syscall number for all architectures syscalls: Cleanup references to sys_lookup_dcookie() Documentation: Drop or replace remaining mentions of IA64 lib/raid6: Drop IA64 support Documentation: Drop IA64 from feature descriptions kernel: Drop IA64 support from sig_fault handlers arch: Remove Itanium (IA-64) architecture
2023-10-30iommufd: Organize the mock domain alloc functions closer to Joerg's treeJason Gunthorpe1-19/+16
Patches in Joerg's iommu tree to convert the mock driver to use domain_alloc_paging() that clash badly with the way the selftest changes for nesting were structured. Massage the selftest so that it looks closer the code after the domain_alloc_paging() conversion to ease the merge. Change __mock_domain_alloc_paging() into mock_domain_alloc_paging() in the same way as the iommu tree. The merge resolution then trivially takes both and deletes mock_domain_alloc(). Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Nicolin Chen <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Yi Liu <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-10-30iommufd/selftest: Fix page-size check in iommufd_test_dirty()Joao Martins1-2/+4
iommufd_test_dirty()/IOMMU_TEST_OP_DIRTY sets the dirty bits in the mock domain implementation that the userspace side validates against what it obtains via the UAPI. However in introducing iommufd_test_dirty() it forgot to validate page_size being 0 leading to two possible divide-by-zero problems: one at the beginning when calculating @max and while calculating the IOVA in the XArray PFN tracking list. While at it, validate the length to require non-zero value as well, as we can't be allocating a 0-sized bitmap. Link: https://lore.kernel.org/r/[email protected] Reported-by: [email protected] Closes: https://lore.kernel.org/linux-iommu/[email protected]/ Fixes: a9af47e382a4 ("iommufd/selftest: Test IOMMU_HWPT_GET_DIRTY_BITMAP") Signed-off-by: Joao Martins <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-10-30iommufd: Add iopt_area_alloc()Jason Gunthorpe2-3/+17
We never initialize the two interval tree nodes, and zero fill is not the same as RB_CLEAR_NODE. This can hide issues where we missed adding the area to the trees. Factor out the allocation and clear the two nodes. Fixes: 51fe6141f0f6 ("iommufd: Data structure to provide IOVA to PFN mapping") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jason Gunthorpe <[email protected]>
2023-10-30iommufd: Fix missing update of domains_itree after splitting iopt_areaKoichiro Den1-0/+10
In iopt_area_split(), if the original iopt_area has filled a domain and is linked to domains_itree, pages_nodes have to be properly reinserted. Otherwise the domains_itree becomes corrupted and we will UAF. Fixes: 51fe6141f0f6 ("iommufd: Data structure to provide IOVA to PFN mapping") Link: https://lore.kernel.org/r/[email protected] Cc: [email protected] Signed-off-by: Koichiro Den <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-10-27Merge branches 'iommu/fixes', 'arm/tegra', 'arm/smmu', 'virtio', 'x86/vt-d', ↵Joerg Roedel38-2735/+2025
'x86/amd', 'core' and 's390' into next
2023-10-27iommu: Avoid unnecessary cache invalidationsLu Baolu1-1/+2
The iommu_create_device_direct_mappings() only needs to flush the caches when the mappings are changed in the affected domain. This is not true for non-DMA domains, or for devices attached to the domain that have no reserved regions. To avoid unnecessary cache invalidations, add a check before iommu_flush_iotlb_all(). Fixes: a48ce36e2786 ("iommu: Prevent RESV_DIRECT devices from blocking domains") Signed-off-by: Lu Baolu <[email protected]> Tested-by: Henry Willard <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2023-10-26Merge tag 'v6.6-rc7' into coreJoerg Roedel5-29/+31
Linux 6.6-rc7
2023-10-26iommu/dart: Remove the force_bypass variableJason Gunthorpe1-14/+6
This flag just caches if the IO page size is larger than the CPU PAGE_SIZE. This only needs to be checked in two places so remove the confusingly named cache. dart would like to not support paging domains at all if the IO page size is larger than the CPU page size. In this case we should ideally fail domain_alloc_paging(), as there is no point in creating a domain that can never be attached. Move the test into apple_dart_finalize_domain(). The check in apple_dart_mod_streams() will prevent the domain from being attached to the wrong dart There is no HW limitation that prevents BLOCKED domains from working, remove that test. The check in apple_dart_of_xlate() is redundant since immediately after the pgsize is checked. Remove it. Remove the variable. Suggested-by: Janne Grunau <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Reviewed-by: Janne Grunau <[email protected]> Acked-by: Sven Peter <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2023-10-26iommu/dart: Call apple_dart_finalize_domain() as part of alloc_paging()Jason Gunthorpe1-9/+19
In many cases the dev argument will now be !NULL so we should use it to finalize the domain at allocation. Make apple_dart_finalize_domain() accept the correct type. Reviewed-by: Janne Grunau <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Acked-by: Sven Peter <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2023-10-26iommu/dart: Convert to domain_alloc_paging()Jason Gunthorpe1-8/+5
Since the IDENTITY and BLOCKED behaviors were moved to global statics all that remains is the paging domain. Rename to apple_dart_attach_dev_paging() and remove the left over type check. Reviewed-by: Janne Grunau <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Acked-by: Sven Peter <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2023-10-26iommu/dart: Move the blocked domain support to a global staticJason Gunthorpe1-22/+32
Move to the new static global for blocked domains. Move the blocked specific code to apple_dart_attach_dev_blocked(). Reviewed-by: Janne Grunau <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Acked-by: Sven Peter <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2023-10-26iommu/dart: Use static global identity domainsJason Gunthorpe1-11/+28
Move to the new static global for identity domains. Move the identity specific code to apple_dart_attach_dev_identity(). Reviewed-by: Janne Grunau <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Acked-by: Sven Peter <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2023-10-26iommufd: Convert to alloc_domain_paging()Jason Gunthorpe1-8/+3
Move the global static blocked domain to the ops and convert the unmanaged domain to domain_alloc_paging. Signed-off-by: Jason Gunthorpe <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Acked-by: Sven Peter <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2023-10-26iommu/vt-d: Use ops->blocked_domainJason Gunthorpe1-2/+1
Trivially migrate to the ops->blocked_domain for the existing global static. Reviewed-by: Lu Baolu <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Acked-by: Sven Peter <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2023-10-26iommu/vt-d: Update the definition of the blocking domainJason Gunthorpe1-2/+2
The global static should pre-define the type and the NOP free function can be now left as NULL. Reviewed-by: Lu Baolu <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Acked-by: Sven Peter <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2023-10-26iommu: Move IOMMU_DOMAIN_BLOCKED global statics to ops->blocked_domainJason Gunthorpe1-0/+2
Following the pattern of identity domains, just assign the BLOCKED domain global statics to a value in ops. Update the core code to use the global static directly. Update powerpc to use the new scheme and remove its empty domain_alloc callback. Reviewed-by: Lu Baolu <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Acked-by: Sven Peter <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Joerg Roedel <[email protected]>
2023-10-26iommu/vt-d: Disallow read-only mappings to nest parent domainLu Baolu1-0/+6
When remapping hardware is configured by system software in scalable mode as Nested (PGTT=011b) and with PWSNP field Set in the PASID-table-entry, it may Set Accessed bit and Dirty bit (and Extended Access bit if enabled) in first-stage page-table entries even when second-stage mappings indicate that corresponding first-stage page-table is Read-Only. As the result, contents of pages designated by VMM as Read-Only can be modified by IOMMU via PML5E (PML4E for 4-level tables) access as part of address translation process due to DMAs issued by Guest. This disallows read-only mappings in the domain that is supposed to be used as nested parent. Reference from Sapphire Rapids Specification Update [1], errata details, SPR17. Userspace should know this limitation by checking the IOMMU_HW_INFO_VTD_ERRATA_772415_SPR17 flag reported in the IOMMU_GET_HW_INFO ioctl. [1] https://www.intel.com/content/www/us/en/content-details/772415/content-details.html Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Kevin Tian <[email protected]> Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Yi Liu <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-10-26iommu/vt-d: Add nested domain allocationLu Baolu3-20/+23
This adds the support for IOMMU_HWPT_DATA_VTD_S1 type. And 'nested_parent' is added to mark the nested parent domain to sanitize the input parent domain. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Yi Liu <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-10-26iommu/vt-d: Set the nested domain to a deviceYi Liu1-0/+54
This adds the helper for setting the nested domain to a device hence enable nested domain usage on Intel VT-d. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jacob Pan <[email protected]> Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Yi Liu <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-10-26iommu/vt-d: Make domain attach helpers to be externYi Liu2-9/+13
This makes the helpers visible to nested.c. Link: https://lore.kernel.org/r/[email protected] Suggested-by: Lu Baolu <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Reviewed-by: Lu Baolu <[email protected]> Signed-off-by: Yi Liu <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-10-26iommu/vt-d: Add helper to setup pasid nested translationLu Baolu2-0/+114
The configurations are passed in from the user when the user domain is allocated. This helper interprets these configurations according to the data structure defined in uapi/linux/iommufd.h. The EINVAL error will be returned if any of configurations are not compatible with the hardware capabilities. The caller can retry with another compatible user domain. The encoding of fields of each pasid entry is defined in section 9.6 of the VT-d spec. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jacob Pan <[email protected]> Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Yi Liu <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-10-26iommu/vt-d: Add helper for nested domain allocationLu Baolu3-1/+65
This adds helper for accepting user parameters and allocate a nested domain. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Kevin Tian <[email protected]> Signed-off-by: Jacob Pan <[email protected]> Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Yi Liu <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-10-26iommu/vt-d: Extend dmar_domain to support nested domainLu Baolu1-6/+30
The nested domain fields are exclusive to those that used for a DMA remapping domain. Use union to avoid memory waste. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Kevin Tian <[email protected]> Signed-off-by: Lu Baolu <[email protected]> Signed-off-by: Yi Liu <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-10-26iommu/vt-d: Enhance capability check for nested parent domain allocationYi Liu2-1/+3
This adds the scalable mode check before allocating the nested parent domain as checking nested capability is not enough. User may turn off scalable mode which also means no nested support even if the hardware supports it. Fixes: c97d1b20d383 ("iommu/vt-d: Add domain_alloc_user op") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Yi Liu <[email protected]> Reviewed-by: Lu Baolu <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2023-10-26iommufd/selftest: Add nested domain allocation for mock domainNicolin Chen2-30/+140
Add nested domain support in the ->domain_alloc_user op with some proper sanity checks. Then, add a domain_nested_ops for all nested domains and split the get_md_pagetable helper into paging and nested helpers. Also, add an iotlb as a testing property of a nested domain. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Nicolin Chen <[email protected]> Signed-off-by: Yi Liu <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>