summaryrefslogtreecommitdiff
path: root/drivers/iommu
AgeCommit message (Collapse)Author
2024-04-26iommu/amd: Define per-IOMMU iopf_queueSuravee Suthikulpanit
AMD IOMMU hardware supports PCI Peripheral Paging Request (PPR) using a PPR log, which is a circular buffer containing requests from downstream end-point devices. There is one PPR log per IOMMU instance. Therefore, allocate an iopf_queue per IOMMU instance during driver initialization, and free the queue during driver deinitialization. Also rename enable_iommus_v2() -> enable_iommus_ppr() to reflect its usage. And add amd_iommu_gt_ppr_supported() check before enabling PPR log. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Co-developed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240418103400.6229-10-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26iommu/amd: Enable PCI features based on attached domain capabilityVasant Hegde
Commit eda8c2860ab6 ("iommu/amd: Enable device ATS/PASID/PRI capabilities independently") changed the way it enables device capability while attaching devices. I missed to account the attached domain capability. Meaning if domain is not capable of handling PASID/PRI (ex: paging domain with v1 page table) then enabling device feature is not required. This patch enables PASID/PRI only if domain is capable of handling SVA. Also move pci feature enablement to do_attach() function so that we make SVA capability in one place. Finally make PRI enable/disable functions as static functions. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240418103400.6229-9-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26iommu/amd: Setup GCR3 table in advance if domain is SVA capableVasant Hegde
SVA can be supported if domain is in passthrough mode or paging domain with v2 page table. Current code sets up GCR3 table for domain with v2 page table only. Setup GCR3 table for all SVA capable domains. - Move GCR3 init/destroy to separate function. - Change default GCR3 table to use MAX supported PASIDs. Ideally it should use 1 level PASID table as its using PASID zero only. But we don't have support to extend PASID table yet. We will fix this later. - When domain is configured with passthrough mode, allocate default GCR3 table only if device is SVA capable. Note that in attach_device() path it will not know whether device will use SVA or not. If device is attached to passthrough domain and if it doesn't use SVA then GCR3 table will never be used. We will endup wasting memory allocated for GCR3 table. This is done to avoid DTE update when attaching PASID to device. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240418103400.6229-8-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26iommu/amd: Introduce iommu_dev_data.max_pasidsVasant Hegde
This variable will track the number of PASIDs supported by the device. If IOMMU or device doesn't support PASID then it will be zero. This will be used while allocating GCR3 table to decide required number of PASID table levels. Also in PASID bind path it will use this variable to check whether device supports PASID or not. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240418103400.6229-7-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26iommu/amd: Fix PPR interrupt processing logicVasant Hegde
* Do not re-read ppr head pointer as its just updated by the driver. * Do not read PPR buffer tail pointer inside while loop. If IOMMU generates PPR events continuously then completing interrupt processing takes long time. In worst case it may cause infinite loop. Suggested-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240418103400.6229-6-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26iommu/amd: Move PPR-related functions into ppr.cSuravee Suthikulpanit
In preparation to subsequent PPR-related patches, and also remove static declaration for certain helper functions so that it can be reused in other files. Also rename below functions: alloc_ppr_log -> amd_iommu_alloc_ppr_log iommu_enable_ppr_log -> amd_iommu_enable_ppr_log free_ppr_log -> amd_iommu_free_ppr_log iommu_poll_ppr_log -> amd_iommu_poll_ppr_log Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Co-developed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240418103400.6229-5-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26iommu/amd: Add support for enabling/disabling IOMMU featuresWei Huang
Add support for struct iommu_ops.dev_{enable/disable}_feat. Please note that the empty feature switches will be populated by subsequent patches. Signed-off-by: Wei Huang <wei.huang2@amd.com> Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240418103400.6229-4-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26iommu/amd: Introduce per device DTE update functionVasant Hegde
Consolidate per device update and flush logic into separate function. Also make it as global function as it will be used in subsequent series to update the DTE. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240418103400.6229-3-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26iommu/amd: Rename amd_iommu_v2_supported() as amd_iommu_pasid_supported()Vasant Hegde
To reflect its usage. No functional changes intended. Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240418103400.6229-2-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26iommu/amd: Enhance def_domain_type to handle untrusted deviceVasant Hegde
Previously, IOMMU core layer was forcing IOMMU_DOMAIN_DMA domain for untrusted device. This always took precedence over driver's def_domain_type(). Commit 59ddce4418da ("iommu: Reorganize iommu_get_default_domain_type() to respect def_domain_type()") changed the behaviour. Current code calls def_domain_type() but if it doesn't return IOMMU_DOMAIN_DMA for untrusted device it throws error. This results in IOMMU group (and potentially IOMMU itself) in undetermined state. This patch adds untrusted check in AMD IOMMU driver code. So that it allows eGPUs behind Thunderbolt work again. Fine tuning amd_iommu_def_domain_type() will be done later. Reported-by: Eric Wagner <ewagner12@gmail.com> Link: https://lore.kernel.org/linux-iommu/CAHudX3zLH6CsRmLE-yb+gRjhh-v4bU5_1jW_xCcxOo_oUUZKYg@mail.gmail.com Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3182 Fixes: 59ddce4418da ("iommu: Reorganize iommu_get_default_domain_type() to respect def_domain_type()") Cc: Robin Murphy <robin.murphy@arm.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: stable@kernel.org # v6.7+ Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20240423111725.5813-1-vasant.hegde@amd.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26iommu/dma: Centralise iommu_setup_dma_ops()Robin Murphy
It's somewhat hard to see, but arm64's arch_setup_dma_ops() should only ever call iommu_setup_dma_ops() after a successful iommu_probe_device(), which means there should be no harm in achieving the same order of operations by running it off the back of iommu_probe_device() itself. This then puts it in line with the x86 and s390 .probe_finalize bodges, letting us pull it all into the main flow properly. As a bonus this lets us fold in and de-scope the PCI workaround setup as well. At this point we can also then pull the call up inside the group mutex, and avoid having to think about whether iommu_group_store_type() could theoretically race and free the domain if iommu_setup_dma_ops() ran just *before* iommu_device_use_default_domain() claims it... Furthermore we replace one .probe_finalize call completely, since the only remaining implementations are now one which only needs to run once for the initial boot-time probe, and two which themselves render that path unreachable. This leaves us a big step closer to realistically being able to unpick the variety of different things that iommu_setup_dma_ops() has been muddling together, and further streamline iommu-dma into core API flows in future. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> # For Intel IOMMU Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Hanjun Guo <guohanjun@huawei.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/bebea331c1d688b34d9862eefd5ede47503961b8.1713523152.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26iommu/dma: Make limit checks self-containedRobin Murphy
It's now easy to retrieve the device's DMA limits if we want to check them against the domain aperture, so do that ourselves instead of relying on them being passed through the callchain. Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Tested-by: Hanjun Guo <guohanjun@huawei.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/e28a114243d1e79eb3609aded034f8529521333f.1713523152.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26iommu/vt-d: Remove struct intel_svmLu Baolu
The struct intel_svm was used for keeping attached devices info for sva domain. Since sva domain is a kind of iommu_domain, the struct dmar_domain should centralize all info of a sva domain, including the info of attached devices. Therefore, retire struct intel_svm and clean up the code. Besides, register mmu notifier callback in domain_alloc_sva() callback which allows the memory management notifier lifetime to follow the lifetime of the iommu_domain. Call mmu_notifier_put() in the domain free and defer the real free to the mmu free_notifier callback. Co-developed-by: Tina Zhang <tina.zhang@intel.com> Signed-off-by: Tina Zhang <tina.zhang@intel.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20240416080656.60968-13-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26iommu: Add ops->domain_alloc_sva()Jason Gunthorpe
Make a new op that receives the device and the mm_struct that the SVA domain should be created for. Unlike domain_alloc_paging() the dev argument is never NULL here. This allows drivers to fully initialize the SVA domain and allocate the mmu_notifier during allocation. It allows the notifier lifetime to follow the lifetime of the iommu_domain. Since we have only one call site, upgrade the new op to return ERR_PTR instead of NULL. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Tina Zhang <tina.zhang@intel.com> Link: https://lore.kernel.org/r/20240311090843.133455-15-vasant.hegde@amd.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20240416080656.60968-12-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26iommu/vt-d: Remove intel_svm_devLu Baolu
The intel_svm_dev data structure used in the sva implementation for the Intel IOMMU driver stores information about a device attached to an SVA domain. It is a duplicate of dev_pasid_info that serves the same purpose. Replace intel_svm_dev with dev_pasid_info and clean up the use of intel_svm_dev. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20240416080656.60968-11-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26iommu/vt-d: Use cache helpers in arch_invalidate_secondary_tlbsLu Baolu
The arch_invalidate_secondary_tlbs callback is called in the SVA mm notification path. It invalidates all or a range of caches after the CPU page table is modified. Use the cache tag helps in this path. The mm_types defines vm_end as the first byte after the end address which is different from the iommu gather API, hence convert the end parameter from mm_types to iommu gather scheme before calling the cache_tag helper. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20240416080656.60968-10-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26iommu/vt-d: Use cache_tag_flush_range() in cache_invalidate_userLu Baolu
The cache_invalidate_user callback is called to invalidate a range of caches for the affected user domain. Use cache_tag_flush_range() in this callback. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20240416080656.60968-9-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26iommu/vt-d: Cleanup use of iommu_flush_iotlb_psi()Lu Baolu
Use cache_tag_flush_range() in switch_to_super_page() to invalidate the necessary caches when switching mappings from normal to super pages. The iommu_flush_iotlb_psi() call in intel_iommu_memory_notifier() is unnecessary since there should be no cache invalidation for the identity domain. Clean up iommu_flush_iotlb_psi() after the last call site is removed. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20240416080656.60968-8-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26iommu/vt-d: Use cache_tag_flush_range_np() in iotlb_sync_mapLu Baolu
The iotlb_sync_map callback is called by the iommu core after non-present to present mappings are created. The iommu driver uses this callback to invalidate caches if IOMMU is working in caching mode and second-only translation is used for the domain. Use cache_tag_flush_range_np() in this callback. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20240416080656.60968-7-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26iommu/vt-d: Use cache_tag_flush_range() in tlb_syncLu Baolu
The tlb_sync callback is called by the iommu core to flush a range of caches for the affected domain. Use cache_tag_flush_range() in this callback. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20240416080656.60968-6-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26iommu/vt-d: Use cache_tag_flush_all() in flush_iotlb_allLu Baolu
The flush_iotlb_all callback is called by the iommu core to flush all caches for the affected domain. Use cache_tag_flush_all() in this callback. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20240416080656.60968-5-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26iommu/vt-d: Add trace events for cache tag interfaceLu Baolu
Add trace events for cache tag assign/unassign/flush operations and trace the events in the interfaces. These trace events will improve debugging capabilities by providing detailed information about cache tag activity. A sample of the traced messages looks like below [messages have been stripped and wrapped to make the line short]. cache_tag_assign: dmar9/0000:00:01.0 type iotlb did 1 pasid 9 ref 1 cache_tag_assign: dmar9/0000:00:01.0 type devtlb did 1 pasid 9 ref 1 cache_tag_flush_all: dmar6/0000:8a:00.0 type iotlb did 7 pasid 0 ref 1 cache_tag_flush_range: dmar1 0000:00:1b.0[0] type iotlb did 9 [0xeab00000-0xeab1afff] addr 0xeab00000 pages 0x20 mask 0x5 cache_tag_flush_range: dmar1 0000:00:1b.0[0] type iotlb did 9 [0xeab20000-0xeab31fff] addr 0xeab20000 pages 0x20 mask 0x5 cache_tag_flush_range: dmar1 0000:00:1b.0[0] type iotlb did 9 [0xeaa40000-0xeaa51fff] addr 0xeaa40000 pages 0x20 mask 0x5 cache_tag_flush_range: dmar1 0000:00:1b.0[0] type iotlb did 9 [0x98de0000-0x98de4fff] addr 0x98de0000 pages 0x8 mask 0x3 cache_tag_flush_range: dmar1 0000:00:1b.0[0] type iotlb did 9 [0xe9828000-0xe9828fff] addr 0xe9828000 pages 0x1 mask 0x0 cache_tag_unassign: dmar9/0000:00:01.0 type iotlb did 1 pasid 9 ref 1 cache_tag_unassign: dmar9/0000:00:01.0 type devtlb did 1 pasid 9 ref 1 Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20240416080656.60968-4-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26iommu/vt-d: Add cache tag invalidation helpersLu Baolu
Add several helpers to invalidate the caches after mappings in the affected domain are changed. - cache_tag_flush_range() invalidates a range of caches after mappings within this range are changed. It uses the page-selective cache invalidation methods. - cache_tag_flush_all() invalidates all caches tagged by a domain ID. It uses the domain-selective cache invalidation methods. - cache_tag_flush_range_np() invalidates a range of caches when new mappings are created in the domain and the corresponding page table entries change from non-present to present. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20240416080656.60968-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26iommu/vt-d: Add cache tag assignment interfaceLu Baolu
Caching tag is a combination of tags used by the hardware to cache various translations. Whenever a mapping in a domain is changed, the IOMMU driver should invalidate the caches with the caching tags. The VT-d specification describes caching tags in section 6.2.1, Tagging of Cached Translations. Add interface to assign caching tags to an IOMMU domain when attached to a RID or PASID, and unassign caching tags when a domain is detached from a RID or PASID. All caching tags are listed in the per-domain tag list and are protected by a dedicated lock. In addition to the basic IOTLB and devTLB caching tag types, NESTING_IOTLB and NESTING_DEVTLB tag types are also introduced. These tags are used for caches that store translations for DMA accesses through a nested user domain. They are affected by changes to mappings in the parent domain. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20240416080656.60968-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26iommu/vt-d: Remove caching mode check before device TLB flushLu Baolu
The Caching Mode (CM) of the Intel IOMMU indicates if the hardware implementation caches not-present or erroneous translation-structure entries except for the first-stage translation. The caching mode is irrelevant to the device TLB, therefore there is no need to check it before a device TLB invalidation operation. Remove two caching mode checks before device TLB invalidation in the driver. The removal of these checks doesn't change the driver's behavior in critical map/unmap paths. Hence, there is no functionality or performance impact, especially since commit <29b32839725f> ("iommu/vt-d: Do not use flush-queue when caching-mode is on") has already disabled flush-queue for caching mode. Therefore, caching mode will never call intel_flush_iotlb_all(). Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Link: https://lore.kernel.org/r/20240415013835.9527-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26iommu/vt-d: Remove private data use in fault messageJingqi Liu
According to Intel VT-d specification revision 4.0, "Private Data" field has been removed from Page Request/Response. Since the private data field is not used in fault message, remove the related definitions in page request descriptor and remove the related code in page request/response handler, as Intel hasn't shipped any products which support private data in the page request message. Signed-off-by: Jingqi Liu <Jingqi.liu@intel.com> Link: https://lore.kernel.org/r/20240308103811.76744-3-Jingqi.liu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26iommu/vt-d: Remove debugfs use of private data fieldJingqi Liu
Since the page fault report and response have been tracked by ftrace, the users can easily calculate the time used for a page fault handling. There's no need to expose the similar functionality in debugfs. Hence, remove the corresponding operations in debugfs. Signed-off-by: Jingqi Liu <Jingqi.liu@intel.com> Link: https://lore.kernel.org/r/20240308103811.76744-2-Jingqi.liu@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26iommu/vt-d: Allocate DMAR fault interrupts locallyDimitri Sivanich
The Intel IOMMU code currently tries to allocate all DMAR fault interrupt vectors on the boot cpu. On large systems with high DMAR counts this results in vector exhaustion, and most of the vectors are not initially allocated socket local. Instead, have a cpu on each node do the vector allocation for the DMARs on that node. The boot cpu still does the allocation for its node during its boot sequence. Signed-off-by: Dimitri Sivanich <sivanich@hpe.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/Zfydpp2Hm+as16TY@hpe.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26iommu/vt-d: Use try_cmpxchg64{,_local}() in iommu.cUros Bizjak
Replace this pattern in iommu.c: cmpxchg64{,_local}(*ptr, 0, new) != 0 ... with the simpler and faster: !try_cmpxchg64{,_local}(*ptr, &tmp, new) The x86 CMPXCHG instruction returns success in the ZF flag, so this change saves a compare after the CMPXCHG. No functional change intended. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Lu Baolu <baolu.lu@linux.intel.com> Cc: Joerg Roedel <joro@8bytes.org> Cc: Will Deacon <will@kernel.org> Cc: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20240414162454.49584-1-ubizjak@gmail.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-26iommu/vt-d: Remove redundant assignment to variable errColin Ian King
Variable err is being assigned a value that is never read. It is either being re-assigned later on error exit paths, or never referenced on the non-error path. Cleans up clang scan build warning: drivers/iommu/intel/dmar.c:1070:2: warning: Value stored to 'err' is never read [deadcode.DeadStores]` Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://lore.kernel.org/r/20240411090535.306326-1-colin.i.king@gmail.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-25mm: change inlined allocation helpers to account at the call siteSuren Baghdasaryan
Main goal of memory allocation profiling patchset is to provide accounting that is cheap enough to run in production. To achieve that we inject counters using codetags at the allocation call sites to account every time allocation is made. This injection allows us to perform accounting efficiently because injected counters are immediately available as opposed to the alternative methods, such as using _RET_IP_, which would require counter lookup and appropriate locking that makes accounting much more expensive. This method requires all allocation functions to inject separate counters at their call sites so that their callers can be individually accounted. Counter injection is implemented by allocation hooks which should wrap all allocation functions. Inlined functions which perform allocations but do not use allocation hooks are directly charged for the allocations they perform. In most cases these functions are just specialized allocation wrappers used from multiple places to allocate objects of a specific type. It would be more useful to do the accounting at their call sites instead. Instrument these helpers to do accounting at the call site. Simple inlined allocation wrappers are converted directly into macros. More complex allocators or allocators with documentation are converted into _noprof versions and allocation hooks are added. This allows memory allocation profiling mechanism to charge allocations to the callers of these functions. Link: https://lkml.kernel.org/r/20240415020731.1152108-1-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Acked-by: Jan Kara <jack@suse.cz> [jbd2] Cc: Anna Schumaker <anna@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: David S. Miller <davem@davemloft.net> Cc: Dennis Zhou <dennis@kernel.org> Cc: Eric Dumazet <edumazet@google.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Jakub Sitnicki <jakub@cloudflare.com> Cc: Jiri Kosina <jikos@kernel.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-25change alloc_pages name in dma_map_ops to avoid name conflictsSuren Baghdasaryan
After redefining alloc_pages, all uses of that name are being replaced. Change the conflicting names to prevent preprocessor from replacing them when it's not intended. Link: https://lkml.kernel.org/r/20240321163705.3067592-18-surenb@google.com Signed-off-by: Suren Baghdasaryan <surenb@google.com> Tested-by: Kees Cook <keescook@chromium.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andreas Hindborg <a.hindborg@samsung.com> Cc: Benno Lossin <benno.lossin@proton.me> Cc: "Björn Roy Baron" <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Christoph Lameter <cl@linux.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Gary Guo <gary@garyguo.net> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wedson Almeida Filho <wedsonaf@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-19Merge tag 'for-linus-iommufd' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd Pull iommufd fixes from Jason Gunthorpe: "Two fixes for the selftests: - CONFIG_IOMMUFD_TEST needs CONFIG_IOMMUFD_DRIVER to work - The kconfig fragment sshould include fault injection so the fault injection test can work" * tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: iommufd: Add config needed for iommufd_fail_nth iommufd: Add missing IOMMUFD_DRIVER kconfig for the selftest
2024-04-18iommu/arm-smmu-qcom: Use the custom fault handler on more platformsGeorgi Djakov
The TBU support is now available, so let's allow it to be used on other platforms that have the Qualcomm SMMU-500 implementation with TBUs. This will allow the context fault handler to query the TBUs when a context fault occurs. Signed-off-by: Georgi Djakov <quic_c_gdjako@quicinc.com> Link: https://lore.kernel.org/r/20240417133731.2055383-7-quic_c_gdjako@quicinc.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-18iommu/arm-smmu-qcom: Use a custom context fault handler for sdm845Georgi Djakov
The sdm845 platform now supports TBUs, so let's get additional debug info from the TBUs when a context fault occurs. Implement a custom context fault handler that does both software + hardware page table walks and TLB Invalidate All. Signed-off-by: Georgi Djakov <quic_c_gdjako@quicinc.com> Link: https://lore.kernel.org/r/20240417133731.2055383-5-quic_c_gdjako@quicinc.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-18iommu/arm-smmu: Allow using a threaded handler for context interruptsGeorgi Djakov
Threaded IRQ handlers run in a less critical context compared to normal IRQs, so they can perform more complex and time-consuming operations without causing significant delays in other parts of the kernel. During a context fault, it might be needed to do more processing and gather debug information from TBUs in the handler. These operations may sleep, so add an option to use a threaded IRQ handler in these cases. Signed-off-by: Georgi Djakov <quic_c_gdjako@quicinc.com> Link: https://lore.kernel.org/r/20240417133731.2055383-4-quic_c_gdjako@quicinc.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-18iommu/arm-smmu-qcom-debug: Add support for TBUsGeorgi Djakov
Operating the TBUs (Translation Buffer Units) from Linux on Qualcomm platforms can help with debugging context faults. To help with that, the TBUs can run ATOS (Address Translation Operations) to manually trigger address translation of IOVA to physical address in hardware and provide more details when a context fault happens. The driver will control the resources needed by the TBU to allow running the debug operations such as ATOS, check for outstanding transactions, do snapshot capture etc. Signed-off-by: Georgi Djakov <quic_c_gdjako@quicinc.com> Link: https://lore.kernel.org/r/20240417133731.2055383-3-quic_c_gdjako@quicinc.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-18iommu/arm-smmu-v3: Free MSIs in case of ENOMEMAleksandr Aprelkov
If devm_add_action() returns -ENOMEM, then MSIs are allocated but not not freed on teardown. Use devm_add_action_or_reset() instead to keep the static analyser happy. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Aleksandr Aprelkov <aaprelkov@usergate.com> Link: https://lore.kernel.org/r/20240403053759.643164-1-aaprelkov@usergate.com [will: Tweak commit message, remove warning message] Signed-off-by: Will Deacon <will@kernel.org>
2024-04-18iommu/arm-smmu: Convert to domain_alloc_paging()Jason Gunthorpe
Now that the BLOCKED and IDENTITY behaviors are managed with their own domains change to the domain_alloc_paging() op. The check for using_legacy_binding is now redundant, arm_smmu_def_domain_type() always returns IOMMU_DOMAIN_IDENTITY for this mode, so the core code will never attempt to create a DMA domain in the first place. Since commit a4fdd9762272 ("iommu: Use flush queue capability") the core code only passes in IDENTITY/BLOCKED/UNMANAGED/DMA domain types. It will not pass in IDENTITY or BLOCKED if the global statics exist, so the test for DMA is also redundant now too. Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/0-v1-3632c65678e0+2f1-smmu_alloc_paging_jgg@nvidia.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-15iommu: account IOMMU allocated memoryPasha Tatashin
In order to be able to limit the amount of memory that is allocated by IOMMU subsystem, the memory must be accounted. Account IOMMU as part of the secondary pagetables as it was discussed at LPC. The value of SecPageTables now contains mmeory allocation by IOMMU and KVM. There is a difference between GFP_ACCOUNT and what NR_IOMMU_PAGES shows. GFP_ACCOUNT is set only where it makes sense to charge to user processes, i.e. IOMMU Page Tables, but there more IOMMU shared data that should not really be charged to a specific process. Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: David Rientjes <rientjes@google.com> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Link: https://lore.kernel.org/r/20240413002522.1101315-12-pasha.tatashin@soleen.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-15iommu: observability of the IOMMU allocationsPasha Tatashin
Add NR_IOMMU_PAGES into node_stat_item that counts number of pages that are allocated by the IOMMU subsystem. The allocations can be view per-node via: /sys/devices/system/node/nodeN/vmstat. For example: $ grep iommu /sys/devices/system/node/node*/vmstat /sys/devices/system/node/node0/vmstat:nr_iommu_pages 106025 /sys/devices/system/node/node1/vmstat:nr_iommu_pages 3464 The value is in page-count, therefore, in the above example the iommu allocations amount to ~428M. Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: David Rientjes <rientjes@google.com> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Link: https://lore.kernel.org/r/20240413002522.1101315-11-pasha.tatashin@soleen.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-15iommu/tegra-smmu: use page allocation function provided by iommu-pages.hPasha Tatashin
Convert iommu/tegra-smmu.c to use the new page allocation functions provided in iommu-pages.h. Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Thierry Reding <treding@nvidia.com> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Link: https://lore.kernel.org/r/20240413002522.1101315-10-pasha.tatashin@soleen.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-15iommu/sun50i: use page allocation function provided by iommu-pages.hPasha Tatashin
Convert iommu/sun50i-iommu.c to use the new page allocation functions provided in iommu-pages.h. Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Link: https://lore.kernel.org/r/20240413002522.1101315-9-pasha.tatashin@soleen.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-15iommu/rockchip: use page allocation function provided by iommu-pages.hPasha Tatashin
Convert iommu/rockchip-iommu.c to use the new page allocation functions provided in iommu-pages.h. Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: David Rientjes <rientjes@google.com> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Link: https://lore.kernel.org/r/20240413002522.1101315-8-pasha.tatashin@soleen.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-15iommu/exynos: use page allocation function provided by iommu-pages.hPasha Tatashin
Convert iommu/exynos-iommu.c to use the new page allocation functions provided in iommu-pages.h. Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Link: https://lore.kernel.org/r/20240413002522.1101315-7-pasha.tatashin@soleen.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-15iommu/io-pgtable-dart: use page allocation function provided by iommu-pages.hPasha Tatashin
Convert iommu/io-pgtable-dart.c to use the new page allocation functions provided in iommu-pages.h., and remove unnecessary struct io_pgtable_cfg argument from __dart_alloc_pages(). Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Janne Grunau <j@jannau.net> Acked-by: David Rientjes <rientjes@google.com> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Link: https://lore.kernel.org/r/20240413002522.1101315-6-pasha.tatashin@soleen.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-15iommu/io-pgtable-arm: use page allocation function provided by iommu-pages.hPasha Tatashin
Convert iommu/io-pgtable-arm.c to use the new page allocation functions provided in iommu-pages.h. Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: David Rientjes <rientjes@google.com> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Link: https://lore.kernel.org/r/20240413002522.1101315-5-pasha.tatashin@soleen.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-15iommu/amd: use page allocation function provided by iommu-pages.hPasha Tatashin
Convert iommu/amd/* files to use the new page allocation functions provided in iommu-pages.h. Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: David Rientjes <rientjes@google.com> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Link: https://lore.kernel.org/r/20240413002522.1101315-4-pasha.tatashin@soleen.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-15iommu/dma: use iommu_put_pages_list() to releae freelistPasha Tatashin
Free the IOMMU page tables via iommu_put_pages_list(). The page tables were allocated via iommu_alloc_* functions in architecture specific places, but are released in dma-iommu if the freelist is gathered during map/unmap operations into iommu_iotlb_gather data structure. Currently, only iommu/intel that does that. Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: David Rientjes <rientjes@google.com> Link: https://lore.kernel.org/r/20240413002522.1101315-3-pasha.tatashin@soleen.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-04-15iommu/vt-d: add wrapper functions for page allocationsPasha Tatashin
In order to improve observability and accountability of IOMMU layer, we must account the number of pages that are allocated by functions that are calling directly into buddy allocator. This is achieved by first wrapping the allocation related functions into a separate inline functions in new file: drivers/iommu/iommu-pages.h Convert all page allocation calls under iommu/intel to use these new functions. Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Acked-by: David Rientjes <rientjes@google.com> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Link: https://lore.kernel.org/r/20240413002522.1101315-2-pasha.tatashin@soleen.com Signed-off-by: Joerg Roedel <jroedel@suse.de>