blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2023-04-20	Merge branch 'pci/controller/kirin'	Bjorn Helgaas	1	-0/+1
	- Select CONFIG_REGMAP_MMIO so kirin driver links correctly (Josh Triplett) * pci/controller/kirin: PCI: kirin: Select REGMAP_MMIO
2023-04-20	Merge branch 'pci/controller/ixp4xx'	Bjorn Helgaas	1	-4/+6
	- Use the PCI_CONF1_ADDRESS() macro to simplify config space address computation (Pali Rohár) * pci/controller/ixp4xx: PCI: ixp4xx: Use PCI_CONF1_ADDRESS() macro
2023-04-20	Merge branch 'pci/controller/dwc'	Bjorn Helgaas	1	-0/+7
	- Install i.MX6 PCI abort handler only when DT contains a PCI controller claimed by the imx6 driver (H. Nikolaus Schaller) * pci/controller/dwc: PCI: imx6: Install the fault handler only on compatible match
2023-04-20	Merge branch 'pci/resource'	Bjorn Helgaas	9	-55/+32
	- Add pci_dev_for_each_resource() and pci_bus_for_each_resource() iterators to simplify loops (Andy Shevchenko) * pci/resource: EISA: Drop unused pci_bus_for_each_resource() index argument PCI: Make pci_bus_for_each_resource() index optional PCI: Document pci_bus_for_each_resource() PCI: Introduce pci_dev_for_each_resource() PCI: Introduce pci_resource_n()
2023-04-20	Merge branch 'pci/reset'	Bjorn Helgaas	5	-17/+29
	- Wait longer for devices to become ready after resume (as we do for reset) to accommodate Intel Titan Ridge xHCI devices (Mika Westerberg) - Drop pci_bridge_wait_for_secondary_bus() timeout parameter since all callers pass the same value (Mika Westerberg) - Extend D3hot delay for NVIDIA HDA controllers to avoid unrecoverable devices after a bus reset (Alex Williamson) * pci/reset: PCI/PM: Extend D3hot delay for NVIDIA HDA controllers PCI/PM: Drop pci_bridge_wait_for_secondary_bus() timeout parameter PCI/PM: Increase wait time after resume
2023-04-20	Merge branch 'pci/p2pdma'	Bjorn Helgaas	1	-2/+1
	- Fix pci_p2pmem_find_many() kernel-doc (Cai Huoqing) * pci/p2pdma: PCI/P2PDMA: Fix pci_p2pmem_find_many() kernel-doc
2023-04-20	Merge branch 'pci/hotplug'	Bjorn Helgaas	1	-0/+15
	- Fix pciehp AB-BA deadlock between reset_lock and device_lock (Lukas Wunner) * pci/hotplug: PCI: pciehp: Fix AB-BA deadlock between reset_lock and device_lock
2023-04-20	Merge branch 'pci/enumeration'	Bjorn Helgaas	4	-6/+6
	- Use of_property_present(), instead of lower-level functions like of_get_property(), for testing DT property presence (Rob Herring) * pci/enumeration: PCI: Use of_property_present() for testing DT property presence
2023-04-20	PCI: Restrict device disabled status check to DT	Rob Herring	3	-12/+30
	Commit 6fffbc7ae137 ("PCI: Honor firmware's device disabled status") checked the firmware device status for both DT and ACPI devices. That caused a regression in some ACPI systems. The exact reason isn't clear. It's possibly a firmware bug. For now, at least, refactor the check to be for DT based systems only. Note that the original implementation leaked a refcount which is now correctly handled. [bhelgaas: Per ACPI r6.5, sec 6.3.7, for devices on an enumerable bus, _STA must return with bit[0] ("device is present") set] Link: https://lore.kernel.org/all/[email protected]/ Fixes: 6fffbc7ae137 ("PCI: Honor firmware's device disabled status") Link: https://lore.kernel.org/r/[email protected] Link: https://bugzilla.kernel.org/show_bug.cgi?id=217317 Reported-by: Donald Hunter <[email protected]> Reported-by: Vitaly Kuznetsov <[email protected]> Tested-by: Donald Hunter <[email protected]> Tested-by: Vitaly Kuznetsov <[email protected]> Signed-off-by: Rob Herring <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Cc: Binbin Zhou <[email protected]> Cc: Liu Peibao <[email protected]> Cc: Huacai Chen <[email protected]>
2023-04-18	PCI: Use of_property_present() for testing DT property presence	Rob Herring	4	-6/+6
	It is preferred to use typed property access functions (i.e. of_property_read_<type> functions) rather than low-level of_get_property()/of_find_property() functions for reading properties. As part of this, convert of_get_property()/of_find_property() calls to the recently added of_property_present() helper when we just want to test for presence of a property and nothing more. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Rob Herring <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: AngeloGioacchino Del Regno <[email protected]> # pcie-mediatek
2023-04-18	PCI/DOE: Relax restrictions on request and response size	Lukas Wunner	1	-25/+49
	An upcoming user of DOE is CMA (Component Measurement and Authentication, PCIe r6.0 sec 6.31). It builds on SPDM (Security Protocol and Data Model): https://www.dmtf.org/dsp/DSP0274 SPDM message sizes are not always a multiple of dwords. To transport them over DOE without using bounce buffers, allow sending requests and receiving responses whose final dword is only partially populated. To be clear, PCIe r6.0 sec 6.30.1 specifies the Data Object Header 2 "Length" in dwords and pci_doe_send_req() and pci_doe_recv_resp() read/write dwords. So from a spec point of view, DOE is still specified in dwords and allowing non-dword request/response buffers is merely for the convenience of callers. Tested-by: Ira Weiny <[email protected]> Signed-off-by: Lukas Wunner <[email protected]> Reviewed-by: Ming Li <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> Link: https://lore.kernel.org/r/151b1a6a1794afb65d941287ecbc032c5b8004b9.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <[email protected]>
2023-04-18	PCI/DOE: Make mailbox creation API private	Lukas Wunner	1	-37/+4
	The PCI core has just been amended to create a pci_doe_mb struct for every DOE instance on device enumeration. CXL (the only in-tree DOE user so far) has been migrated to use those mailboxes instead of creating its own. That leaves pcim_doe_create_mb() and pci_doe_for_each_off() without any callers, so drop them. pci_doe_supports_prot() is now only used internally, so declare it static. pci_doe_destroy_mb() is no longer used as callback for devm_add_action(), so refactor it to accept a struct pci_doe_mb pointer instead of a generic void pointer. Because pci_doe_create_mb() is only called on device enumeration, i.e. before driver binding, the workqueue name never contains a driver name. So replace dev_driver_string() with dev_bus_name() when generating the workqueue name. Tested-by: Ira Weiny <[email protected]> Signed-off-by: Lukas Wunner <[email protected]> Reviewed-by: Ming Li <[email protected]> Reviewed-by: Ira Weiny <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> Link: https://lore.kernel.org/r/64f614b6584982986c55d2c6229b4ee2b276dd59.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <[email protected]>
2023-04-18	PCI/DOE: Create mailboxes on device enumeration	Lukas Wunner	4	-0/+86
	Currently a DOE instance cannot be shared by multiple drivers because each driver creates its own pci_doe_mb struct for a given DOE instance. For the same reason a DOE instance cannot be shared between the PCI core and a driver. Moreover, finding out which protocols a DOE instance supports requires creating a pci_doe_mb for it. If a device has multiple DOE instances, a driver looking for a specific protocol may need to create a pci_doe_mb for each of the device's DOE instances and then destroy those which do not support the desired protocol. That's obviously an inefficient way to do things. Overcome these issues by creating mailboxes in the PCI core on device enumeration. Provide a pci_find_doe_mailbox() API call to allow drivers to get a pci_doe_mb for a given (pci_dev, vendor, protocol) triple. This API is modeled after pci_find_capability() and can later be amended with a pci_find_next_doe_mailbox() call to iterate over all mailboxes of a given pci_dev which support a specific protocol. On removal, destroy the mailboxes in pci_destroy_dev(), after the driver is unbound. This allows drivers to use DOE in their ->remove() hook. On surprise removal, cancel ongoing DOE exchanges and prevent new ones from being scheduled. Thereby ensure that a hot-removed device doesn't needlessly wait for a running exchange to time out. Tested-by: Ira Weiny <[email protected]> Signed-off-by: Lukas Wunner <[email protected]> Reviewed-by: Ming Li <[email protected]> Reviewed-by: Ira Weiny <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> Link: https://lore.kernel.org/r/40a6f973f72ef283d79dd55e7e6fddc7481199af.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <[email protected]>
2023-04-18	PCI/DOE: Allow mailbox creation without devres management	Lukas Wunner	1	-37/+66
	DOE mailbox creation is currently only possible through a devres-managed API. The lifetime of mailboxes thus ends with driver unbinding. An upcoming commit will create DOE mailboxes upon device enumeration by the PCI core. Their lifetime shall not be limited by a driver. Therefore rework pcim_doe_create_mb() into the non-devres-managed pci_doe_create_mb(). Add pci_doe_destroy_mb() for mailbox destruction on device removal. Provide a devres-managed wrapper under the existing pcim_doe_create_mb() name. The error path of pcim_doe_create_mb() previously called xa_destroy() if alloc_ordered_workqueue() failed. That's unnecessary because the xarray is still empty at that point. It doesn't need to be destroyed until it's been populated by pci_doe_cache_protocols(). Arrange the error path of the new pci_doe_create_mb() accordingly. pci_doe_cancel_tasks() is no longer used as callback for devm_add_action(), so refactor it to accept a struct pci_doe_mb pointer instead of a generic void pointer. Tested-by: Ira Weiny <[email protected]> Signed-off-by: Lukas Wunner <[email protected]> Reviewed-by: Ming Li <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> Link: https://lore.kernel.org/r/7c9a63867d70233c5e9d26cd8bf956742cd6d650.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <[email protected]>
2023-04-18	PCI/DOE: Deduplicate mailbox flushing	Lukas Wunner	1	-6/+3
	When a DOE mailbox is torn down, its workqueue is flushed once in pci_doe_flush_mb() through a call to flush_workqueue() and subsequently flushed once more in pci_doe_destroy_workqueue() through a call to destroy_workqueue(). Deduplicate by dropping flush_workqueue() from pci_doe_flush_mb(). Rename pci_doe_flush_mb() to pci_doe_cancel_tasks() to more aptly describe what it now does. Tested-by: Ira Weiny <[email protected]> Signed-off-by: Lukas Wunner <[email protected]> Reviewed-by: Ming Li <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> Link: https://lore.kernel.org/r/1f009f60b326d1c6d776641d4b20aff27de0c234.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <[email protected]>
2023-04-18	PCI/DOE: Make asynchronous API private	Lukas Wunner	1	-2/+43
	A synchronous API for DOE has just been introduced. CXL (the only in-tree DOE user so far) was converted to use it instead of the asynchronous API. Consequently, pci_doe_submit_task() as well as the pci_doe_task struct are only used internally, so make them private. Tested-by: Ira Weiny <[email protected]> Signed-off-by: Lukas Wunner <[email protected]> Reviewed-by: Ming Li <[email protected]> Reviewed-by: Ira Weiny <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> Link: https://lore.kernel.org/r/cc19544068483681e91dfe27545c2180cd09f931.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <[email protected]>
2023-04-18	PCI/DOE: Provide synchronous API and use it internally	Lukas Wunner	1	-15/+54
	The DOE API only allows asynchronous exchanges and forces callers to provide a completion callback. Yet all existing callers only perform synchronous exchanges. Upcoming commits for CMA (Component Measurement and Authentication, PCIe r6.0 sec 6.31) likewise require only synchronous DOE exchanges. Provide a synchronous pci_doe() API call which builds on the internal asynchronous machinery. Convert the internal pci_doe_discovery() to the new call. The new API allows submission of const-declared requests, necessitating the addition of a const qualifier in struct pci_doe_task. Tested-by: Ira Weiny <[email protected]> Signed-off-by: Lukas Wunner <[email protected]> Reviewed-by: Ming Li <[email protected]> Reviewed-by: Ira Weiny <[email protected]> Reviewed-by: Davidlohr Bueso <[email protected]> Reviewed-by: Dan Williams <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> Link: https://lore.kernel.org/r/0f444206da9615c56301fbaff459c0f45d27f122.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <[email protected]>
2023-04-17	PCI/PM: Extend D3hot delay for NVIDIA HDA controllers	Alex Williamson	1	-0/+13
	Assignment of NVIDIA Ampere-based GPUs have seen a regression since the below referenced commit, where the reduced D3hot transition delay appears to introduce a small window where a D3hot->D0 transition followed by a bus reset can wedge the device. The entire device is subsequently unavailable, returning -1 on config space read and is unrecoverable without a host reset. This has been observed with RTX A2000 and A5000 GPU and audio functions assigned to a Windows VM, where shutdown of the VM places the devices in D3hot prior to vfio-pci performing a bus reset when userspace releases the devices. The issue has roughly a 2-3% chance of occurring per shutdown. Restoring the HDA controller d3hot_delay to the effective value before the below commit has been shown to resolve the issue. NVIDIA confirms this change should be safe for all of their HDA controllers. Fixes: 3e347969a577 ("PCI/PM: Reduce D3hot delay with usleep_range()") Link: https://lore.kernel.org/r/[email protected] Reported-by: Zhiyi Guo <[email protected]> Signed-off-by: Alex Williamson <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Tarun Gupta <[email protected]> Cc: Abhishek Sahu <[email protected]> Cc: Tarun Gupta <[email protected]>
2023-04-17	PCI: hv: Enable PCI pass-thru devices in Confidential VMs	Michael Kelley	1	-64/+168
	For PCI pass-thru devices in a Confidential VM, Hyper-V requires that PCI config space be accessed via hypercalls. In normal VMs, config space accesses are trapped to the Hyper-V host and emulated. But in a confidential VM, the host can't access guest memory to decode the instruction for emulation, so an explicit hypercall must be used. Add functions to make the new MMIO read and MMIO write hypercalls. Update the PCI config space access functions to use the hypercalls when such use is indicated by Hyper-V flags. Also, set the flag to allow the Hyper-V PCI driver to be loaded and used in a Confidential VM (a.k.a., "Isolation VM"). The driver has previously been hardened against a malicious Hyper-V host[1]. [1] https://lore.kernel.org/all/[email protected]/ Co-developed-by: Dexuan Cui <[email protected]> Signed-off-by: Dexuan Cui <[email protected]> Signed-off-by: Michael Kelley <[email protected]> Reviewed-by: Boqun Feng <[email protected]> Reviewed-by: Haiyang Zhang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Wei Liu <[email protected]>
2023-04-16	PCI/MSI: Remove over-zealous hardware size check in pci_msix_validate_entries()	Thomas Gleixner	1	-7/+2
	pci_msix_validate_entries() validates the entries array which is handed in by the caller for a MSI-X interrupt allocation. Aside of consistency failures it also detects a failure when the size of the MSI-X hardware table in the device is smaller than the size of the entries array. That's wrong for the case of range allocations where the caller provides the minimum and the maximum number of vectors to allocate, when the hardware size is greater or equal than the mininum, but smaller than the maximum. Remove the hardware size check completely from that function and just ensure that the entires array up to the maximum size is consistent. The limitation and range checking versus the hardware size happens independently of that afterwards anyway because the entries array is optional. Fixes: 4644d22eb673 ("PCI/MSI: Validate MSI-X contiguous restriction early") Reported-by: David Laight <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Cc: [email protected] Link: https://lore.kernel.org/r/87v8i3sg62.ffs@tglx
2023-04-12	PCI: qcom: Add SM8550 PCIe support	Abel Vesa	1	-11/+14
	SM8550 requires two additional clocks for proper working. Add these two clocks as optional clocks (as only required by this platform) and compatible for this platform. While at it, let's also rename the reset variable to "rst" from "pci_reset" to match the existing naming preference. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Abel Vesa <[email protected]> [[email protected]: commit log rewording] Signed-off-by: Lorenzo Pieralisi <[email protected]> Reviewed-by: Konrad Dybcio <[email protected]> Reviewed-by: Manivannan Sadhasivam <[email protected]> Reviewed-by: Johan Hovold <[email protected]>
2023-04-12	PCI: qcom: Add support for SDX55 SoC	Manivannan Sadhasivam	1	-1/+3
	Add support for SDX55 SoC reusing the 1.9.0 config. The PCIe controller is of version 1.10.0 but it is compatible with the 1.9.0 config. This SoC also requires "sleep" clock which is added as an optional clock in the driver, since it is not required on other SoCs. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Manivannan Sadhasivam <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]>
2023-04-12	PCI: qcom: Enable async probe by default	Manivannan Sadhasivam	1	-0/+1
	Qcom PCIe RC driver waits for the PHY link to be up during the probe; this consumes several milliseconds during boot. Enable async probe by default so that other drivers can load in parallel while this driver waits for the link to be up. Suggested-by: Rob Herring <[email protected]> Link: https://lore.kernel.org/r/[email protected] Tested-by: Johan Hovold <[email protected]> Signed-off-by: Manivannan Sadhasivam <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]> Reviewed-by: Johan Hovold <[email protected]>
2023-04-12	PCI: qcom: Add support for system suspend and resume	Manivannan Sadhasivam	1	-0/+62
	During the system suspend, vote for minimal interconnect bandwidth (1KiB) to keep the interconnect path active for config access and also turn OFF the resources like clock and PHY if there are no active devices connected to the controller. For the controllers with active devices, the resources are kept ON as removing the resources will trigger access violation during the late end of suspend cycle as kernel tries to access the config space of PCIe devices to mask the MSIs. Also, it is not desirable to put the link into L2/L3 state as that implies VDD supply will be removed and the devices may go into powerdown state. This will affect the lifetime of storage devices like NVMe. And finally, during resume, turn ON the resources if the controller was truly suspended (resources OFF) and update the interconnect bandwidth based on PCIe Gen speed. Suggested-by: Krishna chaitanya chundru <[email protected]> Link: https://lore.kernel.org/r/[email protected] Tested-by: Johan Hovold <[email protected]> Signed-off-by: Manivannan Sadhasivam <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]> Reviewed-by: Johan Hovold <[email protected]> Acked-by: Dhruva Gole <[email protected]>
2023-04-11	PCI/PM: Drop pci_bridge_wait_for_secondary_bus() timeout parameter	Mika Westerberg	4	-18/+16
	All callers of pci_bridge_wait_for_secondary_bus() supply a timeout of PCIE_RESET_READY_POLL_MS, so drop the parameter. Move the definition of PCIE_RESET_READY_POLL_MS into pci.c, the only user. [bhelgaas: extracted from https://lore.kernel.org/r/[email protected]] Signed-off-by: Mika Westerberg <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
2023-04-11	PCI/PM: Increase wait time after resume	Mika Westerberg	1	-1/+2
	PCIe r6.0 sec 6.6.1 prescribes that a device must be able to respond to config requests within 1.0 s (PCI_RESET_WAIT) after exiting conventional reset and this same delay is prescribed when coming out of D3cold (as that involves reset too). A device that requires more than 1 second to initialize after reset may respond to config requests with Request Retry Status completions (sec 2.3.1), and we accommodate that in Linux with a 60 second cap (PCIE_RESET_READY_POLL_MS). Previously we waited up to PCIE_RESET_READY_POLL_MS only in the reset code path, not in the resume path. However, a device has surfaced, namely Intel Titan Ridge xHCI, which requires a longer delay also in the resume code path. Make the resume code path to use this same extended delay as the reset path. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216728 Link: https://lore.kernel.org/r/[email protected] Reported-by: Chris Chiu <[email protected]> Signed-off-by: Mika Westerberg <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Cc: Lukas Wunner <[email protected]>
2023-04-11	Merge tag 'pci-v6.3-fixes-2' of ↵	Linus Torvalds	1	-2/+3
	git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci fixes from Bjorn Helgaas: - Provide pci_msix_can_alloc_dyn() stub when CONFIG_PCI_MSI unset to avoid build errors (Reinette Chatre) - Quirk AMD XHCI controller that loses MSI-X state in D3hot to avoid broken USB after hotplug or suspend/resume (Basavaraj Natikar) - Fix use-after-free in pci_bus_release_domain_nr() (Rob Herring) * tag 'pci-v6.3-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: PCI: Fix use-after-free in pci_bus_release_domain_nr() x86/PCI: Add quirk for AMD XHCI controller that loses MSI-X state in D3hot PCI/MSI: Provide missing stub for pci_msix_can_alloc_dyn()
2023-04-11	PCI: pciehp: Fix AB-BA deadlock between reset_lock and device_lock	Lukas Wunner	1	-0/+15
	In 2013, commits 2e35afaefe64 ("PCI: pciehp: Add reset_slot() method") 608c388122c7 ("PCI: Add slot reset option to pci_dev_reset()") amended PCIe hotplug to mask Presence Detect Changed events during a Secondary Bus Reset. The reset thus no longer causes gratuitous slot bringdown and bringup. However the commits neglected to serialize reset with code paths reading slot registers. For instance, a slot bringup due to an earlier hotplug event may see the Presence Detect State bit cleared during a concurrent Secondary Bus Reset. In 2018, commit 5b3f7b7d062b ("PCI: pciehp: Avoid slot access during reset") retrofitted the missing locking. It introduced a reset_lock which serializes a Secondary Bus Reset with other parts of pciehp. Unfortunately the locking turns out to be overzealous: reset_lock is held for the entire enumeration and de-enumeration of hotplugged devices, including driver binding and unbinding. Driver binding and unbinding acquires device_lock while the reset_lock of the ancestral hotplug port is held. A concurrent Secondary Bus Reset acquires the ancestral reset_lock while already holding the device_lock. The asymmetric locking order in the two code paths can lead to AB-BA deadlocks. Michael Haeuptle reports such deadlocks on simultaneous hot-removal and vfio release (the latter implies a Secondary Bus Reset): pciehp_ist() # down_read(reset_lock) pciehp_handle_presence_or_link_change() pciehp_disable_slot() __pciehp_disable_slot() remove_board() pciehp_unconfigure_device() pci_stop_and_remove_bus_device() pci_stop_bus_device() pci_stop_dev() device_release_driver() device_release_driver_internal() __device_driver_lock() # device_lock() SYS_munmap() vfio_device_fops_release() vfio_device_group_close() vfio_device_close() vfio_device_last_close() vfio_pci_core_close_device() vfio_pci_core_disable() # device_lock() __pci_reset_function_locked() pci_reset_bus_function() pci_dev_reset_slot_function() pci_reset_hotplug_slot() pciehp_reset_slot() # down_write(reset_lock) Ian May reports the same deadlock on simultaneous hot-removal and an AER-induced Secondary Bus Reset: aer_recover_work_func() pcie_do_recovery() aer_root_reset() pci_bus_error_reset() pci_slot_reset() pci_slot_lock() # device_lock() pci_reset_hotplug_slot() pciehp_reset_slot() # down_write(reset_lock) Fix by releasing the reset_lock during driver binding and unbinding, thereby splitting and shrinking the critical section. Driver binding and unbinding is protected by the device_lock() and thus serialized with a Secondary Bus Reset. There's no need to additionally protect it with the reset_lock. However, pciehp does not bind and unbind devices directly, but rather invokes PCI core functions which also perform certain enumeration and de-enumeration steps. The reset_lock's purpose is to protect slot registers, not enumeration and de-enumeration of hotplugged devices. That would arguably be the job of the PCI core, not the PCIe hotplug driver. After all, an AER-induced Secondary Bus Reset may as well happen during boot-time enumeration of the PCI hierarchy and there's no locking to prevent that either. Exempting de-enumeration from the reset_lock is relatively harmless: A concurrent Secondary Bus Reset may foil config space accesses such as PME interrupt disablement. But if the device is physically gone, those accesses are pointless anyway. If the device is physically present and only logically removed through an Attention Button press or the sysfs "power" attribute, PME interrupts as well as DMA cannot come through because pciehp_unconfigure_device() disables INTx and Bus Master bits. That's still protected by the reset_lock in the present commit. Exempting enumeration from the reset_lock also has limited impact: The exempted call to pci_bus_add_device() may perform device accesses through pcibios_bus_add_device() and pci_fixup_device() which are now no longer protected from a concurrent Secondary Bus Reset. Otherwise there should be no impact. In essence, the present commit seeks to fix the AB-BA deadlocks while still retaining a best-effort reset protection for enumeration and de-enumeration of hotplugged devices -- until a general solution is implemented in the PCI core. Link: https://lore.kernel.org/linux-pci/CS1PR8401MB0728FC6FDAB8A35C22BD90EC95F10@CS1PR8401MB0728.NAMPRD84.PROD.OUTLOOK.COM Link: https://lore.kernel.org/linux-pci/[email protected] Link: https://lore.kernel.org/linux-pci/[email protected] Link: https://lore.kernel.org/linux-pci/[email protected] Link: https://bugzilla.kernel.org/show_bug.cgi?id=215590 Fixes: 5b3f7b7d062b ("PCI: pciehp: Avoid slot access during reset") Link: https://lore.kernel.org/r/fef2b2e9edf245c049a8c5b94743c0f74ff5008a.1681191902.git.lukas@wunner.de Reported-by: Michael Haeuptle <[email protected]> Reported-by: Ian May <[email protected]> Reported-by: Andrey Grodzovsky <[email protected]> Reported-by: Rahul Kumar <[email protected]> Reported-by: Jialin Zhang <[email protected]> Tested-by: Anatoli Antonovitch <[email protected]> Signed-off-by: Lukas Wunner <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Cc: [email protected] # v4.19+ Cc: Dan Stein <[email protected]> Cc: Ashok Raj <[email protected]> Cc: Alex Michon <[email protected]> Cc: Xiongfeng Wang <[email protected]> Cc: Alex Williamson <[email protected]> Cc: Mika Westerberg <[email protected]> Cc: Sathyanarayanan Kuppuswamy <[email protected]>
2023-04-11	PCI: qcom: Expose link transition counts via debugfs	Manivannan Sadhasivam	1	-2/+63
	Qualcomm PCIe controllers have debug registers in the MHI region that count PCIe link transitions. Expose them over debugfs to userspace to help debug the low power issues. Note that even though the registers are prefixed as PARF_, they don't live under the "parf" register region. The register naming is following the Qualcomm's internal documentation as like other registers. While at it, let's arrange the local variables in probe function to follow reverse XMAS tree order. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Manivannan Sadhasivam <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]>
2023-04-11	PCI: qcom: Rename qcom_pcie_config_sid_sm8250() to reflect IP version	Manivannan Sadhasivam	1	-72/+71
	qcom_pcie_config_sid_sm8250() function no longer applies only to SM8250. So let's rename it to reflect the actual IP version and also move its definition to keep it sorted as per IP revisions. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Manivannan Sadhasivam <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]>
2023-04-11	PCI: qcom: Use macros for defining total no. of clocks & supplies	Manivannan Sadhasivam	1	-4/+6
	To keep uniformity, let's use macros to define the total number of clocks and supplies in qcom_pcie_resources_{2_7_0/2_9_0} structs. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Manivannan Sadhasivam <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]>
2023-04-11	PCI: qcom: Use bulk reset APIs for handling resets for IP rev 2.4.0	Manivannan Sadhasivam	1	-208/+30
	All the resets are asserted and deasserted at the same time. So the bulk reset APIs can be used to handle them together. This simplifies the code a lot. It should be noted that there were delays in-between the reset asserts and deasserts. But going by the config used by other revisions, those delays are not really necessary. So a single delay after all asserts and one after deasserts is used. The total number of resets supported is 12 but only ipq4019 is using all of them. Link: https://lore.kernel.org/r/[email protected] Tested-by: Sricharan Ramabadhran <[email protected]> Signed-off-by: Manivannan Sadhasivam <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]>
2023-04-11	PCI: qcom: Use bulk reset APIs for handling resets for IP rev 2.3.3	Manivannan Sadhasivam	1	-26/+23
	All the resets are asserted and deasserted at the same time. So the bulk reset APIs can be used to handle them together. This simplifies the code a lot. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Manivannan Sadhasivam <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]>
2023-04-11	PCI: qcom: Use bulk clock APIs for handling clocks for IP rev 2.3.3	Manivannan Sadhasivam	1	-68/+20
	All the clocks are enabled and disabled at the same time. So the bulk clock APIs can be used to handle them together. This simplifies the code a lot. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Manivannan Sadhasivam <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]>
2023-04-11	PCI: qcom: Use bulk clock APIs for handling clocks for IP rev 2.3.2	Manivannan Sadhasivam	1	-57/+15
	All the clocks are enabled and disabled at the same time. So the bulk clock APIs can be used to handle them together. This simplifies the code a lot. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Manivannan Sadhasivam <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]>
2023-04-11	PCI: qcom: Use bulk clock APIs for handling clocks for IP rev 1.0.0	Manivannan Sadhasivam	1	-53/+19
	All the clocks are enabled and disabled at the same time. So the bulk clock APIs can be used to handle them together. This simplifies the code a lot. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Manivannan Sadhasivam <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]>
2023-04-11	PCI: qcom: Use bulk reset APIs for handling resets for IP rev 2.1.0	Manivannan Sadhasivam	1	-95/+34
	All the resets are asserted and deasserted at the same time. So the bulk reset APIs can be used to handle them together. This simplifies the code a lot. While at it, let's also move the qcom_pcie_resources_2_1_0 struct below qcom_pcie_resources_1_0_0 to keep it sorted. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Manivannan Sadhasivam <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]>
2023-04-11	PCI: qcom: Use lower case for hex	Manivannan Sadhasivam	1	-7/+7
	To maintain uniformity, let's use lower case for representing hexadecimal numbers. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Manivannan Sadhasivam <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]>
2023-04-11	PCI: qcom: Add missing macros for register fields	Manivannan Sadhasivam	1	-17/+25
	Some of the registers are changed using hardcoded bitfields without macros. This provides no information on what the register setting is about. So add the macros to those fields for making the code more understandable. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Manivannan Sadhasivam <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]>
2023-04-11	PCI: qcom: Use bitfield definitions for register fields	Manivannan Sadhasivam	1	-7/+7
	To maintain uniformity throughout the driver and also to make the code easier to read, let's make use of bitfield definitions for register fields. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Manivannan Sadhasivam <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]>
2023-04-11	PCI: qcom: Sort and group registers and bitfield definitions	Manivannan Sadhasivam	1	-45/+63
	Sorting the registers and their bit definitions will make it easier to add more definitions in the future and it also helps in maintenance. While at it, let's also group the registers and bit definitions separately as done in the pcie-qcom-ep driver. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Manivannan Sadhasivam <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]>
2023-04-11	PCI: qcom: Remove PCIE20_ prefix from register definitions	Manivannan Sadhasivam	1	-93/+91
	The PCIE part is redundant and 20 doesn't represent anything across the SoCs supported now. So let's get rid of the prefix. This involves adding the IP version suffix to one definition of PARF_SLV_ADDR_SPACE_SIZE that defines offset specific to that version. The other definition is generic for the rest of the versions. Also, the register PCIE20_LNK_CONTROL2_LINK_STATUS2 is not used anywhere, hence removed. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Manivannan Sadhasivam <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]>
2023-04-11	PCI: qcom: Fix the incorrect register usage in v2.7.0 config	Manivannan Sadhasivam	1	-5/+3
	Qcom PCIe IP version v2.7.0 and its derivatives don't contain the PCIE20_PARF_AXI_MSTR_WR_ADDR_HALT register. Instead, they have the new PCIE20_PARF_AXI_MSTR_WR_ADDR_HALT_V2 register. So fix the incorrect register usage which is modifying a different register. Also in this IP version, this register change doesn't depend on MSI being enabled. So remove that check also. Link: https://lore.kernel.org/r/[email protected] Fixes: ed8cc3b1fc84 ("PCI: qcom: Add support for SDM845 PCIe controller") Signed-off-by: Manivannan Sadhasivam <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]> Cc: <[email protected]> # 5.6+
2023-04-09	Merge tag 'cxl-fixes-6.3-rc6' of ↵	Linus Torvalds	1	-12/+18
	git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull compute express link (cxl) fixes from Dan Williams: "Several fixes for driver startup regressions that landed during the merge window as well as some older bugs. The regressions were due to a lack of testing with what the CXL specification calls Restricted CXL Host (RCH) topologies compared to the testing with Virtual Host (VH) CXL topologies. A VH topology is typical PCIe while RCH topologies map CXL endpoints as Root Complex Integrated endpoints. The impact is some driver crashes on startup. This merge window also added compatibility for range registers (the mechanism that CXL 1.1 defined for mapping memory) to treat them like HDM decoders (the mechanism that CXL 2.0 defined for mapping Host-managed Device Memory). That work collided with the new region enumeration code that was tested with CXL 2.0 setups, and fails with crashes at startup. Lastly, the DOE (Data Object Exchange) implementation for retrieving an ACPI-like data table from CXL devices is being reworked for v6.4. Several fixes fell out of that work that are suitable for v6.3. All of this has been in linux-next for a while, and all reported issues [1] have been addressed. Summary: - Fix several issues with region enumeration in RCH topologies that can trigger crashes on driver startup or shutdown. - Fix CXL DVSEC range register compatibility versus region enumeration that leads to startup crashes - Fix CDAT endiannes handling - Fix multiple buffer handling boundary conditions - Fix Data Object Exchange (DOE) workqueue usage vs CONFIG_DEBUG_OBJECTS warn splats" Link: http://lore.kernel.org/r/[email protected] [1] * tag 'cxl-fixes-6.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/hdm: Extend DVSEC range register emulation for region enumeration cxl/hdm: Limit emulation to the number of range registers cxl/region: Move coherence tracking into cxl_region_attach() cxl/region: Fix region setup/teardown for RCDs cxl/port: Fix find_cxl_root() for RCDs and simplify it cxl/hdm: Skip emulation when driver manages mem_enable cxl/hdm: Fix double allocation of @cxlhdm PCI/DOE: Fix memory leak with CONFIG_DEBUG_OBJECTS=y PCI/DOE: Silence WARN splat with CONFIG_DEBUG_OBJECTS=y cxl/pci: Handle excessive CDAT length cxl/pci: Handle truncated CDAT entries cxl/pci: Handle truncated CDAT header cxl/pci: Fix CDAT retrieval on big endian
2023-04-07	PCI/EDR: Add edr_handle_event() comments	Bjorn Helgaas	1	-1/+10
	EDR documentation is a bit sketchy. Add a couple comments to edr_handle_event() about the devices involved. Link: https://lore.kernel.org/r/20230407215259.GA3825733@bhelgaas Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>
2023-04-07	PCI/EDR: Clear Device Status after EDR error recovery	Kuppuswamy Sathyanarayanan	1	-0/+1
	During EDR recovery, the OS must clear error status of the port that triggered DPC even if firmware retains control of DPC and AER (see the implementation note in the PCI Firmware spec r3.3, sec 4.6.12). Prior to 068c29a248b6 ("PCI/ERR: Clear PCIe Device Status errors only if OS owns AER"), the port Device Status was cleared in this path: edr_handle_event dpc_process_error(dev) # "dev" triggered DPC pcie_do_recovery(dev, dpc_reset_link) dpc_reset_link # exit DPC pcie_clear_device_status(dev) # clear Device Status After 068c29a248b6, pcie_do_recovery() no longer clears Device Status when firmware controls AER, so the error bit remains set even after recovery. Per the "Downstream Port Containment configuration control" bit in the returned _OSC Control Field (sec 4.5.1), the OS is allowed to clear error status until it evaluates _OST, so clear Device Status in edr_handle_event() if the error recovery was successful. [bhelgaas: commit log] Fixes: 068c29a248b6 ("PCI/ERR: Clear PCIe Device Status errors only if OS owns AER") Link: https://lore.kernel.org/r/20230315235449.1279209-1-sathyanarayanan.kuppuswamy@linux.intel.com Reported-by: Tsaur Erwin <[email protected]> Signed-off-by: Kuppuswamy Sathyanarayanan <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
2023-04-06	PCI: Fix use-after-free in pci_bus_release_domain_nr()	Rob Herring	1	-2/+3
	Commit c14f7ccc9f5d ("PCI: Assign PCI domain IDs by ida_alloc()") introduced a use-after-free bug in the bus removal cleanup. The issue was found with kfence: [ 19.293351] BUG: KFENCE: use-after-free read in pci_bus_release_domain_nr+0x10/0x70 [ 19.302817] Use-after-free read at 0x000000007f3b80eb (in kfence-#115): [ 19.309677] pci_bus_release_domain_nr+0x10/0x70 [ 19.309691] dw_pcie_host_deinit+0x28/0x78 [ 19.309702] tegra_pcie_deinit_controller+0x1c/0x38 [pcie_tegra194] [ 19.309734] tegra_pcie_dw_probe+0x648/0xb28 [pcie_tegra194] [ 19.309752] platform_probe+0x90/0xd8 ... [ 19.311457] kfence-#115: 0x00000000063a155a-0x00000000ba698da8, size=1072, cache=kmalloc-2k [ 19.311469] allocated by task 96 on cpu 10 at 19.279323s: [ 19.311562] __kmem_cache_alloc_node+0x260/0x278 [ 19.311571] kmalloc_trace+0x24/0x30 [ 19.311580] pci_alloc_bus+0x24/0xa0 [ 19.311590] pci_register_host_bridge+0x48/0x4b8 [ 19.311601] pci_scan_root_bus_bridge+0xc0/0xe8 [ 19.311613] pci_host_probe+0x18/0xc0 [ 19.311623] dw_pcie_host_init+0x2c0/0x568 [ 19.311630] tegra_pcie_dw_probe+0x610/0xb28 [pcie_tegra194] [ 19.311647] platform_probe+0x90/0xd8 ... [ 19.311782] freed by task 96 on cpu 10 at 19.285833s: [ 19.311799] release_pcibus_dev+0x30/0x40 [ 19.311808] device_release+0x30/0x90 [ 19.311814] kobject_put+0xa8/0x120 [ 19.311832] device_unregister+0x20/0x30 [ 19.311839] pci_remove_bus+0x78/0x88 [ 19.311850] pci_remove_root_bus+0x5c/0x98 [ 19.311860] dw_pcie_host_deinit+0x28/0x78 [ 19.311866] tegra_pcie_deinit_controller+0x1c/0x38 [pcie_tegra194] [ 19.311883] tegra_pcie_dw_probe+0x648/0xb28 [pcie_tegra194] [ 19.311900] platform_probe+0x90/0xd8 ... [ 19.313579] CPU: 10 PID: 96 Comm: kworker/u24:2 Not tainted 6.2.0 #4 [ 19.320171] Hardware name: /, BIOS 1.0-d7fb19b 08/10/2022 [ 19.325852] Workqueue: events_unbound deferred_probe_work_func The stack trace is a bit misleading as dw_pcie_host_deinit() doesn't directly call pci_bus_release_domain_nr(). The issue turns out to be in pci_remove_root_bus() which first calls pci_remove_bus() which frees the struct pci_bus when its struct device is released. Then pci_bus_release_domain_nr() is called and accesses the freed struct pci_bus. Reordering these fixes the issue. Fixes: c14f7ccc9f5d ("PCI: Assign PCI domain IDs by ida_alloc()") Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/r/[email protected] Reported-by: Jon Hunter <[email protected]> Tested-by: Jon Hunter <[email protected]> Signed-off-by: Rob Herring <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]> Cc: [email protected] # v6.2+ Cc: Pali Rohár <[email protected]>
2023-04-06	PCI/P2PDMA: Fix pci_p2pmem_find_many() kernel-doc	Cai Huoqing	1	-2/+1
	Remove reference to pci_p2pmem_dma(), which has never existed. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Cai Huoqing <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Logan Gunthorpe <[email protected]>
2023-04-05	PCI: Make pci_bus_for_each_resource() index optional	Andy Shevchenko	5	-17/+13
	Refactor pci_bus_for_each_resource() in the same way as pci_dev_for_each_resource(). This allows the index to be hidden inside the implementation so the caller can omit it when it's not used otherwise. No functional changes intended. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Andy Shevchenko <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Krzysztof Wilczyński <[email protected]> Reviewed-by: Philippe Mathieu-Daudé <[email protected]>
2023-04-04	PCI: Introduce pci_dev_for_each_resource()	Mika Westerberg	5	-38/+19
	Instead of open-coding it everywhere introduce a tiny helper that can be used to iterate over each resource of a PCI device, and convert the most obvious users into it. While at it drop doubled empty line before pdev_sort_resources(). No functional changes intended. Suggested-by: Andy Shevchenko <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mika Westerberg <[email protected]> Signed-off-by: Andy Shevchenko <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Krzysztof Wilczyński <[email protected]>