aboutsummaryrefslogtreecommitdiff
path: root/drivers/pci
AgeCommit message (Collapse)AuthorFilesLines
2023-02-22Merge branch 'pci/resource'Bjorn Helgaas2-67/+171
- Realign space as required by bridge windows after dividing it up (Mika Westerberg) - Account for space required by other devices on the bus before distributing it all to bridges (Mika Westerberg) - Distribute spare resources to root bus devices as well as to other hotplug bridges (Mika Westerberg) - Fix bug that dropped root bus resources that end at zero, e.g., a host bridge that leads only to bus 00 (Geert Uytterhoeven) * pci/resource: PCI: Fix dropping valid root bus resources with .end = zero PCI: Distribute available resources for root buses, too PCI: Take other bus devices into account when distributing resources PCI: Align extra resources for hotplug bridges properly
2023-02-22Merge branch 'pci/reset'Bjorn Helgaas4-38/+43
- Always observe reset delay when waking devices from D3cold, e.g., after system sleep, regardless of whether we're allowed to runtime-suspend to D3cold (Lukas Wunner) - Unify reset and resume delays to wait for downstream devices after a bridge reset (Lukas Wunner) - Wait for downstream devices after a DPC-induced bridge reset (Lukas Wunner) * pci/reset: PCI/DPC: Await readiness of secondary bus after reset PCI: Unify delay handling for reset and resume PCI/PM: Observe reset delay irrespective of bridge_d3
2023-02-22Merge branch 'pci/pm'Bjorn Helgaas1-14/+31
- Account for _S0W when deciding whether to put bridges in D3 to avoid missing hotplug events (Rafael J. Wysocki) * pci/pm: PCI/ACPI: Account for _S0W of the target bridge in acpi_pci_bridge_d3()
2023-02-22Merge branch 'pci/p2pdma'Bjorn Helgaas1-3/+5
- Annotate RCU dereference (Logan Gunthorpe) * pci/p2pdma: PCI/P2PDMA: Annotate RCU dereference
2023-02-22Merge branch 'pci/kbuild'Bjorn Helgaas12-12/+0
- Remove MODULE_LICENSE from boolean drivers so they don't look like modules so modprobe will complain about them (Nick Alcock) * pci/kbuild: PCI: Remove MODULE_LICENSE so boolean drivers don't look like modules
2023-02-22Merge branch 'pci/iov'Bjorn Helgaas1-1/+1
- Enlarge virtfn sysfs name buffer to prevent buffer overflow (Alexey V. Vissarionov) * pci/iov: PCI/IOV: Enlarge virtfn sysfs name buffer
2023-02-22Merge branch 'pci/hotplug'Bjorn Helgaas2-30/+15
- Add quirk to work around Qualcomm hardware defect in Command Completed signaling (Manivannan Sadhasivam) - Remove locking to allow devices to be marked as disconnected immediately instead of waiting for concurrent bind/unbind to complete (Lukas Wunner) * pci/hotplug: PCI: hotplug: Allow marking devices as disconnected during bind/unbind PCI: pciehp: Add Qualcomm quirk for Command Completed erratum
2023-02-22Merge branch 'pci/enumeration'Bjorn Helgaas4-37/+62
- Implement portdrv .shutdown() method that calls service driver .remove() methods (which disables interrupt generation as required by .shutdown()), but doesn't disable bus mastering (which hangs on Loongson LS7A because of a hardware defect) (Huacai Chen) - Prevent MRRS increases for devices below Loongson LS7A to avoid hardware limitations (Huacai Chen) - Ignore devices with a firmware (DT/ACPI) node that says the device is disabled (Rob Herring) * pci/enumeration: PCI: Honor firmware's device disabled status PCI: loongson: Add more devices that need MRRS quirk PCI: loongson: Prevent LS7A MRRS increases PCI/portdrv: Prevent LS7A Bus Master clearing on shutdown
2023-02-22PCI: dwc: Add Root Port and Endpoint controller eDMA engine supportSerge Semin4-3/+238
Since the DW eDMA core now supports eDMA controllers embedded in locally accessible DW PCIe Root Ports and Endpoints, register these controllers when possible. To do that the DW PCIe core driver needs to perform some preparations first. First of all, it needs to find the eDMA controller CSRs base address, whether they are accessible over the Port Logic or iATU unrolled space. Afterwards it can try to auto-detect the eDMA controller availability and number of read/write channels. If none are found the procedure silently returns without error. Secondly, the platform is supposed to provide either combined or per-channel IRQ signals. If no valid IRQs set is found, the procedure returns without error to be backward compatible with platforms where DW PCIe controllers have eDMA but lack the IRQ description. Finally, before actually probing the eDMA device we need to allocate LLP items buffers. After that the DW eDMA can be registered. If registration is successful, a message regarding the number of detected Read/Write eDMA channels will be printed to the system as is done for the iATU settings. Note: the DW PCI controller driver (either host or endpoint mode) is currently always built-in, so if the DW eDMA core is built as a module (CONFIG_DW_EDMA=m), eDMA controllers will not be registered even if the dw-edma module is later loaded. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Serge Semin <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Manivannan Sadhasivam <[email protected]> Acked-by: Vinod Koul <[email protected]>
2023-02-22PCI: bt1: Set 64-bit DMA maskSerge Semin1-0/+4
The DW PCIe Root Port IP core is synthesized with the 64-bit AXI address bus. Since the device is also equipped with the eDMA engine, explicitly set the device DMA mask so DMA engine clients can allocate data buffers anywhere in the 64-bit memory space. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Serge Semin <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
2023-02-22PCI: dwc: Restrict only coherent DMA mask for MSI address allocationSerge Semin1-1/+11
The MSI target address must be in the lowest 4GB memory to support PCI peripherals without 64-bit MSI support. Since the allocation is done from DMA coherent memory, set only the coherent DMA mask, leaving the streaming DMA mask alone. Thus streaming DMA operations will work with no artificial limitations. It will be specifically useful for the eDMA-capable controllers so the corresponding DMA engine clients would map the DMA buffers with no need for SWIOTLB for buffers allocated above 4GB. Add a brief comment about the reason allocating the MSI target address below 4GB. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Serge Semin <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Robin Murphy <[email protected]>
2023-02-21Merge tag 'hyperv-next-signed-20230220' of ↵Linus Torvalds1-6/+2
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv updates from Wei Liu: - allow Linux to run as the nested root partition for Microsoft Hypervisor (Jinank Jain and Nuno Das Neves) - clean up the return type of callback functions (Dawei Li) * tag 'hyperv-next-signed-20230220' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: x86/hyperv: Fix hv_get/set_register for nested bringup Drivers: hv: Make remove callback of hyperv driver void returned Drivers: hv: Enable vmbus driver for nested root partition x86/hyperv: Add an interface to do nested hypercalls Drivers: hv: Setup synic registers in case of nested root partition x86/hyperv: Add support for detecting nested hypervisor
2023-02-21Merge tag 'rcu.2023.02.10a' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull RCU updates from Paul McKenney: - Documentation updates - Miscellaneous fixes, perhaps most notably: - Throttling callback invocation based on the number of callbacks that are now ready to invoke instead of on the total number of callbacks - Several patches that suppress false-positive boot-time diagnostics, for example, due to lockdep not yet being initialized - Make expedited RCU CPU stall warnings dump stacks of any tasks that are blocking the stalled grace period. (Normal RCU CPU stall warnings have done this for many years) - Lazy-callback fixes to avoid delays during boot, suspend, and resume. (Note that lazy callbacks must be explicitly enabled, so this should not (yet) affect production use cases) - Make kfree_rcu() and friends take advantage of polled grace periods, thus reducing memory footprint by almost two orders of magnitude, admittedly on a microbenchmark This also begins the transition from kfree_rcu(p) to kfree_rcu_mightsleep(p). This transition was motivated by bugs where kfree_rcu(p), which can block, was typed instead of the intended kfree_rcu(p, rh) - SRCU updates, perhaps most notably fixing a bug that causes SRCU to fail when booted on a system with a non-zero boot CPU. This surprising situation actually happens for kdump kernels on the powerpc architecture This also adds an srcu_down_read() and srcu_up_read(), which act like srcu_read_lock() and srcu_read_unlock(), but allow an SRCU read-side critical section to be handed off from one task to another - Clean up the now-useless SRCU Kconfig option There are a few more commits that are not yet acked or pulled into maintainer trees, and these will be in a pull request for a later merge window - RCU-tasks updates, perhaps most notably these fixes: - A strange interaction between PID-namespace unshare and the RCU-tasks grace period that results in a low-probability but very real hang - A race between an RCU tasks rude grace period on a single-CPU system and CPU-hotplug addition of the second CPU that can result in a too-short grace period - A race between shrinking RCU tasks down to a single callback list and queuing a new callback to some other CPU, but where that queuing is delayed for more than an RCU grace period. This can result in that callback being stranded on the non-boot CPU - Torture-test updates and fixes - Torture-test scripting updates and fixes - Provide additional RCU CPU stall-warning information in kernels built with CONFIG_RCU_CPU_STALL_CPUTIME=y, and restore the full five-minute timeout limit for expedited RCU CPU stall warnings * tag 'rcu.2023.02.10a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (80 commits) rcu/kvfree: Add kvfree_rcu_mightsleep() and kfree_rcu_mightsleep() kernel/notifier: Remove CONFIG_SRCU init: Remove "select SRCU" fs/quota: Remove "select SRCU" fs/notify: Remove "select SRCU" fs/btrfs: Remove "select SRCU" fs: Remove CONFIG_SRCU drivers/pci/controller: Remove "select SRCU" drivers/net: Remove "select SRCU" drivers/md: Remove "select SRCU" drivers/hwtracing/stm: Remove "select SRCU" drivers/dax: Remove "select SRCU" drivers/base: Remove CONFIG_SRCU rcu: Disable laziness if lazy-tracking says so rcu: Track laziness during boot and suspend rcu: Remove redundant call to rcu_boost_kthread_setaffinity() rcu: Allow up to five minutes expedited RCU CPU stall-warning timeouts rcu: Align the output of RCU CPU stall warning messages rcu: Add RCU stall diagnosis information sched: Add helper nr_context_switches_cpu() ...
2023-02-21PCI/MSI: Clarify usage of pci_msix_free_irq()Reinette Chatre1-2/+2
pci_msix_free_irq() is used to free an interrupt on a PCI/MSI-X interrupt domain. The API description specifies that the interrupt to be freed was allocated via pci_msix_alloc_irq_at(). This description limits the usage of pci_msix_free_irq() since pci_msix_free_irq() can also be used to free MSI-X interrupts allocated with, for example, pci_alloc_irq_vectors(). Remove the text stating that the interrupt to be freed had to be allocated with pci_msix_alloc_irq_at(). The needed struct msi_map need not be from pci_msix_alloc_irq_at() but can be created from scratch using pci_irq_vector() to obtain the Linux IRQ number. Highlight that pci_msix_free_irq() cannot be used to disable MSI-X to guide users that, for example, pci_free_irq_vectors() remains to be needed. Signed-off-by: Reinette Chatre <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Link: https://lore.kernel.org/lkml/87r0xsd8j4.ffs@tglx Link: https://lore.kernel.org/r/4c3e7a50d6e70f408812cd7ab199c6b4b326f9de.1676408572.git.reinette.chatre@intel.com
2023-02-20PCI: Avoid FLR for SolidRun SNET DPU rev 1Alvaro Karsz1-0/+8
This patch fixes a FLR bug on the SNET DPU rev 1 by setting the PCI_DEV_FLAGS_NO_FLR_RESET flag. As there is a quirk to avoid FLR (quirk_no_flr), I added a new quirk to check the rev ID before calling to quirk_no_flr. Without this patch, a SNET DPU rev 1 may hang when FLR is applied. Signed-off-by: Alvaro Karsz <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> Message-Id: <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2023-02-17PCI: Remove MODULE_LICENSE so boolean drivers don't look like modulesNick Alcock12-12/+0
Since 8b41fc4454e3 ("kbuild: create modules.builtin without Makefile.modbuiltin or tristate.conf"), MODULE_LICENSE declarations are used to identify modules. As a consequence, MODULE_LICENSE in non-modules causes modprobe to misidentify the object file as a module when it is not, and modprobe might succeed rather than failing with a suitable error message. For tristate modules that can be either built-in or loaded at runtime, modprobe succeeds in both cases: # modprobe ext4 [exit status zero if CONFIG_EXT4_FS=y or =m] For boolean modules like the Standard Hot Plug Controller driver (shpchp) that cannot be loaded at runtime, modprobe should always fail like this: # modprobe shpchp modprobe: FATAL: Module shpchp not found in directory /lib/modules/... [exit status non-zero regardless of CONFIG_HOTPLUG_PCI_SHPC] but prior to this commit, shpchp_core.c contained MODULE_LICENSE, so "modprobe shpchp" silently succeeded when it should have failed. Remove MODULE_LICENSE in files that cannot be built as modules. [bhelgaas: commit log, squash] Suggested-by: Luis Chamberlain <[email protected]> Link: https://lore.kernel.org/r/[email protected]/ Signed-off-by: Nick Alcock <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Manivannan Sadhasivam <[email protected]> Cc: Luis Chamberlain <[email protected]> Cc: Hitomi Hasegawa <[email protected]> Cc: Rob Herring <[email protected]> Cc: Lorenzo Pieralisi <[email protected]>
2023-02-16PCI: hv: Drop duplicate PCI_MSI dependencyLukas Bulwahn1-1/+1
Commit a474d3fbe287 ("PCI/MSI: Get rid of PCI_MSI_IRQ_DOMAIN") removed PCI_MSI_IRQ_DOMAIN and made all previous references to it refer to PCI_MSI instead. PCI_HYPERV_INTERFACE already depended on PCI_MSI && PCI_MSI_IRQ_DOMAIN, so we ended up with a redundant dependency on PCI_MSI && PCI_MSI. Drop the duplicate. No functional change. Just a stylistic clean-up. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Lukas Bulwahn <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
2023-02-16PCI/P2PDMA: Annotate RCU dereferenceLogan Gunthorpe1-3/+5
A dereference of the __rcu pointer was noticed by sparse: drivers/pci/p2pdma.c:199:44: sparse: sparse: dereference of noderef expression Dereference the __rcu pointer using rcu_dereference_protected() instead of accessing it directly. It's safe to use rcu_dereference_protected() because a reference is held on the pgmap's percpu reference counter and thus it cannot disappear. Link: https://lore.kernel.org/r/[email protected] Reported-by: kernel test robot <[email protected]> Signed-off-by: Logan Gunthorpe <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Chaitanya Kulkarni <[email protected]>
2023-02-16PCI/sysfs: Constify struct kobj_type pci_slot_ktypeThomas Weißschuh1-1/+1
Since commit ee6d3dd4ed48 ("driver core: make kobj_type constant.") the driver core allows the usage of const struct kobj_type. Take advantage of this to constify the structure definition to prevent modification at runtime. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Thomas Weißschuh <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
2023-02-15PCI: hotplug: Allow marking devices as disconnected during bind/unbindLukas Wunner1-30/+13
On surprise removal, pciehp_unconfigure_device() and acpiphp's trim_stale_devices() call pci_dev_set_disconnected() to mark removed devices as permanently offline. Thereby, the PCI core and drivers know to skip device accesses. However pci_dev_set_disconnected() takes the device_lock and thus waits for a concurrent driver bind or unbind to complete. As a result, the driver's ->probe and ->remove hooks have no chance to learn that the device is gone. That doesn't make any sense, so drop the device_lock and instead use atomic xchg() and cmpxchg() operations to update the device state. As a byproduct, an AB-BA deadlock reported by Anatoli is fixed which occurs on surprise removal with AER concurrently performing a bus reset. AER bus reset: INFO: task irq/26-aerdrv:95 blocked for more than 120 seconds. Tainted: G W 6.2.0-rc3-custom-norework-jan11+ schedule rwsem_down_write_slowpath down_write_nested pciehp_reset_slot # acquires reset_lock pci_reset_hotplug_slot pci_slot_reset # acquires device_lock pci_bus_error_reset aer_root_reset pcie_do_recovery aer_process_err_devices aer_isr pciehp surprise removal: INFO: task irq/26-pciehp:96 blocked for more than 120 seconds. Tainted: G W 6.2.0-rc3-custom-norework-jan11+ schedule_preempt_disabled __mutex_lock mutex_lock_nested pci_dev_set_disconnected # acquires device_lock pci_walk_bus pciehp_unconfigure_device pciehp_disable_slot pciehp_handle_presence_or_link_change pciehp_ist # acquires reset_lock Link: https://bugzilla.kernel.org/show_bug.cgi?id=215590 Fixes: a6bd101b8f84 ("PCI: Unify device inaccessible") Link: https://lore.kernel.org/r/3dc88ea82bdc0e37d9000e413d5ebce481cbd629.1674205689.git.lukas@wunner.de Reported-by: Anatoli Antonovitch <[email protected]> Signed-off-by: Lukas Wunner <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Cc: [email protected] # v4.20+ Cc: Keith Busch <[email protected]>
2023-02-14PCI: pciehp: Add Qualcomm quirk for Command Completed erratumManivannan Sadhasivam1-0/+2
The Qualcomm PCI bridge device (Device ID 0x010e) found in chipsets such as SC8280XP used in Lenovo Thinkpad X13s, does not set the Command Completed bit unless writes to the Slot Command register change "Control" bits. This results in timeouts like below during boot and resume from suspend: pcieport 0002:00:00.0: pciehp: Timeout on hotplug command 0x03c0 (issued 2020 msec ago) ... pcieport 0002:00:00.0: pciehp: Timeout on hotplug command 0x13f1 (issued 107724 msec ago) Add the device to the Command Completed quirk to mark commands "completed" immediately unless they change the "Control" bits. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Manivannan Sadhasivam <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
2023-02-14PCI: qcom: Add IPQ8074 Gen3 port supportRobert Marko1-0/+1
IPQ8074 has one Gen2 and one Gen3 port, with Gen2 port already supported. Add compatible for Gen3 port which uses the same controller as IPQ6018. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Robert Marko <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
2023-02-14PCI: qcom: Fix host-init error handlingJohan Hovold1-1/+12
Implement the new host_deinit() callback so that the PHY is powered off and regulators and clocks are disabled also on late host-init errors. Link: https://lore.kernel.org/r/[email protected] Fixes: 82a823833f4e ("PCI: qcom: Add Qualcomm PCIe controller driver") Signed-off-by: Johan Hovold <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Manivannan Sadhasivam <[email protected]>
2023-02-14PCI: qcom: Add SM8350 supportDmitry Baryshkov1-0/+1
Add support for the PCIe host on Qualcomm SM8350 platform. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Dmitry Baryshkov <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Bjorn Andersson <[email protected]>
2023-02-13PCI: Add ACS quirk for Wangxun NICsMengyuan Lou1-0/+22
Wangxun has verified there is no peer-to-peer between functions for the below selection of SFxxx, RP1000 and RP2000 NICS. They may be multi-function devices, but the hardware does not advertise ACS capability. Add an ACS quirk for these devices so the functions can be in independent IOMMU groups. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mengyuan Lou <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
2023-02-13PCI: Fix dropping valid root bus resources with .end = zeroGeert Uytterhoeven1-1/+1
On r8a7791/koelsch: kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak) # cat /sys/kernel/debug/kmemleak unreferenced object 0xc3a34e00 (size 64): comm "swapper/0", pid 1, jiffies 4294937460 (age 199.080s) hex dump (first 32 bytes): b4 5d 81 f0 b4 5d 81 f0 c0 b0 a2 c3 00 00 00 00 .]...].......... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<fe3aa979>] __kmalloc+0xf0/0x140 [<34bd6bc0>] resource_list_create_entry+0x18/0x38 [<767046bc>] pci_add_resource_offset+0x20/0x68 [<b3f3edf2>] devm_of_pci_get_host_bridge_resources.constprop.0+0xb0/0x390 When coalescing two resources for a contiguous aperture, the second resource is enlarged to cover the full contiguous range, while the first resource is marked invalid. This invalidation is done by clearing the flags, start, and end members. When adding the initial resources to the bus later, invalid resources are skipped. Unfortunately, the check for an invalid resource considers only the end member, causing false positives. E.g. on r8a7791/koelsch, root bus resource 0 ("bus 00") is skipped, and no longer registered with pci_bus_insert_busn_res() (causing the memory leak), nor printed: pci-rcar-gen2 ee090000.pci: host bridge /soc/pci@ee090000 ranges: pci-rcar-gen2 ee090000.pci: MEM 0x00ee080000..0x00ee08ffff -> 0x00ee080000 pci-rcar-gen2 ee090000.pci: PCI: revision 11 pci-rcar-gen2 ee090000.pci: PCI host bridge to bus 0000:00 -pci_bus 0000:00: root bus resource [bus 00] pci_bus 0000:00: root bus resource [mem 0xee080000-0xee08ffff] Fix this by only skipping resources where all of the flags, start, and end members are zero. Fixes: 7c3855c423b17f6c ("PCI: Coalesce host bridge contiguous apertures") Link: https://lore.kernel.org/r/da0fcd5e86c74239be79c7cb03651c0fce31b515.1676036673.git.geert+renesas@glider.be Tested-by: Niklas Schnelle <[email protected]> Signed-off-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Acked-by: Kai-Heng Feng <[email protected]>
2023-02-14PCI: endpoint: Use link_up() callback in place of LINK_UP notifierManivannan Sadhasivam2-25/+20
As a part of the transition towards callback mechanism for signalling the events from EPC to EPF, let's use the link_up() callback in the place of the LINK_UP notifier. This also removes the notifier support completely from the PCI endpoint framework. Link: https://lore.kernel.org/linux-pci/[email protected] Signed-off-by: Manivannan Sadhasivam <[email protected]> Signed-off-by: Krzysztof Wilczyński <[email protected]> Acked-by: Kishon Vijay Abraham I <[email protected]>
2023-02-14PCI: endpoint: Use callback mechanism for passing events from EPC to EPFManivannan Sadhasivam2-8/+16
Instead of using the notifiers for passing the events from EPC to EPF, let's introduce a callback based mechanism where the EPF drivers can populate relevant callbacks for EPC events they want to subscribe. The use of notifiers in kernel is not recommended if there is a real link between the sender and receiver, like in this case. Also, the existing atomic notifier forces the notification functions to be in atomic context while the caller may be in non-atomic context. For instance, the two in-kernel users of the notifiers, pcie-qcom and pcie-tegra194, both are calling the notifier functions in non-atomic context (from threaded IRQ handlers). This creates a sleeping in atomic context issue with the existing EPF_TEST driver that calls the EPC APIs that may sleep. For all these reasons, let's get rid of the notifier chains and use the simple callback mechanism for signalling the events from EPC to EPF drivers. This preserves the context of the caller and avoids the latency of going through a separate interface for triggering the notifications. As a first step of the transition, the core_init() callback is introduced in this commit, that'll replace the existing CORE_INIT notifier used for signalling the init complete event from EPC. During the occurrence of the event, EPC will go over the list of EPF drivers attached to it and will call the core_init() callback if available. Link: https://lore.kernel.org/linux-pci/[email protected] Signed-off-by: Manivannan Sadhasivam <[email protected]> Signed-off-by: Krzysztof Wilczyński <[email protected]> Acked-by: Kishon Vijay Abraham I <[email protected]>
2023-02-14PCI: endpoint: Use a separate lock for protecting epc->pci_epf listManivannan Sadhasivam1-4/+5
The EPC controller maintains a list of EPF drivers added to it. For protecting this list against the concurrent accesses, the epc->lock (used for protecting epc_ops) has been used so far. Since there were no users trying to use epc_ops and modify the pci_epf list simultaneously, this was not an issue. But with the addition of callback mechanism for passing the events, this will be a problem. Because the pci_epf list needs to be iterated first for getting hold of the EPF driver and then the relevant event specific callback needs to be called for the driver. If the same epc->lock is used, then it will result in a deadlock scenario. For instance, ... mutex_lock(&epc->lock); list_for_each_entry(epf, &epc->pci_epf, list) { epf->event_ops->core_init(epf); | |-> pci_epc_set_bar(); | |-> mutex_lock(&epc->lock) # DEADLOCK ... So to fix this issue, use a separate lock called "list_lock" for protecting the pci_epf list against the concurrent accesses. This lock will also be used by the callback mechanism. Link: https://lore.kernel.org/linux-pci/[email protected] Signed-off-by: Manivannan Sadhasivam <[email protected]> Signed-off-by: Krzysztof Wilczyński <[email protected]> Acked-by: Kishon Vijay Abraham I <[email protected]>
2023-02-14PCI: tegra194: Move dw_pcie_ep_linkup() to threaded IRQ handlerManivannan Sadhasivam1-2/+7
dw_pcie_ep_linkup() may take more time to execute depending on the EPF driver implementation. Calling this API in the hard IRQ handler is not encouraged since the hard IRQ handlers are supposed to complete quickly. So move the dw_pcie_ep_linkup() call to threaded IRQ handler. Link: https://lore.kernel.org/linux-pci/[email protected] Signed-off-by: Manivannan Sadhasivam <[email protected]> Signed-off-by: Krzysztof Wilczyński <[email protected]> Reviewed-by: Vidya Sagar <[email protected]>
2023-02-14PCI: dra7xx: Use threaded IRQ handler for "dra7xx-pcie-main" IRQManivannan Sadhasivam1-1/+1
The "dra7xx-pcie-main" hard IRQ handler is just printing the IRQ status and calling the dw_pcie_ep_linkup() API if LINK_UP status is set. But the execution of dw_pcie_ep_linkup() depends on the EPF driver and may take more time depending on the EPF implementation. In general, hard IRQ handlers are supposed to return quickly and not block for so long. Moreover, there is no real need of the current IRQ handler to be a hard IRQ handler. So switch to the threaded IRQ handler for the "dra7xx-pcie-main" IRQ. Link: https://lore.kernel.org/linux-pci/[email protected] Signed-off-by: Manivannan Sadhasivam <[email protected]> Signed-off-by: Krzysztof Wilczyński <[email protected]> Acked-by: Kishon Vijay Abraham I <[email protected]>
2023-02-13PCI: Honor firmware's device disabled statusRob Herring1-0/+2
If a device has a firmware node (DT/ACPI), and the device is marked disabled, that is currently ignored. Add a check for this condition and bail out creating the pci_dev. This assumes the config space for the device can still be accessed because they already have by this point in order to identify the device. Link: https://lore.kernel.org/r/[email protected] Tested-by: Binbin Zhou <[email protected]> Signed-off-by: Rob Herring <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Cc: Liu Peibao <[email protected]> Cc: Huacai Chen <[email protected]>
2023-02-13PCI: loongson: Add more devices that need MRRS quirkHuacai Chen1-9/+24
Loongson-2K SOC and LS7A2000 chipset add new PCI IDs that need MRRS quirk. Add them. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Huacai Chen <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
2023-02-10Merge tag 'pci-v6.2-fixes-2' of ↵Linus Torvalds3-88/+34
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull PCI fixes from Bjorn Helgaas: - Move to a shared PCI git tree (Bjorn Helgaas) - Add Krzysztof Wilczyński as another PCI maintainer (Lorenzo Pieralisi) - Revert a couple ASPM patches to fix suspend/resume regressions (Bjorn Helgaas) * tag 'pci-v6.2-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: Revert "PCI/ASPM: Refactor L1 PM Substates Control Register programming" Revert "PCI/ASPM: Save L1 PM Substates Capability for suspend/resume" MAINTAINERS: Promote Krzysztof to PCI controller maintainer MAINTAINERS: Move to shared PCI tree
2023-02-10Revert "PCI/ASPM: Refactor L1 PM Substates Control Register programming"Bjorn Helgaas1-40/+34
This reverts commit 5e85eba6f50dc288c22083a7e213152bcc4b8208. Thomas Witt reported that 5e85eba6f50d ("PCI/ASPM: Refactor L1 PM Substates Control Register programming") broke suspend/resume on a Tuxedo Infinitybook S 14 v5, which seems to use a Clevo L140CU Mainboard. The main symptom is: iwlwifi 0000:02:00.0: Unable to change power state from D3hot to D0, device inaccessible nvme 0000:03:00.0: Unable to change power state from D3hot to D0, device inaccessible and the machine is only partially usable after resume. It can't run dmesg and can't do a clean reboot. This happens on every suspend/resume cycle. Revert 5e85eba6f50d until we can figure out the root cause. Fixes: 5e85eba6f50d ("PCI/ASPM: Refactor L1 PM Substates Control Register programming") Link: https://bugzilla.kernel.org/show_bug.cgi?id=216877 Reported-by: Thomas Witt <[email protected]> Tested-by: Thomas Witt <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Cc: [email protected] # v6.1+ Cc: Vidya Sagar <[email protected]>
2023-02-10Revert "PCI/ASPM: Save L1 PM Substates Capability for suspend/resume"Bjorn Helgaas3-48/+0
This reverts commit 4ff116d0d5fd8a025604b0802d93a2d5f4e465d1. Tasev Nikola and Mark Enriquez reported that resume from suspend was broken in v6.1-rc1. Tasev bisected to a47126ec29f5 ("PCI/PTM: Cache PTM Capability offset"), but we can't figure out how that could be related. Mark saw the same symptoms and bisected to 4ff116d0d5fd ("PCI/ASPM: Save L1 PM Substates Capability for suspend/resume"), which does have a connection: it restores L1 Substates configuration while ASPM L1 may be enabled: pci_restore_state pci_restore_aspm_l1ss_state aspm_program_l1ss pci_write_config_dword(PCI_L1SS_CTL1, ctl1) # L1SS restore pci_restore_pcie_state pcie_capability_write_word(PCI_EXP_LNKCTL, cap[i++]) # L1 restore which is a problem because PCIe r6.0, sec 5.5.4, requires that: If setting either or both of the enable bits for ASPM L1 PM Substates, both ports must be configured as described in this section while ASPM L1 is disabled. Separately, Thomas Witt reported that 5e85eba6f50d ("PCI/ASPM: Refactor L1 PM Substates Control Register programming") broke suspend/resume, and it depends on 4ff116d0d5fd. Revert 4ff116d0d5fd ("PCI/ASPM: Save L1 PM Substates Capability for suspend/resume") to fix the resume issue and enable revert of 5e85eba6f50d to fix the issue Thomas reported. Note that reverting 4ff116d0d5fd means L1 Substates config may be lost on suspend/resume. As far as we know the system will use more power but will still *work* correctly. Fixes: 4ff116d0d5fd ("PCI/ASPM: Save L1 PM Substates Capability for suspend/resume") Link: https://bugzilla.kernel.org/show_bug.cgi?id=216782 Link: https://bugzilla.kernel.org/show_bug.cgi?id=216877 Reported-by: Tasev Nikola <[email protected]> Reported-by: Mark Enriquez <[email protected]> Reported-by: Thomas Witt <[email protected]> Tested-by: Mark Enriquez <[email protected]> Tested-by: Thomas Witt <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Cc: [email protected] # v6.1+ Cc: Vidya Sagar <[email protected]>
2023-02-09PCI/DPC: Await readiness of secondary bus after resetLukas Wunner3-5/+8
pci_bridge_wait_for_secondary_bus() is called after a Secondary Bus Reset, but not after a DPC-induced Hot Reset. As a result, the delays prescribed by PCIe r6.0 sec 6.6.1 are not observed and devices on the secondary bus may be accessed before they're ready. One affected device is Intel's Ponte Vecchio HPC GPU. It comprises a PCIe switch whose upstream port is not immediately ready after reset. Because its config space is restored too early, it remains in D0uninitialized, its subordinate devices remain inaccessible and DPC recovery fails with messages such as: i915 0000:8c:00.0: can't change power state from D3cold to D0 (config space inaccessible) intel_vsec 0000:8e:00.1: can't change power state from D3cold to D0 (config space inaccessible) pcieport 0000:89:02.0: AER: device recovery failed Fix it. Link: https://lore.kernel.org/r/9f5ff00e1593d8d9a4b452398b98aa14d23fca11.1673769517.git.lukas@wunner.de Tested-by: Ravi Kishore Koppuravuri <[email protected]> Signed-off-by: Lukas Wunner <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Mika Westerberg <[email protected]> Cc: [email protected]
2023-02-09PCI: mvebu: Mark driver as BROKENPali Rohár1-0/+1
People are reporting that pci-mvebu.c driver does not work with recent mainline kernel. There are more bugs which prevents its for daily usage. So lets mark it as broken for now, until somebody would be able to fix it in mainline kernel. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Pali Rohár <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]>
2023-02-07Merge branch 'for-6.3/cxl' into cxl/nextDan Williams1-0/+1
Merge the general CXL updates with fixes targeting v6.2-rc for v6.3. Resolve a conflict with the fix and move of cxl_report_and_clear() from pci.c to core/pci.c.
2023-02-07PCI: Unify delay handling for reset and resumeLukas Wunner3-32/+34
Sheng Bi reports that pci_bridge_secondary_bus_reset() may fail to wait for devices on the secondary bus to become accessible after reset: Although it does call pci_dev_wait(), it erroneously passes the bridge's pci_dev rather than that of a child. The bridge of course is always accessible while its secondary bus is reset, so pci_dev_wait() returns immediately. Sheng Bi proposes introducing a new pci_bridge_secondary_bus_wait() function which is called from pci_bridge_secondary_bus_reset(): https://lore.kernel.org/linux-pci/[email protected]/ However we already have pci_bridge_wait_for_secondary_bus() which does almost exactly what we need. So far it's only called on resume from D3cold (which implies a Fundamental Reset per PCIe r6.0 sec 5.8). Re-using it for Secondary Bus Resets is a leaner and more rational approach than introducing a new function. That only requires a few minor tweaks: - Amend pci_bridge_wait_for_secondary_bus() to await accessibility of the first device on the secondary bus by calling pci_dev_wait() after performing the prescribed delays. pci_dev_wait() needs two parameters, a reset reason and a timeout, which callers must now pass to pci_bridge_wait_for_secondary_bus(). The timeout is 1 sec for resume (PCIe r6.0 sec 6.6.1) and 60 sec for reset (commit 821cdad5c46c ("PCI: Wait up to 60 seconds for device to become ready after FLR")). Introduce a PCI_RESET_WAIT macro for the 1 sec timeout. - Amend pci_bridge_wait_for_secondary_bus() to return 0 on success or -ENOTTY on error for consumption by pci_bridge_secondary_bus_reset(). - Drop an unnecessary 1 sec delay from pci_reset_secondary_bus() which is now performed by pci_bridge_wait_for_secondary_bus(). A static delay this long is only necessary for Conventional PCI, so modern PCIe systems benefit from shorter reset times as a side effect. Fixes: 6b2f1351af56 ("PCI: Wait for device to become ready after secondary bus reset") Link: https://lore.kernel.org/r/da77c92796b99ec568bd070cbe4725074a117038.1673769517.git.lukas@wunner.de Reported-by: Sheng Bi <[email protected]> Tested-by: Ravi Kishore Koppuravuri <[email protected]> Signed-off-by: Lukas Wunner <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Mika Westerberg <[email protected]> Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]> Cc: [email protected] # v4.17+
2023-02-07PCI/PM: Observe reset delay irrespective of bridge_d3Lukas Wunner1-1/+1
If a PCI bridge is suspended to D3cold upon entering system sleep, resuming it entails a Fundamental Reset per PCIe r6.0 sec 5.8. The delay prescribed after a Fundamental Reset in PCIe r6.0 sec 6.6.1 is sought to be observed by: pci_pm_resume_noirq() pci_pm_bridge_power_up_actions() pci_bridge_wait_for_secondary_bus() However, pci_bridge_wait_for_secondary_bus() bails out if the bridge_d3 flag is not set. That flag indicates whether a bridge is allowed to suspend to D3cold at *runtime*. Hence *no* delay is observed on resume from system sleep if runtime D3cold is forbidden. That doesn't make any sense, so drop the bridge_d3 check from pci_bridge_wait_for_secondary_bus(). The purpose of the bridge_d3 check was probably to avoid delays if a bridge remained in D0 during suspend. However the sole caller of pci_bridge_wait_for_secondary_bus(), pci_pm_bridge_power_up_actions(), is only invoked if the previous power state was D3cold. Hence the additional bridge_d3 check seems superfluous. Fixes: ad9001f2f411 ("PCI/PM: Add missing link delays required by the PCIe spec") Link: https://lore.kernel.org/r/eb37fa345285ec8bacabbf06b020b803f77bdd3d.1673769517.git.lukas@wunner.de Tested-by: Ravi Kishore Koppuravuri <[email protected]> Signed-off-by: Lukas Wunner <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Mika Westerberg <[email protected]> Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]> Cc: [email protected] # v5.5+
2023-02-07PCI: Distribute available resources for root buses, tooMika Westerberg1-1/+56
Previously we distributed spare resources only upon hot-add, so if the initial root bus scan found devices that had not been fully configured by the BIOS, we allocated only enough resources to cover what was then present. If some of those devices were hotplug bridges, we did not leave any additional resource space for future expansion. Distribute the available resources for root buses, too, to make this work the same way as the normal hotplug case. A previous commit to do this was reverted due to a regression reported by Jonathan Cameron: e96e27fc6f79 ("PCI: Distribute available resources for root buses, too") 5632e2beaf9d ("Revert "PCI: Distribute available resources for root buses, too"") This commit changes pci_bridge_resources_not_assigned() to work with bridges that do not have all the resource windows programmed by the boot firmware (previously we expected all I/O, memory and prefetchable memory were programmed). Link: https://bugzilla.kernel.org/show_bug.cgi?id=216000 Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/r/[email protected] Reported-by: Chris Chiu <[email protected]> Signed-off-by: Mika Westerberg <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
2023-02-07PCI: Take other bus devices into account when distributing resourcesMika Westerberg1-70/+106
A PCI bridge may reside on a bus with other devices as well. The resource distribution code does not take this into account and therefore it expands the bridge resource windows too much, not leaving space for the other devices (or functions of a multifunction device). This leads to an issue that Jonathan reported when running QEMU with the following topology (QEMU parameters): -device pcie-root-port,port=0,id=root_port13,chassis=0,slot=2 \ -device x3130-upstream,id=sw1,bus=root_port13,multifunction=on \ -device e1000,bus=root_port13,addr=0.1 \ -device xio3130-downstream,id=fun1,bus=sw1,chassis=0,slot=3 \ -device e1000,bus=fun1 The first e1000 NIC here is another function in the switch upstream port. This leads to following errors: pci 0000:00:04.0: bridge window [mem 0x10200000-0x103fffff] to [bus 02-04] pci 0000:02:00.0: bridge window [mem 0x10200000-0x103fffff] to [bus 03-04] pci 0000:02:00.1: BAR 0: failed to assign [mem size 0x00020000] e1000 0000:02:00.1: can't ioremap BAR 0: [??? 0x00000000 flags 0x0] Fix this by taking into account bridge windows, device BARs and SR-IOV PF BARs on the bus (PF BARs include space for VF BARS so only account PF BARs), including the ones belonging to bridges themselves if it has any. Link: https://lore.kernel.org/linux-pci/[email protected]/ Link: https://lore.kernel.org/linux-pci/[email protected]/ Link: https://lore.kernel.org/r/[email protected] Reported-by: Jonathan Cameron <[email protected]> Reported-by: Alexander Motin <[email protected]> Signed-off-by: Mika Westerberg <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
2023-02-07PCI: Align extra resources for hotplug bridges properlyMika Westerberg1-6/+19
After division the extra resource space per hotplug bridge may not be aligned according to the window alignment, so align it before passing it down for further distribution. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mika Westerberg <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
2023-02-03PCI: mt7621: Delay phy ports initializationSergio Paracuellos1-0/+2
Some devices like ZBT WE1326 and ZBT WF3526-P and some Netgear models need to delay phy port initialization after calling the mt7621_pcie_init_port() driver function to get into reliable boots for both warm and hard resets. The delay required to detect the ports seems to be in the range [75-100] milliseconds. If the ports are not detected the controller is not functional. There is no datasheet or something similar to really understand why this extra delay is needed only for these devices and it is not for most of the boards that are built on mt7621 SoC. This issue has been reported by openWRT community and the complete discussion is in [0]. The 100 milliseconds delay has been tested in all devices to validate it. Add the extra 100 milliseconds delay to fix the issue. [0]: https://github.com/openwrt/openwrt/pull/11220 Link: https://lore.kernel.org/r/[email protected] Fixes: 2bdd5238e756 ("PCI: mt7621: Add MediaTek MT7621 PCIe host controller driver") Signed-off-by: Sergio Paracuellos <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]>
2023-02-03PCI: tegra: Convert to devm_of_phy_optional_get()Geert Uytterhoeven1-4/+1
Use the new devm_of_phy_optional_get() helper instead of open-coding the same operation. Signed-off-by: Geert Uytterhoeven <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> Link: https://lore.kernel.org/r/56508eeadf7fa8692877e872871f10294d48c49d.1674584626.git.geert+renesas@glider.be Signed-off-by: Vinod Koul <[email protected]>
2023-02-02drivers/pci/controller: Remove "select SRCU"Paul E. McKenney1-1/+1
Now that the SRCU Kconfig option is unconditionally selected, there is no longer any point in selecting it. Therefore, remove the "select SRCU" Kconfig statements. Signed-off-by: Paul E. McKenney <[email protected]> Cc: Lorenzo Pieralisi <[email protected]> Cc: Rob Herring <[email protected]> Cc: "Krzysztof Wilczyński" <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: <[email protected]> Acked-by: Lorenzo Pieralisi <[email protected]> Reviewed-by: John Ogness <[email protected]>
2023-02-02PCI: vmd: Add quirk to configure PCIe ASPM and LTRDavid E. Box1-1/+54
PCIe ports reserved for VMD use are not visible to BIOS and therefore not configured to enable PCIe ASPM or LTR values (which BIOS will configure if they are not set). Lack of this programming results in high power consumption on laptops as reported in bugzilla. For affected products use pci_enable_link_state to set the allowed link states for devices on the root ports. Also set the LTR value to the maximum value needed for the SoC. This is a workaround for products from Rocket Lake through Alder Lake. Raptor Lake, the latest product at this time, has already implemented LTR configuring in BIOS. Future products will move ASPM configuration back to BIOS as well. As this solution is intended for laptops, support is not added for hotplug or for devices downstream of a switch on the root port. Link: https://bugzilla.kernel.org/show_bug.cgi?id=212355 Link: https://bugzilla.kernel.org/show_bug.cgi?id=215063 Link: https://bugzilla.kernel.org/show_bug.cgi?id=213717 Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Michael Bottini <[email protected]> Signed-off-by: David E. Box <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]> Reviewed-by: Jon Derrick <[email protected]> Reviewed-by: Nirmal Patel <[email protected]> Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>
2023-02-02PCI: vmd: Create feature grouping for client productsDavid E. Box1-18/+10
Simplify the device ID list by creating a grouping of features shared by client products. Suggested-by: Jon Derrick <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: David E. Box <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]> Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>
2023-02-02PCI: vmd: Use PCI_VDEVICE in device listDavid E. Box1-8/+8
Use PCI_VDEVICE to simplify the device table. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: David E. Box <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]> Reviewed-by: Jon Derrick <[email protected]> Reviewed-by: Nirmal Patel <[email protected]> Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>