aboutsummaryrefslogtreecommitdiff
path: root/drivers/pci
AgeCommit message (Collapse)AuthorFilesLines
2022-12-12Merge tag 'irq-core-2022-12-10' of ↵Linus Torvalds11-814/+1316
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "Updates for the interrupt core and driver subsystem: The bulk is the rework of the MSI subsystem to support per device MSI interrupt domains. This solves conceptual problems of the current PCI/MSI design which are in the way of providing support for PCI/MSI[-X] and the upcoming PCI/IMS mechanism on the same device. IMS (Interrupt Message Store] is a new specification which allows device manufactures to provide implementation defined storage for MSI messages (as opposed to PCI/MSI and PCI/MSI-X that has a specified message store which is uniform accross all devices). The PCI/MSI[-X] uniformity allowed us to get away with "global" PCI/MSI domains. IMS not only allows to overcome the size limitations of the MSI-X table, but also gives the device manufacturer the freedom to store the message in arbitrary places, even in host memory which is shared with the device. There have been several attempts to glue this into the current MSI code, but after lengthy discussions it turned out that there is a fundamental design problem in the current PCI/MSI-X implementation. This needs some historical background. When PCI/MSI[-X] support was added around 2003, interrupt management was completely different from what we have today in the actively developed architectures. Interrupt management was completely architecture specific and while there were attempts to create common infrastructure the commonalities were rudimentary and just providing shared data structures and interfaces so that drivers could be written in an architecture agnostic way. The initial PCI/MSI[-X] support obviously plugged into this model which resulted in some basic shared infrastructure in the PCI core code for setting up MSI descriptors, which are a pure software construct for holding data relevant for a particular MSI interrupt, but the actual association to Linux interrupts was completely architecture specific. This model is still supported today to keep museum architectures and notorious stragglers alive. In 2013 Intel tried to add support for hot-pluggable IO/APICs to the kernel, which was creating yet another architecture specific mechanism and resulted in an unholy mess on top of the existing horrors of x86 interrupt handling. The x86 interrupt management code was already an incomprehensible maze of indirections between the CPU vector management, interrupt remapping and the actual IO/APIC and PCI/MSI[-X] implementation. At roughly the same time ARM struggled with the ever growing SoC specific extensions which were glued on top of the architected GIC interrupt controller. This resulted in a fundamental redesign of interrupt management and provided the today prevailing concept of hierarchical interrupt domains. This allowed to disentangle the interactions between x86 vector domain and interrupt remapping and also allowed ARM to handle the zoo of SoC specific interrupt components in a sane way. The concept of hierarchical interrupt domains aims to encapsulate the functionality of particular IP blocks which are involved in interrupt delivery so that they become extensible and pluggable. The X86 encapsulation looks like this: |--- device 1 [Vector]---[Remapping]---[PCI/MSI]--|... |--- device N where the remapping domain is an optional component and in case that it is not available the PCI/MSI[-X] domains have the vector domain as their parent. This reduced the required interaction between the domains pretty much to the initialization phase where it is obviously required to establish the proper parent relation ship in the components of the hierarchy. While in most cases the model is strictly representing the chain of IP blocks and abstracting them so they can be plugged together to form a hierarchy, the design stopped short on PCI/MSI[-X]. Looking at the hardware it's clear that the actual PCI/MSI[-X] interrupt controller is not a global entity, but strict a per PCI device entity. Here we took a short cut on the hierarchical model and went for the easy solution of providing "global" PCI/MSI domains which was possible because the PCI/MSI[-X] handling is uniform across the devices. This also allowed to keep the existing PCI/MSI[-X] infrastructure mostly unchanged which in turn made it simple to keep the existing architecture specific management alive. A similar problem was created in the ARM world with support for IP block specific message storage. Instead of going all the way to stack a IP block specific domain on top of the generic MSI domain this ended in a construct which provides a "global" platform MSI domain which allows overriding the irq_write_msi_msg() callback per allocation. In course of the lengthy discussions we identified other abuse of the MSI infrastructure in wireless drivers, NTB etc. where support for implementation specific message storage was just mindlessly glued into the existing infrastructure. Some of this just works by chance on particular platforms but will fail in hard to diagnose ways when the driver is used on platforms where the underlying MSI interrupt management code does not expect the creative abuse. Another shortcoming of today's PCI/MSI-X support is the inability to allocate or free individual vectors after the initial enablement of MSI-X. This results in an works by chance implementation of VFIO (PCI pass-through) where interrupts on the host side are not set up upfront to avoid resource exhaustion. They are expanded at run-time when the guest actually tries to use them. The way how this is implemented is that the host disables MSI-X and then re-enables it with a larger number of vectors again. That works by chance because most device drivers set up all interrupts before the device actually will utilize them. But that's not universally true because some drivers allocate a large enough number of vectors but do not utilize them until it's actually required, e.g. for acceleration support. But at that point other interrupts of the device might be in active use and the MSI-X disable/enable dance can just result in losing interrupts and therefore hard to diagnose subtle problems. Last but not least the "global" PCI/MSI-X domain approach prevents to utilize PCI/MSI[-X] and PCI/IMS on the same device due to the fact that IMS is not longer providing a uniform storage and configuration model. The solution to this is to implement the missing step and switch from global PCI/MSI domains to per device PCI/MSI domains. The resulting hierarchy then looks like this: |--- [PCI/MSI] device 1 [Vector]---[Remapping]---|... |--- [PCI/MSI] device N which in turn allows to provide support for multiple domains per device: |--- [PCI/MSI] device 1 |--- [PCI/IMS] device 1 [Vector]---[Remapping]---|... |--- [PCI/MSI] device N |--- [PCI/IMS] device N This work converts the MSI and PCI/MSI core and the x86 interrupt domains to the new model, provides new interfaces for post-enable allocation/free of MSI-X interrupts and the base framework for PCI/IMS. PCI/IMS has been verified with the work in progress IDXD driver. There is work in progress to convert ARM over which will replace the platform MSI train-wreck. The cleanup of VFIO, NTB and other creative "solutions" are in the works as well. Drivers: - Updates for the LoongArch interrupt chip drivers - Support for MTK CIRQv2 - The usual small fixes and updates all over the place" * tag 'irq-core-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (134 commits) irqchip/ti-sci-inta: Fix kernel doc irqchip/gic-v2m: Mark a few functions __init irqchip/gic-v2m: Include arm-gic-common.h irqchip/irq-mvebu-icu: Fix works by chance pointer assignment iommu/amd: Enable PCI/IMS iommu/vt-d: Enable PCI/IMS x86/apic/msi: Enable PCI/IMS PCI/MSI: Provide pci_ims_alloc/free_irq() PCI/MSI: Provide IMS (Interrupt Message Store) support genirq/msi: Provide constants for PCI/IMS support x86/apic/msi: Enable MSI_FLAG_PCI_MSIX_ALLOC_DYN PCI/MSI: Provide post-enable dynamic allocation interfaces for MSI-X PCI/MSI: Provide prepare_desc() MSI domain op PCI/MSI: Split MSI-X descriptor setup genirq/msi: Provide MSI_FLAG_MSIX_ALLOC_DYN genirq/msi: Provide msi_domain_alloc_irq_at() genirq/msi: Provide msi_domain_ops:: Prepare_desc() genirq/msi: Provide msi_desc:: Msi_data genirq/msi: Provide struct msi_map x86/apic/msi: Remove arch_create_remap_msi_irq_domain() ...
2022-12-10Merge branch 'pci/kbuild'Bjorn Helgaas16-16/+6
- Remove unnecessary <linux/of_irq.h> includes (Bjorn Helgaas) * pci/kbuild: PCI: Drop of_match_ptr() to avoid unused variables PCI: Remove unnecessary <linux/of_irq.h> includes PCI: xgene-msi: Include <linux/irqdomain.h> explicitly PCI: mvebu: Include <linux/irqdomain.h> explicitly PCI: microchip: Include <linux/irqdomain.h> explicitly PCI: altera-msi: Include <linux/irqdomain.h> explicitly # Conflicts: # drivers/pci/controller/pci-mvebu.c
2022-12-10Merge branch 'pci/ctrl/xilinx'Bjorn Helgaas1-4/+3
- Fix whitespace issues (Michal Simek) * pci/ctrl/xilinx: PCI: xilinx-nwl: Fix coding style violations
2022-12-10Merge branch 'pci/ctrl/mvebu'Bjorn Helgaas1-34/+17
- Switch to the gpiod API so we can make of_get_named_gpio_flags() private (Dmitry Torokhov) * pci/ctrl/mvebu: PCI: mvebu: Switch to using gpiod API
2022-12-10Merge branch 'pci/ctrl/aardvark'Bjorn Helgaas1-12/+10
- Switch to using devm_gpiod_get_optional() so we can stop exporting devm_gpiod_get_from_of_node() (Dmitry Torokhov) * pci/ctrl/aardvark: PCI: aardvark: Switch to using devm_gpiod_get_optional()
2022-12-10Merge branch 'remotes/lorenzo/pci/misc'Bjorn Helgaas2-10/+10
- Register notifier if core_init_notifier is enabled in pci-epf-test (Kunihiko Hayashi) - Fixup Kconfig indentation (Shunsuke Mie) * remotes/lorenzo/pci/misc: PCI: endpoint: Fix Kconfig indent style PCI: pci-epf-test: Register notifier if only core_init_notifier is enabled
2022-12-10Merge branch 'remotes/lorenzo/pci/vmd'Bjorn Helgaas1-2/+25
- Restore MSI remapping configuration during resume because the configuration is cleared out by firmware when suspending (Nirmal Patel) - Reset the hierarchy below VMD when probing the VMD; we attempted this before, but with the wrong device, so it didn't work (Francisco Munoz) * remotes/lorenzo/pci/vmd: PCI: vmd: Fix secondary bus reset for Intel bridges PCI: vmd: Disable MSI remapping after suspend
2022-12-10Merge branch 'remotes/lorenzo/pci/tegra'Bjorn Helgaas1-4/+5
- Switch from devm_gpiod_get_from_of_node() to devm_fwnode_gpiod_get() (Dmitry Torokhov) * remotes/lorenzo/pci/tegra: PCI: tegra: Switch to using devm_fwnode_gpiod_get
2022-12-10Merge branch 'remotes/lorenzo/pci/qcom'Bjorn Helgaas1-0/+76
- Add DT and driver support for SC8280XP/SA8540P basic interconnects where interconnect bandwidth must be requested before enabling interconnect clocks (Johan Hovold) - Add 'dma-coherent' property (Johan Hovold) * remotes/lorenzo/pci/qcom: dt-bindings: PCI: qcom: Allow 'dma-coherent' property PCI: qcom: Add basic interconnect support dt-bindings: PCI: qcom: Add SC8280XP/SA8540P interconnects
2022-12-10Merge branch 'remotes/lorenzo/pci/mt7621'Bjorn Helgaas1-1/+2
- Add sentinel to mt7621_pcie_quirks_match[] to prevent oops when parsing the table (John Thomson) * remotes/lorenzo/pci/mt7621: PCI: mt7621: Add sentinel to quirks table
2022-12-10Merge branch 'remotes/lorenzo/pci/endpoint'Bjorn Helgaas2-65/+92
- Add a .release() callback for the Endpoint Controller library so an Endpoint driver is removable (Yoshihiro Shimoda) - Fix pci-epf-vntb kernel-doc and whitespace (Frank Li) - Fix pci-epf-vntb error path usage of pci_epc_mem_free_addr() (Frank Li) - Remove pci-epf-vntb unused epf_db_phy (Frank Li) - Fix pci-epf-vntb sparse warnings (Frank Li) * remotes/lorenzo/pci/endpoint: PCI: endpoint: pci-epf-vntb: Fix sparse ntb->reg build warning PCI: endpoint: pci-epf-vntb: Fix sparse build warning for epf_db PCI: endpoint: pci-epf-vntb: Replace hardcoded 4 with sizeof(u32) PCI: endpoint: pci-epf-vntb: Remove unused epf_db_phy struct member PCI: endpoint: pci-epf-vntb: Fix call pci_epc_mem_free_addr() in error path PCI: endpoint: pci-epf-vntb: Fix struct epf_ntb_ctrl indentation PCI: endpoint: pci-epf-vntb: Clean up kernel_doc warning PCI: endpoint: Fix WARN() when an endpoint driver is removed
2022-12-10Merge branch 'remotes/lorenzo/pci/dwc'Bjorn Helgaas10-122/+1009
- Fix n_fts[] array overrun (Vidya Sagar) - Don't advertise PTM Responder role for Endpoints (Vidya Sagar) - Fix qcom "reset assert" error message (Manivannan Sadhasivam) - Downgrade "link didn't come up" message to dev_info (Vidya Sagar) - Initialize PHY before deasserting core reset so the link comes up on boards where the PHY provides the reference clock (this was a regression in v6.0) (Sascha Hauer) - Switch histb to the gpiod API (Dmitry Torokhov) - Fix imx6sx and imx8mq clock names in DT binding (Serge Semin) - Fix visconti MSI interrupt in DT binding (Serge Semin) - Consolidate reset-gpio, cdm, windows info in common DT shared by both Root Port and Endpoint bindings (Serge Semin) - Remove bus node from DT examples (Serge Semin) - Add common phys, phy-names to DT (Serge Semin) - Add default max-link-speed of Gen5 to DT (Serge Semin) - Apply generic schema for generic device (Serge Semin) - Add default max-functions of 32 to DT (Serge Semin) - Add common interrupts, interrupt-names to DT (Serge Semin) - Add common regs, reg-names to DT (Serge Semin) - Add common clocks, resets to DT (Serge Semin) - Add dma-coherent to DT (Serge Semin) - Apply common schema to Rockchip DT (Serge Semin) - Add Baikal-T1 DT bindings (Serge Semin) - Add dma-ranges support in DesignWare core (Serge Semin) - Add dw_pcie_cap_is() for testing controller capabilities (Serge Semin) - Add generic resources getter to DesignWare core (Serge Semin) - Combine iATU detection procedures (Serge Semin) - Add generic clock and reset names to DesignWare core (Serge Semin) - Add Baikal-T1 PCIe controller driver (Serge Semin) * remotes/lorenzo/pci/dwc: PCI: dwc: Add Baikal-T1 PCIe controller support PCI: dwc: Introduce generic platform clocks and resets PCI: dwc: Combine iATU detection procedures PCI: dwc: Introduce generic resources getter PCI: dwc: Introduce generic controller capabilities interface PCI: dwc: Introduce dma-ranges property support for RC-host dt-bindings: PCI: dwc: Add Baikal-T1 PCIe Root Port bindings dt-bindings: PCI: dwc: Apply common schema to Rockchip DW PCIe nodes dt-bindings: PCI: dwc: Add dma-coherent property dt-bindings: PCI: dwc: Add clocks/resets common properties dt-bindings: PCI: dwc: Add reg/reg-names common properties dt-bindings: PCI: dwc: Add interrupts/interrupt-names common properties dt-bindings: PCI: dwc: Add max-functions EP property dt-bindings: PCI: dwc: Apply generic schema for generic device only dt-bindings: PCI: dwc: Add max-link-speed common property dt-bindings: PCI: dwc: Add phys/phy-names common properties dt-bindings: PCI: dwc: Remove bus node from the examples dt-bindings: PCI: dwc: Detach common RP/EP DT bindings dt-bindings: visconti-pcie: Fix interrupts array max constraints dt-bindings: imx6q-pcie: Fix clock names for imx6sx and imx8mq PCI: histb: Switch to using gpiod API PCI: imx6: Initialize PHY before deasserting core reset PCI: dwc: Use dev_info for PCIe link down event logging PCI: qcom: Fix error message for reset_control_assert() PCI: designware-ep: Disable PTM capabilities for EP mode PCI: Add PCI_PTM_CAP_RES macro PCI: dwc: Fix n_fts[] array overrun
2022-12-10Merge branch 'remotes/lorenzo/pci/brcmstb'Bjorn Helgaas1-37/+48
- Enable Multi-MSI (Jim Quinlan) - Wait for 100ms after PERST# deassert for power and clocks to stabilize (Jim Quinlan) - Use readl_poll_timeout_atomic() instead of hand-rolled timeout loop (Jim Quinlan) - Drop needless "inline" annotations (Jim Quinlan) - Set RCB_MPS mode bit so data for reads up to MPS are returned in a single completion (Jim Quinlan) * remotes/lorenzo/pci/brcmstb: PCI: brcmstb: Set RCB_{MPS,64B}_MODE bits PCI: brcmstb: Drop needless 'inline' annotations PCI: brcmstb: Replace status loops with read_poll_timeout_atomic() PCI: brcmstb: Wait for 100ms following PERST# deassert PCI: brcmstb: Enable Multi-MSI
2022-12-10Merge branch 'pci/sysfs'Bjorn Helgaas1-4/+9
- Fix a double free in the error path of creating sysfs "resource%d" attributes (Sascha Hauer) * pci/sysfs: PCI/sysfs: Fix double free in error path
2022-12-10Merge branch 'pci/resource'Bjorn Helgaas1-0/+4
- Remove EfiMemoryMappedIO regions from the E820 map to allow PCI core to allocate BARs from them. The only purpose of EfiMemoryMappedIO is to tell the OS to map things needed by EFI runtime services, so it's often used for PCI host bridge apertures. If we can't allocate from those apertures, we can't hot-add devices (Bjorn Helgaas) * pci/resource: x86/PCI: Use pr_info() when possible x86/PCI: Fix log message typo x86/PCI: Tidy E820 removal messages PCI: Skip allocate_resource() if too little space available efi/x86: Remove EfiMemoryMappedIO from E820 map
2022-12-10Merge branch 'pci/portdrv'Bjorn Helgaas4-284/+256
- Squash portdrv_core.c and portdrv_pci.c into portdrv.c to make it easier to find things (Bjorn Helgaas) - Allow AER service only for Root Ports & RCECs so portdrv can successfully bind to other devices that have AER but lack MSI (which they don't need for AER), which allows power management for those devices (Bjorn Helgaas) * pci/portdrv: PCI/portdrv: Allow AER service only for Root Ports & RCECs PCI/portdrv: Unexport pcie_port_service_register(), pcie_port_service_unregister() PCI/portdrv: Move private things to portdrv.c PCI/portdrv: Squash into portdrv.c
2022-12-10Merge branch 'pci/pm'Bjorn Helgaas1-4/+4
- Remove unused 'state' parameter to pci_legacy_suspend_late() (Bjorn Helgaas) * pci/pm: PCI/PM: Remove unused 'state' parameter to pci_legacy_suspend_late()
2022-12-10Merge branch 'pci/misc'Bjorn Helgaas1-1/+1
- Use METHOD_NAME__UID instead of plain string to make it easier to find all uses (Yipeng Zou) * pci/misc: PCI/ACPI: Use METHOD_NAME__UID instead of plain string
2022-12-10Merge branch 'pci/hotplug'Bjorn Helgaas8-25/+22
- Enable pciehp by default if USB4 is enabled because USB4/Thunderbolt tunneling depends on native PCIe hotplug (Albert Zhou) - Make sure pciehp binds only to Downstream Ports, not Upstream Ports (Rafael J. Wysocki) - Remove unused get_mode1_ECC_cap callback in shpchp (Ian Cowan) - Enable pciehp Command Completed Interrupt only if supported to reduce confusion when looking at lspci output (Pali Rohár) * pci/hotplug: PCI: pciehp: Enable Command Completed Interrupt only if supported PCI: shpchp: Remove unused get_mode1_ECC_cap callback PCI: acpiphp: Avoid setting is_hotplug_bridge for PCIe Upstream Ports PCI/portdrv: Set PCIE_PORT_SERVICE_HP for Root and Downstream Ports only PCI: pciehp: Enable by default if USB4 enabled
2022-12-10Merge branch 'pci/enumeration'Bjorn Helgaas6-47/+85
- Only read/write PCIe Link 2 registers for devices with Links and PCIe Capability version >= 2 (Maciej W. Rozycki) - Revert a patch that cleared PCI_STATUS during enumeration because it broke Linux guests on Apple's virtualization framework (Bjorn Helgaas) - Assign PCI domain IDs using IDAs so IDs can be easily reused after loading/unloading host bridge drivers (Pali Rohár) - Fix pci_device_is_present(), which previously always returned "false" for VFs because their vendor ID is always 0xfff (Michael S. Tsirkin) - Check for alloc failure in pci_request_irq() (Zeng Heng) * pci/enumeration: PCI: Check for alloc failure in pci_request_irq() PCI: Fix pci_device_is_present() for VFs by checking PF PCI: Assign PCI domain IDs by ida_alloc() Revert "PCI: Clear PCI_STATUS when setting up device" PCI: Access Link 2 registers only for devices with Links
2022-12-10PCI: Skip allocate_resource() if too little space availableBjorn Helgaas1-0/+4
pci_bus_alloc_from_region() allocates MMIO space by iterating through all the resources available on the bus. The available resource might be reduced if the caller requires 32-bit space or we're avoiding BIOS or E820 areas. Don't bother calling allocate_resource() if we need more space than is available in this resource. This prevents some pointless and annoying messages about avoided areas. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Bjorn Helgaas <[email protected]> Acked-by: Hans de Goede <[email protected]>
2022-12-10PCI/portdrv: Allow AER service only for Root Ports & RCECsBjorn Helgaas1-1/+3
Previously portdrv allowed the AER service for any device with an AER capability (assuming Linux had control of AER) even though the AER service driver only attaches to Root Port and RCECs. Because get_port_device_capability() included AER for non-RP, non-RCEC devices, we tried to initialize the AER IRQ even though these devices don't generate AER interrupts. Intel DG1 and DG2 discrete graphics cards contain a switch leading to a GPU. The switch supports AER but not MSI, so initializing an AER IRQ failed, and portdrv failed to claim the switch port at all. The GPU itself could be suspended, but the switch could not be put in a low-power state because it had no driver. Don't allow the AER service on non-Root Port, non-Root Complex Event Collector devices. This means we won't enable Bus Mastering if the device doesn't require MSI, the AER service will not appear in sysfs, and the AER service driver will not bind to the device. Link: https://lore.kernel.org/r/[email protected] Link: https://lore.kernel.org/r/[email protected] Based-on-patch-by: Mika Westerberg <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]>
2022-12-08PCI: xilinx-nwl: Fix coding style violationsMichal Simek1-4/+3
Fix code alignments and remove additional newline. Link: https://lore.kernel.org/r/17c75e7003bb8c43a0f45ae3d7c45cac230ef852.1670503129.git.michal.simek@amd.com Signed-off-by: Michal Simek <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
2022-12-07PCI: mvebu: Switch to using gpiod APIDmitry Torokhov1-34/+17
Switch the driver away from legacy gpio/of_gpio API to gpiod API, and remove use of of_get_named_gpio_flags() which I want to make private to gpiolib. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Dmitry Torokhov <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]>
2022-12-07PCI: pciehp: Enable Command Completed Interrupt only if supportedPali Rohár1-1/+3
The No Command Completed Support bit in the Slot Capabilities register indicates whether Command Completed Interrupt Enable is unsupported. We already check whether No Command Completed Support bit is set in pcie_wait_cmd(), and do not wait in this case. Don't enable this Command Completed Interrupt at all if NCCS is set, so that when users dump configuration space from userspace, the dump does not confuse them by saying that Command Completed Interrupt is not supported, but it is enabled. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Pali Rohár <[email protected]> Signed-off-by: Marek Behún <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Lukas Wunner <[email protected]>
2022-12-07PCI: aardvark: Switch to using devm_gpiod_get_optional()Dmitry Torokhov1-12/+10
Switch the driver to the generic version of gpiod API (and away from OF-specific variant), so that we can stop exporting devm_gpiod_get_from_of_node(). Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Dmitry Torokhov <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Linus Walleij <[email protected]> Acked-by: Pali Rohár <[email protected]>
2022-12-06PCI: mt7621: Add sentinel to quirks tableJohn Thomson1-1/+2
Current driver is missing a sentinel in the struct soc_device_attribute array, which causes an oops when assessed by the soc_device_match(mt7621_pcie_quirks_match) call. This was only exposed once the CONFIG_SOC_MT7621 mt7621 soc_dev_attr was fixed to register the SOC as a device, in: commit 7c18b64bba3b ("mips: ralink: mt7621: do not use kzalloc too early") Fix it by adding the required sentinel. Link: https://lore.kernel.org/lkml/[email protected] Link: https://lore.kernel.org/r/[email protected] Fixes: b483b4e4d3f6 ("staging: mt7621-pci: add quirks for 'E2' revision using 'soc_device_attribute'") Signed-off-by: John Thomson <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]> Acked-by: Sergio Paracuellos <[email protected]>
2022-12-06PCI: vmd: Fix secondary bus reset for Intel bridgesFrancisco Munoz1-2/+20
The reset was never applied in the current implementation because Intel Bridges owned by VMD are parentless. Internally, pci_reset_bus() applies a reset to the parent of the PCI device supplied as argument, but in this case it failed because there wasn't a parent. In more detail, this change allows the VMD driver to enumerate NVMe devices in pass-through configurations when guest reboots are performed. There was an attempted to fix this, but later we discovered that the code inside pci_reset_bus() wasn’t triggering secondary bus resets. Therefore, we updated the parameters passed to it, and now NVMe SSDs attached to VMD bridges are properly enumerated in VT-d pass-through scenarios. Link: https://lore.kernel.org/r/[email protected] Fixes: 6aab5622296b ("PCI: vmd: Clean up domain before enumeration") Signed-off-by: Francisco Munoz <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]> Reviewed-by: Nirmal Patel <[email protected]> Reviewed-by: Jonathan Derrick <[email protected]>
2022-12-05PCI/MSI: Provide pci_ims_alloc/free_irq()Thomas Gleixner1-0/+50
Single vector allocation which allocates the next free index in the IMS space. The free function releases. All allocated vectors are released also via pci_free_vectors() which is also releasing MSI/MSI-X vectors. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> Acked-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-12-05PCI/MSI: Provide IMS (Interrupt Message Store) supportThomas Gleixner1-0/+59
IMS (Interrupt Message Store) is a new specification which allows implementation specific storage of MSI messages contrary to the strict standard specified MSI and MSI-X message stores. This requires new device specific interrupt domains to handle the implementation defined storage which can be an array in device memory or host/guest memory which is shared with hardware queues. Add a function to create IMS domains for PCI devices. IMS domains are using the new per device domain mechanism and are configured by the device driver via a template. IMS domains are created as secondary device domains so they work side on side with MSI[-X] on the same device. The IMS domains have a few constraints: - The index space is managed by the core code. Device memory based IMS provides a storage array with a fixed size which obviously requires an index. But there is no association between index and functionality so the core can randomly allocate an index in the array. System memory based IMS does not have the concept of an index as the storage is somewhere in memory. In that case the index is purely software based to keep track of the allocations. - There is no requirement for consecutive index ranges This is currently a limitation of the MSI core and can be implemented if there is a justified use case by changing the internal storage from xarray to maple_tree. For now it's single vector allocation. - The interrupt chip must provide the following callbacks: - irq_mask() - irq_unmask() - irq_write_msi_msg() - The interrupt chip must provide the following optional callbacks when the irq_mask(), irq_unmask() and irq_write_msi_msg() callbacks cannot operate directly on hardware, e.g. in the case that the interrupt message store is in queue memory: - irq_bus_lock() - irq_bus_unlock() These callbacks are invoked from preemptible task context and are allowed to sleep. In this case the mandatory callbacks above just store the information. The irq_bus_unlock() callback is supposed to make the change effective before returning. - Interrupt affinity setting is handled by the underlying parent interrupt domain and communicated to the IMS domain via irq_write_msi_msg(). IMS domains cannot have a irq_set_affinity() callback. That's a reasonable restriction similar to the PCI/MSI device domain implementations. The domain is automatically destroyed when the PCI device is removed. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Acked-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-12-05PCI/MSI: Provide post-enable dynamic allocation interfaces for MSI-XThomas Gleixner2-1/+69
MSI-X vectors can be allocated after the initial MSI-X enablement, but this needs explicit support of the underlying interrupt domains. Provide a function to query the ability and functions to allocate/free individual vectors post-enable. The allocation can either request a specific index in the MSI-X table or with the index argument MSI_ANY_INDEX it allocates the next free vector. The return value is a struct msi_map which on success contains both index and the Linux interrupt number. In case of failure index is negative and the Linux interrupt number is 0. The allocation function is for a single MSI-X index at a time as that's sufficient for the most urgent use case VFIO to get rid of the 'disable MSI-X, reallocate, enable-MSI-X' cycle which is prone to lost interrupts and redirections to the legacy and obviously unhandled INTx. As single index allocation is also sufficient for the use cases Jason Gunthorpe pointed out: Allocation of a MSI-X or IMS vector for a network queue. See Link below. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> Acked-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/all/[email protected] Link: https://lore.kernel.org/r/[email protected]
2022-12-05PCI/MSI: Provide prepare_desc() MSI domain opThomas Gleixner1-0/+9
The setup of MSI descriptors for PCI/MSI-X interrupts depends partially on the MSI index for which the descriptor is initialized. Dynamic MSI-X vector allocation post MSI-X enablement allows to allocate vectors at a given index or at any free index in the available table range. The latter requires that the descriptor is initialized after the MSI core has chosen an index. Implement the prepare_desc() op in the PCI/MSI-X specific msi_domain_ops which is invoked before the core interrupt descriptor and the associated Linux interrupt number is allocated. That callback is also provided for the upcoming PCI/IMS implementations so the implementation specific interrupt domain can do their domain specific initialization of the MSI descriptors. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> Acked-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-12-05PCI/MSI: Split MSI-X descriptor setupThomas Gleixner2-27/+47
The upcoming mechanism to allocate MSI-X vectors after enabling MSI-X needs to share some of the MSI-X descriptor setup. The regular descriptor setup on enable has the following code flow: 1) Allocate descriptor 2) Setup descriptor with PCI specific data 3) Insert descriptor 4) Allocate interrupts which in turn scans the inserted descriptors This cannot be easily changed because the PCI/MSI code needs to handle the legacy architecture specific allocation model and the irq domain model where quite some domains have the assumption that the above flow is how it works. Ideally the code flow should look like this: 1) Invoke allocation at the MSI core 2) MSI core allocates descriptor 3) MSI core calls back into the irq domain which fills in the domain specific parts This could be done for underlying parent MSI domains which support post-enable allocation/free but that would create significantly different code pathes for MSI/MSI-X enable. Though for dynamic allocation which wants to share the allocation code with the upcoming PCI/IMS support it's the right thing to do. Split the MSI-X descriptor setup into the preallocation part which just sets the index and fills in the horrible hack of virtual IRQs and the real PCI specific MSI-X setup part which solely depends on the index in the descriptor. This allows to provide a common dynamic allocation interface at the MSI core level for both PCI/MSI-X and PCI/IMS. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> Acked-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-12-05PCI/MSI: Remove unused pci_dev_has_special_msi_domain()Thomas Gleixner1-21/+0
The check for special MSI domains like VMD which prevents the interrupt remapping code to overwrite device::msi::domain is not longer required and has been replaced by an x86 specific version which is aware of MSI parent domains. Remove it. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> Acked-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-12-05PCI/MSI: Add support for per device MSI[X] domainsThomas Gleixner3-5/+201
Provide a template and the necessary callbacks to create PCI/MSI and PCI/MSI-X domains. The domains are created when MSI or MSI-X is enabled. The domain's lifetime is either the device lifetime or in case that e.g. MSI-X was tried first and failed, then the MSI-X domain is removed and a MSI domain is created as both are mutually exclusive and reside in the default domain ID slot of the per device domain pointer array. Also expand pci_msi_domain_supports() to handle feature checks correctly even in the case that the per device domain was not yet created by checking the features supported by the MSI parent. Add the necessary setup calls into the MSI and MSI-X enable code path. These setup calls are backwards compatible. They return success when there is no parent domain found, which means the existing global domains or the legacy allocation path keep just working. Co-developed-by: Ahmed S. Darwish <[email protected]> Signed-off-by: Ahmed S. Darwish <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> Acked-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-12-05PCI/MSI: Split __pci_write_msi_msg()Thomas Gleixner1-50/+54
The upcoming per device MSI domains will create different domains for MSI and MSI-X. Split the write message function into MSI and MSI-X helpers so they can be used by those new domain functions seperately. Signed-off-by: Ahmed S. Darwish <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> Acked-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-12-05Merge branch 'for-6.2/cxl-aer' into for-6.2/cxlDan Williams1-1/+7
Pick up CXL AER handling and correctable error extensions. Resolve conflicts with cxl_pmem_wq reworks and RCH support.
2022-12-05PCI/MSI: Use msi_domain_alloc/free_irqs_all_locked()Thomas Gleixner1-2/+2
Switch to the new domain id aware interfaces to phase out the previous ones. No functional change. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> Acked-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-12-05genirq/msi: Rename msi_add_msi_desc() to msi_insert_msi_desc()Thomas Gleixner1-2/+2
This reflects the functionality better. No functional change. Signed-off-by: Thomas Gleixner <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Reviewed-by: Kevin Tian <[email protected]> Acked-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-12-05PCI/MSI: Use bullet lists in kernel-doc comments of api.cBagas Sanjaya1-14/+19
Use bullet-list RST syntax for kernel-doc parameters' flags and interrupt mode descriptions. Otherwise Sphinx produces "Unexpected identation" errors and warnings. Fixes: 5c0997dc33ac24 ("PCI/MSI: Move pci_alloc_irq_vectors() to api.c") Fixes: 017239c8db2093 ("PCI/MSI: Move pci_irq_vector() to api.c") Fixes: be37b8428b7b77 ("PCI/MSI: Move pci_irq_get_affinity() to api.c") Reported-by: Stephen Rothwell <[email protected]> Suggested-by: Ahmed S. Darwish <[email protected]> Signed-off-by: Bagas Sanjaya <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Acked-by: Ahmed S. Darwish <[email protected]> Link: https://lore.kernel.org/r/[email protected]
2022-12-03PCI/AER: Add optional logging callback for correctable errorDave Jiang1-1/+7
Some new devices such as CXL devices may want to record additional error information on a corrected error. Add a callback to allow the PCI device driver to do additional logging such as providing additional stats for user space RAS monitoring. For CXL device, this is actually a need due to CXL needing to write to the CXL RAS capability structure correctable error status register in order to clear the unmasked correctable errors. See CXL spec rev3.0 8.2.4.16. Suggested-by: Jonathan Cameron <[email protected]> Reviewed-by: Kuppuswamy Sathyanarayanan <[email protected]> Reviewed-by: Jonathan Cameron <[email protected]> Acked-by: Bjorn Helgaas <[email protected]> Signed-off-by: Dave Jiang <[email protected]> Link: https://lore.kernel.org/r/166984619233.2804404.3966368388544312674.stgit@djiang5-desk3.ch.intel.com Signed-off-by: Dan Williams <[email protected]>
2022-12-02Merge tag 'v6.1-rc7' into iommufd.git for-nextJason Gunthorpe1-20/+90
Resolve conflicts in drivers/vfio/vfio_main.c by using the iommfd version. The rc fix was done a different way when iommufd patches reworked this code. Signed-off-by: Jason Gunthorpe <[email protected]>
2022-11-28PCI: hv: update comment in x86 specific hv_arch_irq_unmaskOlaf Hering1-3/+3
The function hv_set_affinity was removed in commit 831c1ae7 ("PCI: hv: Make the code arch neutral by adding arch specific interfaces"). Signed-off-by: Olaf Hering <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Wei Liu <[email protected]>
2022-11-23PCI: endpoint: pci-epf-vntb: Fix sparse ntb->reg build warningFrank Li1-4/+4
pci-epf-vntb.c:1128:33: sparse: expected void [noderef] __iomem *base pci-epf-vntb.c:1128:33: sparse: got struct epf_ntb_ctrl *reg Add __iomem type cast in vntb_epf_peer_spad_read() and vntb_epf_peer_spad_write(). Link: https://lore.kernel.org/r/[email protected] Reported-by: kernel test robot <[email protected]> Signed-off-by: Frank Li <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]> Acked-by: Manivannan Sadhasivam <[email protected]>
2022-11-23PCI: endpoint: pci-epf-vntb: Fix sparse build warning for epf_dbFrank Li1-6/+4
Use epf_db[i] dereference instead of readl() because epf_db is in memory allocated by dma_alloc_coherent(), not I/O. Remove useless/duplicated readl() in the process. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Frank Li <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]>
2022-11-23PCI: endpoint: pci-epf-vntb: Replace hardcoded 4 with sizeof(u32)Frank Li1-12/+12
NTB spad entry item size is sizeof(u32), replace hardcoded 4 with it. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Frank Li <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]> Acked-by: Manivannan Sadhasivam <[email protected]>
2022-11-23PCI: endpoint: pci-epf-vntb: Remove unused epf_db_phy struct memberFrank Li1-1/+0
epf_db_phy member in struct epf_ntb is not used, remove it. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Frank Li <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]> Acked-by: Manivannan Sadhasivam <[email protected]>
2022-11-23PCI: endpoint: pci-epf-vntb: Fix call pci_epc_mem_free_addr() in error pathFrank Li1-1/+1
Replace pci_epc_mem_free_addr() with pci_epf_free_space() in the error handle path to match pci_epf_alloc_space(). Link: https://lore.kernel.org/r/[email protected] Fixes: e35f56bb0330 ("PCI: endpoint: Support NTB transfer between RC and EP") Signed-off-by: Frank Li <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]>
2022-11-23PCI: endpoint: pci-epf-vntb: Fix struct epf_ntb_ctrl indentationFrank Li1-14/+14
Align the indentation of struct epf_ntb_ctrl with other structs in the driver. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Frank Li <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]>
2022-11-23PCI: endpoint: pci-epf-vntb: Clean up kernel_doc warningFrank Li1-29/+54
Cleanup warning found by scripts/kernel-doc. Consolidate terms: - host, host1 to HOST - vhost, vHost, Vhost, VHOST2 to VHOST Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Frank Li <[email protected]> Signed-off-by: Lorenzo Pieralisi <[email protected]> Acked-by: Manivannan Sadhasivam <[email protected]>