Age | Commit message (Collapse) | Author | Files | Lines |
|
- Make ibmphp read-only arrays static instead of putting them on the stack
(Colin Ian King)
* pci/hotplug:
PCI: ibmphp: Make read-only arrays static
|
|
Add support for voting interconnect (ICC) bandwidth based
on the link speed and width.
This commit is inspired from the basic interconnect support added
to pcie-qcom driver in commit c4860af88d0c ("PCI: qcom: Add basic
interconnect support").
The interconnect support is kept optional to be backward compatible
with legacy device trees.
[kwilczynski: add missing kernel-doc for the icc_mem variable]
Link: https://lore.kernel.org/linux-pci/[email protected]
Signed-off-by: Krishna chaitanya chundru <[email protected]>
Signed-off-by: Krzysztof Wilczyński <[email protected]>
Reviewed-by: Manivannan Sadhasivam <[email protected]>
|
|
Sometimes, the Qcom PCIe EP controller can receive some interrupts
unknown to the driver, like safety interrupts in newer SoCs. In those
cases, if the driver doesn't clear the interrupts, it will end up in an
interrupt storm. However, the users will not know about it because the
log is treated as a debug message.
So let's treat the unknown event log as an error so that it at least
makes the user aware, thereby getting fixed eventually.
[kwilczynski: commit log]
Link: https://lore.kernel.org/linux-pci/[email protected]
Signed-off-by: Manivannan Sadhasivam <[email protected]>
Signed-off-by: Krzysztof Wilczyński <[email protected]>
|
|
Add missing kernel-doc for pci_epc_mem_init() API.
Link: https://lore.kernel.org/linux-pci/[email protected]
Signed-off-by: Manivannan Sadhasivam <[email protected]>
Signed-off-by: Krzysztof Wilczyński <[email protected]>
Reviewed-by: Randy Dunlap <[email protected]>
|
|
For transfers below 4K, let's use iATU since using eDMA for such small
transfers is inefficient.
This is mainly because setting up an eDMA transfer and waiting for
completion adds some latency. This latency is negligible for large
transfers but not for the smaller ones.
With using iATU, there is an increase in ~50Mbps throughput on both MHI
UL (Uplink) and DL (Downlink) channels.
[kwilczynski: commit log]
Link: https://lore.kernel.org/linux-pci/[email protected]
Signed-off-by: Manivannan Sadhasivam <[email protected]>
Signed-off-by: Krzysztof Wilczyński <[email protected]>
|
|
Add support for Qualcomm Snapdragon SM8450 SoC to the EPF driver. SM8450
has the dedicated PID (0x0306) and supports eDMA. Currently, it has no
fixed PCI class, so it is being advertised as "PCI_CLASS_OTHERS".
Link: https://lore.kernel.org/linux-pci/[email protected]
Signed-off-by: Manivannan Sadhasivam <[email protected]>
Signed-off-by: Krzysztof Wilczyński <[email protected]>
|
|
Add support for Embedded DMA (eDMA) available in the DesignWare PCIe IP
to transfer the MHI buffers between the host and the endpoint. The eDMA
use helps achieve greater throughput as the transfers are offloaded from
CPUs.
For differentiating the iATU and eDMA APIs, the pci_epf_mhi_{read/write}
APIs are renamed to pci_epf_mhi_iatu_{read/write} and separate eDMA
specific APIs pci_epf_mhi_edma_{read/write} are introduced.
Platforms that require eDMA support can pass the MHI_EPF_USE_DMA flag
through pci_epf_mhi_ep_info.
[kwilczynski: commit log]
Link: https://lore.kernel.org/linux-pci/[email protected]
Signed-off-by: Manivannan Sadhasivam <[email protected]>
Signed-off-by: Krzysztof Wilczyński <[email protected]>
|
|
Qualcomm PCIe Endpoint controllers have the in-built Embedded DMA (eDMA)
peripheral for offloading the data transfer between the PCIe bus and
memory.
Let's add support for it by enabling the eDMA IRQ in the driver. The
eDMA DMA Engine driver will handle the rest of the functionality.
Since the eDMA on Qualcomm platforms only uses a single IRQ for all
channels, use 1 for edma.nr_irqs.
[kwilczynski: commit log]
Link: https://lore.kernel.org/linux-pci/[email protected]
Signed-off-by: Manivannan Sadhasivam <[email protected]>
Signed-off-by: Krzysztof Wilczyński <[email protected]>
|
|
Instead of hardcoding the alignment restriction in the EPF_MHI driver, make
use of the info available from the EPF core that reflects the alignment
restriction of the endpoint controller.
For this purpose, let's introduce the get_align_offset() static function.
[kwilczynski: update get_align_offset() to avoid issues on 32-bit architectures]
Link: https://lore.kernel.org/linux-pci/[email protected]
Link: https://lore.kernel.org/linux-pci/[email protected]
Signed-off-by: Manivannan Sadhasivam <[email protected]>
Signed-off-by: Krzysztof Wilczyński <[email protected]>
|
|
For a device with no Power Management Capability, pci_power_up() previously
returned 0 (success) if the platform was able to put the device in D0,
which led to pci_set_full_power_state() trying to read PCI_PM_CTRL, even
though it doesn't exist.
Since dev->pm_cap == 0 in this case, pci_set_full_power_state() actually
read the wrong register, interpreted it as PCI_PM_CTRL, and corrupted
dev->current_state. This led to messages like this in some cases:
pci 0000:01:00.0: Refused to change power state from D3hot to D0
To prevent this, make pci_power_up() always return a negative failure code
if the device lacks a Power Management Capability, even if non-PCI platform
power management has been able to put the device in D0. The failure will
prevent pci_set_full_power_state() from trying to access PCI_PM_CTRL.
Fixes: e200904b275c ("PCI/PM: Split pci_power_up()")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Feiyang Chen <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: "Rafael J. Wysocki" <[email protected]>
Cc: [email protected] # v5.19+
|
|
Add support for sa8775p SoC that uses controller version 5.90
reusing the 1.9.0 config.
Link: https://lore.kernel.org/linux-pci/[email protected]
Signed-off-by: Mrinmay Sarkar <[email protected]>
Signed-off-by: Krzysztof Wilczyński <[email protected]>
Reviewed-by: Bjorn Andersson <[email protected]>
Reviewed-by: Manivannan Sadhasivam <[email protected]>
|
|
Qcom PCIe EP controllers have 4K alignment restriction for the outbound
window address. Hence, pass this info to the EPF core so that the EPF
drivers can make use of this info.
Link: https://lore.kernel.org/linux-pci/[email protected]
Signed-off-by: Manivannan Sadhasivam <[email protected]>
Signed-off-by: Krzysztof Wilczyński <[email protected]>
|
|
Return early for errors in pcie_capability_clear_and_set_word_unlocked()
and pcie_capability_clear_and_set_dword() to simplify the control flow.
No functional change intended.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Ilpo Järvinen <[email protected]>
|
|
Update config space save/restore debug messages so they line up better.
Previously:
nvme 0000:05:00.0: saving config space at offset 0x4 (reading 0x20100006)
nvme 0000:05:00.0: saving config space at offset 0x8 (reading 0x1080200)
nvme 0000:05:00.0: saving config space at offset 0xc (reading 0x0)
nvme 0000:05:00.0: restoring config space at offset 0x4 (was 0x0, writing 0x20100006)
Now:
nvme 0000:05:00.0: save config 0x04: 0x20100006
nvme 0000:05:00.0: save config 0x08: 0x01080200
nvme 0000:05:00.0: save config 0x0c: 0x00000000
nvme 0000:05:00.0: restore config 0x04: 0x00000000 -> 0x20100006
No functional change intended. Enable these messages by setting
CONFIG_DYNAMIC_DEBUG=y and adding 'dyndbg="file drivers/pci/* +p"'
to kernel parameters.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Helgaas <[email protected]>
Acked-by: Rafael J. Wysocki <[email protected]>
|
|
Remove unnecessary "return;" in void functions and format consistently.
No functional change intended.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Ilpo Järvinen <[email protected]>
|
|
Fix typos in docs and comments.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Randy Dunlap <[email protected]>
Reviewed-by: Ilpo Järvinen <[email protected]>
|
|
Fix typos in the pci_bus_resetable() and pci_slot_resetable() function
names. No functional change intended.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Ilpo Järvinen <[email protected]>
|
|
Simplify pci_dev_driver() by removing the "else". The "if" case always
returns, so the "else" is superfluous. No functional change intended.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Ilpo Järvinen <[email protected]>
|
|
Simplify pci_pio_to_address() by removing an unnecessary local variable.
No functional change intended.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Ilpo Järvinen <[email protected]>
|
|
ACPI Platform Error Interfaces (APEI) convey error information to the OS.
If the APEI GHES driver receives information about PCI errors, it queues it
in aer_recover_ring for processing by the PCI AER code.
AER_RECOVER_RING_SIZE is the size of the aer_recover_ring FIFO and is
arbitrary, with no direct connection to hardware.
AER_RECOVER_RING_ORDER was only used to compute AER_RECOVER_RING_SIZE.
Remove it and define AER_RECOVER_RING_SIZE directly. No functional change
intended.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Ilpo Järvinen <[email protected]>
|
|
We used u8, u16, and u32 for get_user() pointer types, but "unsigned char",
"unsigned short", and "unsigned int" for put_user().
Use u8, u16, and u32 for put_user() for consistency. No functional change
intended.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Ilpo Järvinen <[email protected]>
|
|
Previously we used "%#08x" to print a 32-bit value. This fills an
8-character field with "0x...", but of course many 32-bit values require a
10-character field "0x12345678" for this format. Fix the formats to avoid
confusion.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Ilpo Järvinen <[email protected]>
|
|
We always assign "fields" immediately, so remove the unnecessary
initializations. No functional change intended.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Ilpo Järvinen <[email protected]>
|
|
pcie_port_bus_type is used only in pci-driver.c and pcie/portdrv_core.c and
pcie/portdrv_pci.c. None of these can be built as modules, so
pcie_port_bus_type doesn't need to be exported. Unexport it.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Ilpo Järvinen <[email protected]>
|
|
The busn member of struct mvebu_pcie is unused, so drop it.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Pali Rohár <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Ilpo Järvinen <[email protected]>
Cc: Thomas Petazzoni <[email protected]>
|
|
The following declarations have never been implemented since the beginning
of git history, so remove them:
u8 acpiphp_get_attention_status(struct acpiphp_slot *slot);
u8 cpci_get_latch_status(struct slot *slot);
u8 cpci_get_adapter_status(struct slot *slot);
int ibmphp_get_total_hp_slots(void);
void ibmphp_free_ibm_slot(struct slot *);
void pdev_enable_device(struct pci_dev *);
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Yue Haibing <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
|
|
Fix typos, rewrap to fill 78 columns, convert to conventional multi-line
style.
[bhelgaas: squash and add more fixes]
Link: https://lore.kernel.org/r/[email protected]
Link: https://lore.kernel.org/r/[email protected]
Link: https://lore.kernel.org/r/[email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sui Jingfeng <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
|
|
A comment says that Multi-MSI is not supported by the driver.
A past commit [1] added this feature, so the comment is
incorrect and is removed.
[1] commit 198acab1772f22f2 ("PCI: brcmstb: Enable Multi-MSI")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jim Quinlan <[email protected]>
Signed-off-by: Lorenzo Pieralisi <[email protected]>
Acked-by: Florian Fainelli <[email protected]>
|
|
The current PCIe driver assumes PERST# is asserted when probe() is invoked.
Some older versions of the 2711/RPi bootloader left PERST# unasserted, as
the Raspian OS does assert PERST# on probe(). For this reason, we assert
PERST# for BCM2711 SOCs (i.e. RPi).
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jim Quinlan <[email protected]>
Signed-off-by: Lorenzo Pieralisi <[email protected]>
|
|
Add PME_Turn_off/PME_TO_Ack handshake sequence for ls1028a platform.
Implemented on top of common dwc dw_pcie_suspend(resume)_noirq()
functions to handle system enter/exit suspend states.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Hou Zhiqiang <[email protected]>
Signed-off-by: Frank Li <[email protected]>
Signed-off-by: Lorenzo Pieralisi <[email protected]>
Acked-by: Manivannan Sadhasivam <[email protected]>
|
|
Introduce an helper function (dw_pcie_get_ltssm()) to retrieve
SMLH_LTSS_STATE.
Add common dw_pcie_suspend(resume)_noirq() API to implement the DWC
controller generic suspend/resume functionality.
Add a controller specific callback to send the PME_Turn_Off message
(ie .pme_turn_off) for controller platform specific PME handling.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Frank Li <[email protected]>
[[email protected]: commit log]
Signed-off-by: Lorenzo Pieralisi <[email protected]>
Acked-by: Manivannan Sadhasivam <[email protected]>
|
|
Add the PCIE_PME_TO_L2_TIMEOUT_US macro to define the L2 ready timeout
as described in the PCI specifications.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Frank Li <[email protected]>
Signed-off-by: Lorenzo Pieralisi <[email protected]>
Acked-by: Manivannan Sadhasivam <[email protected]>
|
|
The endpoint controller loses the Maximum Link Width and Supported Link Speed
value from the Link Capabilities Register - initially configured by the Reset
Configuration Word (RCW) - during a link-down or hot reset event.
Address this issue in the endpoint event handler.
Link: https://lore.kernel.org/r/[email protected]
Fixes: a805770d8a22 ("PCI: layerscape: Add EP mode support")
Signed-off-by: Xiaowei Bao <[email protected]>
Signed-off-by: Hou Zhiqiang <[email protected]>
Signed-off-by: Frank Li <[email protected]>
Signed-off-by: Lorenzo Pieralisi <[email protected]>
Acked-by: Manivannan Sadhasivam <[email protected]>
|
|
Add support to pass link-down notification to the endpoint function
driver so that it can process the LINK_DOWN event.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Frank Li <[email protected]>
Signed-off-by: Lorenzo Pieralisi <[email protected]>
Acked-by: Manivannan Sadhasivam <[email protected]>
|
|
Reorganize vga_client_register() to avoid the goto and the need to save the
return value. Update the kernel-doc to reflect -ENODEV on failure. No
functional change intended.
[bhelgaas: drop "ret" variable, commit log]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sui Jingfeng <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
|
|
In vga_arbiter_notify_clients(), "new_state" was computed during every loop
iteration even though it doesn't depend on anything that changes during the
loop. Move the computation outside the loop.
[bhelgaas: drop renames that obscure the purpose, commit log]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sui Jingfeng <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
|
|
Previously vga_update_device_decodes() took "int new_decodes", but the
callers pass "unsigned int new_decodes". Correct the
vga_update_device_decodes() parameter type to "unsigned int" to match.
In vga_arbiter_notify_clients(), the return from vgadev->set_decode() is
"unsigned int" but was stored as "uint32_t new_decodes". Correct the
new_decodes type to "unsigned int".
[bhelgaas: use correct type for ->set_decode() return, commit log]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sui Jingfeng <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
|
|
Previously vga_str_to_iostate() took "int *io_state", but vga_arb_write()
is the only caller and it passes "unsigned int *". Make the
vga_str_to_iostate() parameter type "unsigned int *" to match.
[bhelgaas: commit log]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Sui Jingfeng <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Andi Shyti <[email protected]>
Reviewed-by: Ilpo Järvinen <[email protected]>
|
|
The iMSI-RX module of the DW PCIe controller provides multiple sets of
MSI_CTRL_INT_i_* registers, and each set is capable of handling 32 MSI
interrupts. However, the fu740 PCIe controller driver only enabled one set
of MSI_CTRL_INT_i_* registers, as the total number of supported interrupts
was not specified.
Set the supported number of MSI vectors to enable all the MSI_CTRL_INT_i_*
registers on the fu740 PCIe core, allowing the system to fully utilize the
available MSI interrupts.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Yong-Xuan Wang <[email protected]>
Signed-off-by: Lorenzo Pieralisi <[email protected]>
Reviewed-by: Serge Semin <[email protected]>
|
|
pci_dt_testdrv is bound to QEMU PCI Test Device. It reads
overlay_pci_node fdt fragment and apply it to Test Device. Then it
calls of_platform_default_populate() to populate the platform
devices.
Tested-by: Herve Codina <[email protected]>
Signed-off-by: Lizhi Hou <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Rob Herring <[email protected]>
|
|
The Xilinx Alveo U50 PCI card exposes multiple hardware peripherals on
its PCI BAR. The card firmware provides a flattened device tree to
describe the hardware peripherals on its BARs. This allows U50 driver to
load the flattened device tree and generate the device tree node for
hardware peripherals underneath.
To generate device tree node for U50 card, add PCI quirks to call
of_pci_make_dev_node() for U50.
Acked-by: Bjorn Helgaas <[email protected]>
Signed-off-by: Lizhi Hou <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Rob Herring <[email protected]>
|
|
The PCI endpoint device such as Xilinx Alveo PCI card maps the register
spaces from multiple hardware peripherals to its PCI BAR. Normally,
the PCI core discovers devices and BARs using the PCI enumeration process.
There is no infrastructure to discover the hardware peripherals that are
present in a PCI device, and which can be accessed through the PCI BARs.
Apparently, the device tree framework requires a device tree node for the
PCI device. Thus, it can generate the device tree nodes for hardware
peripherals underneath. Because PCI is self discoverable bus, there might
not be a device tree node created for PCI devices. Furthermore, if the PCI
device is hot pluggable, when it is plugged in, the device tree nodes for
its parent bridges are required. Add support to generate device tree node
for PCI bridges.
Add an of_pci_make_dev_node() interface that can be used to create device
tree node for PCI devices.
Add a PCI_DYNAMIC_OF_NODES config option. When the option is turned on,
the kernel will generate device tree nodes for PCI bridges unconditionally.
Initially, add the basic properties for the dynamically generated device
tree nodes which include #address-cells, #size-cells, device_type,
compatible, ranges, reg.
Acked-by: Bjorn Helgaas <[email protected]>
Signed-off-by: Lizhi Hou <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Rob Herring <[email protected]>
|
|
When a Linux VM with an assigned PCI device runs on Hyper-V, if the PCI
device driver is not loaded yet (i.e. MSI-X/MSI is not enabled on the
device yet), doing a VM hibernation triggers a panic in
hv_pci_restore_msi_msg() -> msi_lock_descs(&pdev->dev), because
pdev->dev.msi.data is still NULL.
Avoid the panic by checking if MSI-X/MSI is enabled.
Link: https://lore.kernel.org/r/[email protected]
Fixes: dc2b453290c4 ("PCI: hv: Rework MSI handling")
Signed-off-by: Dexuan Cui <[email protected]>
Signed-off-by: Lorenzo Pieralisi <[email protected]>
Reviewed-by: [email protected]
Reviewed-by: Michael Kelley <[email protected]>
Cc: [email protected]
|
|
During domain reset process vmd_domain_reset() clears PCI
configuration space of VMD root ports. But certain platform
has observed following errors and failed to boot.
...
DMAR: VT-d detected Invalidation Queue Error: Reason f
DMAR: VT-d detected Invalidation Time-out Error: SID ffff
DMAR: VT-d detected Invalidation Completion Error: SID ffff
DMAR: QI HEAD: UNKNOWN qw0 = 0x0, qw1 = 0x0
DMAR: QI PRIOR: UNKNOWN qw0 = 0x0, qw1 = 0x0
DMAR: Invalidation Time-out Error (ITE) cleared
The root cause is that memset_io() clears prefetchable memory base/limit
registers and prefetchable base/limit 32 bits registers sequentially.
This seems to be enabling prefetchable memory if the device disabled
prefetchable memory originally.
Here is an example (before memset_io()):
PCI configuration space for 10000:00:00.0:
86 80 30 20 06 00 10 00 04 00 04 06 00 00 01 00
00 00 00 00 00 00 00 00 00 01 01 00 00 00 00 20
00 00 00 00 01 00 01 00 ff ff ff ff 75 05 00 00
...
So, prefetchable memory is ffffffff00000000-575000fffff, which is
disabled. When memset_io() clears prefetchable base 32 bits register,
the prefetchable memory becomes 0000000000000000-575000fffff, which is
enabled and incorrect.
Here is the quote from section 7.5.1.3.9 of PCI Express Base 6.0 spec:
The Prefetchable Memory Limit register must be programmed to a smaller
value than the Prefetchable Memory Base register if there is no
prefetchable memory on the secondary side of the bridge.
This is believed to be the reason for the failure and in addition the
sequence of operation in vmd_domain_reset() is not following the PCIe
specs.
Disable the bridge window by executing a sequence of operations
borrowed from pci_disable_bridge_window() and pci_setup_bridge_io(),
that comply with the PCI specifications.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Nirmal Patel <[email protected]>
Signed-off-by: Lorenzo Pieralisi <[email protected]>
|
|
When certain PHB HW failure causes pHyp to recover PHB, it marks the PE
state as temporarily unavailable until recovery is complete. This also
triggers an EEH handler in Linux which needs to notify drivers, and perform
recovery. But before notifying the driver about the PCI error it uses
get_adapter_status()->rpaphp_get_sensor_state()->rtas_call(get-sensor-state)
operation of the hotplug_slot to determine if the slot contains a device or
not. If the slot is empty, the recovery is skipped entirely.
eeh_event_handler()
->eeh_handle_normal_event()
->eeh_slot_presence_check()
->get_adapter_status()
->rpaphp_get_sensor_state()
->rtas_get_sensor()
->rtas_call(get-sensor-state)
However on certain PHB failures, the RTAS call rtas_call(get-sensor-state)
returns extended busy error (9902) until PHB is recovered by pHyp. Once PHB
is recovered, the rtas_call(get-sensor-state) returns success with correct
presence status. The RTAS call interface rtas_get_sensor() loops over the
RTAS call on extended delay return code (9902) until the return value is
either success (0) or error (-1). This causes the EEH handler to get stuck
for ~6 seconds before it could notify that the PCI error has been detected
and stop any active operations. Hence with running I/O traffic, during this
6 seconds, the network driver continues its operation and hits a timeout
(netdev watchdog).
------------
[52732.244731] DEBUG: ibm_read_slot_reset_state2()
[52732.244762] DEBUG: ret = 0, rets[0]=5, rets[1]=1, rets[2]=4000, rets[3]=>
[52732.244798] DEBUG: in eeh_slot_presence_check
[52732.244804] DEBUG: error state check
[52732.244807] DEBUG: Is slot hotpluggable
[52732.244810] DEBUG: hotpluggable ops ?
[52732.244953] DEBUG: Calling ops->get_adapter_status
[52732.244958] DEBUG: calling rpaphp_get_sensor_state
[52736.564262] ------------[ cut here ]------------
[52736.564299] NETDEV WATCHDOG: enP64p1s0f3 (tg3): transmit queue 0 timed o>
[52736.564324] WARNING: CPU: 1442 PID: 0 at net/sched/sch_generic.c:478 dev>
[...]
[52736.564505] NIP [c000000000c32368] dev_watchdog+0x438/0x440
[52736.564513] LR [c000000000c32364] dev_watchdog+0x434/0x440
------------
On timeouts, network driver starts dumping debug information to console
(e.g bnx2 driver calls bnx2x_panic_dump()), and go into recovery path while
pHyp is still recovering the PHB. As part of recovery, the driver tries to
reset the device and it keeps failing since every PCI read/write returns
ff's. And when EEH recovery kicks-in, the driver is unable to recover the
device. This impacts the ssh connection and leads to the system being
inaccessible. To get the NIC working again it needs a reboot or re-assign
the I/O adapter from HMC.
[ 9531.168587] EEH: Beginning: 'slot_reset'
[ 9531.168601] PCI 0013:01:00.0#10000: EEH: Invoking bnx2x->slot_reset()
[...]
[ 9614.110094] bnx2x: [bnx2x_func_stop:9129(enP19p1s0f0)]FUNC_STOP ramrod failed. Running a dry transaction
[ 9614.110300] bnx2x: [bnx2x_igu_int_disable:902(enP19p1s0f0)]BUG! Proper val not read from IGU!
[ 9629.178067] bnx2x: [bnx2x_fw_command:3055(enP19p1s0f0)]FW failed to respond!
[ 9629.178085] bnx2x 0013:01:00.0 enP19p1s0f0: bc 7.10.4
[ 9629.178091] bnx2x: [bnx2x_fw_dump_lvl:789(enP19p1s0f0)]Cannot dump MCP info while in PCI error
[ 9644.241813] bnx2x: [bnx2x_io_slot_reset:14245(enP19p1s0f0)]IO slot reset --> driver unload
[...]
[ 9644.241819] PCI 0013:01:00.0#10000: EEH: bnx2x driver reports: 'disconnect'
[ 9644.241823] PCI 0013:01:00.1#10000: EEH: Invoking bnx2x->slot_reset()
[ 9644.241827] bnx2x: [bnx2x_io_slot_reset:14229(enP19p1s0f1)]IO slot reset initializing...
[ 9644.241916] bnx2x 0013:01:00.1: enabling device (0140 -> 0142)
[ 9644.258604] bnx2x: [bnx2x_io_slot_reset:14245(enP19p1s0f1)]IO slot reset --> driver unload
[ 9644.258612] PCI 0013:01:00.1#10000: EEH: bnx2x driver reports: 'disconnect'
[ 9644.258615] EEH: Finished:'slot_reset' with aggregate recovery state:'disconnect'
[ 9644.258620] EEH: Unable to recover from failure from PHB#13-PE#10000.
[ 9644.261811] EEH: Beginning: 'error_detected(permanent failure)'
[...]
[ 9644.261823] EEH: Finished:'error_detected(permanent failure)'
Hence, it becomes important to inform driver about the PCI error detection
as early as possible, so that driver is aware of PCI error and waits for
EEH handler's next action for successful recovery.
Current implementation uses rtas_get_sensor() API which blocks the slot
check state until RTAS call returns success. To avoid this, fix the PCI
hotplug driver (rpaphp) to return an error (-EBUSY) if the slot presence
state can not be detected immediately while PE is in EEH recovery state.
Change rpaphp_get_sensor_state() to invoke rtas_call(get-sensor-state)
directly only if the respective PE is in EEH recovery state, and take
actions based on RTAS return status. This way EEH handler will not be
blocked on rpaphp_get_sensor_state() and can immediately notify driver
about the PCI error and stop any active operations.
In normal cases (non-EEH case) rpaphp_get_sensor_state() will continue to
invoke rtas_get_sensor() as it was earlier with no change in existing
behavior.
Signed-off-by: Mahesh Salgaonkar <[email protected]>
Reviewed-by: Nathan Lynch <[email protected]>
Acked-by: Bjorn Helgaas <[email protected]>
Signed-off-by: Michael Ellerman <[email protected]>
Link: https://msgid.link/169235815601.193557.13989873835811325343.stgit@jupiter
|
|
When we have a struct pci_dev *, use pci_dev_id() instead of manually
composing the ID from dev->bus->number and dev->devfn.
[bhelgaas: commit log]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Zheng Zengkai <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Reviewed-by: Logan Gunthorpe <[email protected]>
|
|
Testing that a device is not currently in a low power state provides no
guarantees that the device is not imminently transitioning to such a state.
Increment the PM usage counter before accessing the device. Since we don't
wish to wake the device for PME polling, do so only if the device is
already active by using pm_runtime_get_if_active().
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alex Williamson <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
|
|
Unlike default access to config space through sysfs, the VPD read and write
functions don't actively manage the runtime power management state of the
device during access. Since commit 7ab5e10eda02 ("vfio/pci: Move the
unused device into low power state with runtime PM"), the vfio-pci driver
will use runtime power management and release unused devices to make use of
low power states. Attempting to access VPD information in D3cold can
result in incorrect information or kernel crashes depending on the system
behavior.
Wrap the VPD read/write bin attribute handlers in runtime PM and take into
account the potential quirk to select the correct device to wake.
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alex Williamson <[email protected]>
[bhelgaas: tweak pci_dev_put() test to match the pci_get_func0_dev() test]
Signed-off-by: Bjorn Helgaas <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull pci fixes from Bjorn Helgaas:
- Add Manivannan Sadhasivam as DesignWare PCIe driver co-maintainer
(Krzysztof Wilczyński)
- Revert "PCI: dwc: Wait for link up only if link is started" to fix a
regression on Qualcomm platforms that don't reach interconnect sync
state if the slot is empty (Johan Hovold)
- Revert "PCI: mvebu: Mark driver as BROKEN" so people can use
pci-mvebu even though some others report problems (Bjorn Helgaas)
- Avoid a NULL pointer dereference when using acpiphp for root bus
hotplug to fix a regression added in v6.5-rc1 (Igor Mammedov)
* tag 'pci-v6.5-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
PCI: acpiphp: Use pci_assign_unassigned_bridge_resources() only for non-root bus
Revert "PCI: mvebu: Mark driver as BROKEN"
Revert "PCI: dwc: Wait for link up only if link is started"
MAINTAINERS: Add Manivannan Sadhasivam as DesignWare PCIe driver maintainer
|
|
Don't assume that the device is fully under the control of ASPM and use RMW
capability accessors which do proper locking to avoid losing concurrent
updates to the register values.
If configuration fails in pcie_aspm_configure_common_clock(), the
function attempts to restore the old PCI_EXP_LNKCTL_CCC settings. Store
only the old PCI_EXP_LNKCTL_CCC bit for the relevant devices rather
than the content of the whole LNKCTL registers. It aligns better with
how pcie_lnkctl_clear_and_set() expects its parameter and makes the
code more obvious to understand.
Suggested-by: Lukas Wunner <[email protected]>
Fixes: 2a42d9dba784 ("PCIe: ASPM: Break out of endless loop waiting for PCI config bits to switch")
Fixes: 7d715a6c1ae5 ("PCI: add PCI Express ASPM support")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Ilpo Järvinen <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
Acked-by: "Rafael J. Wysocki" <[email protected]>
|