Age | Commit message (Collapse) | Author | Files | Lines |
|
priv->cbs is an array of priv->info->num_cbs_shapers elements of type
struct sja1105_cbs_entry which only get allocated if CONFIG_NET_SCH_CBS
is enabled.
However, sja1105_reload_cbs() is called from sja1105_static_config_reload()
which in turn is called for any of the items in sja1105_reset_reasons,
therefore during the normal runtime of the driver and not just from a
code path which can be triggered by the tc-cbs offload.
The sja1105_reload_cbs() function does not contain a check whether the
priv->cbs array is NULL or not, it just assumes it isn't and proceeds to
iterate through the credit-based shaper elements. This leads to a NULL
pointer dereference.
The solution is to return success if the priv->cbs array has not been
allocated, since sja1105_reload_cbs() has nothing to do.
Fixes: 4d7525085a9b ("net: dsa: sja1105: offload the Credit-Based Shaper qdisc")
Signed-off-by: Vladimir Oltean <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Simply get a pointer to the data in the register payload instead of
copying it to a temporary buffer.
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan
Stefan Schmidt says:
====================
pull-request: ieee802154 for net 2021-06-24
An update from ieee802154 for your *net* tree.
This time we only have fixes for ieee802154 hwsim driver.
Sparked from some syzcaller reports We got a potential
crash fix from Eric Dumazet and two memory leak fixes from
Dongliang Mu.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
https://patchwork.kernel.org/project/netdevbpf/list/?series=506637&state=*
- Remove unused variable
- Use correct integer type for string formatting.
- Remove `inline` in C files
Fixes: 9c1a59a2f4bc ("gve: DQO: Add ring allocation and initialization")
Fixes: a57e5de476be ("gve: DQO: Add TX path")
Signed-off-by: Bailey Forrest <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Complete to commit def4ec6dce393e ("e1000e: PCIm function state support")
Check the PCIm state only on CSME systems. There is no point to do this
check on non CSME systems.
This patch fixes a generation a false-positive warning:
"Error in exiting dmoff"
Fixes: def4ec6dce39 ("e1000e: PCIm function state support")
Signed-off-by: Sasha Neftin <[email protected]>
Tested-by: Dvora Fuxbrumer <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Mention support for the SJA1110 in menuconfig.
Signed-off-by: Vladimir Oltean <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2021-06-24
This series contains updates to i40e driver only.
Dinghao Liu corrects error handling for failed i40e_vsi_request_irq()
call.
Mateusz allows for disabling of autonegotiation for all BaseT media.
Jesse corrects the multiplier being used on 5Gb speeds for PTP.
Jan adds locking in paths calling i40e_setup_pf_switch() that were
missing it.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
The RX queue has an array of `gve_rx_buf_state_dqo` objects. All
allocated pages have an associated buf_state object. When a buffer is
posted on the RX buffer queue, the buffer ID will be the buf_state's
index into the RX queue's array.
On packet reception, the RX queue will have one descriptor for each
buffer associated with a received packet. Each RX descriptor will have
a buffer_id that was posted on the buffer queue.
Notable mentions:
- We use a default buffer size of 2048 bytes. Based on page size, we
may post separate sections of a single page as separate buffers.
- The driver holds an extra reference on pages passed up the receive
path with an skb and keeps these pages on a list. When posting new
buffers to the NIC, we check if any of these pages has only our
reference, or another buffer sized segment of the page has no
references. If so, it is free to reuse. This page recycling approach
is a common netdev optimization that reduces page alloc/free calls.
- Pages in the free list have a page_count bias in order to avoid an
atomic increment of pagecount every time we attempt to reuse a page.
# references = page_count() - bias
- In order to track when a page is safe to reuse, we keep track of the
last offset which had a single SKB reference. When this occurs, it
implies that every single other offset is reusable. Otherwise, we
don't know if offsets can be safely reused.
- We maintain two free lists of pages. List #1 (recycled_buf_states)
contains pages we know can be reused right away. List #2
(used_buf_states) contains pages which cannot be used right away. We
only attempt to get pages from list #2 when list #1 is empty. We only
attempt to use a small fixed number pages from list #2 before giving
up and allocating a new page. Both lists are FIFOs in hope that by the
time we attempt to reuse a page, the references were dropped.
Signed-off-by: Bailey Forrest <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Reviewed-by: Catherine Sullivan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
TX SKBs will have their buffers DMA mapped with the device. Each buffer
will have at least one TX descriptor associated. Each SKB will also have
a metadata descriptor.
Each TX queue maintains an array of `gve_tx_pending_packet_dqo` objects.
Every TX SKB will have an associated pending_packet object. A TX SKB's
descriptors will use its pending_packet's index as the completion tag,
which will be returned on the TX completion queue.
The device implements a "flow-miss model". Most packets will simply
receive a packet completion. The flow-miss system may choose to process
a packet based on its contents. A TX packet which experiences a flow
miss would receive a miss completion followed by a later reinjection
completion. The miss-completion is received when the packet starts to be
processed by the flow-miss system and the reinjection completion is
received when the flow-miss system completes processing the packet and
sends it on the wire.
Notable mentions:
- Buffers may be freed after receiving the miss-completion, but in order
to avoid packet reordering, we do not complete the SKB until receiving
the reinjection completion.
- The driver must robustly handle the unlikely scenario where a miss
completion does not have an associated reinjection completion. This is
accomplished by maintaining a list of packets which have a pending
reinjection completion. After a short timeout (5 seconds), the
SKB and buffers are released and the pending_packet is moved to a
second list which has a longer timeout (60 seconds), where the
pending_packet will not be reused. When the longer timeout elapses,
the driver may assume the reinjection completion would never be
received and the pending_packet may be reused.
- Completion handling is triggered by an interrupt and is done in the
NAPI poll function. Because the TX path and completion exist in
different threading contexts they maintain their own lists for free
pending_packet objects. The TX path uses a lock-free approach to steal
the list from the completion path.
- Both the TSO context and general context descriptors have metadata
bytes. The device requires that if multiple descriptors contain the
same field, each descriptor must have the same value set for that
field.
Signed-off-by: Bailey Forrest <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Reviewed-by: Catherine Sullivan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
When interrupts are first enabled, we also set the ratelimits, which
will be static for the entire usage of the device.
Signed-off-by: Bailey Forrest <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Reviewed-by: Catherine Sullivan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Allocate the buffer and completion ring structures. Do not populate the
rings yet. That will happen in the respective rx and tx datapath
follow-on patches
Signed-off-by: Bailey Forrest <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Reviewed-by: Catherine Sullivan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add napi netdev device registration, interrupt handling and initial tx
and rx polling stubs. The stubs will be filled in follow-on patches.
Also:
- LRO feature advertisement and handling
- Also update ethtool logic
Signed-off-by: Bailey Forrest <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Reviewed-by: Catherine Sullivan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
DQO queue creation requires additional parameters:
- TX completion/RX buffer queue size
- TX completion/RX buffer queue address
- TX/RX queue size
- RX buffer size
Signed-off-by: Bailey Forrest <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Reviewed-by: Catherine Sullivan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
- Add new DQO datapath structures:
- `gve_rx_buf_queue_dqo`
- `gve_rx_compl_queue_dqo`
- `gve_rx_buf_state_dqo`
- `gve_tx_desc_dqo`
- `gve_tx_pending_packet_dqo`
- Incorporate these into the existing ring data structures:
- `gve_rx_ring`
- `gve_tx_ring`
Noteworthy mentions:
- `gve_rx_buf_state` represents an RX buffer which was posted to HW.
Each RX queue has an array of these objects and the index into the
array is used as the buffer_id when posted to HW.
- `gve_tx_pending_packet_dqo` is treated similarly for TX queues. The
completion_tag is the index into the array.
- These two structures have links for linked lists which are represented
by 16b indexes into a contiguous array of these structures.
This reduces memory footprint compared to 64b pointers.
- We use unions for the writeable datapath structures to reduce cache
footprint. GQI specific members will renamed like DQO members in a
future patch.
Signed-off-by: Bailey Forrest <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Reviewed-by: Catherine Sullivan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
General description of rings and descriptors:
TX ring is used for sending TX packet buffers to the NIC. It has the
following descriptors:
- `gve_tx_pkt_desc_dqo` - Data buffer descriptor
- `gve_tx_tso_context_desc_dqo` - TSO context descriptor
- `gve_tx_general_context_desc_dqo` - Generic metadata descriptor
Metadata is a collection of 12 bytes. We define `gve_tx_metadata_dqo`
which represents the logical interpetation of the metadata bytes. It's
helpful to define this structure because the metadata bytes exist in
multiple descriptor types (including `gve_tx_tso_context_desc_dqo`),
and the device requires same field has the same value in all
descriptors.
The TX completion ring is used to receive completions from the NIC.
Having a separate ring allows for completions to be out of order. The
completion descriptor `gve_tx_compl_desc` has several different types,
most important are packet and descriptor completions. Descriptor
completions are used to notify the driver when descriptors sent on the
TX ring are done being consumed. The descriptor completion is only used
to signal that space is cleared in the TX ring. A packet completion will
be received when a packet transmitted on the TX queue is done being
transmitted.
In addition there are "miss" and "reinjection" completions. The device
implements a "flow-miss model". Most packets will simply receive a
packet completion. The flow-miss system may choose to process a packet
based on its contents. A TX packet which experiences a flow miss would
receive a miss completion followed by a later reinjection completion.
The miss-completion is received when the packet starts to be processed
by the flow-miss system and the reinjection completion is received when
the flow-miss system completes processing the packet and sends it on the
wire.
The RX buffer ring is used to send buffers to HW via the
`gve_rx_desc_dqo` descriptor.
Received packets are put into the RX queue by the device, which
populates the `gve_rx_compl_desc_dqo` descriptor. The RX descriptors
refer to buffers posted by the buffer queue. Received buffers may be
returned out of order, such as when HW LRO is enabled.
Important concepts:
- "TX" and "RX buffer" queues, which send descriptors to the device, use
MMIO doorbells to notify the device of new descriptors.
- "RX" and "TX completion" queues, which receive descriptors from the
device, use a "generation bit" to know when a descriptor was populated
by the device. The driver initializes all bits with the "current
generation". The device will populate received descriptors with the
"next generation" which is inverted from the current generation. When
the ring wraps, the current/next generation are swapped.
- It's the driver's responsibility to ensure that the RX and TX
completion queues are not overrun. This can be accomplished by
limiting the number of descriptors posted to HW.
- TX packets have a 16 bit completion_tag and RX buffers have a 16 bit
buffer_id. These will be returned on the TX completion and RX queues
respectively to let the driver know which packet/buffer was completed.
Bitfields are used to describe descriptor fields. This notation is more
concise and readable than shift-and-mask. It is possible because the
driver is restricted to little endian platforms.
Signed-off-by: Bailey Forrest <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Reviewed-by: Catherine Sullivan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Unlike GQI, DQO RX descriptors do not contain the L3 and L4 type of the
packet. L3 and L4 types are necessary in order to set the hash and csum
on RX SKBs correctly.
DQO RX descriptors instead contain a 10 bit PTYPE index. The PTYPE map
enables the device to tell the driver how to map from PTYPE index to
L3/L4 type.
The device doesn't provide any guarantees about the range of possible
PTYPEs, so we just use a 1024 entry array to implement a fast mapping
structure.
Signed-off-by: Bailey Forrest <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Reviewed-by: Catherine Sullivan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
- In addition to TX and RX queues, DQO has TX completion and RX buffer
queues.
- TX completions are received when the device has completed sending a
packet on the wire.
- RX buffers are posted on a separate queue form the RX completions.
- DQO descriptor rings are allowed to be smaller than PAGE_SIZE.
Signed-off-by: Bailey Forrest <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Reviewed-by: Catherine Sullivan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The currently supported queue formats are:
- GQI_RDA - GQI with raw DMA addressing
- GQI_QPL - GQI with queue page list
- DQO_RDA - DQO with raw DMA addressing
The old `gve_priv.raw_addressing` value is only used for GQI_RDA, so we
remove it in favor of just checking against GQI_RDA
Signed-off-by: Bailey Forrest <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Reviewed-by: Catherine Sullivan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The current model uses an integer ID and a fixed size struct for the
parameters of each device option.
The new model allows the device option structs to grow in size over
time. A driver may assume that changes to device option structs will
always be appended.
New device options will also generally have a
`supported_features_mask` so that the driver knows which fields within a
particular device option are enabled.
`gve_device_option.feat_mask` is changed to `required_features_mask`,
and it is a bitmask which must match the value expected by the driver.
This gives the device the ability to break backwards compatibility with
old drivers for certain features by blocking the old drivers from trying
to use the feature.
We maintain ABI compatibility with the old model for
GVE_DEV_OPT_ID_RAW_ADDRESSING in case a driver is using a device which
does not support the new model.
This patch introduces some new terminology:
RDA - Raw DMA Addressing - Buffers associated with SKBs are directly DMA
mapped and read/updated by the device.
QPL - Queue Page Lists - Driver uses bounce buffers which are DMA mapped
with the device for read/write and data is copied from/to SKBs.
Signed-off-by: Bailey Forrest <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Reviewed-by: Catherine Sullivan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Using `page_offset` like a boolean means a page may only be split into
two sections. With page sizes larger than 4k, this can be very wasteful.
Future commits in this patchset use `struct gve_rx_slot_page_info` in a
way which supports a fixed buffer size and a variable page size.
Signed-off-by: Bailey Forrest <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Reviewed-by: Catherine Sullivan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Future use cases will have a different padding value.
Signed-off-by: Bailey Forrest <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Reviewed-by: Catherine Sullivan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
These functions will be shared by the GQI and DQO variants of the GVNIC
driver as of follow-up patches in this series.
Signed-off-by: Bailey Forrest <[email protected]>
Reviewed-by: Willem de Bruijn <[email protected]>
Reviewed-by: Catherine Sullivan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The key length used to store the macsec key was set to MACSEC_KEYID_LEN
(16), which is an issue as:
- This was never meant to be the key length.
- The key length can be > 16.
Fix this by using MACSEC_MAX_KEY_LEN instead (the max length accepted in
uAPI).
Fixes: 27736563ce32 ("net: atlantic: MACSec egress offload implementation")
Fixes: 9ff40a751a6f ("net: atlantic: MACSec ingress offload implementation")
Reported-by: Lior Nahmanson <[email protected]>
Signed-off-by: Antoine Tenart <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The key length used to store the macsec key was set to MACSEC_KEYID_LEN
(16), which is an issue as:
- This was never meant to be the key length.
- The key length can be > 16.
Fix this by using MACSEC_MAX_KEY_LEN instead (the max length accepted in
uAPI).
Fixes: 28c5107aa904 ("net: phy: mscc: macsec support")
Reported-by: Lior Nahmanson <[email protected]>
Signed-off-by: Antoine Tenart <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The key length used when offloading macsec to Ethernet or PHY drivers
was set to MACSEC_KEYID_LEN (16), which is an issue as:
- This was never meant to be the key length.
- The key length can be > 16.
Fix this by using MACSEC_MAX_KEY_LEN to store the key (the max length
accepted in uAPI) and secy->key_len to copy it.
Fixes: 3cf3227a21d1 ("net: macsec: hardware offloading infrastructure")
Reported-by: Lior Nahmanson <[email protected]>
Signed-off-by: Antoine Tenart <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Modify the netdev_dbg content from int to char * in usbnet_defer_kevent(),
this looks more readable.
Signed-off-by: Yajun Deng <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This adds statistic counters for the network interfaces provided
by the driver. It also adds CPU port counters (which are not
exposed by ethtool).
This also adds support for configuring the network interface
parameters via ethtool: speed, duplex, aneg etc.
Signed-off-by: Steen Hegelund <[email protected]>
Signed-off-by: Bjarni Jonasson <[email protected]>
Signed-off-by: Lars Povlsen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This configures the Sparx5 calendars according to the bandwidth
requested in the Device Tree nodes.
It also checks if the total requested bandwidth is within the
specs of the detected Sparx5 models limits.
Signed-off-by: Steen Hegelund <[email protected]>
Signed-off-by: Bjarni Jonasson <[email protected]>
Signed-off-by: Lars Povlsen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This adds SwitchDev support by hardware offloading the
software bridge.
Signed-off-by: Steen Hegelund <[email protected]>
Signed-off-by: Bjarni Jonasson <[email protected]>
Signed-off-by: Lars Povlsen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This adds Sparx5 VLAN support.
Sparx5 has more VLAN features than provided here, but these will be added
in later series. For now we only add the basic L2 features.
Signed-off-by: Steen Hegelund <[email protected]>
Signed-off-by: Bjarni Jonasson <[email protected]>
Signed-off-by: Lars Povlsen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This adds the Sparx5 MAC tables: listening for MAC table updates and
updating on request.
Signed-off-by: Steen Hegelund <[email protected]>
Signed-off-by: Bjarni Jonasson <[email protected]>
Signed-off-by: Lars Povlsen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This add configuration of the Sparx5 port module instances.
Sparx5 has in total 65 logical ports (denoted D0 to D64) and 33
physical SerDes connections (S0 to S32). The 65th port (D64) is fixed
allocated to SerDes0 (S0). The remaining 64 ports can in various
multiplexing scenarios be connected to the remaining 32 SerDes using
QSGMII, or USGMII or USXGMII extenders. 32 of the ports can have a 1:1
mapping to the 32 SerDes.
Some additional ports (D65 to D69) are internal to the device and do not
connect to port modules or SerDes macros. For example, internal ports are
used for frame injection and extraction to the CPU queues.
The 65 logical ports are split up into the following blocks.
- 13 x 5G ports (D0-D11, D64)
- 32 x 2G5 ports (D16-D47)
- 12 x 10G ports (D12-D15, D48-D55)
- 8 x 25G ports (D56-D63)
Each logical port supports different line speeds, and depending on the
speeds supported, different port modules (MAC+PCS) are needed. A port
supporting 5 Gbps, 10 Gbps, or 25 Gbps as maximum line speed, will have a
DEV5G, DEV10G, or DEV25G module to support the 5 Gbps, 10 Gbps (incl 5
Gbps), or 25 Gbps (including 10 Gbps and 5 Gbps) speeds. As well as, it
will have a shadow DEV2G5 port module to support the lower speeds
(10/100/1000/2500Mbps). When a port needs to operate at lower speed and the
shadow DEV2G5 needs to be connected to its corresponding SerDes
Not all interface modes are supported in this series, but will be added at
a later stage.
Signed-off-by: Steen Hegelund <[email protected]>
Signed-off-by: Bjarni Jonasson <[email protected]>
Signed-off-by: Lars Povlsen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This patch adds netdevs and phylink support for the ports in the switch.
It also adds register based injection and extraction for these ports.
Frame DMA support for injection and extraction will be added in a later
series.
Signed-off-by: Steen Hegelund <[email protected]>
Signed-off-by: Bjarni Jonasson <[email protected]>
Signed-off-by: Lars Povlsen <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This adds the Sparx5 basic SwitchDev driver framework with IO range
mapping, switch device detection and core clock configuration.
Support for ports, phylink, netdev, mactable etc. are in the following
patches.
Signed-off-by: Steen Hegelund <[email protected]>
Signed-off-by: Bjarni Jonasson <[email protected]>
Signed-off-by: Lars Povlsen <[email protected]>
Reviewed-by: Philipp Zabel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2021-06-24
this is a pull request of 2 patches for net/master.
The first patch is by Norbert Slusarek and prevent allocation of
filter for optlen == 0 in the j1939 CAN protocol.
The last patch is by Stephane Grosjean and fixes a potential
starvation in the TX path of the peak_pciefd driver.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
Parenthesize a check to be more explicit and to fix a sparse warning
seen on some distros.
Fixes: 91dc5d2553fbf ("ibmvnic: fix miscellaneous checks")
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Free tx_pool and clear it, if allocation of tso_pool fails.
release_tx_pools() assumes we have both tx and tso_pools if ->tx_pool is
non-NULL. If allocation of tso_pool fails in init_tx_pools(), the assumption
will not be true and we would end up dereferencing ->tx_buff, ->free_map
fields from a NULL pointer.
Fixes: 3205306c6b8d ("ibmvnic: Update TX pool initialization routine")
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
free_long_term_buff() checks ltb->buff to decide whether we have a long
term buffer to free. So set ltb->buff to NULL afer freeing. While here,
also clear ->map_id, fix up some coding style and log an error.
Fixes: 9c4eaabd1bb39 ("Check CRQ command return codes")
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This fixes a crash in replenish_rx_pool() when called from ibmvnic_poll()
after a previous call to replenish_rx_pool() encountered an error when
allocating a socket buffer.
Thanks to Rick Lindsley and Dany Madden for helping debug the crash.
Fixes: 4f0b6812e9b9 ("ibmvnic: Introduce batched RX buffer descriptor transmission")
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
We batch subordinate command response queue (scrq) descriptors that we
need to send to the VIOS using an "indirect" buffer. If after we queue
one or more scrqs in the indirect buffer encounter an error (say fail
to allocate an skb), we leave the queued scrq descriptors in the
indirect buffer until the next call to ibmvnic_xmit().
On the next call to ibmvnic_xmit(), it is possible that the adapter is
going through a reset and it is possible that the long term buffers
have been unmapped on the VIOS side. If we proceed to flush (send) the
packets that are in the indirect buffer, we will end up using the old
map ids and this can cause the VIOS to trigger an unnecessary FATAL
error reset.
Instead of flushing packets remaining on the indirect_buff, discard
(clean) them instead.
Fixes: 0d973388185d4 ("ibmvnic: Introduce xmit_more support using batched subCRQ hcalls")
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This reverts commit 7c451f3ef676c805a4b77a743a01a5c21a250a73.
When a vnic interface is taken down and then up, connectivity is not
restored. We bisected it to this commit. Reverting this commit until
we can fully investigate the issue/benefit of the change.
Fixes: 7c451f3ef676 ("ibmvnic: remove duplicate napi_schedule call in open function")
Reported-by: Cristobal Forno <[email protected]>
Reported-by: Abdul Haleem <[email protected]>
Signed-off-by: Dany Madden <[email protected]>
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This reverts commit 1c7d45e7b2c29080bf6c8cd0e213cc3cbb62a054.
We tried to optimize the number of hcalls we send and skipped sending
the REQUEST_MAP calls for some maps. However during resets, we need to
resend all the maps to the VIOS since the VIOS does not remember the
old values. In fact we may have failed over to a new VIOS which will
not have any of the mappings.
When we send packets with map ids the VIOS does not know about, it
triggers a FATAL reset. While the client does recover from the FATAL
error reset, we are seeing a large number of such resets. Handling
FATAL resets is lot more unnecessary work than issuing a few more
hcalls so revert the commit and resend the maps to the VIOS.
Fixes: 1c7d45e7b2c ("ibmvnic: simplify reset_long_term_buff function")
Signed-off-by: Sukadev Bhattiprolu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
A recent change that made i40e use new udp_tunnel infrastructure
uses a method that expects to be called under rtnl lock.
However, not all codepaths do the lock prior to calling
i40e_setup_pf_switch.
Fix that by adding additional rtnl locking and unlocking.
Fixes: 40a98cb6f01f ("i40e: convert to new udp_tunnel infrastructure")
Signed-off-by: Jan Sokolowski <[email protected]>
Signed-off-by: Mateusz Palczewski <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
As reported by Alex Sergeev, the i40e driver is incrementing the PTP
clock at 40Gb speeds when linked at 5Gb. Fix this bug by making
sure that the right multiplier is selected when linked at 5Gb.
Fixes: 3dbdd6c2f70a ("i40e: Add support for 5Gbps cards")
Cc: [email protected]
Reported-by: Alex Sergeev <[email protected]>
Suggested-by: Alex Sergeev <[email protected]>
Signed-off-by: Jesse Brandeburg <[email protected]>
Tested-by: Tony Brelinski <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
|
|
The cpsw driver has rcu_read_lock()/rcu_read_unlock() pairs around XDP
program invocations. However, the actual lifetime of the objects referred
by the XDP program invocation is longer, all the way through to the call to
xdp_do_flush(), making the scope of the rcu_read_lock() too small. This
turns out to be harmless because it all happens in a single NAPI poll
cycle (and thus under local_bh_disable()), but it makes the rcu_read_lock()
misleading.
Rather than extend the scope of the rcu_read_lock(), just get rid of it
entirely. With the addition of RCU annotations to the XDP_REDIRECT map
types that take bh execution into account, lockdep even understands this to
be safe, so there's really no reason to keep it around.
Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Tested-by: Grygorii Strashko <[email protected]>
Reviewed-by: Grygorii Strashko <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/bpf/[email protected]
|
|
The stmmac driver has rcu_read_lock()/rcu_read_unlock() pairs around XDP
program invocations. However, the actual lifetime of the objects referred
by the XDP program invocation is longer, all the way through to the call to
xdp_do_flush(), making the scope of the rcu_read_lock() too small. This
turns out to be harmless because it all happens in a single NAPI poll
cycle (and thus under local_bh_disable()), but it makes the rcu_read_lock()
misleading.
Rather than extend the scope of the rcu_read_lock(), just get rid of it
entirely. With the addition of RCU annotations to the XDP_REDIRECT map
types that take bh execution into account, lockdep even understands this to
be safe, so there's really no reason to keep it around.
Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Cc: Giuseppe Cavallaro <[email protected]>
Cc: Alexandre Torgue <[email protected]>
Cc: Jose Abreu <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
The netsec driver has a rcu_read_lock()/rcu_read_unlock() pair around the
full RX loop, covering everything up to and including xdp_do_flush(). This
is actually the correct behaviour, but because it all happens in a single
NAPI poll cycle (and thus under local_bh_disable()), it is also technically
redundant.
With the addition of RCU annotations to the XDP_REDIRECT map types that
take bh execution into account, lockdep even understands this to be safe,
so there's really no reason to keep the rcu_read_lock() around anymore, so
let's just remove it.
Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Ilias Apalodimas <[email protected]>
Cc: Jassi Brar <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
The sfc driver has rcu_read_lock()/rcu_read_unlock() pairs around XDP
program invocations. However, the actual lifetime of the objects referred
by the XDP program invocation is longer, all the way through to the call to
xdp_do_flush(), making the scope of the rcu_read_lock() too small. This
turns out to be harmless because it all happens in a single NAPI poll
cycle (and thus under local_bh_disable()), but it makes the rcu_read_lock()
misleading.
Rather than extend the scope of the rcu_read_lock(), just get rid of it
entirely. With the addition of RCU annotations to the XDP_REDIRECT map
types that take bh execution into account, lockdep even understands this to
be safe, so there's really no reason to keep it around.
Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Edward Cree <[email protected]>
Cc: Martin Habets <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
The qede driver has rcu_read_lock()/rcu_read_unlock() pairs around XDP
program invocations. However, the actual lifetime of the objects referred
by the XDP program invocation is longer, all the way through to the call to
xdp_do_flush(), making the scope of the rcu_read_lock() too small. This
turns out to be harmless because it all happens in a single NAPI poll
cycle (and thus under local_bh_disable()), but it makes the rcu_read_lock()
misleading.
Rather than extend the scope of the rcu_read_lock(), just get rid of it
entirely. With the addition of RCU annotations to the XDP_REDIRECT map
types that take bh execution into account, lockdep even understands this to
be safe, so there's really no reason to keep it around.
Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Cc: Ariel Elior <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/bpf/[email protected]
|
|
The nfp driver has rcu_read_lock()/rcu_read_unlock() pairs around XDP
program invocations. However, the actual lifetime of the objects referred
by the XDP program invocation is longer, all the way through to the call to
xdp_do_flush(), making the scope of the rcu_read_lock() too small.
While this is not actually an issue for the nfp driver because it doesn't
support XDP_REDIRECT (and thus doesn't call xdp_do_flush()), the
rcu_read_lock() is still unneeded. And With the addition of RCU annotations
to the XDP_REDIRECT map types that take bh execution into account, lockdep
even understands this to be safe, so there's really no reason to keep it
around.
Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/bpf/[email protected]
|