Age | Commit message (Collapse) | Author | Files | Lines |
|
ocelot_init() allocates memory, resets the switch and polls for a status
register, things which can fail. Stop probing the driver in that case,
and propagate the error result.
Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Horatiu Vultur <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Tested-by: Alexandre Belloni <[email protected]>
Reviewed-by: Alexandre Belloni <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Do not proceed probing if we couldn't allocate memory for the ports
array, just error out.
Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Horatiu Vultur <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Tested-by: Alexandre Belloni <[email protected]>
Reviewed-by: Alexandre Belloni <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The VSC9953 Seville switch has 2 megabits of buffer split into 4360
words of 60 bytes each.
Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Horatiu Vultur <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Tested-by: Alexandre Belloni <[email protected]>
Reviewed-by: Alexandre Belloni <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The ocelot_port->ts_id is used to:
(a) populate skb->cb[0] for matching the TX timestamp in the PTP IRQ
with an skb.
(b) populate the REW_OP from the injection header of the ongoing skb.
Only then is ocelot_port->ts_id incremented.
This is a problem because, at least theoretically, another timestampable
skb might use the same ocelot_port->ts_id before that is incremented.
Normally all transmit calls are serialized by the netdev transmit
spinlock, but in this case, ocelot_port_add_txtstamp_skb() is also
called by DSA, which has started declaring the NETIF_F_LLTX feature
since commit 2b86cb829976 ("net: dsa: declare lockless TX feature for
slave ports"). So the logic of using and incrementing the timestamp id
should be atomic per port.
The solution is to use the global ocelot_port->ts_id only while
protected by the associated ocelot_port->ts_id_lock. That's where we
populate skb->cb[0]. Note that for ocelot, ocelot_port_add_txtstamp_skb
is called for the actual skb, but for felix, it is called for the skb's
clone. That is something which will also be changed in the future.
Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Horatiu Vultur <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Tested-by: Alexandre Belloni <[email protected]>
Reviewed-by: Alexandre Belloni <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The TX-timestampable skb is added late to the ocelot_port->tx_skbs. It
is in a race with the TX timestamp IRQ, which checks that queue trying
to match the timestamp with the skb by the ts_id. The skb should be
added to the queue before the IRQ can fire.
Fixes: 4e3b0468e6d7 ("net: mscc: PTP Hardware Clock (PHC) support")
Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Horatiu Vultur <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
In order to unify the tx status path, the hw 802.11 encapsulation flag
needs to survive the trip to the tx status call.
Since we don't have any free bits in info->flags, we need to move one.
IEEE80211_TX_INTFL_NEED_TXPROCESSING is only used internally in mac80211,
and only before the call into the driver.
Signed-off-by: Felix Fietkau <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Johannes Berg <[email protected]>
|
|
The current API (which lets the driver turn on/off per vif directly) has a
number of limitations:
- it does not deal with AP_VLAN
- conditions for enabling (no tkip, no monitor) are only checked at
add_interface time
- no way to indicate 4-addr support
In order to address this, store offload flags in struct ieee80211_vif
(easy to extend for decap offload later). mac80211 initially sets the enable
flag, but gives the driver a chance to modify it before its settings are
applied. In addition to the .add_interface op, a .update_vif_offload op is
introduced, which can be used for runtime changes.
If a driver can't disable encap offload at runtime, or if it has some extra
limitations, it can simply override the flags within those ops.
Support for encap offload with 4-address mode interfaces can be enabled
by setting a flag from .add_interface or .update_vif_offload.
Signed-off-by: Felix Fietkau <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[resolved conflict with commit aa2092a9bab3 ("ath11k: add raw mode and
software crypto support")]
Signed-off-by: Johannes Berg <[email protected]>
|
|
>> ld.lld: error: undefined symbol: net_dim_get_rx_moderation
>>> referenced by ionic_lif.c:52 (drivers/net/ethernet/pensando/ionic/ionic_lif.c:52)
>>> net/ethernet/pensando/ionic/ionic_lif.o:(ionic_dim_work) in archive drivers/built-in.a
>> ld.lld: error: undefined symbol: net_dim
>>> referenced by ionic_txrx.c:456 (drivers/net/ethernet/pensando/ionic/ionic_txrx.c:456)
>>> net/ethernet/pensando/ionic/ionic_txrx.o:(ionic_dim_update) in archive drivers/built-in.a
v2: removed sketchy dashes in commit message
Fixes: 04a834592bf5 ("ionic: dynamic interrupt moderation")
Reported-by: kernel test robot <[email protected]>
Signed-off-by: Shannon Nelson <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
struct ethtool_fecparam carries bitmasks not bit numbers.
We want to return 1 (NONE), not 0.
Fixes: 0d0870938337 ("nfp: implement ethtool FEC mode settings")
Signed-off-by: Jakub Kicinski <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: Jesse Brandeburg <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This driver makes use of devm_mdiobus_alloc_size. To ensure this is
available select MDIO_DEVRES which provides it.
Signed-off-by: Andrew Lunn <[email protected]>
Reported-by: Randy Dunlap <[email protected]>
Acked-by: Randy Dunlap <[email protected]> # build-tested
Signed-off-by: David S. Miller <[email protected]>
|
|
Signed-off-by: Qinglang Miao <[email protected]>
Reviewed-by: Ioana Ciornei <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
In rx_request_irq(), it will just return what irq_set_affinity_hint()
returns. If it is failed, the napi and irq requested are not freed
properly. So add exits for failures to handle these.
Signed-off-by: Wei Li <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
There are two interfaces to configure ETS: qdiscs and DCB. Historically,
DCB ETS configuration was projected to ingress as well, and configured port
buffers. Qdisc was not.
Keep qdiscs behaving this way, and if an offloaded qdisc is configured on a
port, move this port's headroom to a manual mode, thus allowing
configuration of port buffers through dcbnl_setbuffer.
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add dcbnl_setbuffer, which bounces requests if a headroom is in DCB mode.
Implement dcbnl_getbuffer such that it can always be used to determine
port-buffer configuration, regardless of headroom mode.
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
There are two interfaces to configure ETS: qdiscs and DCB. Historically,
DCB ETS configuration was projected to ingress as well, and configured port
buffers. Qdisc was not.
So as not to break clients that today use DCB ETS and PFC and rely on
getting a reasonable ingress buffer priomap, keep the ETS mirroring in
effect.
Since qdiscs have not done this mirroring historically, it is reasonable
not to introduce it, but rather permit manual ingress configuration through
dcbnl_setbuffer only in the qdisc mode.
This will require a toggle to indicate whether buffer sizes should be
autocomputed or taken from dcbnl_setbuffer, and likewise for priomaps.
Introduce such and initialize it, and guard port buffer size configuration
as appropriate. The toggle is currently left in the DCB position. In a
following patch, qdisc code will switch it.
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
When phy_is_started() was added to catch incorrect PHY states,
phy_stop() would not be qualified against PHY_DOWN. It is possible to
reach that state when the PHY driver has been unbound and the network
device is then brought down.
Fixes: 2b3e88ea6528 ("net: phy: improve phy state checking")
Signed-off-by: Florian Fainelli <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
If we have unbound the PHY driver prior to calling phy_detach() (often
via phy_disconnect()) then we can cause a NULL pointer de-reference
accessing the driver owner member. The steps to reproduce are:
echo unimac-mdio-0:01 > /sys/class/net/eth0/phydev/driver/unbind
ip link set eth0 down
Fixes: cafe8df8b9bc ("net: phy: Fix lack of reference count on PHY driver")
Signed-off-by: Florian Fainelli <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This patch sets skb->protocol before transmitting frames on the HDLC
device, so that a user listening on the HDLC device with an AF_PACKET
socket will see outgoing frames' sll_protocol field correctly set and
consistent with that of incoming frames.
1. Control frames in hdlc_cisco and hdlc_ppp
When these drivers send control frames, skb->protocol is not set.
This value should be set to htons(ETH_P_HDLC), because when receiving
control frames, their skb->protocol is set to htons(ETH_P_HDLC).
When receiving, hdlc_type_trans in hdlc.h is called, which then calls
cisco_type_trans or ppp_type_trans. The skb->protocol of control frames
is set to htons(ETH_P_HDLC) so that the control frames can be received
by hdlc_rcv in hdlc.c, which calls cisco_rx or ppp_rx to process the
control frames.
2. hdlc_fr
When this driver sends control frames, skb->protocol is set to internal
values used in this driver.
When this driver sends data frames (from upper stacked PVC devices),
skb->protocol is the same as that of the user data packet being sent on
the upper PVC device (for normal PVC devices), or is htons(ETH_P_802_3)
(for Ethernet-emulating PVC devices).
However, skb->protocol for both control frames and data frames should be
set to htons(ETH_P_HDLC), because when receiving, all frames received on
the HDLC device will have their skb->protocol set to htons(ETH_P_HDLC).
When receiving, hdlc_type_trans in hdlc.h is called, and because this
driver doesn't provide a type_trans function in struct hdlc_proto,
all frames will have their skb->protocol set to htons(ETH_P_HDLC).
The frames are then received by hdlc_rcv in hdlc.c, which calls fr_rx
to process the frames (control frames are consumed and data frames
are re-received on upper PVC devices).
Cc: Krzysztof Halasa <[email protected]>
Signed-off-by: Xie He <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This driver is a virtual driver stacked on top of Ethernet interfaces.
When this driver transmits data on the Ethernet device, the skb->protocol
setting is inconsistent with the Ethernet header prepended to the skb.
This causes a user listening on the Ethernet interface with an AF_PACKET
socket, to see different sll_protocol values for incoming and outgoing
frames, because incoming frames would have this value set by parsing the
Ethernet header.
This patch changes the skb->protocol value for outgoing Ethernet frames,
making it consistent with the Ethernet header prepended. This makes a
user listening on the Ethernet device with an AF_PACKET socket, to see
the same sll_protocol value for incoming and outgoing frames.
Cc: Martin Schiller <[email protected]>
Signed-off-by: Xie He <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The following features are supported:
- VLAN-aware bridge offloading
- VLAN-unaware bridge offloading
- FDB offloading (learning, ageing)
- Switchport configuration
Currently there are some limitations like:
- Only 1 VLAN-aware bridge instance supported
- FDB ageing timeout parameter is set globally per device
Co-developed-by: Serhiy Boiko <[email protected]>
Signed-off-by: Serhiy Boiko <[email protected]>
Co-developed-by: Serhiy Pshyk <[email protected]>
Signed-off-by: Serhiy Pshyk <[email protected]>
Co-developed-by: Taras Chornyi <[email protected]>
Signed-off-by: Taras Chornyi <[email protected]>
Signed-off-by: Vadym Kochan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The ethtool API provides support for the configuration of the following
features: speed and duplex, auto-negotiation, MDI-x, forward error
correction, port media type. The API also provides information about the
port status, hardware and software statistic. The following limitation
exists:
- port media type should be configured before speed setting
- ethtool -m option is not supported
- ethtool -p option is not supported
- ethtool -r option is supported for RJ45 port only
- the following combination of parameters is not supported:
ethtool -s sw1pX port XX autoneg on
- forward error correction feature is supported only on SFP ports, 10G
speed
- auto-negotiation and MDI-x features are not supported on
Copper-to-Fiber SFP module
Co-developed-by: Andrii Savka <[email protected]>
Signed-off-by: Andrii Savka <[email protected]>
Co-developed-by: Serhiy Boiko <[email protected]>
Signed-off-by: Serhiy Boiko <[email protected]>
Signed-off-by: Vadym Kochan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add very basic support for devlink interface:
- driver name
- fw version
- devlink ports
Signed-off-by: Vadym Kochan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add PCI interface driver for Prestera Switch ASICs family devices, which
provides:
- Firmware loading mechanism
- Requests & events handling to/from the firmware
- Access to the firmware on the bus level
The firmware has to be loaded each time the device is reset. The driver
is loading it from:
/lib/firmware/mrvl/prestera/mvsw_prestera_fw-v{MAJOR}.{MINOR}.img
The full firmware image version is located within the internal header
and consists of 3 numbers - MAJOR.MINOR.PATCH. Additionally, driver has
hard-coded minimum supported firmware version which it can work with:
MAJOR - reflects the support on ABI level between driver and loaded
firmware, this number should be the same for driver and loaded
firmware.
MINOR - this is the minimum supported version between driver and the
firmware.
PATCH - indicates only fixes, firmware ABI is not changed.
Firmware image file name contains only MAJOR and MINOR numbers to make
driver be compatible with any PATCH version.
Co-developed-by: Oleksandr Mazur <[email protected]>
Signed-off-by: Oleksandr Mazur <[email protected]>
Signed-off-by: Vadym Kochan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Marvell Prestera 98DX326x integrates up to 24 ports of 1GbE with 8
ports of 10GbE uplinks or 2 ports of 40Gbps stacking for a largely
wireless SMB deployment.
The current implementation supports only boards designed for the Marvell
Switchdev solution and requires special firmware.
The core Prestera switching logic is implemented in prestera_main.c,
there is an intermediate hw layer between core logic and firmware. It is
implemented in prestera_hw.c, the purpose of it is to encapsulate hw
related logic, in future there is a plan to support more devices with
different HW related configurations.
This patch contains only basic switch initialization and RX/TX support
over SDMA mechanism.
Currently supported devices have DMA access range <= 32bit and require
ZONE_DMA to be enabled, for such cases SDMA driver checks if the skb
allocated in proper range supported by the Prestera device.
Also meanwhile there is no TX interrupt support in current firmware
version so recycling work is scheduled on each xmit.
Port's mac address is generated from the switch base mac which may be
provided via device-tree (static one or as nvme cell), or randomly
generated. This is required by the firmware.
Co-developed-by: Andrii Savka <[email protected]>
Signed-off-by: Andrii Savka <[email protected]>
Co-developed-by: Oleksandr Mazur <[email protected]>
Signed-off-by: Oleksandr Mazur <[email protected]>
Co-developed-by: Serhiy Boiko <[email protected]>
Signed-off-by: Serhiy Boiko <[email protected]>
Co-developed-by: Serhiy Pshyk <[email protected]>
Signed-off-by: Serhiy Pshyk <[email protected]>
Co-developed-by: Taras Chornyi <[email protected]>
Signed-off-by: Taras Chornyi <[email protected]>
Co-developed-by: Volodymyr Mytnyk <[email protected]>
Signed-off-by: Volodymyr Mytnyk <[email protected]>
Signed-off-by: Vadym Kochan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Fix the memory leak in mps during module unload
path by freeing mps reference entries if the list
adpter->mps_ref is not already empty
Fixes: 28b3870578ef ("cxgb4: Re-work the logic for mps refcounting")
Signed-off-by: Raju Rangoju <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
For additional robustness in the face of Hyper-V errors or malicious
behavior, validate all values that originate from packets that Hyper-V
has sent to the guest in the host-to-guest ring buffer. Ensure that
invalid values cannot cause indexing off the end of an array, or
subvert an existing validation via integer overflow. Ensure that
outgoing packets do not have any leftover guest memory that has not
been zeroed out.
Signed-off-by: Andres Beltran <[email protected]>
Co-developed-by: Andrea Parri (Microsoft) <[email protected]>
Signed-off-by: Andrea Parri (Microsoft) <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: [email protected]
Reviewed-by: Haiyang Zhang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Use napi_consume_skb() to batch consuming skb when cleaning
tx desc in NAPI polling.
Signed-off-by: Yunsheng Lin <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
writel() can be used to order I/O vs memory by default when
writing portable drivers. Use writel() to replace wmb() +
writel_relaxed(), and writel() is dma_wmb() + writel_relaxed()
for ARM64, so there is an optimization here because dma_wmb()
is a lighter barrier than wmb().
Signed-off-by: Yunsheng Lin <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Currently HNS3_RING_RX_RING_FBDNUM_REG register is read to determine
how many rx desc can be cleaned. To avoid the register read operation
in the critical data path, use the valid bit in the rx desc to determine
if a specific rx desc can be cleaned.
The hns3 driver clear valid bit in the rx desc before notifying the
rx desc to the hw, and hw will only set the valid bit of the rx desc
after corresponding buffer is filled with packet data and other field
in the rx desc is set accordingly.
Add hns3_rx_ring_move_fw() function to clear the valid bit in the rx
desc before moving rx ring's next_to_clean forward to avoid double
cleaning a rx desc, also add a dma_rmb() barrier in hns3_handle_rx_bd()
to make sure valid bit is set before reading other field in the rx desc.
Signed-off-by: Yunsheng Lin <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Currently HNS3_RING_TX_RING_HEAD_REG register is read to determine
how many tx desc can be cleaned. To avoid the register read operation
in the critical data path, use the valid bit in the tx desc to determine
if a specific tx desc can be cleaned.
The hns3 driver sets valid bit in the tx desc before ringing a doorbell
to the hw, and hw will only clear the valid bit of the tx desc after
corresponding packet is sent out to the wire. And because next_to_use
for tx ring is a changing variable when the driver is filling the tx
desc, so reuse the pull_len for rx ring to record the tx desc that has
notified to the hw, so that hns3_nic_reclaim_desc() can decide how many
tx desc's valid bit need checking when reclaiming tx desc.
And io_err_cnt stat is also removed for it is not used anymore.
Signed-off-by: Yunsheng Lin <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Use netdev_xmit_more() to defer the tx doorbell operation when
the skb is passed to the driver continuously. By doing this we
can improve the overall xmit performance by avoid some doorbell
operations.
Also, the tx_err_cnt stat is not used, so rename it to tx_more
stat.
Signed-off-by: Yunsheng Lin <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Batch the page reference count updates instead of doing them
one at a time. By doing this we can improve the overall receive
performance by avoid some atomic increment operations when the
rx page is reused.
Signed-off-by: Yunsheng Lin <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The KSZ9477 and KSZ8795 use the port_cnt field differently: For the
KSZ9477, it includes the CPU port(s), while for the KSZ8795, it doesn't.
It would be a good cleanup to make the handling of both drivers match,
but as a first step, fix the recently broken assignment of num_ports in
the KSZ8795 driver (which completely broke probing, as the CPU port
index was always failing the num_ports check).
Fixes: af199a1a9cb0 ("net: dsa: microchip: set the correct number of ports")
Signed-off-by: Matthias Schiffer <[email protected]>
Reviewed-by: Codrin Ciubotariu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This patch adds transport ports information for route lookup so that
IPsec can select Geneve tunnel traffic to do encryption. This is
needed for OVS/OVN IPsec with encrypted Geneve tunnels.
This can be tested by configuring a host-host VPN using an IKE
daemon and specifying port numbers. For example, for an
Openswan-type configuration, the following parameters should be
configured on both hosts and IPsec set up as-per normal:
$ cat /etc/ipsec.conf
conn in
...
left=$IP1
right=$IP2
...
leftprotoport=udp/6081
rightprotoport=udp
...
conn out
...
left=$IP1
right=$IP2
...
leftprotoport=udp
rightprotoport=udp/6081
...
The tunnel can then be setup using "ip" on both hosts (but
changing the relevant IP addresses):
$ ip link add tun type geneve id 1000 remote $IP2
$ ip addr add 192.168.0.1/24 dev tun
$ ip link set tun up
This can then be tested by pinging from $IP1:
$ ping 192.168.0.2
Without this patch the traffic is unencrypted on the wire.
Fixes: 2d07dc79fe04 ("geneve: add initial netdev driver for GENEVE tunnels")
Signed-off-by: Qiuyu Xiao <[email protected]>
Signed-off-by: Mark Gray <[email protected]>
Reviewed-by: Greg Rose <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Use DEFINE_SEQ_ATTRIBUTE macro to simplify the code.
Signed-off-by: Liu Shixin <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Fix some parameter description or spelling mistakes.
Signed-off-by: Lu Wei <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Use the dim library to manage dynamic interrupt
moderation in ionic.
v3: rebase
v2: untangled declarations in ionic_dim_work()
Signed-off-by: Shannon Nelson <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This patch add support to --show-ring & --set-ring Ethtool functions:
- Adding min, max, power of two check to new ring parameter's value.
- Bring down the network interface before changing the value of ring
parameters.
- Bring up the network interface after changing the value of ring
parameters.
Signed-off-by: Song, Yoong Siang <[email protected]>
Signed-off-by: Voon Weifeng <[email protected]>
Signed-off-by: Ong Boon Leong <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Traffic mirroring modes that are in-chip implemented on egress need an
internal buffer to work. As the only client, the SPAN module was managing
the buffer so far. However logically it belongs to the buffers module. E.g.
buffer size validation needs to take the size of the internal buffer into
account.
Therefore move the related code from SPAN to spectrum_buffers. Move over
the callbacks that determine the minimum buffer size as a function of
maximum speed and MTU. Add a field describing the internal buffer to struct
mlxsw_sp_hdroom. Extend mlxsw_sp_hdroom_bufs_reset_sizes() to take care of
sizing the internal buffer as well. Change the SPAN module to invoke that
function and mlxsw_sp_hdroom_configure() like all the other hdroom clients.
Drop the now-unnecessary mlxsw_sp_span_port_buffer_disable().
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The size of the internal buffer is currently calculated in the SPAN module.
Logically it belongs to the spectrum_buffers module, where it should be
moved. However, that being a chip-specific operation, it needs dynamic
dispatch. There currently is a chip-specific structure for description of
shared buffer values, struct mlxsw_sp_sb_vals. However placing ops into
this structure would be confusing. Therefore introduce a new per-chip
structure, currently empty, and initialize the ops pointer as appropriate.
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Currently mlxsw_sp_port_headroom_init() configures both priomap and buffers
by hand. Additionally, for port buffers, it configures buffer 0 with a size
that it will never again have if PFC configuration is touched.
Rewrite the init code to become a client of the new hdroom code. The only
difference in invocation is that the configuration is forced, so that it is
issued even if the desired configuration happens to match what is contained
in (hitherto not initialized with meaningful values) mlxsw_sp_port->hdroom.
Since now mlxsw_sp_port_headroom_init() initializes all the PG buffers to
meaningful values, mlxsw_sp_hdroom_configure_buffers() can avoid querying
the current configuration, and can fill the whole PBMC itself.
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This function is now only used from the buffers module, and is a trivial
field reference. Just inline it and drop the related artifacts.
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Move all the headroom code to the spectrum_buffers module, where it
belongs.
Rename mlxsw_sp_pg_buf_threshold_get() and mlxsw_sp_pg_buf_pack() to
..._hdroom_... to match the naming convention of the new headroom code.
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The ETS handler performs the headroom configuration in three steps: first
it resizes the buffers and adds any new ones. Then it redirects priorities
to the new buffers. And finally it sets the size of the now-unused buffers
to zero. This way no packet drops are introduced.
This sort of careful approach will also be useful for configuring port
buffer sizes and priority map by hand, through dcbnl_setbuffer. Therefore
move the code from the DCB handler to the generic headroom function.
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The new hdroom code has certain conventions: iteration over priorities is
done through a variable named `prio', configuration is not pushed unless it
is dirty, but a `force' flag can be used to override this, updated
configuration is written to port. Convert the function
mlxsw_sp_port_pg_prio_map() to use these conventions and rename
appropriately to fit in.
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The ETS handler performs the headroom configuration in three steps: first
it resizes the buffers and adds any new ones. Then it redirects priorities
to the new buffers. And finally it sets the size of the now-unused buffers
to zero. This way no packet drops are introduced.
Both of the buffer size configuration operations are simply buffer size
configurations, there is no material difference between setting buffers to
zero and any other value. Therefore simply invoke the same
mlxsw_sp_hdroom_configure(), and drop mlxsw_sp_port_pg_destroy() and
mlxsw_sp_ets_has_pg() which are now unused.
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Split mlxsw_sp_port_headroom_set() to three functions.
mlxsw_sp_hdroom_bufs_reset_sizes() changes the sizes of the individual PG
buffers, and mlxsw_sp_hdroom_configure_buffers() will actually apply the
configuration. A third function, mlxsw_sp_hdroom_bufs_fit(), verifies that
the requested buffer configuration matches total headroom size
requirements.
Add wrappers, mlxsw_sp_hdroom_configure() and __..., that will eventually
perform full headroom configuration, but for now, only have them verify the
configured headroom size, and invoke mlxsw_sp_hdroom_configure_buffers().
Have them take the `force` argument to prepare for a later patch, even
though it is currently unused.
Note that the loop in mlxsw_sp_hdroom_configure_buffers() only goes through
DCBX_MAX_BUFFERS. Since there is no logic to configure the control buffer,
it needs to keep the values queried from the FW. Eventually this function
should configure all the PGs.
Note that conversion of __mlxsw_sp_dcbnl_ieee_setets() is not trivial. That
function performs the headroom configuration in three steps: first it
resizes the buffers and adds any new ones. Then it redirects priorities to
the new buffers. And finally it sets the size of the now-unused buffers to
zero. This way no packet drops are introduced.
So after invoking mlxsw_sp_hdroom_bufs_reset_sizes(), tweak the
configuration to keep the old sizes of PG buffers for those buffers whose
size was set to zero.
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
So far, port buffers were always autoconfigured. When dcbnl_setbuffer
callback is implemented, it will allow the user to change the buffer size
configuration by hand. The sizes therefore need to be a configuration
parameter, not always deduced, and therefore belong to struct
mlxsw_sp_hdroom, where the configuration routine should take them from.
Update mlxsw_sp_port_headroom_set() to update these sizes. Have the
function update the sizes even for the case that a given buffer is not
used.
Additionally, change the loop iteration end to DCBX_MAX_BUFFERS instead of
IEEE_8021QAZ_MAX_TCS. The value is the same, but the semantics differ.
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Client-side configuration has lossiness as an attribute of a priority.
Therefore add a "lossy" attribute to struct mlxsw_sp_hdroom_prio.
To a Spectrum ASIC, lossiness is a feature of a port buffer. Therefore add
struct mlxsw_sp_hdroom_buf, which in the following patches will get more
attributes, but right now only use it to track port buffer lossiness.
Instead of passing around the primary indicators of PFC and pause_en, add a
function mlxsw_sp_hdroom_bufs_reset_lossiness() to compute the buffer
lossiness from the priority map and priority lossiness. Change
mlxsw_sp_port_headroom_set() to take the buffer lossy flag from the
headroom configuration. Have the PFC and pause handlers configure priority
lossiness in mlxsw_sp_hdroom, from where it will propagate.
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The mapping from priorities to buffers determines which buffers should be
configured. Lossiness of these priorities combined with the mapping
determines whether a given buffer should be lossy.
Currently this configuration is stored implicitly in DCB ETS, PFC and
ethtool PAUSE configuration. Keeping it together with the rest of the
headroom configuration and deriving it as needed from PFC / ETS / PAUSE
will make things clearer. To that end, add a field "prios" to struct
mlxsw_sp_hdroom.
Previously, __mlxsw_sp_port_headroom_set() took prio_tc as an argument, and
assumed that the same mapping as we use on the egress should be used on
ingress as well. Instead, track this configuration at each priority, so
that it can be adjusted flexibly.
In the following patches, as dcbnl_setbuffer is implemented, it will need
to store its own mapping, and it will also be sometimes necessary to revert
back to the original ETS mapping. Therefore track two buffer indices: the
one for chip configuration (buf_idx), and the source one (ets_buf_idx).
Introduce a function to configure the chip-level buffer index, and for now
have it simply copy the ETS mapping over to the chip mapping.
Update the ETS handler to project prio_tc to the ets_buf_idx and invoke the
buf_idx recomputation.
Now that there is a canonical place to look for this configuration,
mlxsw_sp_port_headroom_set() does not need to invent def_prio_tc to use if
DCB is compiled out.
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|