Age | Commit message (Collapse) | Author | Files | Lines |
|
Add steering tables/rules to check if the decrypted traffic is RoCEv2,
if so copy reg_b_metadata to a temp reg and forward it to RDMA_RX domain.
The rules are added once the MACsec device is assigned an IP address
where we verify that the packet ip is for MACsec device and that the temp
reg has MACsec operation and a valid SCI inside.
Signed-off-by: Patrisious Haddad <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Add steering table in RDMA_TX domain, to forward MACsec traffic
to MACsec crypto table in NIC domain.
The tables are created in a lazy manner when the first TX SA is
being created, and destroyed upon the destruction of the last SA.
Signed-off-by: Patrisious Haddad <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Add MACsec flow steering priorities in RDMA namespaces. This allows
adding tables/rules to forward RoCEv2 traffic to the MACsec crypto
tables in NIC_TX domain, and accept RoCEv2 traffic from NIC_RX domain.
Signed-off-by: Patrisious Haddad <[email protected]>
Reviewed-by: Maor Gottlieb <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Handle MACsec IP ambiguity issue, since mlx5 hw can't support
programming both the MACsec and the physical gid when they have the same
IP address, because it wouldn't know to whom to steer the traffic.
Hence in such case we delete the physical gid from the hw gid table,
which would then cause all traffic sent over it to fail, and we'll only
be able to send traffic over the MACsec gid.
Signed-off-by: Patrisious Haddad <[email protected]>
Reviewed-by: Raed Salem <[email protected]>
Reviewed-by: Mark Zhang <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Remove fs_id from the MACsec SA, since it has no real usage there and
instead maintain with the MACsec steering data inside the core.
Downstream patches requires this change to facilitate IB driver accesses
to the fs_ids to avoid RoCE MACsec dependency on EN driver.
Signed-off-by: Patrisious Haddad <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Since MACsec steering was moved from ethernet private code to core,
remove the netdevice from the MACsec steering, and use core device
methods for error reporting instead.
Signed-off-by: Patrisious Haddad <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
to core
Since now MACsec flow steering (macsec_fs) and MACsec statistics (stats)
are maintained by the core driver, move their data as well to be saved
inside core structures instead of staying part of ethernet MACsec database.
In addition cleanup all MACsec stats functions from the ethernet MACsec
code and move what's needed to be part of macsec_fs instead.
Signed-off-by: Patrisious Haddad <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
naming style
Rename MACsec flow steering(macsec_fs) functions and parameters from
ethernet(core/en_accel) naming convention to core naming convention.
Signed-off-by: Patrisious Haddad <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Since macsec flow steering was moved to core, it should be independent
of all ethernet code and structures hence we remove all ethernet header
includes and redefine ethernet structs internally for macsec_fs usage
where needed.
Signed-off-by: Patrisious Haddad <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Move MACsec flow steering operations(macsec_fs) from core/en_accel to
core/lib, this mandates moving MACsec statistics structure from the
general MACsec code header(en_accel/macsec.h) to macsec_fs header to
remove macsec_fs.h dependency over en_accel/macsec.h.
This to lay the ground for RoCE MACsec by moving all the data
that will need to be accessed by both ethernet MACsec and
RoCE MACsec to be shared at core.
Signed-off-by: Patrisious Haddad <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
Given a macsec net_device add two functions to return the real net_device
for that device, and check if that macsec device is offloaded or not.
This is needed for auxiliary drivers that implement MACsec offload, but
have flows which are triggered over the macsec net_device, this allows
the drivers in such cases to verify if the device is offloaded or not,
and to access the real device of that macsec device, which would
belong to the driver, and would be needed for the offload procedure.
Signed-off-by: Patrisious Haddad <[email protected]>
Reviewed-by: Raed Salem <[email protected]>
Reviewed-by: Mark Zhang <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
|
|
As Simon Horman suggests, update vcap_get_rule() to always
return an ERR_PTR() and update the error detection conditions to
use IS_ERR(), so use IS_ERR() to check the return value.
Signed-off-by: Ruan Jinjie <[email protected]>
Suggested-by: Simon Horman <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
As Simon Horman suggests, update vcap_get_rule() to always
return an ERR_PTR() and update the error detection conditions to
use IS_ERR(), so use IS_ERR() to fix the return value issue.
Fixes: 72df3489fb10 ("net: lan966x: Add ptp trap rules")
Signed-off-by: Ruan Jinjie <[email protected]>
Suggested-by: Simon Horman <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
As Simon Horman suggests, update vcap_get_rule() to always
return an ERR_PTR() and update the error detection conditions to
use IS_ERR(), which would be more cleaner in this case.
Signed-off-by: Ruan Jinjie <[email protected]>
Suggested-by: Simon Horman <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Since commit 91a7cda1f4b8 ("net: phy: Fix race condition on link status
change") all the phy_error() method invocations have been causing the
nested-mutex-lock deadlock because it's normally done in the PHY-driver
threaded IRQ handlers which since that change have been called with the
phydev->lock mutex held. Here is the calls thread:
IRQ: phy_interrupt()
+-> mutex_lock(&phydev->lock); <--------------------+
drv->handle_interrupt() | Deadlock due
+-> ERROR: phy_error() + to the nested
+-> phy_process_error() | mutex lock
+-> mutex_lock(&phydev->lock); <-+
phydev->state = PHY_ERROR;
mutex_unlock(&phydev->lock);
mutex_unlock(&phydev->lock);
The problem can be easily reproduced just by calling phy_error() from any
PHY-device threaded interrupt handler. Fix it by dropping the phydev->lock
mutex lock from the phy_process_error() method and printing a nasty error
message to the system log if the mutex isn't held in the caller execution
context.
Note for the fix to work correctly in the PHY-subsystem itself the
phydev->lock mutex locking must be added to the phy_error_precise()
function.
Link: https://lore.kernel.org/netdev/[email protected]
Fixes: 91a7cda1f4b8 ("net: phy: Fix race condition on link status change")
Suggested-by: Andrew Lunn <[email protected]>
Signed-off-by: Serge Semin <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
xgene_mdio_status is declared static, and is only written once by the
driver. It appears to have been this way since the driver was first
added to the kernel tree. No other users can be found, so let's remove
it.
Signed-off-by: Russell King (Oracle) <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
All captured timestamps should be corrected by PHY, MAC and CDC introduced
latency/errors. The CDC correction is already used. Enable MAC propagation delay
correction as well which is available since commit 26cfb838aa00 ("net: stmmac:
correct MAC propagation delay").
Before:
|ptp4l[390.458]: rms 7 max 21 freq +177 +/- 14 delay 357 +/- 1
After:
|ptp4l[620.012]: rms 7 max 20 freq +195 +/- 14 delay 345 +/- 1
Tested on Intel Elkhart Lake.
Signed-off-by: Kurt Kanzenbach <[email protected]>
Reviewed-by: Johannes Zink <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Handle extended compliance code 0x1 (SFF8024_ECC_100G_25GAUI_C2M_AOC)
for active optical cables supporting 25G and 100G speeds.
Since the specification makes no statement about transmitter range, and
as the specific sfp module that had been tested features only 2m fiber -
short-range (SR) modes are selected.
The 100G speed is irrelevant because it would require multiple fibers /
multiple SFP28 modules combined under one netdev.
sfp-bus.c only handles a single module per netdev, so only 25Gbps modes
are selected.
sfp_parse_support already handles SFF8024_ECC_100GBASE_SR4_25GBASE_SR
with compatible properties, however that entry is a contradiction in
itself since with SFP(28) 100GBASE_SR4 is impossible - that would likely
be a mode for qsfp modules only.
Add a case for SFF8024_ECC_100G_25GAUI_C2M_AOC selecting 25gbase-r
interface mode and 25000baseSR link mode.
Also enforce SFP28 bitrate limits on the values read from sfp eeprom as
requested by Russell King.
Tested with fs.com S28-AO02 AOC SFP28 module.
Signed-off-by: Josua Mayer <[email protected]>
Reviewed-by: Russell King (Oracle) <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Based on the original code semantic in case of Clause 45 MDIO, the address
command is supposed to be followed by the command sending the MMD address,
not the CSR address. The commit 002dd3de097c ("net: mdio: mdio-bitbang:
Separate C22 and C45 transactions") has erroneously broken that. So most
likely due to an unfortunate variable name it switched the code to sending
the CSR address. In our case it caused the protocol malfunction so the
read operation always failed with the turnaround bit always been driven to
one by PHY instead of zero. Fix that by getting back the correct
behaviour: sending MMD address command right after the regular address
command.
Fixes: 002dd3de097c ("net: mdio: mdio-bitbang: Separate C22 and C45 transactions")
Signed-off-by: Serge Semin <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
802.1X PAE frames are link-local frames, therefore they must be trapped to
the CPU port. Currently, the MT753X switches treat 802.1X PAE frames as
regular multicast frames, therefore flooding them to user ports. To fix
this, set 802.1X PAE frames to be trapped to the CPU port(s).
Fixes: b8f126a8d543 ("net-next: dsa: add dsa support for Mediatek MT7530 switch")
Signed-off-by: Arınç ÜNAL <[email protected]>
Reviewed-by: Vladimir Oltean <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The field 'virtual router' was extended to 12 bits in Spectrum-4.
Therefore, the element 'MLXSW_AFK_ELEMENT_VIRT_ROUTER_MSB' needs 3 bits for
Spectrum < 4 and 4 bits for Spectrum >= 4.
The elements are stored in an internal storage scratchpad. Currently, the
MSB is defined there as 3 bits. It means that for Spectrum-4, only 2K VRFs
can be used for multicast routing, as the highest bit is not really used by
the driver. Fix the definition of 'VIRT_ROUTER_MSB' to use 4 bits. Adjust
the definitions of 'virtual router' field in the blocks accordingly - use
'_avoid_size_check' for Spectrum-2 instead of for Spectrum-4. Fix the mask
in parse function to use 4 bits.
Fixes: 6d5d8ebb881c ("mlxsw: Rename virtual router flex key element")
Signed-off-by: Amit Cohen <[email protected]>
Reviewed-by: Ido Schimmel <[email protected]>
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/79bed2b70f6b9ed58d4df02e9798a23da648015b.1692268427.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The two most significant bits of the "local_port" field in the SSPR
register are always cleared since they are overwritten by the deprecated
and overlapping "sub_port" field.
On systems with more than 255 local ports (e.g., Spectrum-4), this
results in the firmware maintaining invalid mappings between system port
and local port. Specifically, two different systems ports (0x1 and
0x101) point to the same local port (0x1), which eventually leads to
firmware errors.
Fix by removing the deprecated "sub_port" field.
Fixes: fd24b29a1b74 ("mlxsw: reg: Align existing registers to use extended local_port field")
Signed-off-by: Ido Schimmel <[email protected]>
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/9b909a3033c8d3d6f67f237306bef4411c5e6ae4.1692268427.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Currently, in Spectrum-2 and above, time stamps are extracted from the CQE
into the time stamp fields in 'struct mlxsw_skb_cb', only when the CQE
time stamp type is UTC. The time stamps are read directly from the CQE and
software can get the time stamp in UTC format using CQEv2.
From Spectrum-4, the time stamps that are read from the CQE are allowed
to be also from MIRROR_UTC type.
Therefore, we get a warning [1] from the driver that the time stamp fields
were not set, when LLDP control packet is sent.
Allow the time stamp type to be MIRROR_UTC and set the time stamp in this
case as well.
[1]
WARNING: CPU: 11 PID: 0 at drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c:1409 mlxsw_sp2_ptp_hwtstamp_fill+0x1f/0x70 [mlxsw_spectrum]
[...]
Call Trace:
<IRQ>
mlxsw_sp2_ptp_receive+0x3c/0x80 [mlxsw_spectrum]
mlxsw_core_skb_receive+0x119/0x190 [mlxsw_core]
mlxsw_pci_cq_tasklet+0x3c9/0x780 [mlxsw_pci]
tasklet_action_common.constprop.0+0x9f/0x110
__do_softirq+0xbb/0x296
irq_exit_rcu+0x79/0xa0
common_interrupt+0x86/0xa0
</IRQ>
<TASK>
Fixes: 4735402173e6 ("mlxsw: spectrum: Extend to support Spectrum-4 ASIC")
Signed-off-by: Danielle Ratson <[email protected]>
Reviewed-by: Ido Schimmel <[email protected]>
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/bcef4d044ef608a4e258d33a7ec0ecd91f480db5.1692268427.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Commit 5d93cfcf7360 ("net: dpaa: Convert to phylink") removed
fman_set_mac_active_pause()/fman_get_pause_cfg() but not declarations.
Commit 48257c4f168e ("Add fs_enet ethernet network driver, for several
embedded platforms.") declared but never implemented
fs_enet_platform_init() and fs_enet_platform_cleanup().
Signed-off-by: Yue Haibing <[email protected]>
Reviewed-by: Sean Anderson <[email protected]>
Acked-by: Madalin Bucur <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
There are two network devices(veth1 and veth3) in ns1, and ipvlan1 with
L3S mode and ipvlan2 with L2 mode are created based on them as
figure (1). In this case, ipvlan_register_nf_hook() will be called to
register nf hook which is needed by ipvlans in L3S mode in ns1 and value
of ipvl_nf_hook_refcnt is set to 1.
(1)
ns1 ns2
------------ ------------
veth1--ipvlan1 (L3S)
veth3--ipvlan2 (L2)
(2)
ns1 ns2
------------ ------------
veth1--ipvlan1 (L3S)
ipvlan2 (L2) veth3
| |
|------->-------->--------->--------
migrate
When veth3 migrates from ns1 to ns2 as figure (2), veth3 will register in
ns2 and calls call_netdevice_notifiers with NETDEV_REGISTER event:
dev_change_net_namespace
call_netdevice_notifiers
ipvlan_device_event
ipvlan_migrate_l3s_hook
ipvlan_register_nf_hook(newnet) (I)
ipvlan_unregister_nf_hook(oldnet) (II)
In function ipvlan_migrate_l3s_hook(), ipvl_nf_hook_refcnt in ns1 is not 0
since veth1 with ipvlan1 still in ns1, (I) and (II) will be called to
register nf_hook in ns2 and unregister nf_hook in ns1. As a result,
ipvl_nf_hook_refcnt in ns1 is decreased incorrectly and this in ns2
is increased incorrectly. When the second net namespace is removed, a
reference count leak warning in ipvlan_ns_exit() will be triggered.
This patch add a check before ipvlan_migrate_l3s_hook() is called. The
warning can be triggered as follows:
$ ip netns add ns1
$ ip netns add ns2
$ ip netns exec ns1 ip link add veth1 type veth peer name veth2
$ ip netns exec ns1 ip link add veth3 type veth peer name veth4
$ ip netns exec ns1 ip link add ipv1 link veth1 type ipvlan mode l3s
$ ip netns exec ns1 ip link add ipv2 link veth3 type ipvlan mode l2
$ ip netns exec ns1 ip link set veth3 netns ns2
$ ip net del ns2
Fixes: 3133822f5ac1 ("ipvlan: use pernet operations and restrict l3s hooks to master netns")
Signed-off-by: Lu Wei <[email protected]>
Reviewed-by: Florian Westphal <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Add a new tx_resets ring counter. This counter will be saved as
tx_total_resets across any reset. Since we currently do a full reset
in bnxt_sched_reset_txr(), the per ring counter will always be cleared
during reset. Only the tx_total_resets count will be meaningful and we
only display this under ethtool -S.
Link: https://lore.kernel.org/netdev/CACKFLimD-bKmJ1tGZOLYRjWzEwxkri-Mw7iFme1x2Dr0twdCeg@mail.gmail.com/
Signed-off-by: Michael Chan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The existing driver displays the sum of 4 ring counters under ethtool -S.
These counters are in the array bnxt_sw_func_stats. These counters are
summed at the time of ethtool -S and will be lost when the device is reset.
Replace these counters with the new total ring error counters added in the
last patch. These new counters are saved before reset. ethtool -S will
now display the sum of the saved counters plus the current counters.
Link: https://lore.kernel.org/netdev/CACKFLimD-bKmJ1tGZOLYRjWzEwxkri-Mw7iFme1x2Dr0twdCeg@mail.gmail.com/
Reviewed-by: Andy Gospodarek <[email protected]>
Reviewed-by: Ajit Khaparde <[email protected]>
Reviewed-by: Somnath Kotur <[email protected]>
Signed-off-by: Michael Chan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Currently, the ring counters are stored in the per ring datastructure.
During reset, all the rings are freed together with the associated
datastructures. As a result, all the ring error counters will be reset
to zero.
Add logic to keep track of the total error counts of all the rings
and save them before reset (including ifdown). The next patch will
display these total ring error counters under ethtool -S.
Link: https://lore.kernel.org/netdev/CACKFLimD-bKmJ1tGZOLYRjWzEwxkri-Mw7iFme1x2Dr0twdCeg@mail.gmail.com/
Reviewed-by: Ajit Khaparde <[email protected]>
Reviewed-by: Andy Gospodarek <[email protected]>
Reviewed-by: Somnath Kotur <[email protected]>
Signed-off-by: Michael Chan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
If we are doing a complete reset with irq_re_init set to true in
bnxt_close_nic(), all the ring structures will be freed. New
structures will be allocated in bnxt_open_nic(). The current code
increments rx_resets counter in bnxt_enable_napi() if bnapi->in_reset
is true. In a complete reset, bnapi->in_reset will never be true
since the structure is just allocated.
Increment the rx_resets counter in bnxt_disable_napi() instead. This
will allow us to save all the ring error counters including the
rx_resets counters in the next patch.
Signed-off-by: Michael Chan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Use the page pool's ability to maintain DMA mappings for us.
This avoids re-mapping of the recycled pages.
Link: https://lore.kernel.org/netdev/[email protected]/
Signed-off-by: Somnath Kotur <[email protected]>
Signed-off-by: Michael Chan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Convert to use the page pool buffers for the aggregation ring when
running in non-XDP mode. This simplifies the driver and we benefit
from the recycling of pages. Adjust the page pool size to account
for the aggregation ring size.
Signed-off-by: Somnath Kotur <[email protected]>
Signed-off-by: Michael Chan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2023-08-17 (ice)
This series contains updates to ice driver only.
Jan removes unused functions and refactors code to make, possible,
functions static.
Jake rearranges some functions to be logically grouped.
Marcin removes an unnecessary call to disable VLAN stripping.
Yang Yingliang utilizes list_for_each_entry() helper for a couple list
traversals.
Przemek removes some parameters from ice_aq_alloc_free_res() which were
always the same and reworks ice_aq_wait_for_event() to reduce chance of
race.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
ice: split ice_aq_wait_for_event() func into two
ice: embed &ice_rq_event_info event into struct ice_aq_task
ice: ice_aq_check_events: fix off-by-one check when filling buffer
ice: drop two params from ice_aq_alloc_free_res()
ice: use list_for_each_entry() helper
ice: Remove redundant VSI configuration in eswitch setup
ice: move E810T functions to before device agnostic ones
ice: refactor ice_vsi_is_vlan_pruning_ena
ice: refactor ice_ptp_hw to make functions static
ice: refactor ice_sched to make functions static
ice: Utilize assign_bit() helper
ice: refactor ice_vf_lib to make functions static
ice: refactor ice_lib to make functions static
ice: refactor ice_ddp to make functions static
ice: remove unused methods
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The blamed commit resolved a bug where frames would still get stuck at
egress, even though they're smaller than the maxSDU[tc], because the
driver did not take into account the extra 33 ns that the queue system
needs for scheduling the frame.
It now takes that into account, but the arithmetic that we perform in
vsc9959_tas_remaining_gate_len_ps() is buggy, because we operate on
64-bit unsigned integers, so gate_len_ns - VSC9959_TAS_MIN_GATE_LEN_NS
may become a very large integer if gate_len_ns < 33 ns.
In practice, this means that we've introduced a regression where all
traffic class gates which are permanently closed will not get detected
by the driver, and we won't enable oversize frame dropping for them.
Before:
mscc_felix 0000:00:00.5: port 0: max frame size 1526 needs 12400000 ps, 1152000 ps for mPackets at speed 1000
mscc_felix 0000:00:00.5: port 0 tc 0 min gate len 1000000, sending all frames
mscc_felix 0000:00:00.5: port 0 tc 1 min gate len 0, sending all frames
mscc_felix 0000:00:00.5: port 0 tc 2 min gate len 0, sending all frames
mscc_felix 0000:00:00.5: port 0 tc 3 min gate len 0, sending all frames
mscc_felix 0000:00:00.5: port 0 tc 4 min gate len 0, sending all frames
mscc_felix 0000:00:00.5: port 0 tc 5 min gate len 0, sending all frames
mscc_felix 0000:00:00.5: port 0 tc 6 min gate len 0, sending all frames
mscc_felix 0000:00:00.5: port 0 tc 7 min gate length 5120 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 615 octets including FCS
After:
mscc_felix 0000:00:00.5: port 0: max frame size 1526 needs 12400000 ps, 1152000 ps for mPackets at speed 1000
mscc_felix 0000:00:00.5: port 0 tc 0 min gate len 1000000, sending all frames
mscc_felix 0000:00:00.5: port 0 tc 1 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS
mscc_felix 0000:00:00.5: port 0 tc 2 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS
mscc_felix 0000:00:00.5: port 0 tc 3 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS
mscc_felix 0000:00:00.5: port 0 tc 4 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS
mscc_felix 0000:00:00.5: port 0 tc 5 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS
mscc_felix 0000:00:00.5: port 0 tc 6 min gate length 0 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 1 octets including FCS
mscc_felix 0000:00:00.5: port 0 tc 7 min gate length 5120 ns not enough for max frame size 1526 at 1000 Mbps, dropping frames over 615 octets including FCS
Fixes: 11afdc6526de ("net: dsa: felix: tc-taprio intervals smaller than MTU should send at least one packet")
Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Return PTR_ERR_OR_ZERO() instead of return 0 or PTR_ERR() to
simplify code.
Signed-off-by: Ruan Jinjie <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Since debugfs_create_dir() returns ERR_PTR, IS_ERR() is enough to
check whether the directory is successfully created. So remove the
redundant NULL check.
Signed-off-by: Ruan Jinjie <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
On SDP interfaces, frame oversize and undersize errors are
observed as driver is not considering packet sizes of all
subscribers of the link before updating the link config.
This patch fixes the same.
Fixes: 9b7dd87ac071 ("octeontx2-af: Support to modify min/max allowed packet lengths")
Signed-off-by: Hariprasad Kelam <[email protected]>
Signed-off-by: Sunil Goutham <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
do_pci_disable_device() disable PCI bus-mastering as following:
static void do_pci_disable_device(struct pci_dev *dev)
{
u16 pci_command;
pci_read_config_word(dev, PCI_COMMAND, &pci_command);
if (pci_command & PCI_COMMAND_MASTER) {
pci_command &= ~PCI_COMMAND_MASTER;
pci_write_config_word(dev, PCI_COMMAND, pci_command);
}
pcibios_disable_device(dev);
}
And pci_disable_device() sets dev->is_busmaster to 0.
pci_enable_device() is called only once before calling to
pci_disable_device() and such pci_clear_master() is not needed. So remove
redundant pci_clear_master().
Also rename goto label 'err_out_clear_master' to 'err_out_disable_device'.
Signed-off-by: Yu Liao <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Acked-by: Shannon Nelson <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
virtchnl: fix fake 1-elem arrays
Alexander Lobakin says:
6.5-rc1 started spitting warning splats when composing virtchnl
messages, precisely on virtchnl_rss_key and virtchnl_lut:
[ 84.167709] memcpy: detected field-spanning write (size 52) of single
field "vrk->key" at drivers/net/ethernet/intel/iavf/iavf_virtchnl.c:1095
(size 1)
[ 84.169915] WARNING: CPU: 3 PID: 11 at drivers/net/ethernet/intel/
iavf/iavf_virtchnl.c:1095 iavf_set_rss_key+0x123/0x140 [iavf]
...
[ 84.191982] Call Trace:
[ 84.192439] <TASK>
[ 84.192900] ? __warn+0xc9/0x1a0
[ 84.193353] ? iavf_set_rss_key+0x123/0x140 [iavf]
[ 84.193818] ? report_bug+0x12c/0x1b0
[ 84.194266] ? handle_bug+0x42/0x70
[ 84.194714] ? exc_invalid_op+0x1a/0x50
[ 84.195149] ? asm_exc_invalid_op+0x1a/0x20
[ 84.195592] ? iavf_set_rss_key+0x123/0x140 [iavf]
[ 84.196033] iavf_watchdog_task+0xb0c/0xe00 [iavf]
...
[ 84.225476] memcpy: detected field-spanning write (size 64) of single
field "vrl->lut" at drivers/net/ethernet/intel/iavf/iavf_virtchnl.c:1127
(size 1)
[ 84.227190] WARNING: CPU: 27 PID: 1044 at drivers/net/ethernet/intel/
iavf/iavf_virtchnl.c:1127 iavf_set_rss_lut+0x123/0x140 [iavf]
...
[ 84.246601] Call Trace:
[ 84.247228] <TASK>
[ 84.247840] ? __warn+0xc9/0x1a0
[ 84.248263] ? iavf_set_rss_lut+0x123/0x140 [iavf]
[ 84.248698] ? report_bug+0x12c/0x1b0
[ 84.249122] ? handle_bug+0x42/0x70
[ 84.249549] ? exc_invalid_op+0x1a/0x50
[ 84.249970] ? asm_exc_invalid_op+0x1a/0x20
[ 84.250390] ? iavf_set_rss_lut+0x123/0x140 [iavf]
[ 84.250820] iavf_watchdog_task+0xb16/0xe00 [iavf]
Gustavo already tried to fix those back in 2021[0][1]. Unfortunately,
a VM can run a different kernel than the host, meaning that those
structures are sorta ABI.
However, it is possible to have proper flex arrays + struct_size()
calculations and still send the very same messages with the same sizes.
The common rule is:
elem[1] -> elem[]
size = struct_size() + <difference between the old and the new msg size>
The "old" size in the current code is calculated 3 different ways for
10 virtchnl structures total. Each commit addresses one of the ways
cumulatively instead of per-structure.
I was planning to send it to -net initially, but given that virtchnl was
renamed from i40evf and got some fat style cleanup commits in the past,
it's not very straightforward to even pick appropriate SHAs, not
speaking of automatic portability. I may send manual backports for
a couple of the latest supported kernels later on if anyone needs it
at all.
[0] https://lore.kernel.org/all/20210525230912.GA175802@embeddedor
[1] https://lore.kernel.org/all/20210525231851.GA176647@embeddedor
* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
virtchnl: fix fake 1-elem arrays for structures allocated as `nents`
virtchnl: fix fake 1-elem arrays in structures allocated as `nents + 1`
virtchnl: fix fake 1-elem arrays in structs allocated as `nents + 1` - 1
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Now that the "budget" is passed into fec_enet_tx_queue(), one
optimization we can do is to use napi_consume_skb() to instead
of dev_kfree_skb_any().
Signed-off-by: Wei Fang <[email protected]>
Suggested-by: Alexander H Duyck <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
./drivers/net/ethernet/sfc/tc_conntrack.c:464:2-3: Unneeded semicolon
Signed-off-by: Yang Li <[email protected]>
Acked-by: Martin Habets <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Cross-merge networking fixes after downstream PR.
Conflicts:
drivers/net/ethernet/sfc/tc.c
fa165e194997 ("sfc: don't unregister flow_indr if it was never registered")
3bf969e88ada ("sfc: add MAE table machinery for conntrack table")
https://lore.kernel.org/all/[email protected]/
No adjacent changes.
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
On s390 systems (aka mainframes), it has classic channel devices for
networking and permanent storage that are currently even more common than
PCI devices. Hence it could have a fully functional s390 kernel with
CONFIG_PCI=n, then the relevant iomem mapping functions [including
ioremap(), devm_ioremap(), etc.] are not available.
Here let ALTERA_TSE depend on HAS_IOMEM so that it won't be built to cause
below compiling error if PCI is unset:
------
ERROR: modpost: "devm_ioremap" [drivers/net/ethernet/altera/altera_tse.ko] undefined!
------
Link: https://lkml.kernel.org/r/[email protected]
Reported-by: kernel test robot <[email protected]>
Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
Signed-off-by: Baoquan He <[email protected]>
Cc: Joyce Ooi <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Paolo Abeni <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Patch series "mm: ioremap: Convert architectures to take GENERIC_IOREMAP
way", v8.
Motivation and implementation:
==============================
Currently, many architecutres have't taken the standard GENERIC_IOREMAP
way to implement ioremap_prot(), iounmap(), and ioremap_xx(), but make
these functions specifically under each arch's folder. Those cause many
duplicated code of ioremap() and iounmap().
In this patchset, firstly introduce generic_ioremap_prot() and
generic_iounmap() to extract the generic code for GENERIC_IOREMAP. By
taking GENERIC_IOREMAP method, the generic generic_ioremap_prot(),
generic_iounmap(), and their generic wrapper ioremap_prot(), ioremap() and
iounmap() are all visible and available to arch. Arch needs to provide
wrapper functions to override the generic version if there's arch specific
handling in its corresponding ioremap_prot(), ioremap() or iounmap().
With these changes, duplicated ioremap/iounmap() code uder ARCH-es are
removed, and the equivalent functioality is kept as before.
Background info:
================
1) The converting more architectures to take GENERIC_IOREMAP way is
suggested by Christoph in below discussion:
https://lore.kernel.org/all/[email protected]/T/#u
2) In the previous v1 to v3, it's basically further action after arm64
has converted to GENERIC_IOREMAP way in below patchset. It's done by
adding hook ioremap_allowed() and iounmap_allowed() in ARCH to add ARCH
specific handling the middle of ioremap_prot() and iounmap().
[PATCH v5 0/6] arm64: Cleanup ioremap() and support ioremap_prot()
https://lore.kernel.org/all/[email protected]/T/#u
Later, during v3 reviewing, Christophe Leroy suggested to introduce
generic_ioremap_prot() and generic_iounmap() to generic codes, and ARCH
can provide wrapper function ioremap_prot(), ioremap() or iounmap() if
needed. Christophe made a RFC patchset as below to specially demonstrate
his idea. This is what v4 and now v5 is doing.
[RFC PATCH 0/8] mm: ioremap: Convert architectures to take GENERIC_IOREMAP way
https://lore.kernel.org/all/[email protected]/T/#u
Testing:
========
In v8, I only applied this patchset onto the latest linus's tree to build
and run on arm64 and s390.
This patch (of 19):
Let's use '#define ioremap_xx' and "#ifdef ioremap_xx" instead.
To remove defined ARCH_HAS_IOREMAP_xx macros in <asm/io.h> of each ARCH,
the ARCH's own ioremap_wc|wt|np definition need be above "#include
<asm-generic/iomap.h>. Otherwise the redefinition error would be seen
during compiling. So the relevant adjustments are made to avoid compiling
error:
loongarch:
- doesn't include <asm-generic/iomap.h>, defining ARCH_HAS_IOREMAP_WC
is redundant, so simply remove it.
m68k:
- selected GENERIC_IOMAP, <asm-generic/iomap.h> has been added in
<asm-generic/io.h>, and <asm/kmap.h> is included above
<asm-generic/iomap.h>, so simply remove ARCH_HAS_IOREMAP_WT defining.
mips:
- move "#include <asm-generic/iomap.h>" below ioremap_wc definition
in <asm/io.h>
powerpc:
- remove "#include <asm-generic/iomap.h>" in <asm/io.h> because it's
duplicated with the one in <asm-generic/io.h>, let's rely on the
latter.
x86:
- selected GENERIC_IOMAP, remove #include <asm-generic/iomap.h> in
the middle of <asm/io.h>. Let's rely on <asm-generic/io.h>.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Baoquan He <[email protected]>
Acked-by: Geert Uytterhoeven <[email protected]>
Reviewed-by: Mike Rapoport (IBM) <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Cc: Alexander Gordeev <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: David Laight <[email protected]>
Cc: Helge Deller <[email protected]>
Cc: John Paul Adrian Glaubitz <[email protected]>
Cc: Kefeng Wang <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Nathan Chancellor <[email protected]>
Cc: Niklas Schnelle <[email protected]>
Cc: Stafford Horne <[email protected]>
Cc: Brian Cain <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Chris Zankel <[email protected]>
Cc: Gerald Schaefer <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: "James E.J. Bottomley" <[email protected]>
Cc: Jonas Bonn <[email protected]>
Cc: Max Filippov <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Rich Felker <[email protected]>
Cc: Stefan Kristiansson <[email protected]>
Cc: Sven Schnelle <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: Vineet Gupta <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yoshinori Sato <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The pds-vfio-pci series made a small interface change to
pds_client_register() and pds_client_unregister(), but forgot to update
the function header descriptions. Fix that.
Reported-by: kernel test robot <[email protected]>
Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
Fixes: b021d05e106e ("pds_core: Require callers of register/unregister to pass PF drvdata")
Signed-off-by: Shannon Nelson <[email protected]>
Signed-off-by: Brett Creeley <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alex Williamson <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from ipsec and netfilter.
No known outstanding regressions.
Fixes to fixes:
- virtio-net: set queues after driver_ok, avoid a potential race
added by recent fix
- Revert "vlan: Fix VLAN 0 memory leak", it may lead to a warning
when VLAN 0 is registered explicitly
- nf_tables:
- fix false-positive lockdep splat in recent fixes
- don't fail inserts if duplicate has expired (fix test failures)
- fix races between garbage collection and netns dismantle
Current release - new code bugs:
- mlx5: Fix mlx5_cmd_update_root_ft() error flow
Previous releases - regressions:
- phy: fix IRQ-based wake-on-lan over hibernate / power off
Previous releases - always broken:
- sock: fix misuse of sk_under_memory_pressure() preventing system
from exiting global TCP memory pressure if a single cgroup is under
pressure
- fix the RTO timer retransmitting skb every 1ms if linear option is
enabled
- af_key: fix sadb_x_filter validation, amment netlink policy
- ipsec: fix slab-use-after-free in decode_session6()
- macb: in ZynqMP resume always configure PS GTR for non-wakeup
source
Misc:
- netfilter: set default timeout to 3 secs for sctp shutdown send and
recv state (from 300ms), align with protocol timers"
* tag 'net-6.5-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (49 commits)
ice: Block switchdev mode when ADQ is active and vice versa
qede: fix firmware halt over suspend and resume
net: do not allow gso_size to be set to GSO_BY_FRAGS
sock: Fix misuse of sk_under_memory_pressure()
sfc: don't fail probe if MAE/TC setup fails
sfc: don't unregister flow_indr if it was never registered
net: dsa: mv88e6xxx: Wait for EEPROM done before HW reset
net/mlx5: Fix mlx5_cmd_update_root_ft() error flow
net/mlx5e: XDP, Fix fifo overrun on XDP_REDIRECT
i40e: fix misleading debug logs
iavf: fix FDIR rule fields masks validation
ipv6: fix indentation of a config attribute
mailmap: add entries for Simon Horman
broadcom: b44: Use b44_writephy() return value
net: openvswitch: reject negative ifindex
team: Fix incorrect deletion of ETH_P_8021AD protocol vid from slaves
net: phy: broadcom: stub c45 read/write for 54810
netfilter: nft_dynset: disallow object maps
netfilter: nf_tables: GC transaction race with netns dismantle
netfilter: nf_tables: fix GC transaction races with netns and netlink event exit path
...
|
|
Enable netconsole features to be set at compilation time. Create two
Kconfig options that allow users to set extended logs and release
prepending features at compilation time.
Right now, the user needs to pass command line parameters to netconsole,
such as "+"/"r" to enable extended logs and version prepending features.
With these two options, the user could set the default values for the
features at compile time, and don't need to pass it in the command line
to get them enabled, simplifying the command line.
Signed-off-by: Breno Leitao <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
De-duplicate the initialization and allocation code for struct
netconsole_target.
The same allocation and initialization code is duplicated in two
different places in the netconsole subsystem, when the netconsole target
is initialized by command line parameters (alloc_param_target()), and
dynamically by sysfs (make_netconsole_target()).
Create a helper function, and call it from the two different functions.
Suggested-by: Eric Dumazet <[email protected]>
Signed-off-by: Breno Leitao <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
When building with clang 18 I see the following warning:
| drivers/net/mdio/mdio-xgene.c:338:13: warning: cast to smaller integer
| type 'enum xgene_mdio_id' from 'const void *' [-Wvoid-pointer-to-enum-cast]
| 338 | mdio_id = (enum xgene_mdio_id)of_id->data;
This is due to the fact that `of_id->data` is a void* while `enum
xgene_mdio_id` has the size of an int. This leads to truncation and
possible data loss.
Link: https://github.com/ClangBuiltLinux/linux/issues/1910
Reported-by: Nathan Chancellor <[email protected]>
Signed-off-by: Justin Stitt <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Reviewed-by: Russell King (Oracle) <[email protected]>
Link: https://lore.kernel.org/r/20230815-void-drivers-net-mdio-mdio-xgene-v1-1-5304342e0659@google.com
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
PCI core API pci_dev_id() can be used to get the BDF number for a pci
device. We don't need to compose it manually. Use pci_dev_id() to
simplify the code a little bit.
Signed-off-by: Jialin Zhang <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Reviewed-by: Shay Agroskin <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Add missing __exit annotations to module exit func tun_cleanup().
Signed-off-by: Ziyang Xuan <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|