Age | Commit message (Collapse) | Author | Files | Lines |
|
There are seven GSI interrupt types that can be signaled by a single
GSI IRQ. These are represented in a bitmask, and the gsi_irq_type_id
enumerated type defines what each bit position represents.
Similarly, the global and general GSI interrupt types each has a set
of conditions it signals, and both types have an enumerated type
that defines which bit that represents each condition.
When used, these enumerated values are passed as an argument to BIT()
in *all* cases. So clean up the code a little bit by defining the
enumerated type values as one-bit masks rather than bit positions.
Rename gsi_general_id to be gsi_general_irq_id for consistency.
Signed-off-by: Alex Elder <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
When checking the validity of an IPA register ID, compare it against
all possible ipa_reg_id values.
Rename the function ipa_reg_id_valid() to be specific about what's
being checked.
Signed-off-by: Alex Elder <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Soon IPA v5.0+ will be supported, and when that happens we will be
able to enable support for the SDX65 (IPA v5.0), SM8450 (IPA v5.1),
and SM8550 (IPA v5.5).
Fix the comment about the GSI version used for IPA v3.1.
Signed-off-by: Alex Elder <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The reg_addr field in the IPA structure is set but never used.
Get rid of it.
Signed-off-by: Alex Elder <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Starting at IPA v4.11, the GSI_GENERIC_COMMAND GSI register got a
new PARAMS field. The code that encodes a value into that field
sets it unconditionally, which is wrong.
We currently only provide 0 as the field's value, so this error has
no real effect. Still, it's a bug, so let's fix it.
Fix an (unrelated) incorrect comment as well. Fields in the
ERROR_LOG GSI register actually *are* defined for IPA versions
prior to v3.5.1.
Fixes: fe68c43ce388 ("net: ipa: support enhanced channel flow control")
Signed-off-by: Alex Elder <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add the following TC flower filter keys to lan966x for IS2:
- ipv4_addr (sip and dip)
- ipv6_addr (sip and dip)
- control (IPv4 fragments)
- portnum (tcp and udp port numbers)
- basic (L3 and L4 protocol)
- vlan (outer vlan tag info)
- tcp (tcp flags)
- ip (tos field)
As the parsing of these keys is similar between lan966x and sparx5, move
the code in a separate file to be shared by these 2 chips. And put the
specific parsing outside of the common functions.
Signed-off-by: Horatiu Vultur <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Commit b3973bb40041 ("vmxnet3: set correct hash type based on
rss information") added hashType information into skb. However,
rssType field is populated for eop descriptor. This can lead
to incorrectly reporting of hashType for packets which use
multiple rx descriptors. Multiple rx descriptors are used
for Jumbo frame or LRO packets, which can hit this issue.
This patch moves the RSS codeblock under eop descritor.
Cc: [email protected]
Fixes: b3973bb40041 ("vmxnet3: set correct hash type based on rss information")
Signed-off-by: Ronak Doshi <[email protected]>
Acked-by: Peng Li <[email protected]>
Acked-by: Guolin Yang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Remove unused bulk clocks struct from the miic state and use an already
existing miic variable in miic_config().
Signed-off-by: Clément Léger <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Add support for cable diagnostics in lan8841 PHY. It has the same
registers layout as lan8814 PHY, therefore reuse the functionality.
Signed-off-by: Horatiu Vultur <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
IPsec offloading callbacks may be called in atomic context, sleep is
not allowed in the implementation. Now use workqueue mechanism to
avoid this issue.
Extend existing workqueue mechanism for multicast configuration only
to universal use, so that all configuring through mailbox asynchronously
can utilize it.
Fixes: 859a497fe80c ("nfp: implement xfrm callbacks and expose ipsec offload feature to upper layer")
Signed-off-by: Yinjun Zhang <[email protected]>
Signed-off-by: Simon Horman <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The mailbox configuration mechanism requires writing several registers,
which shouldn't be interrupted, so need lock to avoid race condition.
The base offset of mailbox configuration registers is not fixed, it
depends on TLV caps read from application firmware.
Fixes: 859a497fe80c ("nfp: implement xfrm callbacks and expose ipsec offload feature to upper layer")
Signed-off-by: Yinjun Zhang <[email protected]>
Signed-off-by: Simon Horman <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Code blocks handling BCMA_CHIP_ID_BCM5357 and BCMA_CHIP_ID_BCM53572 were
incorrectly unified. Chip package values are not unique and cannot be
checked independently. They are meaningful only in a context of a given
chip.
Packages BCM5358 and BCM47188 share the same value but then belong to
different chips. Code unification resulted in treating BCM5358 as
BCM47188 and broke its initialization.
Link: https://github.com/openwrt/openwrt/issues/8278
Fixes: cb1b0f90acfe ("net: ethernet: bgmac: unify code of the same family")
Cc: Jon Mason <[email protected]>
Signed-off-by: Rafał Miłecki <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Add IPsec offloading support for NFP3800. Include data
plane and control plane.
Data plane: add IPsec packet process flow in NFP3800
datapath (NFDk).
Control plane: add an algorithm support distinction flow
in xfrm hook function xdo_dev_state_add(), as NFP3800 has
a different set of IPsec algorithm support.
This matches existing support for the NFP6000/NFP4000 and
their NFD3 datapath.
In addition, fixup the md_bytes calculation for NFD3 datapath
to make sure the two datapahts are keept in sync.
Signed-off-by: Huanhuan Wang <[email protected]>
Reviewed-by: Niklas Söderlund <[email protected]>
Signed-off-by: Simon Horman <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
net/devlink/leftover.c / net/core/devlink.c:
565b4824c39f ("devlink: change port event netdev notifier from per-net to global")
f05bd8ebeb69 ("devlink: move code to a dedicated directory")
687125b5799c ("devlink: split out core code")
https://lore.kernel.org/all/[email protected]/
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Zero-length arrays are deprecated[1]. Replace struct i40e_lump_tracking's
"list" 0-length array with a flexible array. Detected with GCC 13,
using -fstrict-flex-arrays=3:
In function 'i40e_put_lump',
inlined from 'i40e_clear_interrupt_scheme' at drivers/net/ethernet/intel/i40e/i40e_main.c:5145:2:
drivers/net/ethernet/intel/i40e/i40e_main.c:278:27: warning: array subscript <unknown> is outside array bounds of 'u16[0]' {aka 'short unsigned int[]'} [-Warray-bounds=]
278 | pile->list[i] = 0;
| ~~~~~~~~~~^~~
drivers/net/ethernet/intel/i40e/i40e.h: In function 'i40e_clear_interrupt_scheme':
drivers/net/ethernet/intel/i40e/i40e.h:179:13: note: while referencing 'list'
179 | u16 list[0];
| ^~~~
[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays
Cc: Jesse Brandeburg <[email protected]>
Cc: Tony Nguyen <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: "Gustavo A. R. Silva" <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Kees Cook <[email protected]>
Reviewed-by: Gustavo A. R. Silva <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <[email protected]>
|
|
In i40e_status removal patches, i40e_status conversion
to strings was removed in order to easily refactor
the code to use standard errornums. This however made it
more difficult for read error logs.
Use %pe formatter to print error messages in human-readable
format.
Signed-off-by: Jan Sokolowski <[email protected]>
Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <[email protected]>
|
|
To prepare for removal of i40e_status, change the variables
from i40e_status to int. This eases the transition when values
are changed to return standard int error codes over enum i40e_status.
As such changes often also change variable orders, a cleanup
is also applied here to make variables conform to RCT and
some lines are also reformatted where applicable.
Signed-off-by: Jan Sokolowski <[email protected]>
Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <[email protected]>
|
|
Remove the i40e_stat_str() function which prints the string
representation of the i40e_status error code. With upcoming changes
moving away from i40e_status, there will be no need for this function
Signed-off-by: Jan Sokolowski <[email protected]>
Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <[email protected]>
|
|
In an effort to remove i40e status codes and replace them
with standard kernel errornums, unused values of i40e_status_code
were removed.
Signed-off-by: Jan Sokolowski <[email protected]>
Tested-by: Gurucharan G <[email protected]> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <[email protected]>
|
|
is used
While running this selftest which usually passes:
~/selftests/drivers/net/dsa# ./local_termination.sh eno0 swp0
TEST: swp0: Unicast IPv4 to primary MAC address [ OK ]
TEST: swp0: Unicast IPv4 to macvlan MAC address [ OK ]
TEST: swp0: Unicast IPv4 to unknown MAC address [ OK ]
TEST: swp0: Unicast IPv4 to unknown MAC address, promisc [ OK ]
TEST: swp0: Unicast IPv4 to unknown MAC address, allmulti [ OK ]
TEST: swp0: Multicast IPv4 to joined group [ OK ]
TEST: swp0: Multicast IPv4 to unknown group [ OK ]
TEST: swp0: Multicast IPv4 to unknown group, promisc [ OK ]
TEST: swp0: Multicast IPv4 to unknown group, allmulti [ OK ]
TEST: swp0: Multicast IPv6 to joined group [ OK ]
TEST: swp0: Multicast IPv6 to unknown group [ OK ]
TEST: swp0: Multicast IPv6 to unknown group, promisc [ OK ]
TEST: swp0: Multicast IPv6 to unknown group, allmulti [ OK ]
if I start PTP timestamping then run it again (debug prints added by me),
the unknown IPv6 MC traffic is seen by the CPU port even when it should
have been dropped:
~/selftests/drivers/net/dsa# ptp4l -i swp0 -2 -P -m
ptp4l[225.410]: selected /dev/ptp1 as PTP clock
[ 225.445746] mscc_felix 0000:00:00.5: ocelot_l2_ptp_trap_add: port 0 adding L2 PTP trap
[ 225.453815] mscc_felix 0000:00:00.5: ocelot_ipv4_ptp_trap_add: port 0 adding IPv4 PTP event trap
[ 225.462703] mscc_felix 0000:00:00.5: ocelot_ipv4_ptp_trap_add: port 0 adding IPv4 PTP general trap
[ 225.471768] mscc_felix 0000:00:00.5: ocelot_ipv6_ptp_trap_add: port 0 adding IPv6 PTP event trap
[ 225.480651] mscc_felix 0000:00:00.5: ocelot_ipv6_ptp_trap_add: port 0 adding IPv6 PTP general trap
ptp4l[225.488]: port 1: INITIALIZING to LISTENING on INIT_COMPLETE
ptp4l[225.488]: port 0: INITIALIZING to LISTENING on INIT_COMPLETE
^C
~/selftests/drivers/net/dsa# ./local_termination.sh eno0 swp0
TEST: swp0: Unicast IPv4 to primary MAC address [ OK ]
TEST: swp0: Unicast IPv4 to macvlan MAC address [ OK ]
TEST: swp0: Unicast IPv4 to unknown MAC address [ OK ]
TEST: swp0: Unicast IPv4 to unknown MAC address, promisc [ OK ]
TEST: swp0: Unicast IPv4 to unknown MAC address, allmulti [ OK ]
TEST: swp0: Multicast IPv4 to joined group [ OK ]
TEST: swp0: Multicast IPv4 to unknown group [ OK ]
TEST: swp0: Multicast IPv4 to unknown group, promisc [ OK ]
TEST: swp0: Multicast IPv4 to unknown group, allmulti [ OK ]
TEST: swp0: Multicast IPv6 to joined group [ OK ]
TEST: swp0: Multicast IPv6 to unknown group [FAIL]
reception succeeded, but should have failed
TEST: swp0: Multicast IPv6 to unknown group, promisc [ OK ]
TEST: swp0: Multicast IPv6 to unknown group, allmulti [ OK ]
The PGID_MCIPV6 is configured correctly to not flood to the CPU,
I checked that.
Furthermore, when I disable back PTP RX timestamping (ptp4l doesn't do
that when it exists), packets are RX filtered again as they should be:
~/selftests/drivers/net/dsa# hwstamp_ctl -i swp0 -r 0
[ 218.202854] mscc_felix 0000:00:00.5: ocelot_l2_ptp_trap_del: port 0 removing L2 PTP trap
[ 218.212656] mscc_felix 0000:00:00.5: ocelot_ipv4_ptp_trap_del: port 0 removing IPv4 PTP event trap
[ 218.222975] mscc_felix 0000:00:00.5: ocelot_ipv4_ptp_trap_del: port 0 removing IPv4 PTP general trap
[ 218.233133] mscc_felix 0000:00:00.5: ocelot_ipv6_ptp_trap_del: port 0 removing IPv6 PTP event trap
[ 218.242251] mscc_felix 0000:00:00.5: ocelot_ipv6_ptp_trap_del: port 0 removing IPv6 PTP general trap
current settings:
tx_type 1
rx_filter 12
new settings:
tx_type 1
rx_filter 0
~/selftests/drivers/net/dsa# ./local_termination.sh eno0 swp0
TEST: swp0: Unicast IPv4 to primary MAC address [ OK ]
TEST: swp0: Unicast IPv4 to macvlan MAC address [ OK ]
TEST: swp0: Unicast IPv4 to unknown MAC address [ OK ]
TEST: swp0: Unicast IPv4 to unknown MAC address, promisc [ OK ]
TEST: swp0: Unicast IPv4 to unknown MAC address, allmulti [ OK ]
TEST: swp0: Multicast IPv4 to joined group [ OK ]
TEST: swp0: Multicast IPv4 to unknown group [ OK ]
TEST: swp0: Multicast IPv4 to unknown group, promisc [ OK ]
TEST: swp0: Multicast IPv4 to unknown group, allmulti [ OK ]
TEST: swp0: Multicast IPv6 to joined group [ OK ]
TEST: swp0: Multicast IPv6 to unknown group [ OK ]
TEST: swp0: Multicast IPv6 to unknown group, promisc [ OK ]
TEST: swp0: Multicast IPv6 to unknown group, allmulti [ OK ]
So it's clear that something in the PTP RX trapping logic went wrong.
Looking a bit at the code, I can see that there are 4 typos, which
populate "ipv4" VCAP IS2 key filter fields for IPv6 keys.
VCAP IS2 keys of type OCELOT_VCAP_KEY_IPV4 and OCELOT_VCAP_KEY_IPV6 are
handled by is2_entry_set(). OCELOT_VCAP_KEY_IPV4 looks at
&filter->key.ipv4, and OCELOT_VCAP_KEY_IPV6 at &filter->key.ipv6.
Simply put, when we populate the wrong key field, &filter->key.ipv6
fields "proto.mask" and "proto.value" remain all zeroes (or "don't care").
So is2_entry_set() will enter the "else" of this "if" condition:
if (msk == 0xff && (val == IPPROTO_TCP || val == IPPROTO_UDP))
and proceed to ignore the "proto" field. The resulting rule will match
on all IPv6 traffic, trapping it to the CPU.
This is the reason why the local_termination.sh selftest sees it,
because control traps are stronger than the PGID_MCIPV6 used for
flooding (from the forwarding data path).
But the problem is in fact much deeper. We trap all IPv6 traffic to the
CPU, but if we're bridged, we set skb->offload_fwd_mark = 1, so software
forwarding will not take place and IPv6 traffic will never reach its
destination.
The fix is simple - correct the typos.
I was intentionally inaccurate in the commit message about the breakage
occurring when any PTP timestamping is enabled. In fact it only happens
when L4 timestamping is requested (HWTSTAMP_FILTER_PTP_V2_EVENT or
HWTSTAMP_FILTER_PTP_V2_L4_EVENT). But ptp4l requests a larger RX
timestamping filter than it needs for "-2": HWTSTAMP_FILTER_PTP_V2_EVENT.
I wanted people skimming through git logs to not think that the bug
doesn't affect them because they only use ptp4l in L2 mode.
Fixes: 96ca08c05838 ("net: mscc: ocelot: set up traps for PTP packets")
Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
Saeed Mahameed says:
====================
mlx5-next-netdev-deadlock
This series from Jiri solves a deadlock when removing a network namespace
with mlx5 devlink instance being in it.
The deadlock is between:
1) mlx5_ib->unregister_netdevice_notifier()
AND
2) mlx5_core->devlink_reload->cleanup_net()
To slove this introduced mlx5 netdev added/removed events to track uplink
netdev to be used for register_netdevice_notifier_dev_net() purposes.
* tag 'mlx5-next-netdev-deadlock' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
RDMA/mlx5: Track netdev to avoid deadlock during netdev notifier unregister
net/mlx5e: Propagate an internal event in case uplink netdev changes
net/mlx5e: Fix trap event handling
net/mlx5: Introduce CQE error syndrome
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
./drivers/net/ethernet/wangxun/libwx/wx_lib.c:683:2-3: Unneeded semicolon
Reported-by: Abaci Robot <[email protected]>
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=3976
Signed-off-by: Yang Li <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
drivers/net/ethernet/wangxun/libwx/wx_lib.c:1835 wx_setup_all_rx_resources() warn: inconsistent indenting
Reported-by: Abaci Robot <[email protected]>
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=3981
Signed-off-by: Yang Li <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
When removing a network namespace with mlx5 devlink instance being in
it, following callchain is performed:
cleanup_net (takes down_read(&pernet_ops_rwsem)
devlink_pernet_pre_exit()
devlink_reload()
mlx5_devlink_reload_down()
mlx5_unload_one_devl_locked()
mlx5_detach_device()
del_adev()
mlx5r_remove()
__mlx5_ib_remove()
mlx5_ib_roce_cleanup()
mlx5_remove_netdev_notifier()
unregister_netdevice_notifier (takes down_write(&pernet_ops_rwsem)
This deadlocks.
Resolve this by converting to register_netdevice_notifier_dev_net()
which does not take pernet_ops_rwsem and moves the notifier block around
according to netdev it takes as arg.
Use previously introduced netdev added/removed events to track uplink
netdev to be used for register_netdevice_notifier_dev_net() purposes.
Signed-off-by: Jiri Pirko <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Whenever uplink netdev is set/cleared, propagate newly introduced event
to inform notifier blocks netdev was added/removed.
Move the set() helper to core.c from header, introduce clear() and
netdev_added_event_replay() helpers. The last one is going to be called
from rdma driver, so export it.
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Current code does not return correct return value from event handler.
Fix it by returning NOTIFY_* and propagate err over newly introduce ctx
structure.
Signed-off-by: Jiri Pirko <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5 fixes 2023-02-07
This series provides bug fixes to mlx5 driver.
* tag 'mlx5-fixes-2023-02-07' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
net/mlx5: Serialize module cleanup with reload and remove
net/mlx5: fw_tracer, Zero consumer index when reloading the tracer
net/mlx5: fw_tracer, Clear load bit when freeing string DBs buffers
net/mlx5: Expose SF firmware pages counter
net/mlx5: Store page counters in a single array
net/mlx5e: IPoIB, Show unknown speed instead of error
net/mlx5e: Fix crash unsetting rx-vlan-filter in switchdev mode
net/mlx5: Bridge, fix ageing of peer FDB entries
net/mlx5: DR, Fix potential race in dr_rule_create_rule_nic
net/mlx5e: Update rx ring hw mtu upon each rx-fcs flag change
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Move include/linux/qcom_scm.h to include/linux/firmware/qcom/qcom_scm.h.
This removes 1 of a few remaining Qualcomm-specific headers into a more
approciate subdirectory under include/.
Suggested-by: Bjorn Andersson <[email protected]>
Signed-off-by: Elliot Berman <[email protected]>
Reviewed-by: Guru Das Srinagesh <[email protected]>
Acked-by: Mukesh Ojha <[email protected]>
Signed-off-by: Bjorn Andersson <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2023-02-07
1) Minor and trivial code Cleanups
2) Minor fixes for net-next
3) From Shay: dynamic FW trace strings update.
* tag 'mlx5-updates-2023-02-07' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
net/mlx5: fw_tracer, Add support for unrecognized string
net/mlx5: fw_tracer, Add support for strings DB update event
net/mlx5: fw_tracer, allow 0 size string DBs
net/mlx5: fw_tracer: Fix debug print
net/mlx5: fs, Remove redundant assignment of size
net/mlx5: fs_core, Remove redundant variable err
net/mlx5: Fix memory leak in error flow of port set buffer
net/mlx5e: Remove incorrect debugfs_create_dir NULL check in TLS
net/mlx5e: Remove incorrect debugfs_create_dir NULL check in hairpin
net/mlx5: fs, Remove redundant vport_number assignment
net/mlx5e: Remove redundant code for handling vlan actions
net/mlx5e: Don't listen to remove flows event
net/mlx5: fw reset: Skip device ID check if PCI link up failed
net/mlx5: Remove redundant health work lock
mlx5: reduce stack usage in mlx5_setup_tc
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Move xdp_features configuration from efx_pci_probe() to
efx_pci_probe_post_io() since it is where all the other basic netdev
features are initialised.
Signed-off-by: Lorenzo Bianconi <[email protected]>
Link: https://lore.kernel.org/r/9bd31c9a29bcf406ab90a249a28fc328e5578fd1.1675875404.git.lorenzo@kernel.org
Signed-off-by: Martin KaFai Lau <[email protected]>
|
|
Current taprio software implementation is haunted by the shadow of the
igb/igc hardware model. It iterates over child qdiscs in increasing
order of TXQ index, therefore giving higher xmit priority to TXQ 0 and
lower to TXQ N. According to discussions with Vinicius, that is the
default (perhaps even unchangeable) prioritization scheme used for the
NICs that taprio was first written for (igb, igc), and we have a case of
two bugs canceling out, resulting in a functional setup on igb/igc, but
a less sane one on other NICs.
To the best of my understanding, taprio should prioritize based on the
traffic class, so it should really dequeue starting with the highest
traffic class and going down from there. We get to the TXQ using the
tc_to_txq[] netdev property.
TXQs within the same TC have the same (strict) priority, so we should
pick from them as fairly as we can. We can achieve that by implementing
something very similar to q->curband from multiq_dequeue().
Since igb/igc really do have TXQ 0 of higher hardware priority than
TXQ 1 etc, we need to preserve the behavior for them as well. We really
have no choice, because in txtime-assist mode, taprio is essentially a
software scheduler towards offloaded child tc-etf qdiscs, so the TXQ
selection really does matter (not all igb TXQs support ETF/SO_TXTIME,
says Kurt Kanzenbach).
To preserve the behavior, we need a capability bit so that taprio can
determine if it's running on igb/igc, or on something else. Because igb
doesn't offload taprio at all, we can't piggyback on the
qdisc_offload_query_caps() call from taprio_enable_offload(), but
instead we need a separate call which is also made for software
scheduling.
Introduce two static keys to minimize the performance penalty on systems
which only have igb/igc NICs, and on systems which only have other NICs.
For mixed systems, taprio will have to dynamically check whether to
dequeue using one prioritization algorithm or using the other.
Signed-off-by: Vladimir Oltean <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The LAN8841 is completely integrated triple-speed (10BASE-T/ 100BASE-TX/
1000BASE-T) Ethernet physical layer transceivers for transmission and
reception of data on standard CAT-5, as well as CAT-5e and CAT-6,
unshielded twisted pair (UTP) cables.
The LAN8841 offers the industry-standard GMII/MII as well as the RGMII.
Some of the features of the PHY are:
- Wake on LAN
- Auto-MDIX
- IEEE 1588-2008 (V2)
- LinkMD Capable diagnosis
Currently the patch offers support only for link configuration.
Signed-off-by: Horatiu Vultur <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add flower filter packet statistics. This will just read the TCAM
counter of the rule, which mention how many packages were hit by this
rule.
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: Horatiu Vultur <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Arınç reports that on his MT7621AT Unielec U7621-06 board and MT7623NI
Bananapi BPI-R2, packets received by the CPU over mt7530 switch port 0
(of which this driver acts as the DSA master) are not processed
correctly by software. More precisely, they arrive without a DSA tag
(in packet or in the hwaccel area - skb_metadata_dst()), so DSA cannot
demux them towards the switch's interface for port 0. Traffic from other
ports receives a skb_metadata_dst() with the correct port and is demuxed
properly.
Looking at mtk_poll_rx(), it becomes apparent that this driver uses the
skb vlan hwaccel area:
union {
u32 vlan_all;
struct {
__be16 vlan_proto;
__u16 vlan_tci;
};
};
as a temporary storage for the VLAN hwaccel tag, or the DSA hwaccel tag.
If this is a DSA master it's a DSA hwaccel tag, and finally clears up
the skb VLAN hwaccel header.
I'm guessing that the problem is the (mis)use of API.
skb_vlan_tag_present() looks like this:
#define skb_vlan_tag_present(__skb) (!!(__skb)->vlan_all)
So if both vlan_proto and vlan_tci are zeroes, skb_vlan_tag_present()
returns precisely false. I don't know for sure what is the format of the
DSA hwaccel tag, but I surely know that lowermost 3 bits of vlan_proto
are 0 when receiving from port 0:
unsigned int port = vlan_proto & GENMASK(2, 0);
If the RX descriptor has no other bits set to non-zero values in
RX_DMA_VTAG, then the call to __vlan_hwaccel_put_tag() will not, in
fact, make the subsequent skb_vlan_tag_present() return true, because
it's implemented like this:
static inline void __vlan_hwaccel_put_tag(struct sk_buff *skb,
__be16 vlan_proto, u16 vlan_tci)
{
skb->vlan_proto = vlan_proto;
skb->vlan_tci = vlan_tci;
}
What we need to do to fix this problem (assuming this is the problem) is
to stop using skb->vlan_all as temporary storage for driver affairs, and
just create some local variables that serve the same purpose, but
hopefully better. Instead of calling skb_vlan_tag_present(), let's look
at a boolean has_hwaccel_tag which we set to true when the RX DMA
descriptors have something. Disambiguate based on netdev_uses_dsa()
whether this is a VLAN or DSA hwaccel tag, and only call
__vlan_hwaccel_put_tag() if we're certain it's a VLAN tag.
Arınç confirms that the treatment works, so this validates the
assumption.
Link: https://lore.kernel.org/netdev/[email protected]/
Fixes: 2d7605a72906 ("net: ethernet: mtk_eth_soc: enable hardware DSA untagging")
Reported-by: Arınç ÜNAL <[email protected]>
Tested-by: Arınç ÜNAL <[email protected]>
Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Felix Fietkau <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Unsupported port speed can be set and cause error. Now fixing it
and return an error if setting unsupported speed.
This fix depends on the following, which was included in v6.2-rc1:
commit a61474c41e8c ("nfp: ethtool: support reporting link modes").
Fixes: 7c698737270f ("nfp: add support for .set_link_ksettings()")
Signed-off-by: Yu Xiao <[email protected]>
Signed-off-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Parameters 'queue_index' and 'napi_id' are passed in a swapped order.
Fix it here.
Fixes: 23233e577ef9 ("net: ethernet: mtk_eth_soc: rely on page_pool for single page buffers")
Signed-off-by: Tariq Toukan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The special tag is only enabled when the first MAC uses DSA. However, it
must be enabled when any MAC uses DSA. Change the check accordingly.
This fixes hardware DSA untagging not working on the second MAC of the
MT7621 and MT7623 SoCs, and likely other SoCs too. Therefore, remove the
check that disables hardware DSA untagging for the second MAC of the MT7621
and MT7623 SoCs.
Fixes: a1f47752fd62 ("net: ethernet: mtk_eth_soc: disable hardware DSA untagging for second MAC")
Co-developed-by: Richard van Schagen <[email protected]>
Signed-off-by: Richard van Schagen <[email protected]>
Signed-off-by: Arınç ÜNAL <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2023-02-06 (ice)
This series contains updates to ice driver only.
Ani removes WQ_MEM_RECLAIM flag from workqueue to resolve
check_flush_dependency warning.
Michal fixes KASAN out-of-bounds warning.
Brett corrects behaviour for port VLAN Rx filters to prevent receiving
of unintended traffic.
Dan Carpenter fixes possible off by one issue.
Zhang Changzhong adjusts error path for switch recipe to prevent memory
leak.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ice: switch: fix potential memleak in ice_add_adv_recipe()
ice: Fix off by one in ice_tc_forward_to_queue()
ice: Fix disabling Rx VLAN filtering with port VLAN enabled
ice: fix out-of-bounds KASAN warning in virtchnl
ice: Do not use WQ_MEM_RECLAIM flag for workqueue
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
On some platforms, 100/1000/2500 speeds seem to have sometimes problems
reporting false positive tx unit hang during stressful UDP traffic. Likely
other Intel drivers introduce responses to a tx hang. Update the 'tx hang'
comparator with the comparison of the head and tail of ring pointers and
restore the tx_timeout_factor to the previous value (one).
This can be test by using netperf or iperf3 applications.
Example:
iperf3 -s -p 5001
iperf3 -c 192.168.0.2 --udp -p 5001 --time 600 -b 0
netserver -p 16604
netperf -H 192.168.0.2 -l 600 -p 16604 -t UDP_STREAM -- -m 64000
Fixes: b27b8dc77b5e ("igc: Increase timeout value for Speed 100/1000/2500")
Signed-off-by: Sasha Neftin <[email protected]>
Tested-by: Naama Meir <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
ice: various virtualization cleanups
Jacob Keller says:
This series contains a variety of refactors and cleanups in the VF code for
the ice driver. Its primary focus is cleanup and simplification of the VF
operations and addition of a few new operations that will be required by
Scalable IOV, as well as some other refactors needed for the handling of VF
subfunctions.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
ice: remove unnecessary virtchnl_ether_addr struct use
ice: introduce .irq_close VF operation
ice: introduce clear_reset_state operation
ice: convert vf_ops .vsi_rebuild to .create_vsi
ice: introduce ice_vf_init_host_cfg function
ice: add a function to initialize vf entry
ice: Pull common tasks into ice_vf_post_vsi_rebuild
ice: move ice_vf_vsi_release into ice_vf_lib.c
ice: move vsi_type assignment from ice_vsi_alloc to ice_vsi_cfg
ice: refactor VSI setup to use parameter structure
ice: drop unnecessary VF parameter from several VSI functions
ice: fix function comment referring to ice_vsi_alloc
ice: Add more usage of existing function ice_get_vf_vsi(vf)
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
After calling irq_set_affinity_and_hint(), the cpumask pointer is
saved in desc->affinity_hint, and will be used later when reading
/proc/irq/<num>/affinity_hint. So the cpumask variable needs to be
persistent. Otherwise, we are accessing freed memory when reading
the affinity_hint file.
Also, need to clear affinity_hint before free_irq(), otherwise there
is a one-time warning and stack trace during module unloading:
[ 243.948687] WARNING: CPU: 10 PID: 1589 at kernel/irq/manage.c:1913 free_irq+0x318/0x360
...
[ 243.948753] Call Trace:
[ 243.948754] <TASK>
[ 243.948760] mana_gd_remove_irqs+0x78/0xc0 [mana]
[ 243.948767] mana_gd_remove+0x3e/0x80 [mana]
[ 243.948773] pci_device_remove+0x3d/0xb0
[ 243.948778] device_remove+0x46/0x70
[ 243.948782] device_release_driver_internal+0x1fe/0x280
[ 243.948785] driver_detach+0x4e/0xa0
[ 243.948787] bus_remove_driver+0x70/0xf0
[ 243.948789] driver_unregister+0x35/0x60
[ 243.948792] pci_unregister_driver+0x44/0x90
[ 243.948794] mana_driver_exit+0x14/0x3fe [mana]
[ 243.948800] __do_sys_delete_module.constprop.0+0x185/0x2f0
To fix the bug, use the persistent mask, cpumask_of(cpu#), and set
affinity_hint to NULL before freeing the IRQ, as required by free_irq().
Cc: [email protected]
Fixes: 71fa6887eeca ("net: mana: Assign interrupts to CPUs based on NUMA nodes")
Signed-off-by: Haiyang Zhang <[email protected]>
Reviewed-by: Michael Kelley <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Memory allocations in the network transmit path must use GFP_ATOMIC
so they won't sleep.
Reported-by: Paolo Abeni <[email protected]>
Link: https://lore.kernel.org/lkml/[email protected]/
Fixes: 846da38de0e8 ("net: netvsc: Add Isolation VM support for netvsc driver")
Cc: [email protected]
Signed-off-by: Michael Kelley <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Move the nfp_net_get_port_mac_by_hwinfo() check to ahead in the
get/set_eeprom() functions to in order to check for a VF netdev, which
this function does not support.
It is debatable if this is a fix or an enhancement, and we have chosen
to go for the latter. It does address a problem introduced by
commit 74b4f1739d4e ("nfp: flower: change get/set_eeprom logic and enable for flower reps").
However, the ethtool->len == 0 check avoids the problem manifesting as a
run-time bug (NULL pointer dereference of app).
Signed-off-by: James Hershaw <[email protected]>
Reviewed-by: Louis Peens <[email protected]>
Signed-off-by: Simon Horman <[email protected]>
Reviewed-by: Leon Romanovsky <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Recent changes made it possible to register the devlink instance before
its sub-objects and under the instance lock. Among other things, it
allows us to avoid warnings such as this one [1]. The warning is
generated because a buggy firmware is generating a health event during
driver initialization, before the devlink instance is registered.
Move the registration of the devlink instance to the beginning of the
initialization flow to avoid such problems.
A similar change was implemented in netdevsim in commit 82a3aef2e6af
("netdevsim: move devlink registration under the instance lock").
[1]
WARNING: CPU: 3 PID: 49 at net/devlink/leftover.c:7509 devlink_recover_notify.constprop.0+0xaf/0xc0
[...]
Call Trace:
<TASK>
devlink_health_report+0x45/0x1d0
mlxsw_core_health_event_work+0x24/0x30 [mlxsw_core]
process_one_work+0x1db/0x390
worker_thread+0x49/0x3b0
kthread+0xe5/0x110
ret_from_fork+0x1f/0x30
</TASK>
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Jacob Keller <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Cited commit added 'DEVLINK_CMD_PARAM_DEL' notifications whenever the
network namespace of the devlink instance is changed. Specifically, the
notifications are generated after calling reload_down(), but before
calling reload_up(). At this stage, the data structures accessed while
reading the value of the "acl_region_rehash_interval" devlink parameter
are uninitialized, resulting in a use-after-free [1].
Fix by moving the registration and unregistration of the devlink
parameter to the TCAM code where it is actually used. This means that
the parameter is unregistered during reload_down() and then
re-registered during reload_up(), avoiding the use-after-free between
these two operations.
Reproducer:
# ip netns add test123
# devlink dev reload pci/0000:06:00.0 netns test123
[1]
BUG: KASAN: use-after-free in mlxsw_sp_acl_tcam_vregion_rehash_intrvl_get+0xb2/0xd0
Read of size 4 at addr ffff888162fd37d8 by task devlink/1323
[...]
Call Trace:
<TASK>
dump_stack_lvl+0x95/0xbd
print_report+0x181/0x4a1
kasan_report+0xdb/0x200
mlxsw_sp_acl_tcam_vregion_rehash_intrvl_get+0xb2/0xd0
mlxsw_sp_params_acl_region_rehash_intrvl_get+0x32/0x80
devlink_nl_param_fill.constprop.0+0x29a/0x11e0
devlink_param_notify.constprop.0+0xb9/0x250
devlink_notify_unregister+0xbc/0x470
devlink_reload+0x1aa/0x440
devlink_nl_cmd_reload+0x559/0x11b0
genl_family_rcv_msg_doit.isra.0+0x1f8/0x2e0
genl_rcv_msg+0x558/0x7f0
netlink_rcv_skb+0x170/0x440
genl_rcv+0x2d/0x40
netlink_unicast+0x53f/0x810
netlink_sendmsg+0x961/0xe80
__sys_sendto+0x2a4/0x420
__x64_sys_sendto+0xe5/0x1c0
do_syscall_64+0x38/0x80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Fixes: 7d7e9169a3ec ("devlink: move devlink reload notifications back in between _down() and _up() calls")
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Jacob Keller <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Move the initialization and de-initialization code further below in
order to avoid forward declarations in the next patch. No functional
changes.
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Jacob Keller <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Move mutex_destroy() to the end to make the function symmetric with
mlxsw_sp_acl_tcam_init(). No functional changes.
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Jacob Keller <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Pair mutex_init() with a mutex_destroy() in the error path. Found during
code review. No functional changes.
Signed-off-by: Ido Schimmel <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Jacob Keller <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The "acl_region_rehash_interval" devlink parameter is a "runtime"
parameter, making the call to devl_param_driverinit_value_set()
pointless. Before cited commit the function simply returned an error
(that was not checked), but now it emits a WARNING [1].
Fix by removing the function call.
[1]
WARNING: CPU: 0 PID: 7 at net/devlink/leftover.c:10974
devl_param_driverinit_value_set+0x8c/0x90
[...]
Call Trace:
<TASK>
mlxsw_sp2_params_register+0x83/0xb0 [mlxsw_spectrum]
__mlxsw_core_bus_device_register+0x5e5/0x990 [mlxsw_core]
mlxsw_core_bus_device_register+0x42/0x60 [mlxsw_core]
mlxsw_pci_probe+0x1f0/0x230 [mlxsw_pci]
local_pci_probe+0x1a/0x40
work_for_cpu_fn+0xf/0x20
process_one_work+0x1db/0x390
worker_thread+0x1d5/0x3b0
kthread+0xe5/0x110
ret_from_fork+0x1f/0x30
</TASK>
Fixes: 85fe0b324c83 ("devlink: make devlink_param_driverinit_value_set() return void")
Signed-off-by: Danielle Ratson <[email protected]>
Reviewed-by: Ido Schimmel <[email protected]>
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Reviewed-by: Jacob Keller <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Add PF driver support for the following:
- Viewing the standardized MAC Merge layer counters.
- Viewing the standardized Ethernet MAC and RMON counters associated
with the pMAC.
Signed-off-by: Vladimir Oltean <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|