Age | Commit message (Collapse) | Author | Files | Lines |
|
Add the logic to compare net_device returned by ip_dev_find()
with the net_device list in cdev->ports[] array and return
net_device if matched else NULL.
Fixes: 6abde0b24122 ("crypto/chtls: IPv6 support for inline TLS")
Signed-off-by: Venkatesh Ellapu <[email protected]>
Signed-off-by: Vinay Kumar Yadav <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Netdev is filled in egress_dev when connection is established,
If connection is closed before establishment, then egress_dev
is NULL, Fix it using ip_dev_find() rather then extracting from
egress_dev.
Fixes: 6abde0b24122 ("crypto/chtls: IPv6 support for inline TLS")
Signed-off-by: Venkatesh Ellapu <[email protected]>
Signed-off-by: Vinay Kumar Yadav <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Check if netdevice is a vlan interface and find real vlan netdevice.
Fixes: cc35c88ae4db ("crypto : chtls - CPL handler definition")
Signed-off-by: Venkatesh Ellapu <[email protected]>
Signed-off-by: Vinay Kumar Yadav <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
In chtls_sendpage() socket lock is released but not acquired,
fix it by taking lock.
Fixes: 36bedb3f2e5b ("crypto: chtls - Inline TLS record Tx")
Signed-off-by: Vinay Kumar Yadav <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Since commit bbc4d71d63549bc ("net: phy: realtek: fix rtl8211e rx/tx
delay config"), the Realtek PHY driver will override any TX/RX delay
set by hardware straps if the phy-mode device property does not match.
This is causing problems on SynQuacer based platforms (the only SoC
that incorporates the netsec hardware), since many were built with
this Realtek PHY, and shipped with firmware that defines the phy-mode
as 'rgmii', even though the PHY is configured for TX and RX delay using
pull-ups.
From the driver's perspective, we should not make any assumptions in
the general case that the PHY hardware does not require any initial
configuration. However, the situation is slightly different for ACPI
boot, since it implies rich firmware with AML abstractions to handle
hardware details that are not exposed to the OS. So in the ACPI case,
it is reasonable to assume that the PHY comes up in the right mode,
regardless of whether the mode is set by straps, by boot time firmware
or by AML executed by the ACPI interpreter.
So let's ignore the 'phy-mode' device property when probing the netsec
driver in ACPI mode, and hardcode the mode to PHY_INTERFACE_MODE_NA,
which should work with any PHY provided that it is configured by the
time the driver attaches to it. While at it, document that omitting
the mode is permitted for DT probing as well, by setting the phy-mode
DT property to the empty string.
Fixes: 533dd11a12f6 ("net: socionext: Add Synquacer NetSec driver")
Signed-off-by: Ard Biesheuvel <[email protected]>
Reviewed-by: Ilias Apalodimas <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
A break is not needed if it is preceded by a return or goto
Signed-off-by: Tom Rix <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Fixes gcc warning:
passing argument 1 of 'kfree' makes pointer from integer without a cast
Fixes: 3af5f0f5c74e ("net: korina: fix kfree of rx/tx descriptor array")
Reported-by: kernel test robot <[email protected]>
Signed-off-by: Valentin Vidic <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
For several network drivers it was reported that using
__napi_schedule_irqoff() is unsafe with forced threading. One way to
fix this is switching back to __napi_schedule, but then we lose the
benefit of the irqoff version in general. As stated by Eric it doesn't
make sense to make the minimal hard irq handlers in drivers using NAPI
a thread. Therefore ensure that the hard irq handler is never
thread-ified.
Fixes: 9a899a35b0d6 ("r8169: switch to napi_schedule_irqoff")
Link: https://lkml.org/lkml/2020/10/18/19
Signed-off-by: Heiner Kallweit <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Ian reports that after upgrade from v5.8.14 to v5.9 only one
of his 4 ixgbe netdevs appear in the system.
Quoting the comment on ixgbe_x550em_a_has_mii():
* Returns true if hw points to lowest numbered PCI B:D.F x550_em_a device in
* the SoC. There are up to 4 MACs sharing a single MDIO bus on the x550em_a,
* but we only want to register one MDIO bus.
This matches the symptoms, since the return value from
ixgbe_mii_bus_init() is no longer ignored we need to handle
the higher ports of x550em without an error.
Fixes: 09ef193fef7e ("net: ethernet: ixgbe: check the return value of ixgbe_mii_bus_init()")
Reported-by: Ian Kumlien <[email protected]>
Tested-by: Ian Kumlien <[email protected]>
Acked-by: Jesse Brandeburg <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Pull rdma updates from Jason Gunthorpe:
"A usual cycle for RDMA with a typical mix of driver and core subsystem
updates:
- Driver minor changes and bug fixes for mlx5, efa, rxe, vmw_pvrdma,
hns, usnic, qib, qedr, cxgb4, hns, bnxt_re
- Various rtrs fixes and updates
- Bug fix for mlx4 CM emulation for virtualization scenarios where
MRA wasn't working right
- Use tracepoints instead of pr_debug in the CM code
- Scrub the locking in ucma and cma to close more syzkaller bugs
- Use tasklet_setup in the subsystem
- Revert the idea that 'destroy' operations are not allowed to fail
at the driver level. This proved unworkable from a HW perspective.
- Revise how the umem API works so drivers make fewer mistakes using
it
- XRC support for qedr
- Convert uverbs objects RWQ and MW to new the allocation scheme
- Large queue entry sizes for hns
- Use hmm_range_fault() for mlx5 On Demand Paging
- uverbs APIs to inspect the GID table instead of sysfs
- Move some of the RDMA code for building large page SGLs into
lib/scatterlist"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (191 commits)
RDMA/ucma: Fix use after free in destroy id flow
RDMA/rxe: Handle skb_clone() failure in rxe_recv.c
RDMA/rxe: Move the definitions for rxe_av.network_type to uAPI
RDMA: Explicitly pass in the dma_device to ib_register_device
lib/scatterlist: Do not limit max_segment to PAGE_ALIGNED values
IB/mlx4: Convert rej_tmout radix-tree to XArray
RDMA/rxe: Fix bug rejecting all multicast packets
RDMA/rxe: Fix skb lifetime in rxe_rcv_mcast_pkt()
RDMA/rxe: Remove duplicate entries in struct rxe_mr
IB/hfi,rdmavt,qib,opa_vnic: Update MAINTAINERS
IB/rdmavt: Fix sizeof mismatch
MAINTAINERS: CISCO VIC LOW LATENCY NIC DRIVER
RDMA/bnxt_re: Fix sizeof mismatch for allocation of pbl_tbl.
RDMA/bnxt_re: Use rdma_umem_for_each_dma_block()
RDMA/umem: Move to allocate SG table from pages
lib/scatterlist: Add support in dynamic allocation of SG table from pages
tools/testing/scatterlist: Show errors in human readable form
tools/testing/scatterlist: Rejuvenate bit-rotten test
RDMA/ipoib: Set rtnl_link_ops for ipoib interfaces
RDMA/uverbs: Expose the new GID query API to user space
...
|
|
The new HW arbitration feature on Aspeed ast2600 will cause MAC TX to
hang when handling scatter-gather DMA. Disable the problematic feature
by setting MAC register 0x58 bit28 and bit27.
Fixes: 39bfab8844a0 ("net: ftgmac100: Add support for DT phy-handle property")
Signed-off-by: Dylan Hung <[email protected]>
Reviewed-by: Joel Stanley <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
From Maor Gottlieb says:
====================
This series extends __sg_alloc_table_from_pages to allow chaining of new
pages to an already initialized SG table.
This allows for drivers to utilize the optimization of merging contiguous
pages without a need to pre allocate all the pages and hold them in a very
large temporary buffer prior to the call to SG table initialization.
The last patch changes the Infiniband core to use the new API. It removes
duplicate functionality from the code and benefits from the optimization
of allocating dynamic SG table from pages.
In huge pages system of 2MB page size, without this change, the SG table
would contain x512 SG entries.
====================
* branch 'dynamic_sg':
RDMA/umem: Move to allocate SG table from pages
lib/scatterlist: Add support in dynamic allocation of SG table from pages
tools/testing/scatterlist: Show errors in human readable form
tools/testing/scatterlist: Rejuvenate bit-rotten test
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
- Add redirect_neigh() BPF packet redirect helper, allowing to limit
stack traversal in common container configs and improving TCP
back-pressure.
Daniel reports ~10Gbps => ~15Gbps single stream TCP performance gain.
- Expand netlink policy support and improve policy export to user
space. (Ge)netlink core performs request validation according to
declared policies. Expand the expressiveness of those policies
(min/max length and bitmasks). Allow dumping policies for particular
commands. This is used for feature discovery by user space (instead
of kernel version parsing or trial and error).
- Support IGMPv3/MLDv2 multicast listener discovery protocols in
bridge.
- Allow more than 255 IPv4 multicast interfaces.
- Add support for Type of Service (ToS) reflection in SYN/SYN-ACK
packets of TCPv6.
- In Multi-patch TCP (MPTCP) support concurrent transmission of data on
multiple subflows in a load balancing scenario. Enhance advertising
addresses via the RM_ADDR/ADD_ADDR options.
- Support SMC-Dv2 version of SMC, which enables multi-subnet
deployments.
- Allow more calls to same peer in RxRPC.
- Support two new Controller Area Network (CAN) protocols - CAN-FD and
ISO 15765-2:2016.
- Add xfrm/IPsec compat layer, solving the 32bit user space on 64bit
kernel problem.
- Add TC actions for implementing MPLS L2 VPNs.
- Improve nexthop code - e.g. handle various corner cases when nexthop
objects are removed from groups better, skip unnecessary
notifications and make it easier to offload nexthops into HW by
converting to a blocking notifier.
- Support adding and consuming TCP header options by BPF programs,
opening the doors for easy experimental and deployment-specific TCP
option use.
- Reorganize TCP congestion control (CC) initialization to simplify
life of TCP CC implemented in BPF.
- Add support for shipping BPF programs with the kernel and loading
them early on boot via the User Mode Driver mechanism, hence reusing
all the user space infra we have.
- Support sleepable BPF programs, initially targeting LSM and tracing.
- Add bpf_d_path() helper for returning full path for given 'struct
path'.
- Make bpf_tail_call compatible with bpf-to-bpf calls.
- Allow BPF programs to call map_update_elem on sockmaps.
- Add BPF Type Format (BTF) support for type and enum discovery, as
well as support for using BTF within the kernel itself (current use
is for pretty printing structures).
- Support listing and getting information about bpf_links via the bpf
syscall.
- Enhance kernel interfaces around NIC firmware update. Allow
specifying overwrite mask to control if settings etc. are reset
during update; report expected max time operation may take to users;
support firmware activation without machine reboot incl. limits of
how much impact reset may have (e.g. dropping link or not).
- Extend ethtool configuration interface to report IEEE-standard
counters, to limit the need for per-vendor logic in user space.
- Adopt or extend devlink use for debug, monitoring, fw update in many
drivers (dsa loop, ice, ionic, sja1105, qed, mlxsw, mv88e6xxx,
dpaa2-eth).
- In mlxsw expose critical and emergency SFP module temperature alarms.
Refactor port buffer handling to make the defaults more suitable and
support setting these values explicitly via the DCBNL interface.
- Add XDP support for Intel's igb driver.
- Support offloading TC flower classification and filtering rules to
mscc_ocelot switches.
- Add PTP support for Marvell Octeontx2 and PP2.2 hardware, as well as
fixed interval period pulse generator and one-step timestamping in
dpaa-eth.
- Add support for various auth offloads in WiFi APs, e.g. SAE (WPA3)
offload.
- Add Lynx PHY/PCS MDIO module, and convert various drivers which have
this HW to use it. Convert mvpp2 to split PCS.
- Support Marvell Prestera 98DX3255 24-port switch ASICs, as well as
7-port Mediatek MT7531 IP.
- Add initial support for QCA6390 and IPQ6018 in ath11k WiFi driver,
and wcn3680 support in wcn36xx.
- Improve performance for packets which don't require much offloads on
recent Mellanox NICs by 20% by making multiple packets share a
descriptor entry.
- Move chelsio inline crypto drivers (for TLS and IPsec) from the
crypto subtree to drivers/net. Move MDIO drivers out of the phy
directory.
- Clean up a lot of W=1 warnings, reportedly the actively developed
subsections of networking drivers should now build W=1 warning free.
- Make sure drivers don't use in_interrupt() to dynamically adapt their
code. Convert tasklets to use new tasklet_setup API (sadly this
conversion is not yet complete).
* tag 'net-next-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2583 commits)
Revert "bpfilter: Fix build error with CONFIG_BPFILTER_UMH"
net, sockmap: Don't call bpf_prog_put() on NULL pointer
bpf, selftest: Fix flaky tcp_hdr_options test when adding addr to lo
bpf, sockmap: Add locking annotations to iterator
netfilter: nftables: allow re-computing sctp CRC-32C in 'payload' statements
net: fix pos incrementment in ipv6_route_seq_next
net/smc: fix invalid return code in smcd_new_buf_create()
net/smc: fix valid DMBE buffer sizes
net/smc: fix use-after-free of delayed events
bpfilter: Fix build error with CONFIG_BPFILTER_UMH
cxgb4/ch_ipsec: Replace the module name to ch_ipsec from chcr
net: sched: Fix suspicious RCU usage while accessing tcf_tunnel_info
bpf: Fix register equivalence tracking.
rxrpc: Fix loss of final ack on shutdown
rxrpc: Fix bundle counting for exclusive connections
netfilter: restore NF_INET_NUMHOOKS
ibmveth: Identify ingress large send packets.
ibmveth: Switch order of ibmveth_helper calls.
cxgb4: handle 4-tuple PEDIT to NAT mode translation
selftests: Add VRF route leaking tests
...
|
|
Pull dma-mapping updates from Christoph Hellwig:
- rework the non-coherent DMA allocator
- move private definitions out of <linux/dma-mapping.h>
- lower CMA_ALIGNMENT (Paul Cercueil)
- remove the omap1 dma address translation in favor of the common code
- make dma-direct aware of multiple dma offset ranges (Jim Quinlan)
- support per-node DMA CMA areas (Barry Song)
- increase the default seg boundary limit (Nicolin Chen)
- misc fixes (Robin Murphy, Thomas Tai, Xu Wang)
- various cleanups
* tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mapping: (63 commits)
ARM/ixp4xx: add a missing include of dma-map-ops.h
dma-direct: simplify the DMA_ATTR_NO_KERNEL_MAPPING handling
dma-direct: factor out a dma_direct_alloc_from_pool helper
dma-direct check for highmem pages in dma_direct_alloc_pages
dma-mapping: merge <linux/dma-noncoherent.h> into <linux/dma-map-ops.h>
dma-mapping: move large parts of <linux/dma-direct.h> to kernel/dma
dma-mapping: move dma-debug.h to kernel/dma/
dma-mapping: remove <asm/dma-contiguous.h>
dma-mapping: merge <linux/dma-contiguous.h> into <linux/dma-map-ops.h>
dma-contiguous: remove dma_contiguous_set_default
dma-contiguous: remove dev_set_cma_area
dma-contiguous: remove dma_declare_contiguous
dma-mapping: split <linux/dma-mapping.h>
cma: decrease CMA_ALIGNMENT lower limit to 2
firewire-ohci: use dma_alloc_pages
dma-iommu: implement ->alloc_noncoherent
dma-mapping: add new {alloc,free}_noncoherent dma_map_ops methods
dma-mapping: add a new dma_alloc_pages API
dma-mapping: remove dma_cache_sync
53c700: convert to dma_alloc_noncoherent
...
|
|
Minor conflicts in net/mptcp/protocol.h and
tools/testing/selftests/net/Makefile.
In both cases code was added on both sides in the same place
so just keep both.
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
This patch changes the module name to "ch_ipsec" and prepends
"ch_ipsec" string instead of "chcr" in all debug messages and
function names.
V1->V2:
-Removed inline keyword from functions.
-Removed CH_IPSEC prefix from pr_debug.
-Used proper indentation for the continuation line of the function
arguments.
V2->V3:
Fix the checkpatch.pl warnings.
Fixes: 1b77be463929 ("crypto/chcr: Moving chelsio's inline ipsec functionality to /drivers/net")
Signed-off-by: Ayush Sawal <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Ingress large send packets are identified by either:
The IBMVETH_RXQ_LRG_PKT flag in the receive buffer
or with a -1 placed in the ip header checksum.
The method used depends on firmware version. Frame
geometry and sufficient header validation is performed by the
hypervisor eliminating the need for further header checks here.
Fixes: 7b5967389f5a ("ibmveth: set correct gso_size and gso_type")
Signed-off-by: David Wilder <[email protected]>
Reviewed-by: Thomas Falcon <[email protected]>
Reviewed-by: Cristobal Forno <[email protected]>
Reviewed-by: Pradeep Satyanarayana <[email protected]>
Acked-by: Willem de Bruijn <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
ibmveth_rx_csum_helper() must be called after ibmveth_rx_mss_helper()
as ibmveth_rx_csum_helper() may alter ip and tcp checksum values.
Fixes: 66aa0678efc2 ("ibmveth: Support to enable LSO/CSO for Trunk VEA.")
Signed-off-by: David Wilder <[email protected]>
Reviewed-by: Thomas Falcon <[email protected]>
Reviewed-by: Cristobal Forno <[email protected]>
Reviewed-by: Pradeep Satyanarayana <[email protected]>
Acked-by: Willem de Bruijn <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The 4-tuple NAT offload via PEDIT always overwrites all the 4-tuple
fields even if they had not been explicitly enabled. If any fields in
the 4-tuple are not enabled, then the hardware overwrites the
disabled fields with zeros, instead of ignoring them.
So, add a parser that can translate the enabled 4-tuple PEDIT fields
to one of the NAT mode combinations supported by the hardware and
hence avoid overwriting disabled fields to 0. Any rule with
unsupported NAT mode combination is rejected.
Signed-off-by: Herat Ramani <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx
Pull SPDX updates from Greg KH:
"Here are some SPDX-specific changes for 5.10-rc1.
They include:
- driver fixes to make spdxcheck.pl work properly
- add GFDL licenses as "deprecated" but required due to some of our
documentation using them
- add Zlib license as "deprecated" but required because we have code
with this license in the tree.
- convert some drivers to have SPDX identifiers that previously
didn't have them.
All have been in linux-next for a very long time with no reported
issues"
* tag 'spdx-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx:
scripts/spdxcheck.py: handle license identifiers in XML comments
net/mlx5: IPsec: make spdxcheck.py happy
LICENSES/deprecated: add Zlib license text
LICENSE: add GFDL deprecated licenses
net/qla3xxx: Convert to SPDX license identifiers
net/qlge: Convert to SPDX license identifiers
net/qlcnic: Convert to SPDX license identifiers
scsi/qla2xxx: Convert to SPDX license identifiers
scsi/qla4xxx: Convert to SPDX license identifiers
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2020-10-12
Updates to mlx5 driver:
- Cleanup fix of uininitialized pointer read
- xfrm IPSec TX offload
====================
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The e1000_clear_vfta function was triggering a warning in kbuild-bot
testing. It's actually a bug but has no functional impact.
drivers/net/ethernet/intel/e1000/e1000_hw.c:4415:58: warning: Same expression in both branches of ternary operator. [duplicateExpressionTernary]
Fix this warning by removing the offending code and simplifying
the routine to do exactly what it did before, no functional
change.
Signed-off-by: Jesse Brandeburg <[email protected]>
Tested-by: Aaron Brown <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Starting with API version 1.10 firmware for X722 devices has ability
to change FEC settings in PHY. Code added in this patch allows
changing FEC settings if the capability flag indicates the device
supports this feature.
Signed-off-by: Jaroslaw Gawin <[email protected]>
Signed-off-by: Aleksandr Loktionov <[email protected]>
Signed-off-by: Arkadiusz Kubalewski <[email protected]>
Tested-by: Andrew Bowers <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
A helper for checking whether a net_device belongs to mscc_ocelot
already existed and did not need to be rewritten. Use it.
Fixes: 319e4dd11a20 ("net: mscc: ocelot: introduce conversion helpers between port and netdev")
Signed-off-by: Vladimir Oltean <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The at91rm9200 variant used by a few chips including the MSC313 supports
two Tx descriptors (one frame being serialized and another one queued).
However the driver only implemented a single one, which adds a dead time
after each transfer to receive and process the interrupt and wake the
queue up, preventing from reaching line rate.
This patch implements a very basic 2-deep queue to address this limitation.
The tests run on a Breadbee board equipped with an MSC313E show that at
1 GHz, HTTP traffic on medium-sized objects (45kB) was limited to exactly
50 Mbps before this patch, and jumped to 76 Mbps with this patch. And tests
on a single TCP stream with an MTU of 576 jump from 10kpps to 15kpps. With
1500 byte packets it's now possible to reach line rate versus 75 Mbps
before.
Cc: Nicolas Ferre <[email protected]>
Cc: Claudiu Beznea <[email protected]>
Cc: Daniel Palmer <[email protected]>
Signed-off-by: Willy Tarreau <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The RM9200 supports one frame being sent while another one is waiting in
queue. This avoids the dead time that follows the emission of a frame
and which prevents one from reaching line speed.
Right now the driver supports only a single skb, so we'll first replace
the rm9200-specific skb info with an array of two macb_tx_skb (already
used by other drivers). This patch only moves the skb_length to
txq[0].size and skb_physaddr to skb[0].mapping but doesn't perform any
other change. It already uses [desc] in order to minimize future changes.
Cc: Nicolas Ferre <[email protected]>
Cc: Claudiu Beznea <[email protected]>
Cc: Daniel Palmer <[email protected]>
Signed-off-by: Willy Tarreau <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Transmit Buffer Register Empty replaces TXERR on RM9200 and signals the
sender may try to send again becase the last queued frame is no longer
in queue (being transmitted or already transmitted).
Cc: Nicolas Ferre <[email protected]>
Cc: Claudiu Beznea <[email protected]>
Cc: Daniel Palmer <[email protected]>
Signed-off-by: Willy Tarreau <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull copy_and_csum cleanups from Al Viro:
"Saner calling conventions for csum_and_copy_..._user() and friends"
[ Removing 800+ lines of code and cleaning stuff up is good - Linus ]
* 'work.csum_and_copy' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
ppc: propagate the calling conventions change down to csum_partial_copy_generic()
amd64: switch csum_partial_copy_generic() to new calling conventions
sparc64: propagate the calling convention changes down to __csum_partial_copy_...()
xtensa: propagate the calling conventions change down into csum_partial_copy_generic()
mips: propagate the calling convention change down into __csum_partial_copy_..._user()
mips: __csum_partial_copy_kernel() has no users left
mips: csum_and_copy_{to,from}_user() are never called under KERNEL_DS
sparc32: propagate the calling conventions change down to __csum_partial_copy_sparc_generic()
i386: propagate the calling conventions change down to csum_partial_copy_generic()
sh: propage the calling conventions change down to csum_partial_copy_generic()
m68k: get rid of zeroing destination on error in csum_and_copy_from_user()
arm: propagate the calling convention changes down to csum_partial_copy_from_user()
alpha: propagate the calling convention changes down to csum_partial_copy.c helpers
saner calling conventions for csum_and_copy_..._user()
csum_and_copy_..._user(): pass 0xffffffff instead of 0 as initial sum
csum_partial_copy_nocheck(): drop the last argument
unify generic instances of csum_partial_copy_nocheck()
icmp_push_reply(): reorder adding the checksum up
skb_copy_and_csum_bits(): don't bother with the last argument
|
|
In the TX data path, spot packets with xfrm stack IPsec offload
indication.
Fill Software-Parser segment in TX descriptor so that the hardware
may parse the ESP protocol, and perform TX checksum offload on the
inner payload.
Support GSO, by providing the trailer data and ICV placeholder
so HW can fill it post encryption operation.
Padding alignment cannot be performed in HW (ConnectX-6Dx) due to
a bug. Software can overcome this limitation by adding NETIF_F_HW_ESP to
the gso_partial_features field in netdev so the packets being
aligned by the stack.
l4_inner_checksum cannot be offloaded by HW for IPsec tunnel type packet.
Note that for GSO SKBs, the stack does not include an ESP trailer,
unlike the non-GSO case.
Below is the iperf3 performance report on two server of 24 cores
Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz with ConnectX6-DX.
All the bandwidth test uses iperf3 TCP traffic with packet size 128KB.
Each tunnel uses one iperf3 stream with one thread (option -P1).
TX crypto offload shows improvements on both bandwidth
and CPU utilization.
----------------------------------------------------------------------
Mode | Num tunnel | BW | Send CPU util | Recv CPU util
| | (Gbps) | (Average %) | (Average %)
----------------------------------------------------------------------
Cryto offload | | | |
(RX only) | 1 | 4.7 | 4.2 | 3.5
----------------------------------------------------------------------
Cryto offload | | | |
(RX only) | 24 | 15.6 | 20 | 10
----------------------------------------------------------------------
Non-offload | 1 | 4.6 | 4 | 5
----------------------------------------------------------------------
Non-offload | 24 | 11.9 | 16 | 12
----------------------------------------------------------------------
Cryto offload | | | |
(TX & RX) | 1 | 11.9 | 2.1 | 5.9
----------------------------------------------------------------------
Cryto offload | | | |
(TX & RX) | 24 | 38 | 9.5 | 27.5
----------------------------------------------------------------------
Cryto offload | | | |
(TX only) | 1 | 4.7 | 0.7 | 5
----------------------------------------------------------------------
Cryto offload | | | |
(TX only) | 24 | 14.5 | 6 | 20
Regression tests show no degradation on non-ipsec and
non-offload-ipsec traffics. The packet rate test uses pktgen UDP to
transmit on single CPU, the instructions and cycles are measured on
the transmit CPU.
before:
----------------------------------------------------------------------
Non-offload | 1 | 4.7 | 4.2 | 5.1
----------------------------------------------------------------------
Non-offload | 24 | 11.2 | 14 | 15
----------------------------------------------------------------------
Non-ipsec | 1 | 28 | 4 | 5.7
----------------------------------------------------------------------
Non-ipsec | 24 | 68.3 | 17.8 | 39.7
----------------------------------------------------------------------
Non-ipsec packet rate(BURST=1000 BC=5 NCPUS=1 SIZE=60)
13.56Mpps, 456 instructions/pkt, 191 cycles/pkt
after:
----------------------------------------------------------------------
Non-offload | 1 | 4.69 | 4.2 | 5
----------------------------------------------------------------------
Non-offload | 24 | 11.9 | 13.5 | 15.1
----------------------------------------------------------------------
Non-ipsec | 1 | 29 | 3.2 | 5.5
----------------------------------------------------------------------
Non-ipsec | 24 | 68.2 | 18.5 | 39.8
----------------------------------------------------------------------
Non-ipsec packet rate: 13.56Mpps, 472 instructions/pkt, 191 cycles/pkt
Signed-off-by: Raed Salem <[email protected]>
Signed-off-by: Huy Nguyen <[email protected]>
Reviewed-by: Maxim Mikityanskiy <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Add new FTE in TX IPsec FT per IPsec state. It has the
same matching criteria as the RX steering rule.
The IPsec FT is created/destroyed when the first/last rule
is added/deleted respectively.
Signed-off-by: Huy Nguyen <[email protected]>
Reviewed-by: Boris Pismenny <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Add new namespace that represents the NIC TX domain.
Signed-off-by: Huy Nguyen <[email protected]>
Signed-off-by: Raed Salem <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
Currently the error exit path err_free kfree's attr. In the case where
flow and parse_attr failed to be allocated this return path will free
the uninitialized pointer attr, which is not correct. In the other
case where attr fails to allocate attr does not need to be freed. So
in both error exits via err_free attr should not be freed, so remove
it.
Addresses-Coverity: ("Uninitialized pointer read")
Fixes: ff7ea04ad579 ("net/mlx5e: Fix potential null pointer dereference")
Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
|
|
This patch adds FW versions stored in the flash to devlink info_get
callback. Return the correct fw.psid running version using the
newly added bp->nvm_cfg_ver.
v2:
Ensure stored pkg_name string is NULL terminated when copied to
devlink.
Return directly from the last call to bnxt_dl_info_put().
If the FW call to get stored version fails for any reason, return
success immediately to devlink without the stored versions.
Reviewed-by: Andy Gospodarek <[email protected]>
Signed-off-by: Vasundhara Volam <[email protected]>
Signed-off-by: Michael Chan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Add a new function bnxt_dl_info_put() to simplify the code, as there
are more stored firmware version fields to be added in the next patch.
Also, rename fw_ver variable name to ncsi_ver for better naming while
copying to devlink info_get cb.
v2:
Ensure active_pkg_name string is NULL terminated when copied to
devlink.
Return directly from the last call to bnxt_dl_info_put().
Reviewed-by: Pavan Chebbi <[email protected]>
Reviewed-by: Andy Gospodarek <[email protected]>
Signed-off-by: Vasundhara Volam <[email protected]>
Signed-off-by: Michael Chan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Add a new bnxt_hwrm_nvm_get_dev_info() to query firmware version
information via NVM_GET_DEV_INFO firmware command. Use it to
get the running version of the NVM configuration information.
This new function will also be used in subsequent patches to get the
stored firmware versions.
Reviewed-by: Andy Gospodarek <[email protected]>
Signed-off-by: Vasundhara Volam <[email protected]>
Signed-off-by: Michael Chan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
If the VF virtual link is set to always enabled, the speed may be
unknown when the physical link is down. The driver currently logs
the link speed as 4294967295 Mbps which is SPEED_UNKNOWN. Modify
the link up log message as "speed unknown" which makes more sense.
Reviewed-by: Vasundhara Volam <[email protected]>
Reviewed-by: Edwin Peer <[email protected]>
Signed-off-by: Michael Chan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Log these values that contain useful firmware state information.
Reviewed-by: Edwin Peer <[email protected]>
Reviewed-by: Vasundhara Volam <[email protected]>
Signed-off-by: Michael Chan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
event_data1 and event_data2 are used when processing most events.
Store these in local variables at the beginning of the function to
simplify many of the case statements.
Reviewed-by: Edwin Peer <[email protected]>
Signed-off-by: Michael Chan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Currently, bp->msg_enable has default value of 0. It is more useful
to have the commonly used NETIF_MSG_DRV and NETIF_MSG_HW enabled by
default.
v2: Change the fall back bnxt_reset_task() inside bnxt_rx_ring_reset()
to silent mode. With older fw, we would take the fall back path and
it would be very noisy.
Reviewed-by: Edwin Peer <[email protected]>
Reviewed-by: Vasundhara Volam <[email protected]>
Signed-off-by: Michael Chan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Online self tests are not disruptive and can be run in NPAR mode
and in multi-host NIC as well.
Reviewed-by: Edwin Peer <[email protected]>
Signed-off-by: Vasundhara Volam <[email protected]>
Signed-off-by: Michael Chan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
If NVRAM resources are locked, NVM writes are not permitted. In such
scenarios, firmware returns HWRM_ERR_CODE_RESOURCE_LOCKED error to
firmware commands.
Reviewed-by: Edwin Peer <[email protected]>
Signed-off-by: Vasundhara Volam <[email protected]>
Signed-off-by: Michael Chan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The phy_reset_after_clk_enable() is always called with ndev->phydev,
however that pointer may be NULL even though the PHY device instance
already exists and is sufficient to perform the PHY reset.
This condition happens in fec_open(), where the clock must be enabled
first, then the PHY must be reset, and then the PHY IDs can be read
out of the PHY.
If the PHY still is not bound to the MAC, but there is OF PHY node
and a matching PHY device instance already, use the OF PHY node to
obtain the PHY device instance, and then use that PHY device instance
when triggering the PHY reset.
Fixes: 1b0a83ac04e3 ("net: fec: add phy_reset_after_clk_enable() support")
Signed-off-by: Marek Vasut <[email protected]>
Cc: Christoph Niedermaier <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: NXP Linux Team <[email protected]>
Cc: Richard Leitner <[email protected]>
Cc: Shawn Guo <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
netcons calls napi_poll with a budget of 0 to transmit packets.
Handle this by:
- skipping RX processing
- do not try to recycle TX packets to the RX cache
Signed-off-by: Jonathan Lemon <[email protected]>
Reviewed-by: Tariq Toukan <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
kmalloc returns KSEG0 addresses so convert back from KSEG1
in kfree. Also make sure array is freed when the driver is
unloaded from the kernel.
Fixes: ef11291bcd5f ("Add support the Korina (IDT RC32434) Ethernet MAC")
Signed-off-by: Valentin Vidic <[email protected]>
Acked-by: Willem de Bruijn <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The VCAP_IS1_ACT_VID_REPLACE_ENA action, from the VCAP IS1 ingress TCAM,
changes the classified VLAN.
We are only exposing this ability for switch ports that are under VLAN
aware bridges. This is because in standalone ports mode and under a
bridge with vlan_filtering=0, the ocelot driver configures the switch to
operate as VLAN-unaware, so the classified VLAN is not derived from the
802.1Q header from the packet, but instead is always equal to the
port-based VLAN ID of the ingress port. We _can_ still change the
classified VLAN for packets when operating in this mode, but the end
result will most likely be a drop, since both the ingress and the egress
port need to be members of the modified VLAN. And even if we install the
new classified VLAN into the VLAN table of the switch, the result would
still not be as expected: we wouldn't see, on the output port, the
modified VLAN tag, but the original one, even though the classified VLAN
was indeed modified. This is because of how the hardware works: on
egress, what is pushed to the frame is a "port tag", which gives us the
following options:
- Tag all frames with port tag (derived from the classified VLAN)
- Tag all frames with port tag, except if the classified VLAN is 0 or
equal to the native VLAN of the egress port
- No port tag
Needless to say, in VLAN-unaware mode we are disabling the port tag.
Otherwise, the existing VLAN tag would be ignored, and a second VLAN
tag (the port tag), holding the classified VLAN, would be pushed
(instead of replacing the existing 802.1Q tag). This is definitely not
what the user wanted when installing a "vlan modify" action.
So it is simply not worth bothering with VLAN modify rules under other
configurations except when the ports are fully VLAN-aware.
Signed-off-by: Vladimir Oltean <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
This is a methodical transition of the driver from phylib
to phylink, following the guidelines from sfp-phylink.rst.
The MAC register configurations based on interface mode
were moved from the probing path to the mac_config() hook.
MAC enable and disable commands (enabling Rx and Tx paths
at MAC level) were also extracted and assigned to their
corresponding phylink hooks.
As part of the migration to phylink, the serdes configuration
from the driver was offloaded to the PCS_LYNX module,
introduced in commit 0da4c3d393e4 ("net: phy: add Lynx PCS module"),
the PCS_LYNX module being a mandatory component required to
make the enetc driver work with phylink.
Signed-off-by: Claudiu Manoil <[email protected]>
Reviewed-by: Ioana Ciornei <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Decouple internal mdio bus creation from serdes
configuration, as a prerequisite to offloading
serdes configuration to a different module.
Group together mdio bus creation routines, cleanup.
Signed-off-by: Claudiu Manoil <[email protected]>
Reviewed-by: Ioana Ciornei <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Decouple level MAC configuration based on phy interface type
from general port configuration.
Group together MAC and link configuration code.
Decouple external mdio bus creation from interface type
parsing. No longer return an (unhandled) error code when
phy_node not found, use phy_node to indicate whether the
port has a phy or not. No longer fall-through when serdes
configuration fails for the link modes that require
internal link configuration.
Signed-off-by: Claudiu Manoil <[email protected]>
Reviewed-by: Ioana Ciornei <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
When packets are received on the error queue, this function under
net_ratelimit():
netif_err(priv, hw, net_dev, "Err FD status = 0x%08x\n");
does not get printed. Instead we only see:
[ 3658.845592] net_ratelimit: 244 callbacks suppressed
[ 3663.969535] net_ratelimit: 230 callbacks suppressed
[ 3669.085478] net_ratelimit: 228 callbacks suppressed
Enabling NETIF_MSG_HW fixes this issue, and we can see some information
about the frame descriptors of packets.
Signed-off-by: Maxim Kochetkov <[email protected]>
Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Madalin Bucur <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Factor out handling the private packet/byte counters to new
functions rtl_get_priv_stats() and rtl_inc_priv_stats().
Signed-off-by: Heiner Kallweit <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|