Age | Commit message (Collapse) | Author | Files | Lines |
|
GCC is upset that we check the return value of plfxlc_usb_dev()
even tho it can't be NULL:
drivers/net/wireless/purelifi/plfxlc/usb.c: In function ‘resume’:
drivers/net/wireless/purelifi/plfxlc/usb.c:840:20: warning: the comparison will always evaluate as ‘true’ for the address of ‘dev’ will never be NULL [-Waddress]
840 | if (!pl || !plfxlc_usb_dev(pl))
| ^
plfxlc_usb_dev() returns an address of one of the members of pl,
so it's safe to drop these checks.
Acked-by: Kalle Valo <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The clock for this driver switched to the common clock controller driver.
Therefore, update common clock properties for ethernet device in the binding
document.
Signed-off-by: Nobuhiro Iwamatsu <[email protected]>
Acked-by: Rob Herring <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Since kconfig 'select' does not follow dependency chains, if symbol KSA
selects KSB, then KSA should also depend on the same symbols that KSB
depends on, in order to prevent Kconfig warnings and possible build
errors.
Change NET_DSA_SMSC_LAN9303_I2C and NET_DSA_SMSC_LAN9303_MDIO so that
they are limited to VLAN_8021Q if the latter is enabled. This prevents
the Kconfig warning:
WARNING: unmet direct dependencies detected for NET_DSA_SMSC_LAN9303
Depends on [m]: NETDEVICES [=y] && NET_DSA [=y] && (VLAN_8021Q [=m] || VLAN_8021Q [=m]=n)
Selected by [y]:
- NET_DSA_SMSC_LAN9303_I2C [=y] && NETDEVICES [=y] && NET_DSA [=y] && I2C [=y]
Fixes: 430065e26719 ("net: dsa: lan9303: add VLAN IDs to master device")
Signed-off-by: Randy Dunlap <[email protected]>
Cc: Andrew Lunn <[email protected]>
Cc: Vivien Didelot <[email protected]>
Cc: Florian Fainelli <[email protected]>
Cc: Vladimir Oltean <[email protected]>
Cc: Juergen Borleis <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: Mans Rullgard <[email protected]>
Reviewed-by: Vladimir Oltean <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
cppcheck reports
[drivers/net/fddi/skfp/smt.c:750]: (warning) printf format string requires 0 parameters but 2 are given.
DB_SBAN is a vararg macro, like DB_ESSN. Remove the extra args and the nl.
Signed-off-by: Tom Rix <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Jakub Kicinski says:
====================
introduce mt7986 ethernet support
Add support for mt7986-eth driver available on mt7986 soc.
Changes since v2:
- rely on GFP_KERNEL whenever possible
- define mtk_reg_map struct to introduce soc register map and avoid macros
- improve comments
Changes since v1:
- drop SRAM option
- convert ring->dma to void
- convert scratch_ring to void
- enable port4
- fix irq dts bindings
- drop gmac1 support from mt7986a-rfb dts for the moment
====================
Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add support for mt7986-eth driver available on mt7986 soc.
Tested-by: Sam Shih <[email protected]>
Signed-off-by: Lorenzo Bianconi <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Simplify the code converting scratch_ring pointer to void
Signed-off-by: Lorenzo Bianconi <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Simplify the code converting {tx,rx} ring dma pointer to void
Signed-off-by: Lorenzo Bianconi <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Introduce MTK_NETSYS_V2 support. MTK_NETSYS_V2 defines 32B TX/RX DMA
descriptors.
This is a preliminary patch to add mt7986 ethernet support.
Tested-by: Sam Shih <[email protected]>
Signed-off-by: Lorenzo Bianconi <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Introduce reg_map structure to add the capability to support different
register definitions. Move register definitions in mtk_regmap structure.
This is a preliminary patch to introduce mt7986 ethernet support.
Tested-by: Sam Shih <[email protected]>
Signed-off-by: Lorenzo Bianconi <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Remove mtk_rx_dma structure layout dependency in mtk_rx_alloc/mtk_rx_clean.
Initialize to 0 rxd3 and rxd4 in mtk_rx_alloc.
This is a preliminary patch to add mt7986 ethernet support.
Tested-by: Sam Shih <[email protected]>
Signed-off-by: Lorenzo Bianconi <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This is a preliminary to ad mt7986 ethernet support.
Tested-by: Sam Shih <[email protected]>
Signed-off-by: Lorenzo Bianconi <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Similar to tx counterpart, introduce rxd_size in mtk_soc_data data
structure.
This is a preliminary patch to add mt7986 ethernet support.
Tested-by: Sam Shih <[email protected]>
Signed-off-by: Lorenzo Bianconi <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This is a preliminary patch to add mt7986 ethernet support.
Tested-by: Sam Shih <[email protected]>
Signed-off-by: Lorenzo Bianconi <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This is a preliminary patch to add mt7986 ethernet support.
Tested-by: Sam Shih <[email protected]>
Signed-off-by: Lorenzo Bianconi <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This is a preliminary patch to add mt7986 ethernet support.
Tested-by: Sam Shih <[email protected]>
Signed-off-by: Lorenzo Bianconi <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
In order to remove mtk_tx_dma size dependency, introduce txd_size in
mtk_soc_data data structure. Rely on txd_size in mtk_init_fq_dma() and
mtk_dma_free() routines.
This is a preliminary patch to add mt7986 ethernet support.
Tested-by: Sam Shih <[email protected]>
Signed-off-by: Lorenzo Bianconi <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
mtk_tx_set_dma_desc
Move tx dma descriptor configuration in mtk_tx_set_dma_desc routine.
This is a preliminary patch to introduce mt7986 ethernet support since
it relies on a different tx dma descriptor layout.
Tested-by: Sam Shih <[email protected]>
Signed-off-by: Lorenzo Bianconi <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
whenever possible
Rely on GFP_KERNEL for dma descriptors mappings in mtk_tx_alloc(),
mtk_rx_alloc() and mtk_init_fq_dma() since they are run in non-irq
context.
Signed-off-by: Lorenzo Bianconi <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Introduce dts bindings for mt7986 soc in mediatek,net.yaml.
Reviewed-by: Rob Herring <[email protected]>
Signed-off-by: Lorenzo Bianconi <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Introduce ethernet nodes in mt7986 bindings in order to
enable mt7986a/mt7986b ethernet support.
Co-developed-by: Sam Shih <[email protected]>
Signed-off-by: Sam Shih <[email protected]>
Signed-off-by: Lorenzo Bianconi <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Jakub Kicinski says:
====================
eth: silence the GCC 12 array-bounds warnings
Silence the array-bounds warnings in Ethernet drivers.
v2 uses -Wno-array-bounds directly.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
GCC 12 currently generates a rather inconsistent warning:
drivers/net/ethernet/broadcom/tg3.c:17795:51: warning: array subscript 5 is above array bounds of ‘struct tg3_napi[5]’ [-Warray-bounds]
17795 | struct tg3_napi *tnapi = &tp->napi[i];
| ~~~~~~~~^~~
i is guaranteed < tp->irq_max which in turn is either 1 or 5.
There are more loops like this one in the driver, but strangely
GCC 12 dislikes only this single one.
Silence this silliness for now.
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
GCC 12 gets upset because driver allocates partial
struct ice_aqc_sw_rules_elem buffers. The writes are
within bounds.
Silence these warnings for now, our build bot runs GCC 12
so we won't allow any new instances.
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
GCC 12 gets upset because in mtk_foe_entry_commit_subflow()
this driver allocates a partial structure. The writes are
within bounds.
Silence these warnings for now, our build bot runs GCC 12
so we won't allow any new instances.
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Ioana Ciornei says:
====================
dpaa2-eth: software TSO fixes
This patch fixes the software TSO feature in dpaa2-eth.
There are multiple errors that I made in the initial submission of the
code, which I didn't caught since I was always running with passthough
IOMMU.
The bug report came in bugzilla:
https://bugzilla.kernel.org/show_bug.cgi?id=215886
The bugs are in the Tx confirmation path, where I was trying to retrieve
a virtual address after DMA unmapping the area. Besides that, another
dma_unmap call was made with the wrong size.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
DMA unmap the Scatter/Gather table before going through the array to
unmap and free each of the header and data chunks. This is so we do not
touch the data between the dma_map and dma_unmap calls.
Fixes: 3dc709e0cd47 ("dpaa2-eth: add support for software TSO")
Signed-off-by: Ioana Ciornei <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The incorrect software annotation field was being used, swa->sg.sgt_size
instead of swa->tso.sgt_size, which meant that the SGT buffer was
unmapped with a wrong size.
This is also confirmed by the DMA API debug prints which showed the
following:
[ 38.962434] DMA-API: fsl_dpaa2_eth dpni.2: device driver frees DMA memory with different size [device address=0x0000fffffafba740] [map size=224 bytes] [unmap size=0 bytes]
[ 38.980496] WARNING: CPU: 11 PID: 1131 at kernel/dma/debug.c:973 check_unmap+0x58c/0x9b0
[ 38.988586] Modules linked in:
[ 38.991631] CPU: 11 PID: 1131 Comm: iperf3 Not tainted 5.18.0-rc7-00117-g59130eeb2b8f #1972
[ 38.999970] Hardware name: NXP Layerscape LX2160ARDB (DT)
Fixes: 3dc709e0cd47 ("dpaa2-eth: add support for software TSO")
Signed-off-by: Ioana Ciornei <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The TSO header was DMA unmapped before the virtual address was retrieved
and then used to free the buffer. This meant that we were actually
removing the DMA map and then trying to search for it to help in
retrieving the virtual address. This lead to a invalid virtual address
being used in the kfree call.
Fix this by calling dpaa2_iova_to_virt() prior to the dma_unmap call.
[ 487.231819] Unable to handle kernel paging request at virtual address fffffd9807000008
(...)
[ 487.354061] Hardware name: SolidRun LX2160A Honeycomb (DT)
[ 487.359535] pstate: a0400005 (NzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 487.366485] pc : kfree+0xac/0x304
[ 487.369799] lr : kfree+0x204/0x304
[ 487.373191] sp : ffff80000c4eb120
[ 487.376493] x29: ffff80000c4eb120 x28: ffff662240c46400 x27: 0000000000000001
[ 487.383621] x26: 0000000000000001 x25: ffff662246da0cc0 x24: ffff66224af78000
[ 487.390748] x23: ffffad184f4ce008 x22: ffffad1850185000 x21: ffffad1838d13cec
[ 487.397874] x20: ffff6601c0000000 x19: fffffd9807000000 x18: 0000000000000000
[ 487.405000] x17: ffffb910cdc49000 x16: ffffad184d7d9080 x15: 0000000000004000
[ 487.412126] x14: 0000000000000008 x13: 000000000000ffff x12: 0000000000000000
[ 487.419252] x11: 0000000000000004 x10: 0000000000000001 x9 : ffffad184d7d927c
[ 487.426379] x8 : 0000000000000000 x7 : 0000000ffffffd1d x6 : ffff662240a94900
[ 487.433505] x5 : 0000000000000003 x4 : 0000000000000009 x3 : ffffad184f4ce008
[ 487.440632] x2 : ffff662243eec000 x1 : 0000000100000100 x0 : fffffc0000000000
[ 487.447758] Call trace:
[ 487.450194] kfree+0xac/0x304
[ 487.453151] dpaa2_eth_free_tx_fd.isra.0+0x33c/0x3e0 [fsl_dpaa2_eth]
[ 487.459507] dpaa2_eth_tx_conf+0x100/0x2e0 [fsl_dpaa2_eth]
[ 487.464989] dpaa2_eth_poll+0xdc/0x380 [fsl_dpaa2_eth]
Fixes: 3dc709e0cd47 ("dpaa2-eth: add support for software TSO")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215886
Signed-off-by: Ioana Ciornei <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The "ok" tc action is useful when placed in front of a more generic
filter to exclude some more specific rules from matching it.
The ocelot switches can offload this tc action by creating an empty
action vector (no _ENA fields set to 1). This makes sense for all of
VCAP IS1, IS2 and ES0 (but not for PSFP).
Add support for this action. Note that this makes the
gact_drop_and_ok_test() selftest pass, where "action ok" is used in
front of an "action drop" rule, both offloaded to VCAP IS2.
Signed-off-by: Vladimir Oltean <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Vladimir Oltean says:
====================
Streamline Ocelot tc-chains selftest
This series changes the output and the argument format of the Ocelot
switch selftest so that it is more similar to what can be found in
tools/testing/selftests/net/forwarding/.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
Use the standard interface order h1, swp1, swp2, h2 that is used by the
forwarding selftest framework. The previous order was confusing even
with the ASCII drawing. That isn't needed anymore.
This also drops the fixed MAC addresses and uses STABLE_MAC_ADDRS, which
ensures the MAC addresses are unique.
Signed-off-by: Vladimir Oltean <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This is a robotic rename as follows:
eth0 -> swp1
eth1 -> swp2
eth2 -> h2
eth3 -> h1
This brings the selftest more in line with the other forwarding
selftests, where h1 is connected to swp1, and h2 to swp2.
Signed-off-by: Vladimir Oltean <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Bring this driver-specific selftest output in line with the other
selftests.
Before:
Testing VLAN pop.. OK
Testing VLAN push.. OK
Testing ingress VLAN modification.. OK
Testing egress VLAN modification.. OK
Testing frame prioritization.. OK
After:
TEST: VLAN pop [ OK ]
TEST: VLAN push [ OK ]
TEST: Ingress VLAN modification [ OK ]
TEST: Egress VLAN modification [ OK ]
TEST: Frame prioritization [ OK ]
Signed-off-by: Vladimir Oltean <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Most protocol-specific pointers in struct net_device are under
a respective ifdef. Wireless is the notable exception. Since
there's a sizable number of custom-built kernels for datacenter
workloads which don't build wireless it seems reasonable to
ifdefy those pointers as well.
While at it move IPv4 and IPv6 pointers up, those are special
for obvious reasons.
Acked-by: Johannes Berg <[email protected]>
Acked-by: Stefan Schmidt <[email protected]> # ieee802154
Acked-by: Sven Eckelmann <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
An error code returned by devm_clk_get() might have other meanings than
"This clock doesn't exist". So use devm_clk_get_optional() and handle
all remaining errors as fatal.
Signed-off-by: Uwe Kleine-König <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
'prod_idx' (atomic_t) is larger than 'shadow_idx' (u16), so some memory is
over-allocated.
Fixes: b15a9f37be2b ("net-next/hinic: Add wq")
Signed-off-by: Christophe JAILLET <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
An error code returned by devm_clk_get() might have other meanings than
"This clock doesn't exist". So use devm_clk_get_optional() and handle
all remaining errors as fatal.
Signed-off-by: Uwe Kleine-König <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
RGMII mode can be enable from dp83822 straps, and also writing bit 9
of register 0x17 - RMII and Status Register (RCSR).
When phy_interface_is_rgmii rgmii mode must be enabled, same for
contrary, this prevents malconfigurations of hw straps
References:
- https://www.ti.com/lit/gpn/dp83822i p66
Signed-off-by: Tommaso Merciai <[email protected]>
Co-developed-by: Michael Trimarchi <[email protected]>
Suggested-by: Alberto Bianchi <[email protected]>
Tested-by: Tommaso Merciai <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add newly added stress_reuseport_listen object to .gitignore file.
Fixes: ec8cb4f617a2 ("net: selftests: Stress reuseport listen")
Signed-off-by: Muhammad Usama Anjum <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
David Howells says:
====================
rxrpc: Miscellaneous fixes
Here are some fixes for AF_RXRPC:
(1) Fix listen() allowing preallocation to overrun the prealloc buffer.
(2) Prevent resending the request if we've seen the reply starting to
arrive.
(3) Fix accidental sharing of ACK state between transmission and
reception.
(4) Ignore ACKs in which ack.previousPacket regresses. This indicates the
highest DATA number so far seen, so should not be seen to go
backwards.
(5) Fix the determination of when to generate an IDLE-type ACK,
simplifying it so that we generate one if we have more than two DATA
packets that aren't hard-acked (consumed) or soft-acked (in the rx
buffer, but could be discarded and re-requested).
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
Fix the decision on when to generate an IDLE ACK by keeping a count of the
number of packets we've received, but not yet soft-ACK'd, and the number of
packets we've processed, but not yet hard-ACK'd, rather than trying to keep
track of which DATA sequence numbers correspond to those points.
We then generate an ACK when either counter exceeds 2. The counters are
both cleared when we transcribe the information into any sort of ACK packet
for transmission. IDLE and DELAY ACKs are skipped if both counters are 0
(ie. no change).
Fixes: 805b21b929e2 ("rxrpc: Send an ACK after every few DATA packets we receive")
Signed-off-by: David Howells <[email protected]>
cc: Marc Dionne <[email protected]>
cc: [email protected]
Signed-off-by: David S. Miller <[email protected]>
|
|
The previousPacket field in the rx ACK packet should never go backwards -
it's now the highest DATA sequence number received, not the last on
received (it used to be used for out of sequence detection).
Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code")
Signed-off-by: David Howells <[email protected]>
cc: Marc Dionne <[email protected]>
cc: [email protected]
Signed-off-by: David S. Miller <[email protected]>
|
|
Fix accidental overlapping of Rx-phase ACK accounting with Tx-phase ACK
accounting through variables shared between the two. call->acks_* members
refer to ACKs received in the Tx phase and call->ackr_* members to ACKs
sent/to be sent during the Rx phase.
Fixes: 1a2391c30c0b ("rxrpc: Fix detection of out of order acks")
Signed-off-by: David Howells <[email protected]>
cc: Jeffrey Altman <[email protected]>
cc: Marc Dionne <[email protected]>
cc: [email protected]
Signed-off-by: David S. Miller <[email protected]>
|
|
rxrpc has a timer to trigger resending of unacked data packets in a call.
This is not cancelled when a client call switches to the receive phase on
the basis that most calls don't last long enough for it to ever expire.
However, if it *does* expire after we've started to receive the reply, we
shouldn't then go into trying to retransmit or pinging the server to find
out if an ack got lost.
Fix this by skipping the resend code if we're into receiving the reply to a
client call.
Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both")
Signed-off-by: David Howells <[email protected]>
cc: [email protected]
Signed-off-by: David S. Miller <[email protected]>
|
|
AF_RXRPC's listen() handler lets you set the backlog up to 32 (if you bump
up the sysctl), but whilst the preallocation circular buffers have 32 slots
in them, one of them has to be a dead slot because we're using CIRC_CNT().
This means that listen(rxrpc_sock, 32) will cause an oops when the socket
is closed because rxrpc_service_prealloc_one() allocated one too many calls
and rxrpc_discard_prealloc() won't then be able to get rid of them because
it'll think the ring is empty. rxrpc_release_calls_on_socket() then tries
to abort them, but oopses because call->peer isn't yet set.
Fix this by setting the maximum backlog to RXRPC_BACKLOG_MAX - 1 to match
the ring capacity.
BUG: kernel NULL pointer dereference, address: 0000000000000086
...
RIP: 0010:rxrpc_send_abort_packet+0x73/0x240 [rxrpc]
Call Trace:
<TASK>
? __wake_up_common_lock+0x7a/0x90
? rxrpc_notify_socket+0x8e/0x140 [rxrpc]
? rxrpc_abort_call+0x4c/0x60 [rxrpc]
rxrpc_release_calls_on_socket+0x107/0x1a0 [rxrpc]
rxrpc_release+0xc9/0x1c0 [rxrpc]
__sock_release+0x37/0xa0
sock_close+0x11/0x20
__fput+0x89/0x240
task_work_run+0x59/0x90
do_exit+0x319/0xaa0
Fixes: 00e907127e6f ("rxrpc: Preallocate peers, conns and calls for incoming service requests")
Reported-by: Marc Dionne <[email protected]>
Signed-off-by: David Howells <[email protected]>
cc: [email protected]
Link: https://lists.infradead.org/pipermail/linux-afs/2022-March/005079.html
Signed-off-by: David S. Miller <[email protected]>
|
|
David Howells says:
====================
rxrpc: Miscellaneous changes
Here are some miscellaneous changes for AF_RXRPC:
(1) Allow the list of local endpoints to be viewed through /proc.
(2) Switch to using refcount_t for refcounting.
(3) Fix a locking issue found by lockdep.
(4) Autogenerate tracing symbol enums from symbol->string maps to make it
easier to keep them in sync.
(5) Return an error to sendmsg() if a call it tried to set up failed.
Because it failed at this point, no notification will be generated for
recvmsg to pick up - but userspace still needs to know about the
failure.
(6) Fix the selection of abort codes generated by internal events. In
particular, rxrpc and kafs shouldn't be generating RX_USER_ABORT
unless it's because userspace did something to cancel a call.
(7) Adjust the interpretation and handling of certain ACK types to try and
detect NAT changes causing a call to seem to start mid-flow from a
different peer.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
If a client's address changes, say if it is NAT'd, this can disrupt an in
progress operation. For most operations, this is not much of a problem,
but StoreData can be different as some servers modify the target file as
the data comes in, so if a store request is disrupted, the file can get
corrupted on the server.
The problem is that the server doesn't recognise packets that come after
the change of address as belonging to the original client and will bounce
them, either by sending an OUT_OF_SEQUENCE ACK to the apparent new call if
the packet number falls within the initial sequence number window of a call
or by sending an EXCEEDS_WINDOW ACK if it falls outside and then aborting
it. In both cases, firstPacket will be 1 and previousPacket will be 0 in
the ACK information.
Fix this by the following means:
(1) If a client call receives an EXCEEDS_WINDOW ACK with firstPacket as 1
and previousPacket as 0, assume this indicates that the server saw the
incoming packets from a different peer and thus as a different call.
Fail the call with error -ENETRESET.
(2) Also fail the call if a similar OUT_OF_SEQUENCE ACK occurs if the
first packet has been hard-ACK'd. If it hasn't been hard-ACK'd, the
ACK packet will cause it to get retransmitted, so the call will just
be repeated.
(3) Make afs_select_fileserver() treat -ENETRESET as a straight fail of
the operation.
(4) Prioritise the error code over things like -ECONNRESET as the server
did actually respond.
(5) Make writeback treat -ENETRESET as a retryable error and make it
redirty all the pages involved in a write so that the VM will retry.
Note that there is still a circumstance that I can't easily deal with: if
the operation is fully received and processed by the server, but the reply
is lost due to address change. There's no way to know if the op happened.
We can examine the server, but a conflicting change could have been made by
a third party - and we can't tell the difference. In such a case, a
message like:
kAFS: vnode modified {100058:146266} b7->b8 YFS.StoreData64 (op=2646a)
will be logged to dmesg on the next op to touch the file and the client
will reset the inode state, including invalidating clean parts of the
pagecache.
Reported-by: Marc Dionne <[email protected]>
Signed-off-by: David Howells <[email protected]>
cc: [email protected]
Link: http://lists.infradead.org/pipermail/linux-afs/2021-December/004811.html # v1
Signed-off-by: David S. Miller <[email protected]>
|
|
The RX_USER_ABORT code should really only be used to indicate that the user
of the rxrpc service (ie. userspace) implicitly caused a call to be aborted
- for instance if the AF_RXRPC socket is closed whilst the call was in
progress. (The user may also explicitly abort a call and specify the abort
code to use).
Change some of the points of generation to use other abort codes instead:
(1) Abort the call with RXGEN_SS_UNMARSHAL or RXGEN_CC_UNMARSHAL if we see
ENOMEM and EFAULT during received data delivery and abort with
RX_CALL_DEAD in the default case.
(2) Abort with RXGEN_SS_MARSHAL if we get ENOMEM whilst trying to send a
reply.
(3) Abort with RX_CALL_DEAD if we stop hearing from the peer if we had
heard from the peer and abort with RX_CALL_TIMEOUT if we hadn't.
(4) Abort with RX_CALL_DEAD if we try to disconnect a call that's not
completed successfully or been aborted.
Reported-by: Jeffrey Altman <[email protected]>
Signed-off-by: David Howells <[email protected]>
cc: Marc Dionne <[email protected]>
cc: [email protected]
Signed-off-by: David S. Miller <[email protected]>
|
|
If at the end of rxrpc sendmsg() or rxrpc_kernel_send_data() the call that
was being given data was aborted remotely or otherwise failed, return an
error rather than returning the amount of data buffered for transmission.
The call (presumably) did not complete, so there's not much point
continuing with it. AF_RXRPC considers it "complete" and so will be
unwilling to do anything else with it - and won't send a notification for
it, deeming the return from sendmsg sufficient.
Not returning an error causes afs to incorrectly handle a StoreData
operation that gets interrupted by a change of address due to NAT
reconfiguration.
This doesn't normally affect most operations since their request parameters
tend to fit into a single UDP packet and afs_make_call() returns before the
server responds; StoreData is different as it involves transmission of a
lot of data.
This can be triggered on a client by doing something like:
dd if=/dev/zero of=/afs/example.com/foo bs=1M count=512
at one prompt, and then changing the network address at another prompt,
e.g.:
ifconfig enp6s0 inet 192.168.6.2 && route add 192.168.6.1 dev enp6s0
Tracing packets on an Auristor fileserver looks something like:
192.168.6.1 -> 192.168.6.3 RX 107 ACK Idle Seq: 0 Call: 4 Source Port: 7000 Destination Port: 7001
192.168.6.3 -> 192.168.6.1 AFS (RX) 1482 FS Request: Unknown(64538) (64538)
192.168.6.3 -> 192.168.6.1 AFS (RX) 1482 FS Request: Unknown(64538) (64538)
192.168.6.1 -> 192.168.6.3 RX 107 ACK Idle Seq: 0 Call: 4 Source Port: 7000 Destination Port: 7001
<ARP exchange for 192.168.6.2>
192.168.6.2 -> 192.168.6.1 AFS (RX) 1482 FS Request: Unknown(0) (0)
192.168.6.2 -> 192.168.6.1 AFS (RX) 1482 FS Request: Unknown(0) (0)
192.168.6.1 -> 192.168.6.2 RX 107 ACK Exceeds Window Seq: 0 Call: 4 Source Port: 7000 Destination Port: 7001
192.168.6.1 -> 192.168.6.2 RX 74 ABORT Seq: 0 Call: 4 Source Port: 7000 Destination Port: 7001
192.168.6.1 -> 192.168.6.2 RX 74 ABORT Seq: 29321 Call: 4 Source Port: 7000 Destination Port: 7001
The Auristor fileserver logs code -453 (RXGEN_SS_UNMARSHAL), but the abort
code received by kafs is -5 (RX_PROTOCOL_ERROR) as the rx layer sees the
condition and generates an abort first and the unmarshal error is a
consequence of that at the application layer.
Reported-by: Marc Dionne <[email protected]>
Signed-off-by: David Howells <[email protected]>
cc: [email protected]
Link: http://lists.infradead.org/pipermail/linux-afs/2021-December/004810.html # v1
Signed-off-by: David S. Miller <[email protected]>
|