Age | Commit message (Collapse) | Author | Files | Lines |
|
The 58xx devices (Northstar Plus) do actually have their CPU port wired
at port 8, it was unfortunately set to port 5 (B53_CPU_PORT_25) which is
incorrect, since that is the second possible management port.
Fixes: 991a36bb4645 ("net: dsa: b53: Add support for BCM585xx/586xx/88312 integrated switch")
Reported-by: Eric Anholt <[email protected]>
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Implement the correct software reset sequence for 58xx devices by
setting all 3 reset bits and polling for the SW_RST bit to clear itself
without a given timeout. We cannot use is58xx() here because that would
also include the 7445/7278 Starfighter 2 which have their own driver
doing the reset earlier on due to the HW specific integration.
Fixes: 991a36bb4645 ("net: dsa: b53: Add support for BCM585xx/586xx/88312 integrated switch")
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Since Broadcom tags are not enabled in b53 (DSA_PROTO_TAG_NONE), we need
to make sure that the IMP/CPU port is included in the forwarding
decision.
Without this change, switching between non-management ports would work,
but not between management ports and non-management ports thus breaking
the default state in which DSA switch are brought up.
Fixes: 967dd82ffc52 ("net: dsa: b53: Add support for Broadcom RoboSwitch")
Reported-by: Eric Anholt <[email protected]>
Signed-off-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fix from James Bottomley:
"Our final fix before the 4.12 release (hopefully).
It's an error leg again: the fix to not bug on empty DMA transfers is
returning the wrong code and confusing the block layer"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
scsi: return correct blkprep status code in case scsi_init_io() fails.
|
|
Pull MIPS fixes from Ralf Baechle:
"Another round of 4.11 for the MIPS architecture. This time around it's
mostly arch but little platforms-specific code.
- PCI: Register controllers in the right order to aoid a PCI error
- KGDB: Use kernel context for sleeping threads
- smp-cps: Fix potentially uninitialised value of core
- KASLR: Fix build
- ELF: Fix BUG() warning in arch_check_elf
- Fix modversioning of _mcount symbol
- fix out-of-tree defconfig target builds
- cevt-r4k: Fix out-of-bounds array access
- perf: fix deadlock
- Malta: Fix i8259 irqchip setup"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: PCI: add controllers before the specified head
MIPS: KGDB: Use kernel context for sleeping threads
MIPS: smp-cps: Fix potentially uninitialised value of core
MIPS: KASLR: Add missing header files
MIPS: Avoid BUG warning in arch_check_elf
MIPS: Fix modversioning of _mcount symbol
MIPS: generic: fix out-of-tree defconfig target builds
MIPS: cevt-r4k: Fix out-of-bounds array access
MIPS: perf: fix deadlock
MIPS: Malta: Fix i8259 irqchip setup
|
|
Alexander Alemayhu says:
====================
Misc BPF cleanup
while looking into making the Makefile in samples/bpf better handle O= I saw
several warnings when running `make clean && make samples/bpf/`. This series
reduces those warnings.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
Fixes the following warning
samples/bpf/test_lru_dist.c:28:0: warning: "offsetof" redefined
#define offsetof(TYPE, MEMBER) ((size_t)&((TYPE *)0)->MEMBER)
In file included from ./tools/lib/bpf/bpf.h:25:0,
from samples/bpf/libbpf.h:5,
from samples/bpf/test_lru_dist.c:24:
/usr/lib/gcc/x86_64-redhat-linux/6.3.1/include/stddef.h:417:0: note: this is the location of the previous definition
#define offsetof(TYPE, MEMBER) __builtin_offsetof (TYPE, MEMBER)
Signed-off-by: Alexander Alemayhu <[email protected]>
Acked-by: Daniel Borkmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Fixes the following warning
samples/bpf/cookie_uid_helper_example.c: At top level:
samples/bpf/cookie_uid_helper_example.c:276:6: warning: no previous prototype for ‘finish’ [-Wmissing-prototypes]
void finish(int ret)
^~~~~~
HOSTLD samples/bpf/per_socket_stats_example
Signed-off-by: Alexander Alemayhu <[email protected]>
Acked-by: Daniel Borkmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
I was initially going to remove '-Wno-address-of-packed-member' because I
thought it was not supposed to be there but Daniel suggested using
'-Wno-unknown-warning-option'.
This silences several warnings similiar to the one below
warning: unknown warning option '-Wno-address-of-packed-member' [-Wunknown-warning-option]
1 warning generated.
clang -nostdinc -isystem /usr/lib/gcc/x86_64-redhat-linux/6.3.1/include -I./arch/x86/include -I./arch/x86/include/generated/uapi -I./arch/x86/include/generated -I./include
-I./arch/x86/include/uapi -I./include/uapi -I./include/generated/uapi -include ./include/linux/kconfig.h \
-D__KERNEL__ -D__ASM_SYSREG_H -Wno-unused-value -Wno-pointer-sign \
-Wno-compare-distinct-pointer-types \
-Wno-gnu-variable-sized-type-not-at-end \
-Wno-address-of-packed-member -Wno-tautological-compare \
-O2 -emit-llvm -c samples/bpf/xdp_tx_iptunnel_kern.c -o -| llc -march=bpf -filetype=obj -o samples/bpf/xdp_tx_iptunnel_kern.o
$ clang --version
clang version 3.9.1 (tags/RELEASE_391/final)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
Signed-off-by: Alexander Alemayhu <[email protected]>
Acked-by: Daniel Borkmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Now that also the last in-tree user of the xdp_adjust_head bit has
been removed, we can remove the flag from struct bpf_prog altogether.
This, at the same time, also makes sure that any future driver for
XDP comes with bpf_xdp_adjust_head() support right away.
A rejection based on this flag would also mean that tail calls
couldn't be used with such driver as per c2002f983767 ("bpf: fix
checking xdp_adjust_head on tail calls") fix, thus lets not allow
for it in the first place.
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Function pci_find_ext_capability() may return 0, which is an invalid
address. In function qlcnic_sriov_virtid_fn(), its return value is used
without validation. This may result in invalid memory access bugs. This
patch fixes the bug.
Signed-off-by: Pan Bian <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
Mellanox, mlx5 fixes 2017-04-22
This series contains some mlx5 fixes for net.
For your convenience, the series doesn't introduce any conflict with
the ongoing net-next pull request.
Please pull and let me know if there's any problem.
For -stable:
("net/mlx5: E-Switch, Correctly deal with inline mode on ConnectX-5") kernels >= 4.10
("net/mlx5e: Fix ETHTOOL_GRXCLSRLALL handling") kernels >= 4.8
("net/mlx5e: Fix small packet threshold") kernels >= 4.7
("net/mlx5: Fix driver load bad flow when having fw initializing timeout") kernels >= 4.4
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
In function pc300_pci_init_one(), on the ioremap error path, function
pc300_pci_remove_one() is called to free the allocated memory. However,
the path is not terminated, and the freed memory will be used later,
resulting in use-after-free bugs. This path fixes the bug.
Signed-off-by: Pan Bian <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Function nlmsg_new() will return a NULL pointer if there is no enough
memory, and its return value should be checked before it is used.
However, in function tipc_nl_node_get_monitor(), the validation of the
return value of function nlmsg_new() is missed. This patch fixes the
bug.
Signed-off-by: Pan Bian <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Function nla_nest_start() may return a NULL pointer on error. However,
in function lwtunnel_fill_encap(), the return value of nla_nest_start()
is not validated before it is used. This patch checks the return value
of nla_nest_start() against NULL.
Signed-off-by: Pan Bian <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Jakub Kicinski says:
====================
nfp: DMA flags, adjust head and fixes
This series takes advantage of Alex's DMA_ATTR_SKIP_CPU_SYNC to make
XDP packet modifications "correct" from DMA API point of view. It
also allows us to parse the metadata before we run XDP at no additional
DMA sync cost. That way we can get rid of the metadata memcpy, and
remove the last upstream user of bpf_prog->xdp_adjust_head.
David's patch adds a way to read capabilities from the management
firmware.
There are also two net-next fixes. Patch 4 which fixes what seems to
be a result of a botched rebase on my part. Patch 5 corrects locking
when state of ethernet ports is being refreshed.
v3: move the sync from alloc func to the actual give to hw func
v2: sync rx buffers before giving them to the card (Alex)
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
The code refreshing the eth port state was trying to update state
of all ports of the card. Unfortunately to safely walk the port
list we would have to hold the port lock, which we can't due to
lock ordering constraints against rtnl.
Make the per-port sync refresh and async refresh of all ports
completely separate routines.
Fixes: 172f638c93dd ("nfp: add port state refresh")
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
XDP headroom should not be included in free list buffer size.
Fixes: 6fe0c3b43804 ("nfp: add support for xdp_adjust_head()")
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Retrieve identifying information from the NSP. For now it only
contains versions of firmware subcomponents.
Signed-off-by: David Brunecz <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Calling memcpy to shift metadata out of the way for XDP to run
seems like an overkill. The most common metadata contents are
8 bytes containing type and flow hash. Simply parse the metadata
before we run XDP.
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
DMA unmap may destroy changes CPU made to the buffer. To make XDP
run correctly on non-x86 platforms we should use the
DMA_ATTR_SKIP_CPU_SYNC attribute.
Thanks to using the attribute we can now push the sync operation to the
common code path from XDP handler.
A little bit of variable name reshuffling is required to bring the
code back to readable state.
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Benjamin LaHaise says:
====================
flower: add MPLS matching support
This patch series adds support for parsing MPLS flows in the flow dissector
and the flower classifier. Each of the MPLS TTL, BOS, TC and Label fields
can be used for matching.
v2: incorporate style feedback, move #defines to linux/include/mpls.h
Note: this omits Jiri's request to remove tabs between the type and
field names in struct declarations. This would be inconsistent with
numerous other struct definitions.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
Add support to the tc flower classifier to match based on fields in MPLS
labels (TTL, Bottom of Stack, TC field, Label).
Signed-off-by: Benjamin LaHaise <[email protected]>
Signed-off-by: Benjamin LaHaise <[email protected]>
Reviewed-by: Jakub Kicinski <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Simon Horman <[email protected]>
Cc: Jamal Hadi Salim <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Jiri Pirko <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Hadar Hen Zion <[email protected]>
Cc: Gao Feng <[email protected]>
Acked-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add support for parsing MPLS flows to the flow dissector in preparation for
adding MPLS match support to cls_flower.
Signed-off-by: Benjamin LaHaise <[email protected]>
Signed-off-by: Benjamin LaHaise <[email protected]>
Reviewed-by: Jakub Kicinski <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Simon Horman <[email protected]>
Cc: Jamal Hadi Salim <[email protected]>
Cc: Cong Wang <[email protected]>
Cc: Jiri Pirko <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Hadar Hen Zion <[email protected]>
Cc: Gao Feng <[email protected]>
Acked-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Wei Wang says:
====================
net/tcp_fastopen: Fix for various TFO firewall issues
Currently there are still some firewall issues in the middlebox
which make the middlebox drop packets silently for TFO sockets.
This kind of issue is hard to be detected by the end client.
This patch series tries to detect such issues in the kernel and disable
TFO temporarily.
More details about the issues and the fixes are included in the following
patches.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
Christoph Paasch from Apple found another firewall issue for TFO:
After successful 3WHS using TFO, server and client starts to exchange
data. Afterwards, a 10s idle time occurs on this connection. After that,
firewall starts to drop every packet on this connection.
The fix for this issue is to extend existing firewall blackhole detection
logic in tcp_write_timeout() by removing the mss check.
Signed-off-by: Wei Wang <[email protected]>
Acked-by: Yuchung Cheng <[email protected]>
Acked-by: Neal Cardwell <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This counter records the number of times the firewall blackhole issue is
detected and active TFO is disabled.
Signed-off-by: Wei Wang <[email protected]>
Acked-by: Yuchung Cheng <[email protected]>
Acked-by: Neal Cardwell <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Middlebox firewall issues can potentially cause server's data being
blackholed after a successful 3WHS using TFO. Following are the related
reports from Apple:
https://www.nanog.org/sites/default/files/Paasch_Network_Support.pdf
Slide 31 identifies an issue where the client ACK to the server's data
sent during a TFO'd handshake is dropped.
C ---> syn-data ---> S
C <--- syn/ack ----- S
C (accept & write)
C <---- data ------- S
C ----- ACK -> X S
[retry and timeout]
https://www.ietf.org/proceedings/94/slides/slides-94-tcpm-13.pdf
Slide 5 shows a similar situation that the server's data gets dropped
after 3WHS.
C ---- syn-data ---> S
C <--- syn/ack ----- S
C ---- ack --------> S
S (accept & write)
C? X <- data ------ S
[retry and timeout]
This is the worst failure b/c the client can not detect such behavior to
mitigate the situation (such as disabling TFO). Failing to proceed, the
application (e.g., SSL library) may simply timeout and retry with TFO
again, and the process repeats indefinitely.
The proposed solution is to disable active TFO globally under the
following circumstances:
1. client side TFO socket detects out of order FIN
2. client side TFO socket receives out of order RST
We disable active side TFO globally for 1hr at first. Then if it
happens again, we disable it for 2h, then 4h, 8h, ...
And we reset the timeout to 1hr if a client side TFO sockets not opened
on loopback has successfully received data segs from server.
And we examine this condition during close().
The rational behind it is that when such firewall issue happens,
application running on the client should eventually close the socket as
it is not able to get the data it is expecting. Or application running
on the server should close the socket as it is not able to receive any
response from client.
In both cases, out of order FIN or RST will get received on the client
given that the firewall will not block them as no data are in those
frames.
And we want to disable active TFO globally as it helps if the middle box
is very close to the client and most of the connections are likely to
fail.
Also, add a debug sysctl:
tcp_fastopen_blackhole_detect_timeout_sec:
the initial timeout to use when firewall blackhole issue happens.
This can be set and read.
When setting it to 0, it means to disable the active disable logic.
Signed-off-by: Wei Wang <[email protected]>
Acked-by: Yuchung Cheng <[email protected]>
Acked-by: Neal Cardwell <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2017-04-22
Sparse and compiler warnings fixes from Stephen Hemminger.
From Roi Dayan and Or Gerlitz, Add devlink and mlx5 support for controlling
E-Switch encapsulation mode, this knob will enable HW support for applying
encapsulation/decapsulation to VF traffic as part of SRIOV e-switch offloading.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
systemd-sysctl is triggering a suspicious RCU usage message when
net.ipv4.tcp_early_demux or net.ipv4.udp_early_demux is changed via
a sysctl config file:
[ 33.896184] ===============================
[ 33.899558] [ ERR: suspicious RCU usage. ]
[ 33.900624] 4.11.0-rc7+ #104 Not tainted
[ 33.901698] -------------------------------
[ 33.903059] /home/dsa/kernel-2.git/net/ipv4/sysctl_net_ipv4.c:305 suspicious rcu_dereference_check() usage!
[ 33.905724]
other info that might help us debug this:
[ 33.907656]
rcu_scheduler_active = 2, debug_locks = 0
[ 33.909288] 1 lock held by systemd-sysctl/143:
[ 33.910373] #0: (sb_writers#5){.+.+.+}, at: [<ffffffff8123a370>] file_start_write+0x45/0x48
[ 33.912407]
stack backtrace:
[ 33.914018] CPU: 0 PID: 143 Comm: systemd-sysctl Not tainted 4.11.0-rc7+ #104
[ 33.915631] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[ 33.917870] Call Trace:
[ 33.918431] dump_stack+0x81/0xb6
[ 33.919241] lockdep_rcu_suspicious+0x10f/0x118
[ 33.920263] proc_configure_early_demux+0x65/0x10a
[ 33.921391] proc_udp_early_demux+0x3a/0x41
add rcu locking to proc_configure_early_demux.
Fixes: dddb64bcb3461 ("net: Add sysctl to toggle early demux for tcp and udp")
Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
When arp_notify is set to 1 for either a specific interface or for 'all'
interfaces, gratuitous arp requests are sent. Since ndisc_notify is the
ipv6 equivalent to arp_notify, it should follow the same semantics.
Commit 4a6e3c5def13 ("net: ipv6: send unsolicited NA on admin up") sends
the NA on admin up. The final piece is checking devconf_all->ndisc_notify
in addition to the per device setting. Add it.
Fixes: 5cb04436eef6 ("ipv6: add knob to send unsolicited ND on link-layer address change")
Signed-off-by: David Ahern <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
Johan Hedberg says:
====================
pull request: bluetooth-next 2017-04-22
Here are some more Bluetooth patches (and one 802.15.4 patch) in the
bluetooth-next tree targeting the 4.12 kernel. Most of them are pure
fixes.
Please let me know if there are any issues pulling. Thanks.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
Trivial fix to spelling mistake in dev_err message and rejoin
line.
Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
If skb_put_padto() fails then it frees the skb. I shifted that code
up a bit to make my error handling a little simpler.
Fixes: a0d2f20650e8 ("Renesas Ethernet AVB PTP clock driver")
Signed-off-by: Dan Carpenter <[email protected]>
Acked-by: Sergei Shtylyov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Use offset_in_page() macro instead of open-coding.
Signed-off-by: Geliang Tang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Michael Chan says:
====================
bnxt_en: Updates for net-next.
Miscellaneous updates include passing DCBX RoCE VLAN priority to firmware,
checking one more new firmware flag before allowing DCBX to run on the host,
adding 100Gbps speed support, adding check to disallow speed settings on
Multi-host NICs, and a minor fix for reporting VF attributes.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
This change restricts the PF in multi-host mode from setting any port
level PHY configuration. The settings are controlled by firmware in
Multi-Host mode.
Signed-off-by: Deepak Khungar <[email protected]>
Signed-off-by: Michael Chan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Check the additional flag in bnxt_hwrm_func_qcfg() before allowing
DCBX to be done in host mode.
Signed-off-by: Michael Chan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Added support for 100G link speed reporting for Broadcom BCM57454
ASIC in ethtool command.
Signed-off-by: Deepak Khungar <[email protected]>
Signed-off-by: Ray Jui <[email protected]>
Signed-off-by: Michael Chan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The .ndo_get_vf_config() is returning the wrong qos attribute. Fix
the code that checks and reports the qos and spoofchk attributes. The
BNXT_VF_QOS and BNXT_VF_LINK_UP flags should not be set by default
during init. time.
Signed-off-by: Michael Chan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
When the driver gets the RoCE app priority set/delete call through DCBNL,
the driver will send the information to the firmware to set up the
priority VLAN tag for RDMA traffic.
[ New version using the common ETH_P_IBOE constant in if_ether.h ]
Signed-off-by: Michael Chan <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add a new optional conntrack action attribute OVS_CT_ATTR_EVENTMASK,
which can be used in conjunction with the commit flag
(OVS_CT_ATTR_COMMIT) to set the mask of bits specifying which
conntrack events (IPCT_*) should be delivered via the Netfilter
netlink multicast groups. Default behavior depends on the system
configuration, but typically a lot of events are delivered. This can be
very chatty for the NFNLGRP_CONNTRACK_UPDATE group, even if only some
types of events are of interest.
Netfilter core init_conntrack() adds the event cache extension, so we
only need to set the ctmask value. However, if the system is
configured without support for events, the setting will be skipped due
to extension not being found.
Signed-off-by: Jarno Rajahalme <[email protected]>
Reviewed-by: Greg Rose <[email protected]>
Acked-by: Joe Stringer <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Fix typo in a comment.
Signed-off-by: Jarno Rajahalme <[email protected]>
Acked-by: Greg Rose <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Otherwise, UDP checksum offloads could corrupt ESP packets by attempting
to calculate UDP checksum when this inner UDP packet is already protected
by IPsec.
One way to reproduce this bug is to have a VM with virtio_net driver (UFO
set to ON in the guest VM); and then encapsulate all guest's Ethernet
frames in Geneve; and then further encrypt Geneve with IPsec. In this
case following symptoms are observed:
1. If using ixgbe NIC, then it will complain with following error message:
ixgbe 0000:01:00.1: partial checksum but l4 proto=32!
2. Receiving IPsec stack will drop all the corrupted ESP packets and
increase XfrmInStateProtoError counter in /proc/net/xfrm_stat.
3. iperf UDP test from the VM with packet sizes above MTU will not work at
all.
4. iperf TCP test from the VM will get ridiculously low performance because.
Signed-off-by: Ansis Atteka <[email protected]>
Co-authored-by: Steffen Klassert <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
While this may appear as a humdrum one line change, it's actually quite
important. An sk_buff stores data in three places:
1. A linear chunk of allocated memory in skb->data. This is the easiest
one to work with, but it precludes using scatterdata since the memory
must be linear.
2. The array skb_shinfo(skb)->frags, which is of maximum length
MAX_SKB_FRAGS. This is nice for scattergather, since these fragments
can point to different pages.
3. skb_shinfo(skb)->frag_list, which is a pointer to another sk_buff,
which in turn can have data in either (1) or (2).
The first two are rather easy to deal with, since they're of a fixed
maximum length, while the third one is not, since there can be
potentially limitless chains of fragments. Fortunately dealing with
frag_list is opt-in for drivers, so drivers don't actually have to deal
with this mess. For whatever reason, macsec decided it wanted pain, and
so it explicitly specified NETIF_F_FRAGLIST.
Because dealing with (1), (2), and (3) is insane, most users of sk_buff
doing any sort of crypto or paging operation calls a convenient function
called skb_to_sgvec (which happens to be recursive if (3) is in use!).
This takes a sk_buff as input, and writes into its output pointer an
array of scattergather list items. Sometimes people like to declare a
fixed size scattergather list on the stack; othertimes people like to
allocate a fixed size scattergather list on the heap. However, if you're
doing it in a fixed-size fashion, you really shouldn't be using
NETIF_F_FRAGLIST too (unless you're also ensuring the sk_buff and its
frag_list children arent't shared and then you check the number of
fragments in total required.)
Macsec specifically does this:
size += sizeof(struct scatterlist) * (MAX_SKB_FRAGS + 1);
tmp = kmalloc(size, GFP_ATOMIC);
*sg = (struct scatterlist *)(tmp + sg_offset);
...
sg_init_table(sg, MAX_SKB_FRAGS + 1);
skb_to_sgvec(skb, sg, 0, skb->len);
Specifying MAX_SKB_FRAGS + 1 is the right answer usually, but not if you're
using NETIF_F_FRAGLIST, in which case the call to skb_to_sgvec will
overflow the heap, and disaster ensues.
Signed-off-by: Jason A. Donenfeld <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: David S. Miller <[email protected]>
|
|
Nathan Fontenot says:
====================
ibmvnic: Additional updates and bug fixes
This set of patches is an additional set of updates and bug fixes to
the ibmvnic driver which applies on top of the previous set of updates
sent out on 4/19.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
When an error is encountered during transmit we need to free the
skb instead of returning TX_BUSY.
Signed-off-by: Thomas Falcon <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Validate that the napi structs exist before trying to disable them
at driver close.
Signed-off-by: Nathan Fontenot <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Create a common routine for setting the link state for the vnic adapter.
This update moves the sending of the crq and waiting for the link state
response to a common place. The new routine also adds handling of
resending the crq in cases of getting a partial success response.
Signed-off-by: Nathan Fontenot <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
We should be initializing the stats token in the same place we
initialize the other resources for the driver.
Signed-off-by: Nathan Fontenot <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|