Age | Commit message (Collapse) | Author | Files | Lines |
|
Add the new stats API, kernel config parameter, and stats structure
information to the page_pool documentation.
Signed-off-by: Joe Damato <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Adds a function page_pool_get_stats which can be used by drivers to obtain
stats for a specified page_pool.
Signed-off-by: Joe Damato <[email protected]>
Acked-by: Jesper Dangaard Brouer <[email protected]>
Reviewed-by: Ilias Apalodimas <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add per-cpu stats tracking page pool recycling events:
- cached: recycling placed page in the page pool cache
- cache_full: page pool cache was full
- ring: page placed into the ptr ring
- ring_full: page released from page pool because the ptr ring was full
- released_refcnt: page released (and not recycled) because refcnt > 1
Signed-off-by: Joe Damato <[email protected]>
Acked-by: Jesper Dangaard Brouer <[email protected]>
Reviewed-by: Ilias Apalodimas <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Add per-pool statistics counters for the allocation path of a page pool.
These stats are incremented in softirq context, so no locking or per-cpu
variables are needed.
This code is disabled by default and a kernel config option is provided for
users who wish to enable them.
The statistics added are:
- fast: successful fast path allocations
- slow: slow path order-0 allocations
- slow_high_order: slow path high order allocations
- empty: ptr ring is empty, so a slow path allocation was forced.
- refill: an allocation which triggered a refill of the cache
- waive: pages obtained from the ptr ring that cannot be added to
the cache due to a NUMA mismatch.
Signed-off-by: Joe Damato <[email protected]>
Acked-by: Jesper Dangaard Brouer <[email protected]>
Reviewed-by: Ilias Apalodimas <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
If recv_actor() returns an incorrect value, tcp_read_sock()
might loop forever.
Instead, issue a one time warning and make sure to make progress.
Signed-off-by: Eric Dumazet <[email protected]>
Acked-by: John Fastabend <[email protected]>
Acked-by: Jakub Sitnicki <[email protected]>
Acked-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Currently, sk_psock_verdict_recv() returns skb->len
This is problematic because tcp_read_sock() might have
passed orig_len < skb->len, due to the presence of TCP urgent data.
This causes an infinite loop from tcp_read_sock()
Followup patch will make tcp_read_sock() more robust vs bad actors.
Fixes: ef5659280eb1 ("bpf, sockmap: Allow skipping sk_skb parser program")
Reported-by: syzbot <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Acked-by: John Fastabend <[email protected]>
Acked-by: Jakub Sitnicki <[email protected]>
Tested-by: Jakub Sitnicki <[email protected]>
Acked-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Last tcp_write_queue_head() use was removed in commit
114f39feab36 ("tcp: restore autocorking"), so remove it.
Signed-off-by: Tao Chen <[email protected]>
Link: https://lore.kernel.org/r/SYZP282MB33317DEE1253B37C0F57231E86029@SYZP282MB3331.AUSP282.PROD.OUTLOOK.COM
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Network drivers such as igb or igc call eth_get_headlen() to determine the
header length for their to be constructed skbs in receive path.
When running HSR on top of these drivers, it results in triggering BUG_ON() in
skb_pull(). The reason is the skb headlen is not sufficient for HSR to work
correctly. skb_pull() notices that.
For instance, eth_get_headlen() returns 14 bytes for TCP traffic over HSR which
is not correct. The problem is, the flow dissection code does not take HSR into
account. Therefore, add support for it.
Reported-by: Anthony Harivel <[email protected]>
Signed-off-by: Kurt Kanzenbach <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Add support for direct RMII MAC mode. This allows hardware with CPU port
connected in direct 100M fixed link to work properly.
Signed-off-by: Baruch Siach <[email protected]>
Link: https://lore.kernel.org/r/a962d1ccbeec42daa10dd8aff0e66e31f0faf1eb.1646050203.git.baruch@tkos.co.il
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
When the given cmode has no serdes, mv88e6xxx_serdes_get_lane() returns
-NODEV. Earlier in the same function the code skips serdes handing in
this case. Do the same after cmode set.
Signed-off-by: Baruch Siach <[email protected]>
Link: https://lore.kernel.org/r/cd95cf3422ae8daf297a01fa9ec3931b203cdf45.1646050203.git.baruch@tkos.co.il
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Eliminate the following coccicheck warning:
./net/openvswitch/flow.c:379:2-3: Unneeded semicolon
Reported-by: Abaci Robot <[email protected]>
Signed-off-by: Yang Li <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Add extack message to return exact message to user when adding invalid
filter with conflict flags for TC action.
In previous implement we just return EINVAL which is confusing for user.
Signed-off-by: Baowen Zheng <[email protected]>
Reviewed-by: Roi Dayan <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
In order to function, the IPA driver very clearly requires the
interconnect framework to be enabled in the kernel configuration.
State that dependency in the Kconfig file.
This became a problem when CONFIG_COMPILE_TEST support was added.
Non-Qualcomm platforms won't necessarily enable CONFIG_INTERCONNECT.
Reported-by: kernel test robot <[email protected]>
Fixes: 38a4066f593c5 ("net: ipa: support COMPILE_TEST")
Signed-off-by: Alex Elder <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
40GbE Intel Wired LAN Driver Updates 2022-03-01
This series contains updates to iavf driver only.
Mateusz adds support for interrupt moderation for 50G and 100G speeds
as well as support for the driver to specify a request as its primary
MAC address. He also refactors VLAN V2 capability exchange into more
generic extended capabilities to ease the addition of future
capabilities. Finally, he corrects the incorrect return of iavf_status
values and removes non-inclusive language.
Minghao Chi removes unneeded variables, instead returning values
directly.
* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
iavf: Remove non-inclusive language
iavf: Fix incorrect use of assigning iavf_status to int
iavf: stop leaking iavf_status as "errno" values
iavf: remove redundant ret variable
iavf: Add usage of new virtchnl format to set default MAC
iavf: refactor processing of VLAN V2 capability message
iavf: Add support for 50G/100G in AIM algorithm
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Use ida_alloc_xxx()/ida_free() instead to
ida_simple_get()/ida_simple_remove().
The latter is deprecated and more verbose.
Signed-off-by: Christophe JAILLET <[email protected]>
Signed-off-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The truesize for a UDP GRO packet is added by main skb and skbs in main
skb's frag_list:
skb_gro_receive_list
p->truesize += skb->truesize;
The commit 53475c5dd856 ("net: fix use-after-free when UDP GRO with
shared fraglist") introduced a truesize increase for frag_list skbs.
When uncloning skb, it will call pskb_expand_head and trusesize for
frag_list skbs may increase. This can occur when allocators uses
__netdev_alloc_skb and not jump into __alloc_skb. This flow does not
use ksize(len) to calculate truesize while pskb_expand_head uses.
skb_segment_list
err = skb_unclone(nskb, GFP_ATOMIC);
pskb_expand_head
if (!skb->sk || skb->destructor == sock_edemux)
skb->truesize += size - osize;
If we uses increased truesize adding as delta_truesize, it will be
larger than before and even larger than previous total truesize value
if skbs in frag_list are abundant. The main skb truesize will become
smaller and even a minus value or a huge value for an unsigned int
parameter. Then the following memory check will drop this abnormal skb.
To avoid this error we should use the original truesize to segment the
main skb.
Fixes: 53475c5dd856 ("net: fix use-after-free when UDP GRO with shared fraglist")
Signed-off-by: lena wang <[email protected]>
Acked-by: Paolo Abeni <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Convert sfp to use %pe for printing error codes, which can print them
as errno symbols rather than numbers.
Signed-off-by: Russell King (Oracle) <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Convert phylink to use %pe for printing error codes, which can print
them as errno symbols rather than numbers.
Signed-off-by: Russell King (Oracle) <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
In patch [1], tun_msg_ctl was added to allow pass batched xdp buffers to
tun_sendmsg. Although we donot use msg_controllen in this path, we should
check msg_controllen to make sure the caller pass a valid msg_ctl.
[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=fe8dd45bb7556246c6b76277b1ba4296c91c2505
Reported-by: Eric Dumazet <[email protected]>
Suggested-by: Jason Wang <[email protected]>
Signed-off-by: Harold Huang <[email protected]>
Acked-by: Jason Wang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
This cleanup patchset includes the following patches:
- bump version strings, by Simon Wunderlich
- Remove redundant 'flush_workqueue()' calls, by Christophe JAILLET
- Migrate to linux/container_of.h, by Sven Eckelmann
- Demote batadv-on-batadv skip error message, by Sven Eckelmann
* tag 'batadv-next-pullrequest-20220302' of git://git.open-mesh.org/linux-merge:
batman-adv: Demote batadv-on-batadv skip error message
batman-adv: Migrate to linux/container_of.h
batman-adv: Remove redundant 'flush_workqueue()' calls
batman-adv: Start new development cycle
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
Here are some batman-adv bugfixes:
- Remove redundant iflink requests, by Sven Eckelmann (2 patches)
- Don't expect inter-netns unique iflink indices, by Sven Eckelmann
* tag 'batadv-net-pullrequest-20220302' of git://git.open-mesh.org/linux-merge:
batman-adv: Don't expect inter-netns unique iflink indices
batman-adv: Request iflink once in batadv_get_real_netdevice
batman-adv: Request iflink once in batadv-on-batadv check
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Johannes Berg says:
====================
Three more fixes:
- fix build issue in iwlwifi, now that I understood
what's going on there
- propagate error in iwlwifi/mvm to userspace so it
can figure out what's happening
- fix channel switch related updates in P2P-client
in cfg80211
* tag 'wireless-for-net-2022-03-02' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
iwlwifi: mvm: return value for request_ownership
nl80211: Update bss channel on channel switch for P2P_CLIENT
iwlwifi: fix build error for IWLMEI
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull ucounts fix from Eric Biederman:
"Etienne Dechamps recently found a regression caused by enforcing
RLIMIT_NPROC for root where the rlimit was not previously enforced.
Michal Koutný had previously pointed out the inconsistency in
enforcing the RLIMIT_NPROC that had been on the root owned process
after the root user creates a user namespace.
Which makes the fix for the regression simply removing the
inconsistency"
* 'ucount-rlimit-fixes-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
ucounts: Fix systemd LimitNPROC with private users regression
|
|
Pull ARM fixes from Russell King:
- Fix kgdb breakpoint for Thumb2
- Fix dependency for BITREVERSE kconfig
- Fix nommu early_params and __setup returns
* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 9182/1: mmu: fix returns from early_param() and __setup() functions
ARM: 9178/1: fix unmet dependency on BITREVERSE for HAVE_ARCH_BITREVERSE
ARM: Fix kgdb breakpoint for Thumb2
|
|
While it might work, the current approach is fragile in a few ways:
- whenever members in the structure are shuffled, the pointer will be wrong
- the resource freeing may include more than covered by kfree()
Fix this by using charlcd_free() call instead of kfree().
Fixes: 8c9108d014c5 ("auxdisplay: add a driver for lcd2s character display")
Cc: Lars Poeschel <[email protected]>
Signed-off-by: Andy Shevchenko <[email protected]>
Signed-off-by: Miguel Ojeda <[email protected]>
|
|
Once allocated the struct lcd2s_data is never freed.
Fix the memory leak by switching to devm_kzalloc().
Fixes: 8c9108d014c5 ("auxdisplay: add a driver for lcd2s character display")
Cc: Lars Poeschel <[email protected]>
Signed-off-by: Andy Shevchenko <[email protected]>
Signed-off-by: Miguel Ojeda <[email protected]>
|
|
It seems that the lcd2s_redefine_char() has never been properly
tested. The buffer is filled by DEF_CUSTOM_CHAR command followed
by the character number (from 0 to 7), but immediately after that
these bytes are rewritten by the decoded hex stream.
Fix the index to fill the buffer after the command and number.
Fixes: 8c9108d014c5 ("auxdisplay: add a driver for lcd2s character display")
Cc: Lars Poeschel <[email protected]>
Signed-off-by: Andy Shevchenko <[email protected]>
Reviewed-by: Geert Uytterhoeven <[email protected]>
[fixed typo in commit message]
Signed-off-by: Miguel Ojeda <[email protected]>
|
|
Propagate the value to the user space so it can understand
if the operation failed or not.
Fixes: bfcfdb59b669 ("iwlwifi: mvm: add vendor commands needed for iwlmei")
Signed-off-by: Emmanuel Grumbach <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Johannes Berg <[email protected]>
|
|
The wdev channel information is updated post channel switch only for
the station mode and not for the other modes. Due to this, the P2P client
still points to the old value though it moved to the new channel
when the channel change is induced from the P2P GO.
Update the bss channel after CSA channel switch completion for P2P client
interface as well.
Signed-off-by: Sreeramya Soratkal <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Johannes Berg <[email protected]>
|
|
When CONFIG_IWLWIFI=m and CONFIG_IWLMEI=y, the kernel build system
must be told to build the iwlwifi/ subdirectory for both IWLWIFI and
IWLMEI so that builds for both =y and =m are done.
This resolves an undefined reference build error:
ERROR: modpost: "iwl_mei_is_connected" [drivers/net/wireless/intel/iwlwifi/iwlwifi.ko] undefined!
Fixes: 977df8bd5844 ("wlwifi: work around reverse dependency on MEI")
Signed-off-by: Randy Dunlap <[email protected]>
Reported-by: kernel test robot <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Luca Coelho <[email protected]>
Cc: [email protected]
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Johannes Berg <[email protected]>
|
|
Song Liu says:
====================
Changes v1 => v2:
1. Rephrase comments in 2/2. (Yonghong)
Two fixes for bpf_prog_pack.
====================
Signed-off-by: Alexei Starovoitov <[email protected]>
|
|
On do_jit failure path, the header is freed by bpf_jit_binary_pack_free.
While bpf_jit_binary_pack_free doesn't require proper ro_header->size,
bpf_prog_pack_free still uses it. Set header->size in bpf_int_jit_compile
before calling bpf_jit_binary_pack_free.
Fixes: 1022a5498f6f ("bpf, x86_64: Use bpf_jit_binary_pack_alloc")
Fixes: 33c9805860e5 ("bpf: Introduce bpf_jit_binary_pack_[alloc|finalize|free]")
Reported-by: Kui-Feng Lee <[email protected]>
Signed-off-by: Song Liu <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Yonghong Song <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
kernel test robot reported kernel BUG like:
[ 44.587744][ T1] kernel BUG at arch/x86/mm/physaddr.c:76!
[ 44.590151][ T1] __vmalloc_area_node (mm/vmalloc.c:622 mm/vmalloc.c:2995)
[ 44.590151][ T1] __vmalloc_node_range (mm/vmalloc.c:3108)
[ 44.590151][ T1] __vmalloc_node (mm/vmalloc.c:3157)
which is triggered with HAVE_ARCH_HUGE_VMALLOC on 32-bit x86. Since BPF
only uses HAVE_ARCH_HUGE_VMALLOC for x86_64, turn it off for 32-bit x86.
Fixes: fac54e2bfb5b ("x86/Kconfig: Select HAVE_ARCH_HUGE_VMALLOC with HAVE_ARCH_HUGE_VMAP")
Reported-by: kernel test robot <[email protected]>
Signed-off-by: Song Liu <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Yonghong Song <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs
Pull erofs fix from Gao Xiang:
"A one-line patch to fix the new ztailpacking feature on > 4GiB
filesystems because z_idataoff can get trimmed improperly.
ztailpacking is still a brand new EXPERIMENTAL feature, but it'd be
better to fix the issue as soon as possible to avoid unnecessary
backporting.
Summary:
- Fix ztailpacking z_idataoff getting trimmed on > 4GiB filesystems"
* tag 'erofs-for-5.17-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
erofs: fix ztailpacking on > 4GiB filesystems
|
|
Pull NTB fixes from Jon Mason:
"Bug fixes for sparse warning, intel port config offset, and a new
mailing list"
* tag 'ntb-5.17-bugfixes' of git://github.com/jonmason/ntb:
MAINTAINERS: update mailing list address for NTB subsystem
ntb: intel: fix port config status offset for SPR
NTB/msi: Use struct_size() helper in devm_kzalloc()
|
|
In ("ptp: ocp: Have FPGA fold in ns adjustment for adjtime."), the
ns adjustment was written to the FPGA register, so the clock could
accurately perform adjustments.
However, the adjtime() call passes in a s64, while the clock adjustment
registers use a s32. When trying to perform adjustments with a large
value (37 sec), things fail.
Examine the incoming delta, and if larger than 1 sec, use the original
(coarse) adjustment method. If smaller than 1 sec, then allow the
FPGA to fold in the changes over a 1 second window.
Fixes: 6d59d4fa1789 ("ptp: ocp: Have FPGA fold in ns adjustment for adjtime.")
Signed-off-by: Jonathan Lemon <[email protected]>
Acked-by: Richard Cochran <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
add missing ")" which caused by previous commit.
Fixes: 61c4fb9c4d09 ("net: hamradio: use time_is_after_jiffies() instead of open coding it")
Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: Wang Qing <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
z_idataoff here is an absolute physical offset, so it should use
erofs_off_t (64 bits at least). Otherwise, it'll get trimmed and
cause the decompresion failure.
Link: https://lore.kernel.org/r/[email protected]
Fixes: ab92184ff8f1 ("erofs: add on-disk compressed tail-packing inline support")
Reviewed-by: Yue Hu <[email protected]>
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Gao Xiang <[email protected]>
|
|
The ifindex doesn't have to be unique for multiple network namespaces on
the same machine.
$ ip netns add test1
$ ip -net test1 link add dummy1 type dummy
$ ip netns add test2
$ ip -net test2 link add dummy2 type dummy
$ ip -net test1 link show dev dummy1
6: dummy1: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 96:81:55:1e:dd:85 brd ff:ff:ff:ff:ff:ff
$ ip -net test2 link show dev dummy2
6: dummy2: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
link/ether 5a:3c:af:35:07:c3 brd ff:ff:ff:ff:ff:ff
But the batman-adv code to walk through the various layers of virtual
interfaces uses this assumption because dev_get_iflink handles it
internally and doesn't return the actual netns of the iflink. And
dev_get_iflink only documents the situation where ifindex == iflink for
physical devices.
But only checking for dev->netdev_ops->ndo_get_iflink is also not an option
because ipoib_get_iflink implements it even when it sometimes returns an
iflink != ifindex and sometimes iflink == ifindex. The caller must
therefore make sure itself to check both netns and iflink + ifindex for
equality. Only when they are equal, a "physical" interface was detected
which should stop the traversal. On the other hand, vxcan_get_iflink can
also return 0 in case there was currently no valid peer. In this case, it
is still necessary to stop.
Fixes: b7eddd0b3950 ("batman-adv: prevent using any virtual device created on batman-adv as hard-interface")
Fixes: 5ed4a460a1d3 ("batman-adv: additional checks for virtual interfaces on top of WiFi")
Reported-by: Sabrina Dubroca <[email protected]>
Signed-off-by: Sven Eckelmann <[email protected]>
Signed-off-by: Simon Wunderlich <[email protected]>
|
|
There is no need to call dev_get_iflink multiple times for the same
net_device in batadv_get_real_netdevice. And since some of the
ndo_get_iflink callbacks are dynamic (for example via RCUs like in
vxcan_get_iflink), it could easily happen that the returned values are not
stable. The pre-checks before __dev_get_by_index are then of course bogus.
Fixes: 5ed4a460a1d3 ("batman-adv: additional checks for virtual interfaces on top of WiFi")
Signed-off-by: Sven Eckelmann <[email protected]>
Signed-off-by: Simon Wunderlich <[email protected]>
|
|
There is no need to call dev_get_iflink multiple times for the same
net_device in batadv_is_on_batman_iface. And since some of the
.ndo_get_iflink callbacks are dynamic (for example via RCUs like in
vxcan_get_iflink), it could easily happen that the returned values are not
stable. The pre-checks before __dev_get_by_index are then of course bogus.
Fixes: b7eddd0b3950 ("batman-adv: prevent using any virtual device created on batman-adv as hard-interface")
Signed-off-by: Sven Eckelmann <[email protected]>
Signed-off-by: Simon Wunderlich <[email protected]>
|
|
The error message "Cannot find parent device" was shown for users of
macvtap (on batadv devices) whenever the macvtap was moved to a different
netns. This happens because macvtap doesn't provide an implementation for
rtnl_link_ops->get_link_net.
The situation for which this message is printed is actually not an error
but just a warning that the optional sanity check was skipped. So demote
the message from error to warning and adjust the text to better explain
what happened.
Reported-by: Leonardo Mörlein <[email protected]>
Signed-off-by: Sven Eckelmann <[email protected]>
Signed-off-by: Simon Wunderlich <[email protected]>
|
|
The commit d2a8ebbf8192 ("kernel.h: split out container_of() and
typeof_member() macros") introduced a new header for the container_of
related macros from (previously) linux/kernel.h.
Signed-off-by: Sven Eckelmann <[email protected]>
Signed-off-by: Simon Wunderlich <[email protected]>
|
|
Daniel Braunwarth says:
====================
if_ether.h: add industrial fieldbus Ethertypes
This set of patches adds the Ethertypes for PROFINET and EtherCAT.
The defines should be used by iproute2 to extend the list of available link
layer protocols.
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Add the Ethertype for EtherCAT protocol.
Signed-off-by: Daniel Braunwarth <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Add the Ethertype for PROFINET protocol.
Signed-off-by: Daniel Braunwarth <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
When the DSA_NOTIFIER_TAG_PROTO returns an error, the user space process
which initiated the protocol change exits the kernel processing while
still holding the rtnl_mutex. So any other process attempting to lock
the rtnl_mutex would deadlock after such event.
The error handling of DSA_NOTIFIER_TAG_PROTO was inadvertently changed
by the blamed commit, introducing this regression. We must still call
rtnl_unlock(), and we must still call DSA_NOTIFIER_TAG_PROTO for the old
protocol. The latter is due to the limiting design of notifier chains
for cross-chip operations, which don't have a built-in error recovery
mechanism - we should look into using notifier_call_chain_robust for that.
Fixes: dc452a471dba ("net: dsa: introduce tagger-owned storage for private and shared data")
Signed-off-by: Vladimir Oltean <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Assign rtnl_link_ops->get_link_net() callback so that IFLA_LINK_NETNSID is
added to rtnetlink messages. This fixes iproute2 which otherwise resolved
the link interface to an interface in the wrong namespace.
Test commands:
ip netns add nst
ip link add dummy0 type dummy
ip link add link macvtap0 link dummy0 type macvtap
ip link set macvtap0 netns nst
ip -netns nst link show macvtap0
Before:
10: macvtap0@gre0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 500
link/ether 5e:8f:ae:1d:60:50 brd ff:ff:ff:ff:ff:ff
After:
10: macvtap0@if2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 500
link/ether 5e:8f:ae:1d:60:50 brd ff:ff:ff:ff:ff:ff link-netnsid 0
Reported-by: Leonardo Mörlein <[email protected]>
Signed-off-by: Sven Eckelmann <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Fix the following coccicheck warning:
./drivers/net/ethernet/netronome/nfp/flower/qos_conf.c:750:7-55: WARNING
avoid newline at end of message in NL_SET_ERR_MSG_MOD
Signed-off-by: Wan Jiabing <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
In tun, NAPI is supported and we can also use NAPI in the path of
batched XDP buffs to accelerate packet processing. What is more, after
we use NAPI, GRO is also supported. The iperf shows that the throughput of
single stream could be improved from 4.5Gbps to 9.2Gbps. Additionally, 9.2
Gbps nearly reachs the line speed of the phy nic and there is still about
15% idle cpu core remaining on the vhost thread.
Test topology:
[iperf server]<--->tap<--->dpdk testpmd<--->phy nic<--->[iperf client]
Iperf stream:
iperf3 -c 10.0.0.2 -i 1 -t 10
Before:
...
[ 5] 5.00-6.00 sec 558 MBytes 4.68 Gbits/sec 0 1.50 MBytes
[ 5] 6.00-7.00 sec 556 MBytes 4.67 Gbits/sec 1 1.35 MBytes
[ 5] 7.00-8.00 sec 556 MBytes 4.67 Gbits/sec 2 1.18 MBytes
[ 5] 8.00-9.00 sec 559 MBytes 4.69 Gbits/sec 0 1.48 MBytes
[ 5] 9.00-10.00 sec 556 MBytes 4.67 Gbits/sec 1 1.33 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 5.39 GBytes 4.63 Gbits/sec 72 sender
[ 5] 0.00-10.04 sec 5.39 GBytes 4.61 Gbits/sec receiver
After:
...
[ 5] 5.00-6.00 sec 1.07 GBytes 9.19 Gbits/sec 0 1.55 MBytes
[ 5] 6.00-7.00 sec 1.08 GBytes 9.30 Gbits/sec 0 1.63 MBytes
[ 5] 7.00-8.00 sec 1.08 GBytes 9.25 Gbits/sec 0 1.72 MBytes
[ 5] 8.00-9.00 sec 1.08 GBytes 9.25 Gbits/sec 77 1.31 MBytes
[ 5] 9.00-10.00 sec 1.08 GBytes 9.24 Gbits/sec 0 1.48 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 10.8 GBytes 9.28 Gbits/sec 166 sender
[ 5] 0.00-10.04 sec 10.8 GBytes 9.24 Gbits/sec receiver
Reported-at: https://lore.kernel.org/all/CACGkMEvTLG0Ayg+TtbN4q4pPW-ycgCCs3sC3-TF8cuRTf7Pp1A@mail.gmail.com
Signed-off-by: Harold Huang <[email protected]>
Acked-by: Jason Wang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|