aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2024-05-10net: dsa: microchip: Fix spellig mistake "configur" -> "configure"Colin Ian King1-1/+1
There is a spelling mistake in a dev_err message. Fix it. Signed-off-by: Colin Ian King <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-10af_unix: Add dead flag to struct scm_fp_list.Kuniyuki Iwashima3-4/+12
Commit 1af2dface5d2 ("af_unix: Don't access successor in unix_del_edges() during GC.") fixed use-after-free by avoid accessing edge->successor while GC is in progress. However, there could be a small race window where another process could call unix_del_edges() while gc_in_progress is true and __skb_queue_purge() is on the way. So, we need another marker for struct scm_fp_list which indicates if the skb is garbage-collected. This patch adds dead flag in struct scm_fp_list and set it true before calling __skb_queue_purge(). Fixes: 1af2dface5d2 ("af_unix: Don't access successor in unix_del_edges() during GC.") Signed-off-by: Kuniyuki Iwashima <[email protected]> Acked-by: Paolo Abeni <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-10net: ethernet: adi: adin1110: Replace linux/gpio.h by proper oneAndy Shevchenko1-1/+1
linux/gpio.h is deprecated and subject to remove. The driver doesn't use it directly, replace it with what is really being used. Signed-off-by: Andy Shevchenko <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-10octeontx2-pf: Reuse Transmit queue/Send queue index of HTB classHariprasad Kelam1-1/+79
Real number of Transmit queues are incremented when user enables HTB class and vice versa. Depending on SKB priority driver returns transmit queue (Txq). Transmit queues and Send queues are one-to-one mapped. In few scenarios, Driver is returning transmit queue value which is greater than real number of transmit queue and Stack detects this as error and overwrites transmit queue value. For example user has added two classes and real number of queues are incremented accordingly - tc class add dev eth1 parent 1: classid 1:1 htb rate 100Mbit ceil 100Mbit prio 1 quantum 1024 - tc class add dev eth1 parent 1: classid 1:2 htb rate 100Mbit ceil 200Mbit prio 7 quantum 1024 now if user deletes the class with id 1:1, driver decrements the real number of queues - tc class del dev eth1 classid 1:1 But for the class with id 1:2, driver is returning transmit queue value which is higher than real number of transmit queue leading to below error eth1 selects TX queue x, but real number of TX queues is x This patch solves the problem by assigning deleted class transmit queue/send queue to active class. Signed-off-by: Hariprasad Kelam <[email protected]> Reviewed-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-10Merge branch 'gve-minor-cleanups'Jakub Kicinski2-27/+19
Simon Horman says: ==================== gve: Minor cleanups This short patchset provides two minor cleanups for the gve driver. These were found by tooling as mentioned in each patch, and otherwise by inspection. No change in run time behaviour is intended. Each patch is compile tested only. v1: https://lore.kernel.org/r/[email protected] ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-10gve: Use ethtool_sprintf/puts() to fill stats stringsSimon Horman1-25/+17
Make use of standard helpers to simplify filling in stats strings. The first two ethtool_puts() changes address the following fortification warnings flagged by W=1 builds with clang-18. (The last ethtool_puts change does not because the warning relates to writing beyond the first element of an array, and gve_gstrings_priv_flags only has one element.) .../fortify-string.h:562:4: warning: call to '__read_overflow2_field' declared with 'warning' attribute: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning] 562 | __read_overflow2_field(q_size_field, size); | ^ .../fortify-string.h:562:4: warning: call to '__read_overflow2_field' declared with 'warning' attribute: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning] Likewise, the same changes resolve the same problems flagged by Smatch. .../gve_ethtool.c:100 gve_get_strings() error: __builtin_memcpy() '*gve_gstrings_main_stats' too small (32 vs 576) .../gve_ethtool.c:120 gve_get_strings() error: __builtin_memcpy() '*gve_gstrings_adminq_stats' too small (32 vs 512) Compile tested only. Reviewed-by: Shailend Chand <[email protected]> Reviewed-by: Larysa Zaremba <[email protected]> Signed-off-by: Simon Horman <[email protected]> Acked-by: Justin Stitt <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-10gve: Avoid unnecessary use of comma operatorSimon Horman1-2/+2
Although it does not seem to have any untoward side-effects, the use of ';' to separate to assignments seems more appropriate than ','. Flagged by clang-18 -Wcomma No functional change intended. Compile tested only. Reviewed-by: Shailend Chand <[email protected]> Reviewed-by: Larysa Zaremba <[email protected]> Signed-off-by: Simon Horman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-10selftests: net: increase the delay for relative cmsg_time.sh testJakub Kicinski2-16/+23
Slow machines can delay scheduling of the packets for milliseconds. Increase the delay to 8ms if KSFT_MACHINE_SLOW. Try to limit the variability by moving setsockopts earlier (before we read time). This fixes the "TXTIME rel" failures on debug kernels, like: Case ICMPv4 - TXTIME rel returned '', expected 'OK' Reviewed-by: Willem de Bruijn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-10selftests: net: fix timestamp not arriving in cmsg_time.shJakub Kicinski1-5/+15
On slow machines the SND timestamp sometimes doesn't arrive before we quit. The test only waits as long as the packet delay, so it's easy for a race condition to happen. Double the wait but do a bit of polling, once the SND timestamp arrives there's no point to wait any longer. This fixes the "TXTIME abs" failures on debug kernels, like: Case ICMPv4 - TXTIME abs returned '', expected 'OK' Reviewed-by: Willem de Bruijn <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-10virtio_net: Fix memory leak in virtnet_rx_mod_workDaniel Jurgens1-2/+1
The pointer delcaration was missing the __free(kfree). Fixes: ff7c7d9f5261 ("virtio_net: Remove command data from control_buf") Reported-by: Jens Axboe <[email protected]> Closes: https://lore.kernel.org/netdev/[email protected]/ Signed-off-by: Daniel Jurgens <[email protected]> Tested-by: Jens Axboe <[email protected]> Reviewed-by: Xuan Zhuo <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-10bnxt_en: silence clang build warningVadim Fedorenko1-1/+1
Clang build brings a warning: ../drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c:133:12: warning: comparison of distinct pointer types ('typeof (tmo_us) *' (aka 'unsigned int *') and 'typeof (65535) *' (aka 'int *')) [-Wcompare-distinct-pointer-types] 133 | tmo_us = min(tmo_us, BNXT_PTP_QTS_MAX_TMO_US); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fix it by specifying proper type for BNXT_PTP_QTS_MAX_TMO_US. Fixes: 7de3c2218eed ("bnxt_en: Add a timeout parameter to bnxt_hwrm_port_ts_query()") Signed-off-by: Vadim Fedorenko <[email protected]> Reviewed-by: Michael Chan <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-10Merge tag 'gtp-24-05-07' of ↵David S. Miller4-130/+745
git://git.kernel.org/pub/scm/linux/kernel/git/pablo/gtp Pablo neira Ayuso says: ==================== gtp pull request 24-05-07 This v3 includes: - fix for clang uninitialized variable per Jakub. - address Smatch and Coccinelle reports per Simon - remove inline in new IPv6 support per Simon - fix memleaks in netlink control plane per Simon -o- The following patchset contains IPv6 GTP driver support for net-next, this also includes IPv6 over IPv4 and vice-versa: Patch #1 removes a unnecessary stack variable initialization in the socket routine. Patch #2 deals with GTP extension headers. This variable length extension header to decapsulate packets accordingly. Otherwise, packets are dropped when these extension headers are present which breaks interoperation with other non-Linux based GTP implementations. Patch #3 prepares for IPv6 support by moving IPv4 specific fields in PDP context objects to a union. Patch #4 adds IPv6 support while retaining backward compatibility. Three new attributes allows to declare an IPv6 GTP tunnel GTPA_FAMILY, GTPA_PEER_ADDR6 and GTPA_MS_ADDR6 as well as IFLA_GTP_LOCAL6 to declare the IPv6 GTP UDP socket. Up to this patch, only IPv6 outer in IPv6 inner is supported. Patch #5 uses IPv6 address /64 prefix for UE/MS in the inner headers. Unlike IPv4, which provides a 1:1 mapping between UE/MS, IPv6 tunnel encapsulates traffic for /64 address as specified by 3GPP TS. Patch has been split from Patch #4 to highlight this behaviour. Patch #6 passes up IPv6 link-local traffic, such as IPv6 SLAAC, for handling to userspace so they are handled as control packets. Patch #7 prepares to allow for GTP IPv4 over IPv6 and vice-versa by moving IP specific debugging out of the function to build IPv4 and IPv6 GTP packets. Patch #8 generalizes TOS/DSCP handling following similar approach as in the existing iptunnel infrastructure. Patch #9 adds a helper function to build an IPv4 GTP packet in the outer header. Patch #10 adds a helper function to build an IPv6 GTP packet in the outer header. Patch #11 adds support for GTP IPv4-over-IPv6 and vice-versa. Patch #12 allows to use the same TID/TEID (tunnel identifier) for inner IPv4 and IPv6 packets for better UE/MS dual stack integration. This series integrates with the osmocom.org project CI and TTCN-3 test infrastructure (Oliver Smith) as well as the userspace libgtpnl library. Thanks to Harald Welte, Oliver Smith and Pau Espin for reviewing and providing feedback through the osmocom.org redmine platform to make this happen. ==================== Signed-off-by: David S. Miller <[email protected]>
2024-05-10netfilter: nf_tables: allow clone callbacks to sleepFlorian Westphal8-21/+23
Sven Auhagen reports transaction failures with following error: ./main.nft:13:1-26: Error: Could not process rule: Cannot allocate memory percpu: allocation failed, size=16 align=8 atomic=1, atomic alloc failed, no space left This points to failing pcpu allocation with GFP_ATOMIC flag. However, transactions happen from user context and are allowed to sleep. One case where we can call into percpu allocator with GFP_ATOMIC is nft_counter expression. Normally this happens from control plane, so this could use GFP_KERNEL instead. But one use case, element insertion from packet path, needs to use GFP_ATOMIC allocations (nft_dynset expression). At this time, .clone callbacks always use GFP_ATOMIC for this reason. Add gfp_t argument to the .clone function and pass GFP_KERNEL or GFP_ATOMIC flag depending on context, this allows all clone memory allocations to sleep for the normal (transaction) case. Cc: Sven Auhagen <[email protected]> Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2024-05-10selftests: netfilter: add packetdrill based conntrack testsFlorian Westphal10-0/+475
Add a new test script that uses packetdrill tool to exercise conntrack state machine. Needs ip/ip6tables and conntrack tool (to check if we have an entry in the expected state). Test cases added here cover following scenarios: 1. already-acked (retransmitted) packets are not tagged as INVALID 2. RST packet coming when conntrack is already closing (FIN/CLOSE_WAIT) transitions conntrack to CLOSE even if the RST is not an exact match 3. RST packets with out-of-window sequence numbers are marked as INVALID 4. SYN+Challenge ACK: check that challenge ack is allowed to pass 5. Old SYN/ACK: check conntrack handles the case where SYN is answered with SYN/ACK for an old, previous connection attempt 6. Check SYN reception while in ESTABLISHED state generates a challenge ack, RST response clears 'outdated' state + next SYN retransmit gets us into 'SYN_RECV' conntrack state. Tests get run twice, once with ipv4 and once with ipv6. Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2024-05-10netfilter: nft_set_pipapo: remove dirty flagFlorian Westphal2-27/+0
After previous change: ->clone exists: ->dirty is always true ->clone == NULL ->dirty is always false So remove this flag. Signed-off-by: Florian Westphal <[email protected]> Reviewed-by: Stefano Brivio <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2024-05-10netfilter: nft_set_pipapo: move cloning of match info to insert/removal pathFlorian Westphal1-21/+49
This set type keeps two copies of the sets' content, priv->match (live version, used to match from packet path) priv->clone (work-in-progress version of the 'future' priv->match). All additions and removals are done on priv->clone. When transaction completes, priv->clone becomes priv->match and a new clone is allocated for use by next transaction. Problem is that the cloning requires GFP_KERNEL allocations but we cannot fail at either commit or abort time. This patch defers the clone until we get an insertion or removal request. This allows us to handle OOM situations correctly. This also allows to remove ->dirty in a followup change: If ->clone exists, ->dirty is always true If ->clone is NULL, ->dirty is always false, no elements were added or removed (except catchall elements which are external to the specific set backend). Signed-off-by: Florian Westphal <[email protected]> Reviewed-by: Stefano Brivio <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2024-05-10netfilter: nft_set_pipapo: prepare pipapo_get helper for on-demand cloneFlorian Westphal1-7/+10
The helper uses priv->clone unconditionally which will fail once we do the clone conditionally on first insert or removal. 'nft get element' from userspace needs to use priv->match since this runs from rcu read side lock section. Prepare for this by passing the match backend data as argument. Signed-off-by: Florian Westphal <[email protected]> Reviewed-by: Stefano Brivio <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2024-05-10net: ipv6: fix wrong start position when receive hop-by-hop fragmentgaoxingwang1-1/+1
In IPv6, ipv6_rcv_core will parse the hop-by-hop type extension header and increase skb->transport_header by one extension header length. But if there are more other extension headers like fragment header at this time, the skb->transport_header points to the second extension header, not the transport layer header or the first extension header. This will result in the start and nexthdrp variable not pointing to the same position in ipv6frag_thdr_trunced, and ipv6_skip_exthdr returning incorrect offset and frag_off.Sometimes,the length of the last sharded packet is smaller than the calculated incorrect offset, resulting in packet loss. We can use network header to offset and calculate the correct position to solve this problem. Fixes: 9d9e937b1c8b (ipv6/netfilter: Discard first fragment not including all headers) Signed-off-by: Gao Xingwang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2024-05-09tcp: get rid of twsk_unique()Eric Dumazet5-13/+5
DCCP is going away soon, and had no twsk_unique() method. We can directly call tcp_twsk_unique() for TCP sockets. Signed-off-by: Eric Dumazet <[email protected]> Reviewed-by: Kuniyuki Iwashima <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-09net/sched: adjust device watchdog timer to detect stopped queue at right timePraveen Kumar Kannoju1-4/+7
Applications are sensitive to long network latency, particularly heartbeat monitoring ones. Longer the tx timeout recovery higher the risk with such applications on a production machines. This patch remedies, yet honoring device set tx timeout. Modify watchdog next timeout to be shorter than the device specified. Compute the next timeout be equal to device watchdog timeout less the how long ago queue stop had been done. At next watchdog timeout tx timeout handler is called into if still in stopped state. Either called or not called, restore the watchdog timeout back to device specified. Signed-off-by: Praveen Kumar Kannoju <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-09kbuild,bpf: Switch to using --btf_features for pahole v1.26 and laterAlan Maguire1-2/+13
The btf_features list can be used for pahole v1.26 and later - it is useful because if a feature is not yet implemented it will not exit with a failure message. This will allow us to add feature requests to the pahole options without having to check pahole versions in future; if the version of pahole supports the feature it will be added. Signed-off-by: Alan Maguire <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Tested-by: Eduard Zingerman <[email protected]> Acked-by: Andrii Nakryiko <[email protected]> Link: https://lore.kernel.org/bpf/[email protected]
2024-05-09Merge branch 'use network helpers, part 4'Martin KaFai Lau5-150/+59
Geliang Tang says: ==================== From: Geliang Tang <[email protected]> This patchset adds post_socket_cb pointer into struct network_helper_opts to make start_server_addr() helper more flexible. With these modifications, many duplicate codes can be dropped. Patches 1-3 address Martin's comments in the previous series. ==================== Signed-off-by: Martin KaFai Lau <[email protected]>
2024-05-09selftests/bpf: Drop get_port in test_tcp_check_syncookieGeliang Tang1-18/+3
The arguments "addr" and "len" of run_test() have dropped. This makes function get_port() useless. Drop it from test_tcp_check_syncookie_user.c. Signed-off-by: Geliang Tang <[email protected]> Link: https://lore.kernel.org/r/a9b5c8064ab4cbf0f68886fe0e4706428b8d0d47.1714907662.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <[email protected]>
2024-05-09selftests/bpf: Use connect_to_fd in test_tcp_check_syncookieGeliang Tang1-33/+5
This patch uses public helper connect_to_fd() exported in network_helpers.h instead of the local defined function connect_to_server() in test_tcp_check_syncookie_user.c. This can avoid duplicate code. Then the arguments "addr" and "len" of run_test() become useless, drop them too. Signed-off-by: Geliang Tang <[email protected]> Link: https://lore.kernel.org/r/e0ae6b790ac0abc7193aadfb2660c8c9eb0fe1f0.1714907662.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <[email protected]>
2024-05-09selftests/bpf: Use connect_to_fd in sockopt_inheritGeliang Tang1-30/+1
This patch uses public helper connect_to_fd() exported in network_helpers.h instead of the local defined function connect_to_server() in prog_tests/sockopt_inherit.c. This can avoid duplicate code. Signed-off-by: Geliang Tang <[email protected]> Link: https://lore.kernel.org/r/71db79127cc160b0643fd9a12c70ae019ae076a1.1714907662.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <[email protected]>
2024-05-09selftests/bpf: Use start_server_addr in test_tcp_check_syncookieGeliang Tang2-44/+25
Include network_helpers.h in test_tcp_check_syncookie_user.c, use public helper start_server_addr() in it instead of the local defined function start_server(). This can avoid duplicate code. Add two helpers v6only_true() and v6only_false() to set IPV6_V6ONLY sockopt to true or false, set them to post_socket_cb pointer of struct network_helper_opts, and pass it to start_server_setsockopt(). In order to use functions defined in network_helpers.c, Makefile needs to be updated too. Signed-off-by: Geliang Tang <[email protected]> Link: https://lore.kernel.org/r/e0c5324f5da84f453f47543536e70f126eaa8678.1714907662.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <[email protected]>
2024-05-09selftests/bpf: Use start_server_addr in sockopt_inheritGeliang Tang1-21/+12
Include network_helpers.h in prog_tests/sockopt_inherit.c, use public helper start_server_addr() instead of the local defined function start_server(). This can avoid duplicate code. Add a helper custom_cb() to set SOL_CUSTOM sockopt looply, set it to post_socket_cb pointer of struct network_helper_opts, and pass it to start_server_addr(). Signed-off-by: Geliang Tang <[email protected]> Link: https://lore.kernel.org/r/687af66f743a0bf15cdba372c5f71fe64863219e.1714907662.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <[email protected]>
2024-05-09selftests/bpf: Add post_socket_cb for network_helper_optsGeliang Tang2-9/+18
__start_server() sets SO_REUSPORT through setsockopt() when the parameter 'reuseport' is set. This patch makes it more flexible by adding a function pointer post_socket_cb into struct network_helper_opts. The 'const struct post_socket_opts *cb_opts' args in the post_socket_cb is for the future extension. The 'reuseport' parameter can be dropped. Now the original start_reuseport_server() can be implemented by setting a newly defined reuseport_cb() function pointer to post_socket_cb filed of struct network_helper_opts. Signed-off-by: Geliang Tang <[email protected]> Link: https://lore.kernel.org/r/470cb82f209f055fc7fb39c66c6b090b5b7ed2b2.1714907662.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <[email protected]>
2024-05-09selftest: epoll_busy_poll: epoll busy poll testsJoe Damato3-1/+323
Add a simple test for the epoll busy poll ioctls, using the kernel selftest harness. This test ensures that the ioctls have the expected return codes and that the kernel properly gets and sets epoll busy poll parameters. The test can be expanded in the future to do real busy polling (provided another machine to act as the client is available). Signed-off-by: Joe Damato <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-09Merge branch 'selftests-bpf-retire-bpf_tcp_helpers-h'Alexei Starovoitov20-426/+191
Martin KaFai Lau says: ==================== selftests/bpf: Retire bpf_tcp_helpers.h From: Martin KaFai Lau <[email protected]> The earlier commit 8e6d9ae2e09f ("selftests/bpf: Use bpf_tracing.h instead of bpf_tcp_helpers.h") removed the bpf_tcp_helpers.h usages from the non networking tests. This patch set is a continuation of this effort to retire the bpf_tcp_helpers.h from the networking tests (mostly tcp-cc related). The main usage of the bpf_tcp_helpers.h is the partial kernel socket definitions (e.g. sock, tcp_sock). New fields are kept adding back to those partial socket definitions while everything is available in the vmlinux.h. The recent bpf_cc_cubic.c test tried to extend bpf_tcp_helpers.c but eventually used the vmlinux.h instead. To avoid this unnecessary detour for new tests and have one consistent way of using the kernel sockets, this patch set retires the bpf_tcp_helpers.h usages and consolidates the tests to use vmlinux.h instead. ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-09selftests/bpf: Retire bpf_tcp_helpers.hMartin KaFai Lau1-241/+0
The previous patches have consolidated the tests to use bpf_tracing_net.h (i.e. vmlinux.h) instead of bpf_tcp_helpers.h. This patch can finally retire the bpf_tcp_helpers.h from the repository. Signed-off-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-09selftests/bpf: Remove the bpf_tcp_helpers.h usages from other non tcp-cc testsMartin KaFai Lau7-38/+23
The patch removes the remaining bpf_tcp_helpers.h usages in the non tcp-cc networking tests. It either replaces it with bpf_tracing_net.h or just removed it because the test is not actually using any kernel sockets. For the later, the missing macro (mainly SOL_TCP) is defined locally. An exception is the test_sock_fields which is testing the "struct bpf_sock" type instead of the kernel sock type. Whenever "vmlinux.h" is used instead, it hits a verifier error on doing arithmetic on the sock_common pointer: ; return !a6[0] && !a6[1] && !a6[2] && a6[3] == bpf_htonl(1); @ test_sock_fields.c:54 21: (61) r2 = *(u32 *)(r1 +28) ; R1_w=sock_common() R2_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) 22: (56) if w2 != 0x0 goto pc-6 ; R2_w=0 23: (b7) r3 = 28 ; R3_w=28 24: (bf) r2 = r1 ; R1_w=sock_common() R2_w=sock_common() 25: (0f) r2 += r3 R2 pointer arithmetic on sock_common prohibited Hence, instead of including bpf_tracing_net.h, the test_sock_fields test defines a tcp_sock with one lsndtime field in it. Another highlight is, in sockopt_qos_to_cc.c, the tcp_cc_eq() is replaced by bpf_strncmp(). tcp_cc_eq() was a workaround in bpf_tcp_helpers.h before bpf_strncmp had been added. The SOL_IPV6 addition to bpf_tracing_net.h is needed by the test_tcpbpf_kern test. Signed-off-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-09selftests/bpf: Remove bpf_tcp_helpers.h usages from other misc bpf tcp-cc testsMartin KaFai Lau2-10/+2
This patch removed the final few bpf_tcp_helpers.h usages in some misc bpf tcp-cc tests and replace it with bpf_tracing_net.h (i.e. vmlinux.h) Signed-off-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-09selftests/bpf: Use bpf_tracing_net.h in bpf_dctcpMartin KaFai Lau1-7/+15
This patch uses bpf_tracing_net.h (i.e. vmlinux.h) in bpf_dctcp. This will allow to retire the bpf_tcp_helpers.h and consolidate tcp-cc tests to vmlinux.h. It will have a dup on min/max macros with the bpf_cubic. It could be further refactored in the future. Signed-off-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-09selftests/bpf: Use bpf_tracing_net.h in bpf_cubicMartin KaFai Lau1-4/+12
This patch uses bpf_tracing_net.h (i.e. vmlinux.h) in bpf_cubic. This will allow to retire the bpf_tcp_helpers.h and consolidate tcp-cc tests to vmlinux.h. Signed-off-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-09selftests/bpf: Rename tcp-cc private struct in bpf_cubic and bpf_dctcpMartin KaFai Lau2-18/+18
The "struct bictcp" and "struct dctcp" are private to the bpf prog and they are stored in the private buffer in inet_csk(sk)->icsk_ca_priv. Hence, there is no bpf CO-RE required. The same struct name exists in the vmlinux.h. To reuse vmlinux.h, they need to be renamed such that the bpf prog logic will be immuned from the kernel tcp-cc changes. This patch adds a "bpf_" prefix to them. Signed-off-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-09selftests/bpf: Sanitize the SEC and inline usages in the bpf-tcp-cc testsMartin KaFai Lau10-75/+77
It is needed to remove the BPF_STRUCT_OPS usages from the tcp-cc tests because it is defined in bpf_tcp_helpers.h which is going to be retired. While at it, this patch consolidates all tcp-cc struct_ops programs to use the SEC("struct_ops") + BPF_PROG(). It also removes the unnecessary __always_inline usages from the tcp-cc tests. Signed-off-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-09selftests/bpf: Reuse the tcp_sk() from the bpf_tracing_net.hMartin KaFai Lau3-21/+3
This patch removes the individual tcp_sk implementations from the tcp-cc tests. The tcp_sk() implementation from the bpf_tracing_net.h is reused instead. Signed-off-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-09selftests/bpf: Add a few tcp helper functions and macros to bpf_tracing_net.hMartin KaFai Lau2-13/+42
This patch adds a few tcp related helper functions to bpf_tracing_net.h. They will be useful for both tcp-cc and network tracing related bpf progs. They have already been in the bpf_tcp_helpers.h. This change is needed to retire the bpf_tcp_helpers.h and consolidate all tests to vmlinux.h (i.e. bpf_tracing_net.h). Some of the helpers (tcp_sk and inet_csk) are also defined in bpf_cc_cubic.c and they are removed. While at it, remove the vmlinux.h from bpf_cc_cubic.c. bpf_tracing_net.h (which has vmlinux.h after this patch) is enough and will be consistent with the other tcp-cc tests in the later patches. The other TCP_* macro additions will be needed for the bpf_dctcp changes in the later patch. Signed-off-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-09selftests/bpf: Remove bpf_tracing_net.h usages from two networking testsMartin KaFai Lau2-2/+2
This patch removes the bpf_tracing_net.h usage from the networking tests, fib_lookup and test_lwt_redirect. Instead of using the (copied) macro TC_ACT_SHOT and ETH_HLEN from bpf_tracing_net.h, they can directly use the ones defined in the network header files under linux/. Signed-off-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Alexei Starovoitov <[email protected]>
2024-05-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski265-1347/+2906
Cross-merge networking fixes after downstream PR. No conflicts. Adjacent changes: drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c 35d92abfbad8 ("net: hns3: fix kernel crash when devlink reload during initialization") 2a1a1a7b5fd7 ("net: hns3: add command queue trace for hns3") Signed-off-by: Jakub Kicinski <[email protected]>
2024-05-09Merge tag 'net-6.9-rc8' of ↵Linus Torvalds47-205/+519
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bluetooth and IPsec. The bridge patch is actually a follow-up to a recent fix in the same area. We have a pending v6.8 AF_UNIX regression; it should be solved soon, but not in time for this PR. Current release - regressions: - eth: ks8851: Queue RX packets in IRQ handler instead of disabling BHs - net: bridge: fix corrupted ethernet header on multicast-to-unicast Current release - new code bugs: - xfrm: fix possible bad pointer derferencing in error path Previous releases - regressionis: - core: fix out-of-bounds access in ops_init - ipv6: - fix potential uninit-value access in __ip6_make_skb() - fib6_rules: avoid possible NULL dereference in fib6_rule_action() - tcp: use refcount_inc_not_zero() in tcp_twsk_unique(). - rtnetlink: correct nested IFLA_VF_VLAN_LIST attribute validation - rxrpc: fix congestion control algorithm - bluetooth: - l2cap: fix slab-use-after-free in l2cap_connect() - msft: fix slab-use-after-free in msft_do_close() - eth: hns3: fix kernel crash when devlink reload during initialization - eth: dsa: mv88e6xxx: add phylink_get_caps for the mv88e6320/21 family Previous releases - always broken: - xfrm: preserve vlan tags for transport mode software GRO - tcp: defer shutdown(SEND_SHUTDOWN) for TCP_SYN_RECV sockets - eth: hns3: keep using user config after hardware reset" * tag 'net-6.9-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (47 commits) net: dsa: mv88e6xxx: read cmode on mv88e6320/21 serdes only ports net: dsa: mv88e6xxx: add phylink_get_caps for the mv88e6320/21 family net: hns3: fix kernel crash when devlink reload during initialization net: hns3: fix port vlan filter not disabled issue net: hns3: use appropriate barrier function after setting a bit value net: hns3: release PTP resources if pf initialization failed net: hns3: change type of numa_node_mask as nodemask_t net: hns3: direct return when receive a unknown mailbox message net: hns3: using user configure after hardware reset net/smc: fix neighbour and rtable leak in smc_ib_find_route() ipv6: prevent NULL dereference in ip6_output() hsr: Simplify code for announcing HSR nodes timer setup ipv6: fib6_rules: avoid possible NULL dereference in fib6_rule_action() dt-bindings: net: mediatek: remove wrongly added clocks and SerDes rxrpc: Only transmit one ACK per jumbo packet received rxrpc: Fix congestion control algorithm selftests: test_bridge_neigh_suppress.sh: Fix failures due to duplicate MAC ipv6: Fix potential uninit-value access in __ip6_make_skb() net: phy: marvell-88q2xxx: add support for Rev B1 and B2 appletalk: Improve handling of broadcast packets ...
2024-05-09Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linuxLinus Torvalds1-0/+4
Pull ARM fix from Russell King: - clear stale KASan stack poison when a CPU resumes * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rmk/linux: ARM: 9381/1: kasan: clear stale stack poison
2024-05-09Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds1-0/+1
Pull dentry leak fix from Al Viro: "Dentry leak fix in the qibfs driver that I forgot to send a pull request for ;-/ My apologies - it actually sat in vfs.git#fixes for more than two months..." * tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: qibfs: fix dentry leak
2024-05-09l2tp: Support several sockets with same IP/port quadrupleSamuel Thibault1-0/+21
Some l2tp providers will use 1701 as origin port and open several tunnels for the same origin and target. On the Linux side, this may mean opening several sockets, but then trafic will go to only one of them, losing the trafic for the tunnel of the other socket (or leaving it up to userland, consuming a lot of cpu%). This can also happen when the l2tp provider uses a cluster, and load-balancing happens to migrate from one origin IP to another one, for which a socket was already established. Managing reassigning tunnels from one socket to another would be very hairy for userland. Lastly, as documented in l2tpconfig(1), as client it may be necessary to use 1701 as origin port for odd firewalls reasons, which could prevent from establishing several tunnels to a l2tp server, for the same reason: trafic would get only on one of the two sockets. With the V2 protocol it is however easy to route trafic to the proper tunnel, by looking up the tunnel number in the network namespace. This fixes the three cases altogether. Signed-off-by: Samuel Thibault <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2024-05-09net: dsa: mv88e6xxx: read cmode on mv88e6320/21 serdes only portsSteffen Bätz1-6/+17
On the mv88e6320 and 6321 switch family, port 0/1 are serdes only ports. Modified the mv88e6352_get_port4_serdes_cmode function to pass a port number since the register set of the 6352 is equal on the 6320/21. Signed-off-by: Steffen Bätz <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Reviewed-by: Fabio Estevam <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2024-05-09net: dsa: mv88e6xxx: add phylink_get_caps for the mv88e6320/21 familySteffen Bätz1-2/+14
As of commit de5c9bf40c45 ("net: phylink: require supported_interfaces to be filled") Marvell 88e6320/21 switches fail to be probed: ... mv88e6085 30be0000.ethernet-1:00: phylink: error: empty supported_interfaces error creating PHYLINK: -22 ... The problem stems from the use of mv88e6185_phylink_get_caps() to get the device capabilities. Since there are serdes only ports 0/1 included, create a new dedicated phylink_get_caps for the 6320 and 6321 to properly support their set of capabilities. Fixes: de5c9bf40c45 ("net: phylink: require supported_interfaces to be filled") Signed-off-by: Steffen Bätz <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Reviewed-by: Fabio Estevam <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2024-05-09Merge branch 'there-are-some-bugfix-for-the-hns3-ethernet-driver'Paolo Abeni6-41/+47
Jijie Shao says: ==================== There are some bugfix for the HNS3 ethernet driver ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Paolo Abeni <[email protected]>
2024-05-09net: hns3: fix kernel crash when devlink reload during initializationYonglong Liu2-18/+9
The devlink reload process will access the hardware resources, but the register operation is done before the hardware is initialized. So, processing the devlink reload during initialization may lead to kernel crash. This patch fixes this by registering the devlink after hardware initialization. Fixes: cd6242991d2e ("net: hns3: add support for registering devlink for VF") Fixes: 93305b77ffcb ("net: hns3: fix kernel crash when devlink reload during pf initialization") Signed-off-by: Yonglong Liu <[email protected]> Signed-off-by: Jijie Shao <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>
2024-05-09net: hns3: fix port vlan filter not disabled issueYonglong Liu1-1/+6
According to hardware limitation, for device support modify VLAN filter state but not support bypass port VLAN filter, it should always disable the port VLAN filter. but the driver enables port VLAN filter when initializing, if there is no VLAN(except VLAN 0) id added, the driver will disable it in service task. In most time, it works fine. But there is a time window before the service task shceduled and net device being registered. So if user adds VLAN at this time, the driver will not update the VLAN filter state, and the port VLAN filter remains enabled. To fix the problem, if support modify VLAN filter state but not support bypass port VLAN filter, set the port vlan filter to "off". Fixes: 184cd221a863 ("net: hns3: disable port VLAN filter when support function level VLAN filter control") Fixes: 2ba306627f59 ("net: hns3: add support for modify VLAN filter state") Signed-off-by: Yonglong Liu <[email protected]> Signed-off-by: Jijie Shao <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Paolo Abeni <[email protected]>