aboutsummaryrefslogtreecommitdiff
path: root/net/ipv4
AgeCommit message (Collapse)AuthorFilesLines
2022-09-02bpf: Change bpf_getsockopt(SOL_TCP) to reuse do_tcp_getsockopt()Martin KaFai Lau1-2/+2
This patch changes bpf_getsockopt(SOL_TCP) to reuse do_tcp_getsockopt(). It removes the duplicated code from bpf_getsockopt(SOL_TCP). Before this patch, there were some optnames available to bpf_setsockopt(SOL_TCP) but missing in bpf_getsockopt(SOL_TCP). For example, TCP_NODELAY, TCP_MAXSEG, TCP_KEEPIDLE, TCP_KEEPINTVL, and a few more. It surprises users from time to time. This patch automatically closes this gap without duplicating more code. bpf_getsockopt(TCP_SAVED_SYN) does not free the saved_syn, so it stays in sol_tcp_sockopt(). For string name value like TCP_CONGESTION, bpf expects it is always null terminated, so sol_tcp_sockopt() decrements optlen by one before calling do_tcp_getsockopt() and the 'if (optlen < saved_optlen) memset(..,0,..);' in __bpf_getsockopt() will always do a null termination. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20220902002918.2894511-1-kafai@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-02bpf: net: Avoid do_ip_getsockopt() taking sk lock when called from bpfMartin KaFai Lau1-8/+8
Similar to the earlier commit that changed sk_setsockopt() to use sockopt_{lock,release}_sock() such that it can avoid taking lock when called from bpf. This patch also changes do_ip_getsockopt() to use sockopt_{lock,release}_sock() such that a latter patch can make bpf_getsockopt(SOL_IP) to reuse do_ip_getsockopt(). Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20220902002834.2891514-1-kafai@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-02bpf: net: Change do_ip_getsockopt() to take the sockptr_t argumentMartin KaFai Lau3-48/+63
Similar to the earlier patch that changes sk_getsockopt() to take the sockptr_t argument. This patch also changes do_ip_getsockopt() to take the sockptr_t argument such that a latter patch can make bpf_getsockopt(SOL_IP) to reuse do_ip_getsockopt(). Note on the change in ip_mc_gsfget(). This function is to return an array of sockaddr_storage in optval. This function is shared between ip_get_mcast_msfilter() and compat_ip_get_mcast_msfilter(). However, the sockaddr_storage is stored at different offset of the optval because of the difference between group_filter and compat_group_filter. Thus, a new 'ss_offset' argument is added to ip_mc_gsfget(). Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20220902002828.2890585-1-kafai@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-02bpf: net: Avoid do_tcp_getsockopt() taking sk lock when called from bpfMartin KaFai Lau1-9/+9
Similar to the earlier commit that changed sk_setsockopt() to use sockopt_{lock,release}_sock() such that it can avoid taking lock when called from bpf. This patch also changes do_tcp_getsockopt() to use sockopt_{lock,release}_sock() such that a latter patch can make bpf_getsockopt(SOL_TCP) to reuse do_tcp_getsockopt(). Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20220902002821.2889765-1-kafai@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-09-02bpf: net: Change do_tcp_getsockopt() to take the sockptr_t argumentMartin KaFai Lau1-35/+37
Similar to the earlier patch that changes sk_getsockopt() to take the sockptr_t argument . This patch also changes do_tcp_getsockopt() to take the sockptr_t argument such that a latter patch can make bpf_getsockopt(SOL_TCP) to reuse do_tcp_getsockopt(). Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20220902002815.2889332-1-kafai@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-08-31bpf, net: Avoid loading module when calling bpf_setsockopt(TCP_CONGESTION)Martin KaFai Lau1-1/+1
When bpf prog changes tcp-cc by calling bpf_setsockopt(TCP_CONGESTION), it should not try to load module which may be a blocking operation. This details was correct in the v1 [0] but missed by mistake in the later revision in commit cb388e7ee3a8 ("bpf: net: Change do_tcp_setsockopt() to use the sockopt's lock_sock() and capable()"). This patch fixes it by checking the has_current_bpf_ctx(). [0] https://lore.kernel.org/bpf/20220727060921.2373314-1-kafai@fb.com/ Fixes: cb388e7ee3a8 ("bpf: net: Change do_tcp_setsockopt() to use the sockopt's lock_sock() and capable()") Signed-off-by: Martin KaFai Lau <martin.lau@linux.dev> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220830231946.791504-1-martin.lau@linux.dev
2022-08-18bpf: Change bpf_setsockopt(SOL_IP) to reuse do_ip_setsockopt()Martin KaFai Lau1-2/+2
After the prep work in the previous patches, this patch removes the dup code from bpf_setsockopt(SOL_IP) and reuses the implementation in do_ip_setsockopt(). The existing optname white-list is refactored into a new function sol_ip_setsockopt(). NOTE, the current bpf_setsockopt(IP_TOS) is quite different from the the do_ip_setsockopt(IP_TOS). For example, it does not take the INET_ECN_MASK into the account for tcp and also does not adjust sk->sk_priority. It looks like the current bpf_setsockopt(IP_TOS) was referencing the IPV6_TCLASS implementation instead of IP_TOS. This patch tries to rectify that by using the do_ip_setsockopt(IP_TOS). While this is a behavior change, the do_ip_setsockopt(IP_TOS) behavior is arguably what the user is expecting. At least, the INET_ECN_MASK bits should be masked out for tcp. Reviewed-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/r/20220817061826.4180990-1-kafai@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-08-18bpf: Change bpf_setsockopt(SOL_TCP) to reuse do_tcp_setsockopt()Martin KaFai Lau1-2/+2
After the prep work in the previous patches, this patch removes all the dup code from bpf_setsockopt(SOL_TCP) and reuses the do_tcp_setsockopt(). The existing optname white-list is refactored into a new function sol_tcp_setsockopt(). The sol_tcp_setsockopt() also calls the bpf_sol_tcp_setsockopt() to handle the TCP_BPF_XXX specific optnames. bpf_setsockopt(TCP_SAVE_SYN) now also allows a value 2 to save the eth header also and it comes for free from do_tcp_setsockopt(). Reviewed-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/r/20220817061819.4180146-1-kafai@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-08-18bpf: net: Change do_ip_setsockopt() to use the sockopt's lock_sock() and ↵Martin KaFai Lau1-6/+6
capable() Similar to the earlier patch that avoids sk_setsockopt() from taking sk lock and doing capable test when called by bpf. This patch changes do_ip_setsockopt() to use the sockopt_{lock,release}_sock() and sockopt_[ns_]capable(). Reviewed-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/r/20220817061737.4176402-1-kafai@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-08-18bpf: net: Change do_tcp_setsockopt() to use the sockopt's lock_sock() and ↵Martin KaFai Lau1-9/+9
capable() Similar to the earlier patch that avoids sk_setsockopt() from taking sk lock and doing capable test when called by bpf. This patch changes do_tcp_setsockopt() to use the sockopt_{lock,release}_sock() and sockopt_[ns_]capable(). Reviewed-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Link: https://lore.kernel.org/r/20220817061730.4176021-1-kafai@fb.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-08-17tcp: Make SYN ACK RTO tunable by BPF programs with TFOJie Meng2-2/+3
Instead of the hardcoded TCP_TIMEOUT_INIT, this diff calls tcp_timeout_init to initiate req->timeout like the non TFO SYN ACK case. Tested using the following packetdrill script, on a host with a BPF program that sets the initial connect timeout to 10ms. `../../common/defaults.sh` // Initialize connection 0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_TCP, TCP_FASTOPEN, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 +0 < S 0:0(0) win 32792 <mss 1000,sackOK,FO TFO_COOKIE> +0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK> +.01 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK> +.02 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK> +.04 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK> +.01 < . 1:1(0) ack 1 win 32792 +0 accept(3, ..., ...) = 4 Signed-off-by: Jie Meng <jmeng@fb.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-04Merge tag 'driver-core-6.0-rc1' of ↵Linus Torvalds1-0/+6
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core / kernfs updates from Greg KH: "Here is the set of driver core and kernfs changes for 6.0-rc1. The "biggest" thing in here is some scalability improvements for kernfs for large systems. Other than that, included in here are: - arch topology and cache info changes that have been reviewed and discussed a lot. - potential error path cleanup fixes - deferred driver probe cleanups - firmware loader cleanups and tweaks - documentation updates - other small things All of these have been in the linux-next tree for a while with no reported problems" * tag 'driver-core-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (63 commits) docs: embargoed-hardware-issues: fix invalid AMD contact email firmware_loader: Replace kmap() with kmap_local_page() sysfs docs: ABI: Fix typo in comment kobject: fix Kconfig.debug "its" grammar kernfs: Fix typo 'the the' in comment docs: driver-api: firmware: add driver firmware guidelines. (v3) arch_topology: Fix cache attributes detection in the CPU hotplug path ACPI: PPTT: Leave the table mapped for the runtime usage cacheinfo: Use atomic allocation for percpu cache attributes drivers/base: fix userspace break from using bin_attributes for cpumap and cpulist MAINTAINERS: Change mentions of mpm to olivia docs: ABI: sysfs-devices-soc: Update Lee Jones' email address docs: ABI: sysfs-class-pwm: Update Lee Jones' email address Documentation/process: Add embargoed HW contact for LLVM Revert "kernfs: Change kernfs_notify_list to llist." ACPI: Remove the unused find_acpi_cpu_cache_topology() arch_topology: Warn that topology for nested clusters is not supported arch_topology: Add support for parsing sockets in /cpu-map arch_topology: Set cluster identifier in each core/thread from /cpu-map arch_topology: Limit span of cpu_clustergroup_mask() ...
2022-08-01udp: Remove redundant __udp_sysctl_init() call from udp_init().Kuniyuki Iwashima1-7/+1
__udp_sysctl_init() is called for init_net via udp_sysctl_ops. While at it, we can rename __udp_sysctl_init() to udp_sysctl_init(). Fixes: 1e8029515816 ("udp: Move the udp sysctl to namespace.") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski6-50/+62
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-27tcp: md5: fix IPv4-mapped supportEric Dumazet1-3/+12
After the blamed commit, IPv4 SYN packets handled by a dual stack IPv6 socket are dropped, even if perfectly valid. $ nstat | grep MD5 TcpExtTCPMD5Failure 5 0.0 For a dual stack listener, an incoming IPv4 SYN packet would call tcp_inbound_md5_hash() with @family == AF_INET, while tp->af_specific is pointing to tcp_sock_ipv6_specific. Only later when an IPv4-mapped child is created, tp->af_specific is changed to tcp_sock_ipv6_mapped_specific. Fixes: 7bbb765b7349 ("net/tcp: Merge TCP-MD5 inbound callbacks") Reported-by: Brian Vazquez <brianvv@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Dmitry Safonov <dima@arista.com> Tested-by: Leonard Crestez <cdleonard@gmail.com> Link: https://lore.kernel.org/r/20220726115743.2759832-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-26tcp: allow tls to decrypt directly from the tcp rcv queueJakub Kicinski1-1/+41
Expose TCP rx queue accessor and cleanup, so that TLS can decrypt directly from the TCP queue. The expectation is that the caller can access the skb returned from tcp_recv_skb() and up to inq bytes worth of data (some of which may be in ->next skbs) and then call tcp_read_done() when data has been consumed. The socket lock must be held continuously across those two operations. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-25Merge branch 'master' of ↵David S. Miller1-16/+8
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2022-07-20 1) Don't set DST_NOPOLICY in IPv4, a recent patch made this superfluous. From Eyal Birger. 2) Convert alg_key to flexible array member to avoid an iproute2 compile warning when built with gcc-12. From Stephen Hemminger. 3) xfrm_register_km and xfrm_unregister_km do always return 0 so change the type to void. From Zhengchao Shao. 4) Fix spelling mistake in esp6.c From Zhang Jiaming. 5) Improve the wording of comment above XFRM_OFFLOAD flags. From Petr Vaněk. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25ipv4: Fix data-races around sysctl_fib_notify_on_flag_change.Kuniyuki Iwashima1-2/+5
While reading sysctl_fib_notify_on_flag_change, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: 680aea08e78c ("net: ipv4: Emit notification when fib hardware flags are changed") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25tcp: Fix data-races around sysctl_tcp_reflect_tos.Kuniyuki Iwashima1-2/+2
While reading sysctl_tcp_reflect_tos, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: ac8f1710c12b ("tcp: reflect tos value received in SYN to the socket") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Wei Wang <weiwan@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25tcp: Fix a data-race around sysctl_tcp_comp_sack_nr.Kuniyuki Iwashima1-1/+1
While reading sysctl_tcp_comp_sack_nr, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 9c21d2fc41c0 ("tcp: add tcp_comp_sack_nr sysctl") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25tcp: Fix a data-race around sysctl_tcp_comp_sack_slack_ns.Kuniyuki Iwashima1-1/+1
While reading sysctl_tcp_comp_sack_slack_ns, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: a70437cc09a1 ("tcp: add hrtimer slack to sack compression") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25tcp: Fix a data-race around sysctl_tcp_comp_sack_delay_ns.Kuniyuki Iwashima1-1/+2
While reading sysctl_tcp_comp_sack_delay_ns, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 6d82aa242092 ("tcp: add tcp_comp_sack_delay_ns sysctl") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25net: Fix data-races around sysctl_[rw]mem(_offset)?.Kuniyuki Iwashima3-10/+11
While reading these sysctl variables, they can be changed concurrently. Thus, we need to add READ_ONCE() to their readers. - .sysctl_rmem - .sysctl_rwmem - .sysctl_rmem_offset - .sysctl_wmem_offset - sysctl_tcp_rmem[1, 2] - sysctl_tcp_wmem[1, 2] - sysctl_decnet_rmem[1] - sysctl_decnet_wmem[1] - sysctl_tipc_rmem[1] Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-25tcp: Fix data-races around sk_pacing_rate.Kuniyuki Iwashima1-2/+2
While reading sysctl_tcp_pacing_(ss|ca)_ratio, they can be changed concurrently. Thus, we need to add READ_ONCE() to their readers. Fixes: 43e122b014c9 ("tcp: refine pacing rate determination") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski4-41/+41
Daniel Borkmann says: ==================== bpf-next 2022-07-22 We've added 73 non-merge commits during the last 12 day(s) which contain a total of 88 files changed, 3458 insertions(+), 860 deletions(-). The main changes are: 1) Implement BPF trampoline for arm64 JIT, from Xu Kuohai. 2) Add ksyscall/kretsyscall section support to libbpf to simplify tracing kernel syscalls through kprobe mechanism, from Andrii Nakryiko. 3) Allow for livepatch (KLP) and BPF trampolines to attach to the same kernel function, from Song Liu & Jiri Olsa. 4) Add new kfunc infrastructure for netfilter's CT e.g. to insert and change entries, from Kumar Kartikeya Dwivedi & Lorenzo Bianconi. 5) Add a ksym BPF iterator to allow for more flexible and efficient interactions with kernel symbols, from Alan Maguire. 6) Bug fixes in libbpf e.g. for uprobe binary path resolution, from Dan Carpenter. 7) Fix BPF subprog function names in stack traces, from Alexei Starovoitov. 8) libbpf support for writing custom perf event readers, from Jon Doron. 9) Switch to use SPDX tag for BPF helper man page, from Alejandro Colomar. 10) Fix xsk send-only sockets when in busy poll mode, from Maciej Fijalkowski. 11) Reparent BPF maps and their charging on memcg offlining, from Roman Gushchin. 12) Multiple follow-up fixes around BPF lsm cgroup infra, from Stanislav Fomichev. 13) Use bootstrap version of bpftool where possible to speed up builds, from Pu Lehui. 14) Cleanup BPF verifier's check_func_arg() handling, from Joanne Koong. 15) Make non-prealloced BPF map allocations low priority to play better with memcg limits, from Yafang Shao. 16) Fix BPF test runner to reject zero-length data for skbs, from Zhengchao Shao. 17) Various smaller cleanups and improvements all over the place. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (73 commits) bpf: Simplify bpf_prog_pack_[size|mask] bpf: Support bpf_trampoline on functions with IPMODIFY (e.g. livepatch) bpf, x64: Allow to use caller address from stack ftrace: Allow IPMODIFY and DIRECT ops on the same function ftrace: Add modify_ftrace_direct_multi_nolock bpf/selftests: Fix couldn't retrieve pinned program in xdp veth test bpf: Fix build error in case of !CONFIG_DEBUG_INFO_BTF selftests/bpf: Fix test_verifier failed test in unprivileged mode selftests/bpf: Add negative tests for new nf_conntrack kfuncs selftests/bpf: Add tests for new nf_conntrack kfuncs selftests/bpf: Add verifier tests for trusted kfunc args net: netfilter: Add kfuncs to set and change CT status net: netfilter: Add kfuncs to set and change CT timeout net: netfilter: Add kfuncs to allocate and insert CT net: netfilter: Deduplicate code in bpf_{xdp,skb}_ct_lookup bpf: Add documentation for kfuncs bpf: Add support for forcing kfunc args to be trusted bpf: Switch to new kfunc flags infrastructure tools/resolve_btfids: Add support for 8-byte BTF sets bpf: Introduce 8-byte BTF set ... ==================== Link: https://lore.kernel.org/r/20220722221218.29943-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-22Revert "tcp: change pingpong threshold to 3"Wei Wang1-9/+6
This reverts commit 4a41f453bedfd5e9cd040bad509d9da49feb3e2c. This to-be-reverted commit was meant to apply a stricter rule for the stack to enter pingpong mode. However, the condition used to check for interactive session "before(tp->lsndtime, icsk->icsk_ack.lrcvtime)" is jiffy based and might be too coarse, which delays the stack entering pingpong mode. We revert this patch so that we no longer use the above condition to determine interactive session, and also reduce pingpong threshold to 1. Fixes: 4a41f453bedf ("tcp: change pingpong threshold to 3") Reported-by: LemmyHuang <hlm3280@163.com> Suggested-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20220721204404.388396-1-weiwan@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-22tcp: Fix a data-race around sysctl_tcp_invalid_ratelimit.Kuniyuki Iwashima1-1/+2
While reading sysctl_tcp_invalid_ratelimit, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 032ee4236954 ("tcp: helpers to mitigate ACK loops by rate-limiting out-of-window dupacks") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22tcp: Fix a data-race around sysctl_tcp_autocorking.Kuniyuki Iwashima1-1/+1
While reading sysctl_tcp_autocorking, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: f54b311142a9 ("tcp: auto corking") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22tcp: Fix a data-race around sysctl_tcp_min_rtt_wlen.Kuniyuki Iwashima1-1/+1
While reading sysctl_tcp_min_rtt_wlen, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: f672258391b4 ("tcp: track min RTT using windowed min-filter") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22tcp: Fix a data-race around sysctl_tcp_tso_rtt_log.Kuniyuki Iwashima1-1/+1
While reading sysctl_tcp_tso_rtt_log, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 65466904b015 ("tcp: adjust TSO packet sizes based on min_rtt") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22tcp: Fix a data-race around sysctl_tcp_min_tso_segs.Kuniyuki Iwashima1-1/+1
While reading sysctl_tcp_min_tso_segs, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 95bd09eb2750 ("tcp: TSO packets automatic sizing") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22tcp: Fix a data-race around sysctl_tcp_challenge_ack_limit.Kuniyuki Iwashima1-1/+1
While reading sysctl_tcp_challenge_ack_limit, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 282f23c6ee34 ("tcp: implement RFC 5961 3.2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22tcp: Fix a data-race around sysctl_tcp_limit_output_bytes.Kuniyuki Iwashima1-1/+1
While reading sysctl_tcp_limit_output_bytes, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 46d3ceabd8d9 ("tcp: TCP Small Queues") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22tcp: Fix data-races around sysctl_tcp_workaround_signed_windows.Kuniyuki Iwashima1-2/+2
While reading sysctl_tcp_workaround_signed_windows, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: 15d99e02baba ("[TCP]: sysctl to allow TCP window > 32767 sans wscale") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22tcp: Fix data-races around sysctl_tcp_moderate_rcvbuf.Kuniyuki Iwashima1-1/+1
While reading sysctl_tcp_moderate_rcvbuf, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22tcp: Fix data-races around sysctl_tcp_no_ssthresh_metrics_save.Kuniyuki Iwashima1-4/+4
While reading sysctl_tcp_no_ssthresh_metrics_save, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: 65e6d90168f3 ("net-tcp: Disable TCP ssthresh metrics cache by default") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22tcp: Fix a data-race around sysctl_tcp_nometrics_save.Kuniyuki Iwashima1-1/+1
While reading sysctl_tcp_nometrics_save, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22tcp: Fix a data-race around sysctl_tcp_frto.Kuniyuki Iwashima1-1/+1
While reading sysctl_tcp_frto, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22tcp: Fix a data-race around sysctl_tcp_app_win.Kuniyuki Iwashima1-1/+1
While reading sysctl_tcp_app_win, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22tcp: Fix data-races around sysctl_tcp_dsack.Kuniyuki Iwashima1-2/+2
While reading sysctl_tcp_dsack, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-21bpf: Switch to new kfunc flags infrastructureKumar Kartikeya Dwivedi4-41/+41
Instead of populating multiple sets to indicate some attribute and then researching the same BTF ID in them, prepare a single unified BTF set which indicates whether a kfunc is allowed to be called, and also its attributes if any at the same time. Now, only one call is needed to perform the lookup for both kfunc availability and its attributes. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20220721134245.2450-4-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-07-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski24-195/+169
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-nextJakub Kicinski1-28/+14
Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next The following patchset contains Netfilter/IPVS updates for net-next: 1) Simplify nf_ct_get_tuple(), from Jackie Liu. 2) Add format to request_module() call, from Bill Wendling. 3) Add /proc/net/stats/nf_flowtable to monitor in-flight pending hardware offload objects to be processed, from Vlad Buslov. 4) Missing rcu annotation and accessors in the netfilter tree, from Florian Westphal. 5) Merge h323 conntrack helper nat hooks into single object, also from Florian. 6) A batch of update to fix sparse warnings treewide, from Florian Westphal. 7) Move nft_cmp_fast_mask() where it used, from Florian. 8) Missing const in nf_nat_initialized(), from James Yonan. 9) Use bitmap API for Maglev IPVS scheduler, from Christophe Jaillet. 10) Use refcount_inc instead of _inc_not_zero in flowtable, from Florian Westphal. 11) Remove pr_debug in xt_TPROXY, from Nathan Cancellor. * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: xt_TPROXY: remove pr_debug invocations netfilter: flowtable: prefer refcount_inc netfilter: ipvs: Use the bitmap API to allocate bitmaps netfilter: nf_nat: in nf_nat_initialized(), use const struct nf_conn * netfilter: nf_tables: move nft_cmp_fast_mask to where its used netfilter: nf_tables: use correct integer types netfilter: nf_tables: add and use BE register load-store helpers netfilter: nf_tables: use the correct get/put helpers netfilter: x_tables: use correct integer types netfilter: nfnetlink: add missing __be16 cast netfilter: nft_set_bitmap: Fix spelling mistake netfilter: h323: merge nat hook pointers into one netfilter: nf_conntrack: use rcu accessors where needed netfilter: nf_conntrack: add missing __rcu annotations netfilter: nf_flow_table: count pending offload workqueue tasks net/sched: act_ct: set 'net' pointer when creating new nf_flow_table netfilter: conntrack: use correct format characters netfilter: conntrack: use fallthrough to cleanup ==================== Link: https://lore.kernel.org/r/20220720230754.209053-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-20tcp: Fix data-races around sysctl_tcp_max_reordering.Kuniyuki Iwashima1-2/+2
While reading sysctl_tcp_max_reordering, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: dca145ffaa8d ("tcp: allow for bigger reordering level") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20tcp: Fix a data-race around sysctl_tcp_abort_on_overflow.Kuniyuki Iwashima1-1/+1
While reading sysctl_tcp_abort_on_overflow, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20tcp: Fix a data-race around sysctl_tcp_rfc1337.Kuniyuki Iwashima1-1/+1
While reading sysctl_tcp_rfc1337, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20tcp: Fix a data-race around sysctl_tcp_stdurg.Kuniyuki Iwashima1-1/+1
While reading sysctl_tcp_stdurg, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20tcp: Fix a data-race around sysctl_tcp_retrans_collapse.Kuniyuki Iwashima1-1/+1
While reading sysctl_tcp_retrans_collapse, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20tcp: Fix data-races around sysctl_tcp_slow_start_after_idle.Kuniyuki Iwashima1-1/+1
While reading sysctl_tcp_slow_start_after_idle, it can be changed concurrently. Thus, we need to add READ_ONCE() to its readers. Fixes: 35089bb203f4 ("[TCP]: Add tcp_slow_start_after_idle sysctl.") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-20tcp: Fix a data-race around sysctl_tcp_thin_linear_timeouts.Kuniyuki Iwashima1-1/+1
While reading sysctl_tcp_thin_linear_timeouts, it can be changed concurrently. Thus, we need to add READ_ONCE() to its reader. Fixes: 36e31b0af587 ("net: TCP thin linear timeouts") Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>