aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2022-04-11netfilter: nft_socket: make cgroup match work in input tooFlorian Westphal1-4/+3
cgroupv2 helper function ignores the already-looked up sk and uses skb->sk instead. Just pass sk from the calling function instead; this will make cgroup matching work for udp and tcp in input even when edemux did not set skb->sk already. Fixes: e0bb96db96f8 ("netfilter: nft_socket: add support for cgroupsv2") Signed-off-by: Florian Westphal <[email protected]> Tested-by: Topi Miettinen <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2022-04-11netfilter: nft_fib: reverse path filter for policy-based routing on iifPablo Neira Ayuso3-0/+12
If policy-based routing using the iif selector is used, then the fib expression fails to look up for the reverse path from the prerouting hook because the input interface cannot be inferred. In order to support this scenario, extend the fib expression to allow to use after the route lookup, from the forward hook. This patch also adds support for the input hook for usability reasons. Since the prerouting hook cannot be used for the scenario described above, users need two rules: one for the forward chain and another rule for the input chain to check for the reverse path check for locally targeted traffic. Acked-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2022-04-11mac80211: fix ht_capa printout in debugfsBen Greear1-1/+1
Don't use sizeof(pointer) when calculating scnprintf offset. Fixes: 01f84f0ed3b4 ("mac80211: reduce stack usage in debugfs") Signed-off-by: Ben Greear <[email protected]> Link: https://lore.kernel.org/r/[email protected] [correct the Fixes tag] Signed-off-by: Johannes Berg <[email protected]>
2022-04-11cfg80211: hold bss_lock while updating nontrans_listRameshkumar Sundaram1-0/+2
Synchronize additions to nontrans_list of transmitting BSS with bss_lock to avoid races. Also when cfg80211_add_nontrans_list() fails __cfg80211_unlink_bss() needs bss_lock to be held (has lockdep assert on bss_lock). So protect the whole block with bss_lock to avoid races and warnings. Found during code review. Fixes: 0b8fb8235be8 ("cfg80211: Parsing of Multiple BSSID information in scanning") Signed-off-by: Rameshkumar Sundaram <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Johannes Berg <[email protected]>
2022-04-11nl80211: correctly check NL80211_ATTR_REG_ALPHA2 sizeJohannes Berg1-1/+2
We need this to be at least two bytes, so we can access alpha2[0] and alpha2[1]. It may be three in case some userspace used NUL-termination since it was NLA_STRING (and we also push it out with NUL-termination). Cc: [email protected] Reported-by: Lee Jones <[email protected]> Link: https://lore.kernel.org/r/20220411114201.fd4a31f06541.Ie7ff4be2cf348d8cc28ed0d626fc54becf7ea799@changeid Signed-off-by: Johannes Berg <[email protected]>
2022-04-11net/sched: taprio: Check if socket flags are validBenedikt Spranger1-1/+2
A user may set the SO_TXTIME socket option to ensure a packet is send at a given time. The taprio scheduler has to confirm, that it is allowed to send a packet at that given time, by a check against the packet time schedule. The scheduler drop the packet, if the gates are closed at the given send time. The check, if SO_TXTIME is set, may fail since sk_flags are part of an union and the union is used otherwise. This happen, if a socket is not a full socket, like a request socket for example. Add a check to verify, if the union is used for sk_flags. Fixes: 4cfd5779bd6e ("taprio: Add support for txtime-assist mode") Signed-off-by: Benedikt Spranger <[email protected]> Reviewed-by: Kurt Kanzenbach <[email protected]> Acked-by: Vinicius Costa Gomes <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-11net: icmp: add skb drop reasons to icmp protocolMenglong Dong3-46/+67
Replace kfree_skb() used in icmp_rcv() and icmpv6_rcv() with kfree_skb_reason(). In order to get the reasons of the skb drops after icmp message handle, we change the return type of 'handler()' in 'struct icmp_control' from 'bool' to 'enum skb_drop_reason'. This may change its original intention, as 'false' means failure, but 'SKB_NOT_DROPPED_YET' means success now. Therefore, all 'handler' and the call of them need to be handled. Following 'handler' functions are involved: icmp_unreach() icmp_redirect() icmp_echo() icmp_timestamp() icmp_discard() And following new drop reasons are added: SKB_DROP_REASON_ICMP_CSUM SKB_DROP_REASON_INVALID_PROTO The reason 'INVALID_PROTO' is introduced for the case that the packet doesn't follow rfc 1122 and is dropped. This is not a common case, and I believe we can locate the problem from the data in the packet. For now, this 'INVALID_PROTO' is used for the icmp broadcasts with wrong types. Maybe there should be a document file for these reasons. For example, list all the case that causes the 'UNHANDLED_PROTO' and 'INVALID_PROTO' drop reason. Therefore, users can locate their problems according to the document. Reviewed-by: Hao Peng <[email protected]> Reviewed-by: Jiang Biao <[email protected]> Signed-off-by: Menglong Dong <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-11net: icmp: introduce __ping_queue_rcv_skb() to report drop reasonsMenglong Dong1-5/+13
In order to avoid to change the return value of ping_queue_rcv_skb(), introduce the function __ping_queue_rcv_skb(), which is able to report the reasons of skb drop as its return value, as Paolo suggested. Meanwhile, make ping_queue_rcv_skb() a simple call to __ping_queue_rcv_skb(). The kfree_skb() and sock_queue_rcv_skb() used in ping_queue_rcv_skb() are replaced with kfree_skb_reason() and sock_queue_rcv_skb_reason() now. Reviewed-by: Hao Peng <[email protected]> Reviewed-by: Jiang Biao <[email protected]> Signed-off-by: Menglong Dong <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-11net: skb: rename SKB_DROP_REASON_PTYPE_ABSENTMenglong Dong1-5/+3
As David Ahern suggested, the reasons for skb drops should be more general and not be code based. Therefore, rename SKB_DROP_REASON_PTYPE_ABSENT to SKB_DROP_REASON_UNHANDLED_PROTO, which is used for the cases of no L3 protocol handler, no L4 protocol handler, version extensions, etc. From previous discussion, now we have the aim to make these reasons more abstract and users based, avoiding code based. Signed-off-by: Menglong Dong <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-11net: sock: introduce sock_queue_rcv_skb_reason()Menglong Dong1-6/+24
In order to report the reasons of skb drops in 'sock_queue_rcv_skb()', introduce the function 'sock_queue_rcv_skb_reason()'. As the return value of 'sock_queue_rcv_skb()' is used as the error code, we can't make it as drop reason and have to pass extra output argument. 'sock_queue_rcv_skb()' is used in many places, so we can't change it directly. Introduce the new function 'sock_queue_rcv_skb_reason()' and make 'sock_queue_rcv_skb()' an inline call to it. Reviewed-by: Hao Peng <[email protected]> Reviewed-by: Jiang Biao <[email protected]> Signed-off-by: Menglong Dong <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-10tls: rx: jump out for cases which need to leave skb on listJakub Kicinski1-21/+22
The current invese logic is harder to follow (and adds extra tests to the fast path). We have to enumerate all cases which need to keep the skb before consuming it. It's simpler to jump out of the full record flow as we detect those cases. This makes it clear that partial consumption and peek can only reach end of the function thru the !zc case so move the code up there. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-10tls: rx: clear ctx->recv_pkt earlierJakub Kicinski1-9/+7
Whatever we do in the loop the skb should not remain on as ctx->recv_pkt afterwards. We can clear that pointer and restart strparser earlier. This adds overhead of extra linking and unlinking to rx_list but that's not large (upcoming change will switch to unlocked skb list operations). Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-10tls: rx: inline consuming the skb at the end of the loopJakub Kicinski1-24/+5
tls_sw_advance_skb() always consumes the skb at the end of the loop. To fall here the following must be true: !async && !is_peek && !retain_skb retain_skb => !zc && rxm->full_len > len # but non-full record implies !zc, so above can be simplified as retain_skb => rxm->full_len > len !async && !is_peek && !(rxm->full_len > len) !async && !is_peek && rxm->full_len <= len tls_sw_advance_skb() returns false if len < rxm->full_len which can't be true given conditions above. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-10tls: rx: pull most of zc check out of the loopJakub Kicinski1-4/+5
Most of the conditions deciding if zero-copy can be used do not change throughout the iterations, so pre-calculate them. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-10tls: rx: don't track the async countJakub Kicinski1-7/+5
We track both if the last record was handled by async crypto and how many records were async. This is not necessary. We implicitly assume once crypto goes async it will stay that way, otherwise we'd reorder records. So just track if we're in async mode, the exact number of records is not necessary. This change also forces us into "async" mode more consistently in case crypto ever decided to interleave async and sync. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-10tls: rx: don't handle async in tls_sw_advance_skb()Jakub Kicinski1-13/+9
tls_sw_advance_skb() caters to the async case when skb argument is NULL. In that case it simply unpauses the strparser. These are surprising semantics to a person reading the code, and result in higher LoC, so inline the __strp_unpause and only call tls_sw_advance_skb() when we actually move past an skb. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-10tls: rx: factor out writing ContentType to cmsgJakub Kicinski1-55/+36
cmsg can be filled in during rx_list processing or normal receive. Consolidate the code. We don't need to keep the boolean to track if the cmsg was created. 0 is an invalid content type. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-10tls: rx: simplify async waitJakub Kicinski1-12/+2
Since we are protected from async completions by decrypt_compl_lock we can drop the async_notify and reinit the completion before we start waiting. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-10tls: rx: wrap decryption arguments in a structureJakub Kicinski1-22/+27
We pass zc as a pointer to bool a few functions down as an in/out argument. This is error prone since C will happily evalue a pointer as a boolean (IOW forgetting *zc and writing zc leads to loss of developer time..). Wrap the arguments into a structure. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-10tls: rx: don't report text length from the bowels of decryptJakub Kicinski1-19/+14
We plumb pointer to chunk all the way to the decryption method. It's set to the length of the text when decrypt_skb_update() returns. I think the code is written this way because original TLS implementation passed &chunk to zerocopy_from_iter() and this was carried forward as the code gotten more complex, without any refactoring. The fix for peek() introduced a new variable - to_decrypt which for all practical purposes is what chunk is going to get set to. Spare ourselves the pointer passing, use to_decrypt. Use this opportunity to clean things up a little further. Note that chunk / to_decrypt was mostly needed for the async path, since the sync path would access rxm->full_len (decryption transforms full_len from record size to text size). Use the right source of truth more explicitly. We have three cases: - async - it's TLS 1.2 only, so chunk == to_decrypt, but we need the min() because to_decrypt is a whole record and we don't want to underflow len. Note that we can't handle partial record by falling back to sync as it would introduce reordering against records in flight. - zc - again, TLS 1.2 only for now, so chunk == to_decrypt, we don't do zc if len < to_decrypt, no need to check again. - normal - it already handles chunk > len, we can factor out the assignment to rxm->full_len and share it with zc. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-10tls: rx: drop unnecessary arguments from tls_setup_from_iter()Jakub Kicinski1-8/+6
sk is unused, remove it to make it clear the function doesn't poke at the socket. size_used is always 0 on input and @length on success. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-09netfilter: bitwise: improve error goto labelsJeremy Sowden1-5/+6
Replace two labels (`err1` and `err2`) with more informative ones. Signed-off-by: Jeremy Sowden <[email protected]> Signed-off-by: Florian Westphal <[email protected]>
2022-04-09netfilter: bitwise: replace hard-coded size with `sizeof` expressionJeremy Sowden1-1/+1
When calculating the length of an array, use the appropriate `sizeof` expression for its type, rather than an integer literal. Signed-off-by: Jeremy Sowden <[email protected]> Signed-off-by: Florian Westphal <[email protected]>
2022-04-08Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski1-7/+15
Daniel Borkmann says: ==================== pull-request: bpf-next 2022-04-09 We've added 63 non-merge commits during the last 9 day(s) which contain a total of 68 files changed, 4852 insertions(+), 619 deletions(-). The main changes are: 1) Add libbpf support for USDT (User Statically-Defined Tracing) probes. USDTs are an abstraction built on top of uprobes, critical for tracing and BPF, and widely used in production applications, from Andrii Nakryiko. 2) While Andrii was adding support for x86{-64}-specific logic of parsing USDT argument specification, Ilya followed-up with USDT support for s390 architecture, from Ilya Leoshkevich. 3) Support name-based attaching for uprobe BPF programs in libbpf. The format supported is `u[ret]probe/binary_path:[raw_offset|function[+offset]]`, e.g. attaching to libc malloc can be done in BPF via SEC("uprobe/libc.so.6:malloc") now, from Alan Maguire. 4) Various load/store optimizations for the arm64 JIT to shrink the image size by using arm64 str/ldr immediate instructions. Also enable pointer authentication to verify return address for JITed code, from Xu Kuohai. 5) BPF verifier fixes for write access checks to helper functions, e.g. rd-only memory from bpf_*_cpu_ptr() must not be passed to helpers that write into passed buffers, from Kumar Kartikeya Dwivedi. 6) Fix overly excessive stack map allocation for its base map structure and buckets which slipped-in from cleanups during the rlimit accounting removal back then, from Yuntao Wang. 7) Extend the unstable CT lookup helpers for XDP and tc/BPF to report netfilter connection tracking tuple direction, from Lorenzo Bianconi. 8) Improve bpftool dump to show BPF program/link type names, Milan Landaverde. 9) Minor cleanups all over the place from various others. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (63 commits) bpf: Fix excessive memory allocation in stack_map_alloc() selftests/bpf: Fix return value checks in perf_event_stackmap test selftests/bpf: Add CO-RE relos into linked_funcs selftests libbpf: Use weak hidden modifier for USDT BPF-side API functions libbpf: Don't error out on CO-RE relos for overriden weak subprogs samples, bpf: Move routes monitor in xdp_router_ipv4 in a dedicated thread libbpf: Allow WEAK and GLOBAL bindings during BTF fixup libbpf: Use strlcpy() in path resolution fallback logic libbpf: Add s390-specific USDT arg spec parsing logic libbpf: Make BPF-side of USDT support work on big-endian machines libbpf: Minor style improvements in USDT code libbpf: Fix use #ifdef instead of #if to avoid compiler warning libbpf: Potential NULL dereference in usdt_manager_attach_usdt() selftests/bpf: Uprobe tests should verify param/return values libbpf: Improve string parsing for uprobe auto-attach libbpf: Improve library identification for uprobe binary path resolution selftests/bpf: Test for writes to map key from BPF helpers selftests/bpf: Test passing rdonly mem to global func bpf: Reject writes for PTR_TO_MAP_KEY in check_helper_mem_access bpf: Check PTR_TO_MEM | MEM_RDONLY in check_helper_mem_access ... ==================== Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
2022-04-08net/sched: fix initialization order when updating chain 0 headMarcelo Ricardo Leitner1-1/+1
Currently, when inserting a new filter that needs to sit at the head of chain 0, it will first update the heads pointer on all devices using the (shared) block, and only then complete the initialization of the new element so that it has a "next" element. This can lead to a situation that the chain 0 head is propagated to another CPU before the "next" initialization is done. When this race condition is triggered, packets being matched on that CPU will simply miss all other filters, and will flow through the stack as if there were no other filters installed. If the system is using OVS + TC, such packets will get handled by vswitchd via upcall, which results in much higher latency and reordering. For other applications it may result in packet drops. This is reproducible with a tc only setup, but it varies from system to system. It could be reproduced with a shared block amongst 10 veth tunnels, and an ingress filter mirroring packets to another veth. That's because using the last added veth tunnel to the shared block to do the actual traffic, it makes the race window bigger and easier to trigger. The fix is rather simple, to just initialize the next pointer of the new filter instance (tp) before propagating the head change. The fixes tag is pointing to the original code though this issue should only be observed when using it unlocked. Fixes: 2190d1d0944f ("net: sched: introduce helpers to work with filter chains") Signed-off-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: Vlad Buslov <[email protected]> Reviewed-by: Davide Caratti <[email protected]> Link: https://lore.kernel.org/r/b97d5f4eaffeeb9d058155bcab63347527261abf.1649341369.git.marcelo.leitner@gmail.com Signed-off-by: Jakub Kicinski <[email protected]>
2022-04-08sctp: use the correct skb for security_sctp_assoc_requestXin Long1-3/+3
Yi Chen reported an unexpected sctp connection abort, and it occurred when COOKIE_ECHO is bundled with DATA Fragment by SCTP HW GSO. As the IP header is included in chunk->head_skb instead of chunk->skb, it failed to check IP header version in security_sctp_assoc_request(). According to Ondrej, SELinux only looks at IP header (address and IPsec options) and XFRM state data, and these are all included in head_skb for SCTP HW GSO packets. So fix it by using head_skb when calling security_sctp_assoc_request() in processing COOKIE_ECHO. v1->v2: - As Ondrej noticed, chunk->head_skb should also be used for security_sctp_assoc_established() in sctp_sf_do_5_1E_ca(). Fixes: e215dab1c490 ("security: call security_sctp_assoc_request in sctp_sf_do_5_1D_ce") Reported-by: Yi Chen <[email protected]> Signed-off-by: Xin Long <[email protected]> Reviewed-by: Ondrej Mosnacek <[email protected]> Acked-by: Marcelo Ricardo Leitner <[email protected]> Link: https://lore.kernel.org/r/71becb489e51284edf0c11fc15246f4ed4cef5b6.1649337862.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <[email protected]>
2022-04-08Merge tag 'nfs-for-5.18-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfsLinus Torvalds6-33/+74
Pull NFS client fixes from Trond Myklebust: "Stable fixes: - SUNRPC: Ensure we flush any closed sockets before xs_xprt_free() Bugfixes: - Fix an Oopsable condition due to SLAB_ACCOUNT setting in the NFSv4.2 xattr code. - Fix for open() using an file open mode of '3' in NFSv4 - Replace readdir's use of xxhash() with hash_64() - Several patches to handle malloc() failure in SUNRPC" * tag 'nfs-for-5.18-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: SUNRPC: Move the call to xprt_send_pagedata() out of xprt_sock_sendmsg() SUNRPC: svc_tcp_sendmsg() should handle errors from xdr_alloc_bvec() SUNRPC: Handle allocation failure in rpc_new_task() NFS: Ensure rpc_run_task() cannot fail in nfs_async_rename() NFSv4/pnfs: Handle RPC allocation errors in nfs4_proc_layoutget SUNRPC: Handle low memory situations in call_status() SUNRPC: Handle ENOMEM in call_transmit_status() NFSv4.2: Fix missing removal of SLAB_ACCOUNT on kmem_cache allocation SUNRPC: Ensure we flush any closed sockets before xs_xprt_free() NFS: Replace readdir's use of xxhash() with hash_64() SUNRPC: handle malloc failure in ->request_prepare NFSv4: fix open failure with O_ACCMODE flag Revert "NFSv4: Handle the special Linux file open access mode"
2022-04-08net/sched: flower: Avoid overwriting error messagesIdo Schimmel1-4/+0
The various error paths of tc_setup_offload_action() now report specific error messages. Remove the generic messages to avoid overwriting the more specific ones. Before: # tc filter add dev dummy0 ingress pref 1 proto ip flower skip_sw dst_ip 198.51.100.1 action police rate 100Mbit burst 10000 Error: cls_flower: Failed to setup flow action. We have an error talking to the kernel After: # tc filter add dev dummy0 ingress pref 1 proto ip flower skip_sw dst_ip 198.51.100.1 action police rate 100Mbit burst 10000 Error: act_police: Offload not supported when conform/exceed action is "reclassify". We have an error talking to the kernel Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Petr Machata <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-08net/sched: matchall: Avoid overwriting error messagesIdo Schimmel1-4/+0
The various error paths of tc_setup_offload_action() now report specific error messages. Remove the generic messages to avoid overwriting the more specific ones. Before: # tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action police rate 100Mbit burst 10000 Error: cls_matchall: Failed to setup flow action. We have an error talking to the kernel After: # tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action police rate 100Mbit burst 10000 Error: act_police: Offload not supported when conform/exceed action is "reclassify". We have an error talking to the kernel Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Petr Machata <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-08net/sched: cls_api: Add extack message for unsupported action offloadIdo Schimmel1-2/+4
For better error reporting to user space, add an extack message when the requested action does not support offload. Example: # echo 1 > /sys/kernel/tracing/events/netlink/netlink_extack/enable # tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action nat ingress 192.0.2.1 198.51.100.1 Error: cls_matchall: Failed to setup flow action. We have an error talking to the kernel # cat /sys/kernel/tracing/trace_pipe tc-181 [000] b..1. 88.406093: netlink_extack: msg=Action does not support offload tc-181 [000] ..... 88.406108: netlink_extack: msg=cls_matchall: Failed to setup flow action Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Petr Machata <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-08net/sched: act_vlan: Add extack message for offload failureIdo Schimmel1-0/+1
For better error reporting to user space, add an extack message when vlan action offload fails. Currently, the failure cannot be triggered, but add a message in case the action is extended in the future to support more than the current set of modes. Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Petr Machata <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-08net/sched: act_tunnel_key: Add extack message for offload failureIdo Schimmel1-0/+1
For better error reporting to user space, add an extack message when tunnel_key action offload fails. Currently, the failure cannot be triggered, but add a message in case the action is extended in the future to support more than set/release modes. Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Petr Machata <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-08net/sched: act_skbedit: Add extack messages for offload failureIdo Schimmel1-0/+7
For better error reporting to user space, add extack messages when skbedit action offload fails. Example: # echo 1 > /sys/kernel/tracing/events/netlink/netlink_extack/enable # tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action skbedit queue_mapping 1234 Error: cls_matchall: Failed to setup flow action. We have an error talking to the kernel # cat /sys/kernel/tracing/trace_pipe tc-185 [002] b..1. 31.802414: netlink_extack: msg=act_skbedit: Offload not supported when "queue_mapping" option is used tc-185 [002] ..... 31.802418: netlink_extack: msg=cls_matchall: Failed to setup flow action # tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action skbedit inheritdsfield Error: cls_matchall: Failed to setup flow action. We have an error talking to the kernel # cat /sys/kernel/tracing/trace_pipe tc-187 [002] b..1. 45.985145: netlink_extack: msg=act_skbedit: Offload not supported when "inheritdsfield" option is used tc-187 [002] ..... 45.985160: netlink_extack: msg=cls_matchall: Failed to setup flow action Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Petr Machata <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-08net/sched: act_police: Add extack messages for offload failureIdo Schimmel1-3/+14
For better error reporting to user space, add extack messages when police action offload fails. Example: # echo 1 > /sys/kernel/tracing/events/netlink/netlink_extack/enable # tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action police rate 100Mbit burst 10000 Error: cls_matchall: Failed to setup flow action. We have an error talking to the kernel # cat /sys/kernel/tracing/trace_pipe tc-182 [000] b..1. 21.592969: netlink_extack: msg=act_police: Offload not supported when conform/exceed action is "reclassify" tc-182 [000] ..... 21.592982: netlink_extack: msg=cls_matchall: Failed to setup flow action # tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action police rate 100Mbit burst 10000 conform-exceed drop/continue Error: cls_matchall: Failed to setup flow action. We have an error talking to the kernel # cat /sys/kernel/tracing/trace_pipe tc-184 [000] b..1. 38.882579: netlink_extack: msg=act_police: Offload not supported when conform/exceed action is "continue" tc-184 [000] ..... 38.882593: netlink_extack: msg=cls_matchall: Failed to setup flow action Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Petr Machata <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-08net/sched: act_pedit: Add extack message for offload failureIdo Schimmel1-0/+1
For better error reporting to user space, add an extack message when pedit action offload fails. Currently, the failure cannot be triggered, but add a message in case the action is extended in the future to support more than set/add commands. Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Petr Machata <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-08net/sched: act_mpls: Add extack messages for offload failureIdo Schimmel1-0/+7
For better error reporting to user space, add extack messages when mpls action offload fails. Example: # echo 1 > /sys/kernel/tracing/events/netlink/netlink_extack/enable # tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action mpls dec_ttl Error: cls_matchall: Failed to setup flow action. We have an error talking to the kernel # cat /sys/kernel/tracing/trace_pipe tc-182 [000] b..1. 18.693915: netlink_extack: msg=act_mpls: Offload not supported when "dec_ttl" option is used tc-182 [000] ..... 18.693921: netlink_extack: msg=cls_matchall: Failed to setup flow action Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Petr Machata <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-08net/sched: act_mirred: Add extack message for offload failureIdo Schimmel1-0/+1
For better error reporting to user space, add an extack message when mirred action offload fails. Currently, the failure cannot be triggered, but add a message in case the action is extended in the future to support more than ingress/egress mirror/redirect. Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Petr Machata <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-08net/sched: act_gact: Add extack messages for offload failureIdo Schimmel1-0/+10
For better error reporting to user space, add extack messages when gact action offload fails. Example: # echo 1 > /sys/kernel/tracing/events/netlink/netlink_extack/enable # tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action continue Error: cls_matchall: Failed to setup flow action. We have an error talking to the kernel # cat /sys/kernel/tracing/trace_pipe tc-181 [002] b..1. 105.493450: netlink_extack: msg=act_gact: Offload of "continue" action is not supported tc-181 [002] ..... 105.493466: netlink_extack: msg=cls_matchall: Failed to setup flow action # tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action reclassify Error: cls_matchall: Failed to setup flow action. We have an error talking to the kernel # cat /sys/kernel/tracing/trace_pipe tc-183 [002] b..1. 124.126477: netlink_extack: msg=act_gact: Offload of "reclassify" action is not supported tc-183 [002] ..... 124.126489: netlink_extack: msg=cls_matchall: Failed to setup flow action # tc filter add dev dummy0 ingress pref 1 proto all matchall skip_sw action pipe action drop Error: cls_matchall: Failed to setup flow action. We have an error talking to the kernel # cat /sys/kernel/tracing/trace_pipe tc-185 [002] b..1. 137.097791: netlink_extack: msg=act_gact: Offload of "pipe" action is not supported tc-185 [002] ..... 137.097804: netlink_extack: msg=cls_matchall: Failed to setup flow action Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Petr Machata <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-08net/sched: act_api: Add extack to offload_act_setup() callbackIdo Schimmel16-24/+44
The callback is used by various actions to populate the flow action structure prior to offload. Pass extack to this callback so that the various actions will be able to report accurate error messages to user space. Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Petr Machata <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-08net/sched: flower: Take verbose flag into account when logging error messagesIdo Schimmel1-6/+6
The verbose flag was added in commit 81c7288b170a ("sched: cls: enable verbose logging") to avoid suppressing logging of error messages that occur "when the rule is not to be exclusively executed by the hardware". However, such error messages are currently suppressed when setup of flow action fails. Take the verbose flag into account to avoid suppressing error messages. This is done by using the extack pointer initialized by tc_cls_common_offload_init(), which performs the necessary checks. Before: # tc filter add dev dummy0 ingress pref 1 proto ip flower dst_ip 198.51.100.1 action police rate 100Mbit burst 10000 # tc filter add dev dummy0 ingress pref 2 proto ip flower verbose dst_ip 198.51.100.1 action police rate 100Mbit burst 10000 After: # tc filter add dev dummy0 ingress pref 1 proto ip flower dst_ip 198.51.100.1 action police rate 100Mbit burst 10000 # tc filter add dev dummy0 ingress pref 2 proto ip flower verbose dst_ip 198.51.100.1 action police rate 100Mbit burst 10000 Warning: cls_flower: Failed to setup flow action. Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Petr Machata <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-08net/sched: matchall: Take verbose flag into account when logging error messagesIdo Schimmel1-10/+7
The verbose flag was added in commit 81c7288b170a ("sched: cls: enable verbose logging") to avoid suppressing logging of error messages that occur "when the rule is not to be exclusively executed by the hardware". However, such error messages are currently suppressed when setup of flow action fails. Take the verbose flag into account to avoid suppressing error messages. This is done by using the extack pointer initialized by tc_cls_common_offload_init(), which performs the necessary checks. Signed-off-by: Ido Schimmel <[email protected]> Reviewed-by: Petr Machata <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-08netfilter: nf_log_syslog: Consolidate entry checksPhil Sutter1-8/+10
Every syslog logging callback has to perform the same check to cover for rogue containers, introduce a helper for clarity. Drop the FIXME as there is a viable solution since commit 2851940ffee31 ("netfilter: allow logging from non-init namespaces"). Suggested-by: Florian Westphal <[email protected]> Signed-off-by: Phil Sutter <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2022-04-08netfilter: nf_log_syslog: Don't ignore unknown protocolsPhil Sutter1-0/+31
With netdev and bridge nfprotos, loggers may see arbitrary ethernet frames. Print at least basic info like interfaces and MAC header data. Signed-off-by: Phil Sutter <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2022-04-08netfilter: nf_log_syslog: Merge MAC header dumpersPhil Sutter1-66/+25
The functions for IPv4 and IPv6 were almost identical apart from extra SIT tunnel device handling in the latter. Signed-off-by: Phil Sutter <[email protected]> Acked-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2022-04-08flow_dissector: fix false-positive __read_overflow2_field() warningJakub Kicinski1-1/+1
Bounds checking is unhappy that we try to copy both Ethernet addresses but pass pointer to the first one. Luckily destination address is the first field so pass the pointer to the entire header, whatever. Signed-off-by: Jakub Kicinski <[email protected]> Reviewed-by: Kees Cook <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-08net/sched: flower: fix parsing of ethertype following VLAN headerVlad Buslov2-5/+14
A tc flower filter matching TCA_FLOWER_KEY_VLAN_ETH_TYPE is expected to match the L2 ethertype following the first VLAN header, as confirmed by linked discussion with the maintainer. However, such rule also matches packets that have additional second VLAN header, even though filter has both eth_type and vlan_ethtype set to "ipv4". Looking at the code this seems to be mostly an artifact of the way flower uses flow dissector. First, even though looking at the uAPI eth_type and vlan_ethtype appear like a distinct fields, in flower they are all mapped to the same key->basic.n_proto. Second, flow dissector skips following VLAN header as no keys for FLOW_DISSECTOR_KEY_CVLAN are set and eventually assigns the value of n_proto to last parsed header. With these, such filters ignore any headers present between first VLAN header and first "non magic" header (ipv4 in this case) that doesn't result FLOW_DISSECT_RET_PROTO_AGAIN. Fix the issue by extending flow dissector VLAN key structure with new 'vlan_eth_type' field that matches first ethertype following previously parsed VLAN header. Modify flower classifier to set the new flow_dissector_key_vlan->vlan_eth_type with value obtained from TCA_FLOWER_KEY_VLAN_ETH_TYPE/TCA_FLOWER_KEY_CVLAN_ETH_TYPE uAPIs. Link: https://lore.kernel.org/all/Yjhgi48BpTGh6dig@nanopsycho/ Fixes: 9399ae9a6cb2 ("net_sched: flower: Add vlan support") Fixes: d64efd0926ba ("net/sched: flower: Add supprt for matching on QinQ vlan headers") Signed-off-by: Vlad Buslov <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-08tls: hw: rx: use return value of tls_device_decrypted() to carry statusJakub Kicinski2-8/+4
Instead of tls_device poking into internals of the message return 1 from tls_device_decrypted() if the device handled the decryption. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-08tls: rx: refactor decrypt_skb_update()Jakub Kicinski1-33/+33
Use early return and a jump label to remove two indentation levels. No functional changes. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-08tls: rx: don't issue wake ups when data is decryptedJakub Kicinski1-2/+0
We inform the applications that data is available when the record is received. Decryption happens inline inside recvmsg or splice call. Generating another wakeup inside the decryption handler seems pointless as someone must be actively reading the socket if we are executing this code. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2022-04-08tls: rx: replace 'back' with 'offset'Jakub Kicinski1-5/+4
The padding length TLS 1.3 logic is searching for content_type from the end of text. IMHO the code is easier to parse if we calculate offset and decrement it rather than try to maintain positive offset from the end of the record called "back". Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: David S. Miller <[email protected]>