Age | Commit message (Collapse) | Author | Files | Lines |
|
Ensure the match happens in the right direction, previously the
destination used was the server, not the NAT host, as the comment
shows the code intended.
Additionally nf_nat_irc uses port 0 as a signal and there's no valid way
it can appear in a DCC message, so consider port 0 also forged.
Fixes: 869f37d8e48f ("[NETFILTER]: nf_conntrack/nf_nat: add IRC helper port")
Signed-off-by: David Leadbeater <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
hci_abort_conn() is a wrapper around a number of DISCONNECT and
CREATE_CONN_CANCEL commands that was being invoked from hci_request
request queues, which are now deprecated. There are two versions:
hci_abort_conn() which can be invoked from the hci_event thread, and
hci_abort_conn_sync() which can be invoked within a hci_sync cmd chain.
Signed-off-by: Brian Gix <[email protected]>
Signed-off-by: Luiz Augusto von Dentz <[email protected]>
|
|
The HCI_OP_READ_ENC_KEY_SIZE command is converted from using the
deprecated hci_request mechanism to use hci_send_cmd, with an
accompanying hci_cc_read_enc_key_size to handle it's return response.
Signed-off-by: Brian Gix <[email protected]>
Signed-off-by: Luiz Augusto von Dentz <[email protected]>
|
|
When the GSO splitting feature of sch_cake is enabled, GSO superpackets
will be broken up and the resulting segments enqueued in place of the
original skb. In this case, CAKE calls consume_skb() on the original skb,
but still returns NET_XMIT_SUCCESS. This can confuse parent qdiscs into
assuming the original skb still exists, when it really has been freed. Fix
this by adding the __NET_XMIT_STOLEN flag to the return value in this case.
Fixes: 0c850344d388 ("sch_cake: Conditionally split GSO segments")
Signed-off-by: Toke Høiland-Jørgensen <[email protected]>
Reported-by: [email protected] # ZDI-CAN-18231
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
When bpf prog changes tcp-cc by calling bpf_setsockopt(TCP_CONGESTION),
it should not try to load module which may be a blocking operation.
This details was correct in the v1 [0] but missed by mistake in the
later revision in commit cb388e7ee3a8 ("bpf: net: Change do_tcp_setsockopt()
to use the sockopt's lock_sock() and capable()"). This patch fixes it by
checking the has_current_bpf_ctx().
[0] https://lore.kernel.org/bpf/[email protected]/
Fixes: cb388e7ee3a8 ("bpf: net: Change do_tcp_setsockopt() to use the sockopt's lock_sock() and capable()")
Signed-off-by: Martin KaFai Lau <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
strp_init() is called just a few lines above this csk->sk_user_data
check, it also initializes strp->work etc., therefore, it is
unnecessary to call strp_done() to cancel the freshly initialized
work.
And if sk_user_data is already used by KCM, psock->strp should not be
touched, particularly strp->work state, so we need to move strp_init()
after the csk->sk_user_data check.
This also makes a lockdep warning reported by syzbot go away.
Reported-and-tested-by: [email protected]
Reported-by: [email protected]
Fixes: e5571240236c ("kcm: Check if sk_user_data already set in kcm_attach")
Fixes: dff8baa26117 ("kcm: Call strp_stop before strp_done in kcm_attach")
Cc: Tom Herbert <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Commit d678cbd2f867 ("xsk: Fix handling of invalid descriptors in XSK TX
batching API") fixed batch API usage against set of descriptors with
invalid ones but introduced a problem when AF_XDP SW rings are smaller
than HW ones. Mismatch of reported Tx'ed frames between HW generator and
user space app was observed. It turned out that backpressure mechanism
became a bottleneck when the amount of produced descriptors to CQ is
lower than what we grabbed from XSK Tx ring.
Say that 512 entries had been taken from XSK Tx ring but we had only 490
free entries in CQ. Then callsite (ZC driver) will produce only 490
entries onto HW Tx ring but 512 entries will be released from Tx ring
and this is what will be seen by the user space.
In order to fix this case, mix XSK Tx/CQ ring interractions by moving
around internal functions and changing call order:
* pull out xskq_prod_nb_free() from xskq_prod_reserve_addr_batch()
up to xsk_tx_peek_release_desc_batch();
** move xskq_cons_release_n() into xskq_cons_read_desc_batch()
After doing so, algorithm can be described as follows:
1. lookup Tx entries
2. use value from 1. to reserve space in CQ (*)
3. Read from Tx ring as much descriptors as value from 2
3a. release descriptors from XSK Tx ring (**)
4. Finally produce addresses to CQ
Fixes: d678cbd2f867 ("xsk: Fix handling of invalid descriptors in XSK TX batching API")
Signed-off-by: Magnus Karlsson <[email protected]>
Signed-off-by: Maciej Fijalkowski <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
splice back the hook list so nft_chain_release_hook() has a chance to
release the hooks.
BUG: memory leak
unreferenced object 0xffff88810180b100 (size 96):
comm "syz-executor133", pid 3619, jiffies 4294945714 (age 12.690s)
hex dump (first 32 bytes):
28 64 23 02 81 88 ff ff 28 64 23 02 81 88 ff ff (d#.....(d#.....
90 a8 aa 83 ff ff ff ff 00 00 b5 0f 81 88 ff ff ................
backtrace:
[<ffffffff83a8c59b>] kmalloc include/linux/slab.h:600 [inline]
[<ffffffff83a8c59b>] nft_netdev_hook_alloc+0x3b/0xc0 net/netfilter/nf_tables_api.c:1901
[<ffffffff83a9239a>] nft_chain_parse_netdev net/netfilter/nf_tables_api.c:1998 [inline]
[<ffffffff83a9239a>] nft_chain_parse_hook+0x33a/0x530 net/netfilter/nf_tables_api.c:2073
[<ffffffff83a9b14b>] nf_tables_addchain.constprop.0+0x10b/0x950 net/netfilter/nf_tables_api.c:2218
[<ffffffff83a9c41b>] nf_tables_newchain+0xa8b/0xc60 net/netfilter/nf_tables_api.c:2593
[<ffffffff83a3d6a6>] nfnetlink_rcv_batch+0xa46/0xd20 net/netfilter/nfnetlink.c:517
[<ffffffff83a3db79>] nfnetlink_rcv_skb_batch net/netfilter/nfnetlink.c:638 [inline]
[<ffffffff83a3db79>] nfnetlink_rcv+0x1f9/0x220 net/netfilter/nfnetlink.c:656
[<ffffffff83a13b17>] netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
[<ffffffff83a13b17>] netlink_unicast+0x397/0x4c0 net/netlink/af_netlink.c:1345
[<ffffffff83a13fd6>] netlink_sendmsg+0x396/0x710 net/netlink/af_netlink.c:1921
[<ffffffff83865ab6>] sock_sendmsg_nosec net/socket.c:714 [inline]
[<ffffffff83865ab6>] sock_sendmsg+0x56/0x80 net/socket.c:734
[<ffffffff8386601c>] ____sys_sendmsg+0x36c/0x390 net/socket.c:2482
[<ffffffff8386a918>] ___sys_sendmsg+0xa8/0x110 net/socket.c:2536
[<ffffffff8386aaa8>] __sys_sendmsg+0x88/0x100 net/socket.c:2565
[<ffffffff845e5955>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
[<ffffffff845e5955>] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
[<ffffffff84800087>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
Fixes: d54725cd11a5 ("netfilter: nf_tables: support for multiple devices per netdev hook")
Reported-by: [email protected]
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
The IPv6 path already drops dst in the daddr changed case, but the IPv4
path does not. This change makes the two code paths consistent.
Further, it is possible that there is already a metadata_dst allocated from
ingress that might already be attached to skbuff->dst while following
the bridge path. If it is not released before setting a new
metadata_dst, it will be leaked. This is similar to what is done in
bpf_set_tunnel_key() or ip6_route_input().
It is important to note that the memory being leaked is not the dst
being set in the bridge code, but rather memory allocated from some
other code path that is not being freed correctly before the skb dst is
overwritten.
An example of the leakage fixed by this commit found using kmemleak:
unreferenced object 0xffff888010112b00 (size 256):
comm "softirq", pid 0, jiffies 4294762496 (age 32.012s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 80 16 f1 83 ff ff ff ff ................
e1 4e f6 82 ff ff ff ff 00 00 00 00 00 00 00 00 .N..............
backtrace:
[<00000000d79567ea>] metadata_dst_alloc+0x1b/0xe0
[<00000000be113e13>] udp_tun_rx_dst+0x174/0x1f0
[<00000000a36848f4>] geneve_udp_encap_recv+0x350/0x7b0
[<00000000d4afb476>] udp_queue_rcv_one_skb+0x380/0x560
[<00000000ac064aea>] udp_unicast_rcv_skb+0x75/0x90
[<000000009a8ee8c5>] ip_protocol_deliver_rcu+0xd8/0x230
[<00000000ef4980bb>] ip_local_deliver_finish+0x7a/0xa0
[<00000000d7533c8c>] __netif_receive_skb_one_core+0x89/0xa0
[<00000000a879497d>] process_backlog+0x93/0x190
[<00000000e41ade9f>] __napi_poll+0x28/0x170
[<00000000b4c0906b>] net_rx_action+0x14f/0x2a0
[<00000000b20dd5d4>] __do_softirq+0xf4/0x305
[<000000003a7d7e15>] __irq_exit_rcu+0xc3/0x140
[<00000000968d39a2>] sysvec_apic_timer_interrupt+0x9e/0xc0
[<000000009e920794>] asm_sysvec_apic_timer_interrupt+0x16/0x20
[<000000008942add0>] native_safe_halt+0x13/0x20
Florian Westphal says: "Original code was likely fine because nothing
ever did set a skb->dst entry earlier than bridge in those days."
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Harsh Modi <[email protected]>
Acked-by: Florian Westphal <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
__nf_ct_try_assign_helper() remains in place but it now requires a
template to configure the helper.
A toggle to disable automatic helper assignment was added by:
a9006892643a ("netfilter: nf_ct_helper: allow to disable automatic helper assignment")
in 2012 to address the issues described in "Secure use of iptables and
connection tracking helpers". Automatic conntrack helper assignment was
disabled by:
3bb398d925ec ("netfilter: nf_ct_helper: disable automatic helper assignment")
back in 2016.
This patch removes the sysctl and modparam toggles, users now have to
rely on explicit conntrack helper configuration via ruleset.
Update tools/testing/selftests/netfilter/nft_conntrack_helper.sh to
check that auto-assignment does not happen anymore.
Acked-by: Aaron Conole <[email protected]>
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
Signed-off-by: Sabrina Dubroca <[email protected]>
Signed-off-by: Steffen Klassert <[email protected]>
|
|
Signed-off-by: Sabrina Dubroca <[email protected]>
Signed-off-by: Steffen Klassert <[email protected]>
|
|
Signed-off-by: Sabrina Dubroca <[email protected]>
Signed-off-by: Steffen Klassert <[email protected]>
|
|
Signed-off-by: Sabrina Dubroca <[email protected]>
Signed-off-by: Steffen Klassert <[email protected]>
|
|
Signed-off-by: Sabrina Dubroca <[email protected]>
Signed-off-by: Steffen Klassert <[email protected]>
|
|
xfrm_user_rcv_msg() already handles extack, we just need to pass it down.
Signed-off-by: Sabrina Dubroca <[email protected]>
Signed-off-by: Steffen Klassert <[email protected]>
|
|
'unsigned int' is better than 'unsigned'.
Signed-off-by: Xin Gao <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This reverts commit 6005a8aecee8afeba826295321a612ab485c230e.
The assertion was intentionally removed in commit 043b8413e8c0 ("net:
devlink: remove redundant rtnl lock assert") and, contrary what is
described in the commit message, the comment reflects that: "Caller must
hold RTNL mutex or reference to dev...".
Signed-off-by: Vlad Buslov <[email protected]>
Tested-by: Leon Romanovsky <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan
Stefan Schmidt says:
====================
ieee802154 for net 2022-08-29
- repeated word fix from Jilin Yuan.
- missed return code setting in the cc2520 driver by Li Qiong.
- fixing a potential race in by defering the workqueue destroy
in the adf7242 driver by Lin Ma.
- fixing a long standing problem in the mac802154 rx path to match
corretcly by Miquel Raynal.
* tag 'ieee802154-for-net-2022-08-29' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan:
ieee802154: cc2520: add rc code in cc2520_tx()
net: mac802154: Fix a condition in the receive path
net/ieee802154: fix repeated words in comments
ieee802154/adf7242: defer destroy_workqueue call
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
In attach_default_qdiscs(), if a dev has multiple queues and queue 0 fails
to attach qdisc because there is no memory in attach_one_default_qdisc().
Then dev->qdisc will be noop_qdisc by default. But the other queues may be
able to successfully attach to default qdisc.
In this case, the fallback to noqueue process will be triggered. If the
original attached qdisc is not released and a new one is directly
attached, this will cause netdevice reference leaks.
The following is the bug log:
veth0: default qdisc (fq_codel) fail, fallback to noqueue
unregister_netdevice: waiting for veth0 to become free. Usage count = 32
leaked reference.
qdisc_alloc+0x12e/0x210
qdisc_create_dflt+0x62/0x140
attach_one_default_qdisc.constprop.41+0x44/0x70
dev_activate+0x128/0x290
__dev_open+0x12a/0x190
__dev_change_flags+0x1a2/0x1f0
dev_change_flags+0x23/0x60
do_setlink+0x332/0x1150
__rtnl_newlink+0x52f/0x8e0
rtnl_newlink+0x43/0x70
rtnetlink_rcv_msg+0x140/0x3b0
netlink_rcv_skb+0x50/0x100
netlink_unicast+0x1bb/0x290
netlink_sendmsg+0x37c/0x4e0
sock_sendmsg+0x5f/0x70
____sys_sendmsg+0x208/0x280
Fix this bug by clearing any non-noop qdiscs that may have been assigned
before trying to re-attach.
Fixes: bf6dba76d278 ("net: sched: fallback to qdisc noqueue if default qdisc setup fail")
Signed-off-by: Wang Hai <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
There should be no reason to adjust old ktermios which is going to get
discarded anyway.
Reviewed-by: Andy Shevchenko <[email protected]>
Signed-off-by: Ilpo Järvinen <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
|
|
Follow-up the removal of unused internal api of port params made by
commit 42ded61aa75e ("devlink: Delete not used port parameters APIs")
and stub the commands and add extack message to tell the user what is
going on.
If later on port params are needed, could be easily re-introduced,
but until then it is a dead code.
Signed-off-by: Jiri Pirko <[email protected]>
Reviewed-by: Jakub Kicinski <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Using TCQ_MIN_PRIO_BANDS instead of magic number in prio_tune().
Signed-off-by: Zhengchao Shao <[email protected]>
Acked-by: Cong Wang <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
The actual presence check for the header is in
ethnl_parse_header_dev_get() but it's a few layers in,
and already has a ton of arguments so let's just pick
the low hanging fruit and check for missing header in
the default request handler.
Reviewed-by: Ido Schimmel <[email protected]>
Reviewed-by: Johannes Berg <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Strset needs ETHTOOL_A_STRINGSET_ID, use it as an example of
reporting attrs missing in nests.
Reviewed-by: Johannes Berg <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Devlink with its global attr policy has a lot of attribute
presence check, use the new ext ack reporting when they are
missing.
Reviewed-by: Johannes Berg <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
There is currently no way to report via extack in a structured way
that an attribute is missing. This leads to families resorting to
string messages.
Add a pair of attributes - @offset and @type for machine-readable
way of reporting missing attributes. The @offset points to the
nest which should have contained the attribute, @type is the
expected nla_type. The offset will be skipped if the attribute
is missing at the message level rather than inside a nest.
User space should be able to figure out which attribute enum
(AKA attribute space AKA attribute set) the nest pointed to by
@offset is using.
Reviewed-by: Johannes Berg <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
The ext_ack writing code looks very "organically grown".
Move the calculation of the size and writing out to helpers.
This is more idiomatic and gives us the ability to return early
avoiding the long (and randomly ordered) "if" conditions.
Reviewed-by: Johannes Berg <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Paolo Abeni <[email protected]>
|
|
Consolidate alloclen and pagedlen calculation for zerocopy and normal
paged requests. The current non-zerocopy paged version can a bit
overallocate and unnecessary copy a small chunk of data into the linear
part.
Cc: Willem de Bruijn <[email protected]>
Link: https://lore.kernel.org/netdev/CA+FuTSf0+cJ9_N_xrHmCGX_KoVCWcE0YQBdtgEkzGvcLMSv7Qw@mail.gmail.com/
Signed-off-by: Pavel Begunkov <[email protected]>
Link: https://lore.kernel.org/r/b0e4edb7b91f171c7119891d3c61040b8c56596e.1661428921.git.asml.silence@gmail.com
Signed-off-by: Paolo Abeni <[email protected]>
|
|
The issue is the same to commit c2999f7fb05b ("net: sched: multiq: don't
call qdisc_put() while holding tree lock"). Qdiscs call qdisc_put() while
holding sch tree spinlock, which results sleeping-while-atomic BUG.
Fixes: c266f64dbfa2 ("net: sched: protect block state with mutex")
Signed-off-by: Zhengchao Shao <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
|
|
On 32bit-UP u64_stats_fetch_begin() disables only preemption. If the
reader is in preemptible context and the writer side
(u64_stats_update_begin*()) runs in an interrupt context (IRQ or
softirq) then the writer can update the stats during the read operation.
This update remains undetected.
Use u64_stats_fetch_begin_irq() to ensure the stats fetch on 32bit-UP
are not interrupted by a writer. 32bit-SMP remains unaffected by this
change.
Cc: "David S. Miller" <[email protected]>
Cc: Catherine Sullivan <[email protected]>
Cc: David Awogbemila <[email protected]>
Cc: Dimitris Michailidis <[email protected]>
Cc: Eric Dumazet <[email protected]>
Cc: Hans Ulli Kroll <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Jeroen de Borst <[email protected]>
Cc: Johannes Berg <[email protected]>
Cc: Linus Walleij <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: Simon Horman <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
We had historically not checked that genlmsghdr.reserved
is 0 on input which prevents us from using those precious
bytes in the future.
One use case would be to extend the cmd field, which is
currently just 8 bits wide and 256 is not a lot of commands
for some core families.
To make sure that new families do the right thing by default
put the onus of opting out of validation on existing families.
Signed-off-by: Jakub Kicinski <[email protected]>
Acked-by: Paul Moore <[email protected]> (NetLabel)
Signed-off-by: David S. Miller <[email protected]>
|
|
Add the missing unlock before return in the error handling case.
Fixes: ccdde7c74ffd ("wifi: mac80211: properly implement MLO key handling")
Signed-off-by: Sun Ke <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Johannes Berg <[email protected]>
|
|
Upon reception, a packet must be categorized, either it's destination is
the host, or it is another host. A packet with no destination addressing
fields may be valid in two situations:
- the packet has no source field: only ACKs are built like that, we
consider the host as the destination.
- the packet has a valid source field: it is directed to the PAN
coordinator, as for know we don't have this information we consider we
are not the PAN coordinator.
There was likely a copy/paste error made during a previous cleanup
because the if clause is now containing exactly the same condition as in
the switch case, which can never be true. In the past the destination
address was used in the switch and the source address was used in the
if, which matches what the spec says.
Cc: [email protected]
Fixes: ae531b9475f6 ("ieee802154: use ieee802154_addr instead of *_sa variants")
Signed-off-by: Miquel Raynal <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Stefan Schmidt <[email protected]>
|
|
Allow specifying the xfrm interface if_id and link as part of a route
metadata using the lwtunnel infrastructure.
This allows for example using a single xfrm interface in collect_md
mode as the target of multiple routes each specifying a different if_id.
With the appropriate changes to iproute2, considering an xfrm device
ipsec1 in collect_md mode one can for example add a route specifying
an if_id like so:
ip route add <SUBNET> dev ipsec1 encap xfrm if_id 1
In which case traffic routed to the device via this route would use
if_id in the xfrm interface policy lookup.
Or in the context of vrf, one can also specify the "link" property:
ip route add <SUBNET> dev ipsec1 encap xfrm if_id 1 link_dev eth15
Note: LWT_XFRM_LINK uses NLA_U32 similar to IFLA_XFRM_LINK even though
internally "link" is signed. This is consistent with other _LINK
attributes in other devices as well as in bpf and should not have an
effect as device indexes can't be negative.
Reviewed-by: Nicolas Dichtel <[email protected]>
Reviewed-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: Eyal Birger <[email protected]>
Signed-off-by: Steffen Klassert <[email protected]>
|
|
This commit adds support for 'collect_md' mode on xfrm interfaces.
Each net can have one collect_md device, created by providing the
IFLA_XFRM_COLLECT_METADATA flag at creation. This device cannot be
altered and has no if_id or link device attributes.
On transmit to this device, the if_id is fetched from the attached dst
metadata on the skb. If exists, the link property is also fetched from
the metadata. The dst metadata type used is METADATA_XFRM which holds
these properties.
On the receive side, xfrmi_rcv_cb() populates a dst metadata for each
packet received and attaches it to the skb. The if_id used in this case is
fetched from the xfrm state, and the link is fetched from the incoming
device. This information can later be used by upper layers such as tc,
ebpf, and ip rules.
Because the skb is scrubed in xfrmi_rcv_cb(), the attachment of the dst
metadata is postponed until after scrubing. Similarly, xfrm_input() is
adapted to avoid dropping metadata dsts by only dropping 'valid'
(skb_valid_dst(skb) == true) dsts.
Policy matching on packets arriving from collect_md xfrmi devices is
done by using the xfrm state existing in the skb's sec_path.
The xfrm_if_cb.decode_cb() interface implemented by xfrmi_decode_session()
is changed to keep the details of the if_id extraction tucked away
in xfrm_interface.c.
Reviewed-by: Nicolas Dichtel <[email protected]>
Reviewed-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: Eyal Birger <[email protected]>
Signed-off-by: Steffen Klassert <[email protected]>
|
|
Commit 23c7f8d7989e ("net: Fix esp GSO on inter address family
tunnels.") is incomplete. It passes to skb_eth_gso_segment the
protocol for the outer IP version, instead of the inner IP version, so
we end up calling inet_gso_segment on an inner IPv6 packet and
ipv6_gso_segment on an inner IPv4 packet and the packets are dropped.
This patch completes the fix by selecting the correct protocol based
on the inner mode's family.
Fixes: c35fe4106b92 ("xfrm: Add mode handlers for IPsec on layer 2")
Signed-off-by: Sabrina Dubroca <[email protected]>
Signed-off-by: Steffen Klassert <[email protected]>
|
|
We no longer allow "handle" to be zero, so there is no need to check
for that.
Signed-off-by: Dan Carpenter <[email protected]>
Link: https://lore.kernel.org/r/Ywd4NIoS4aiilnMv@kili
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
CRIU needs OVS_DP_ATTR_PER_CPU_PIDS to checkpoint/restore newest
openvswitch versions.
Add pids to generic datapath reply. Limit exported pids amount to
nr_cpu_ids.
Signed-off-by: Andrey Zhadchenko <[email protected]>
Acked-by: Christian Brauner (Microsoft) <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
CRIU is preserving ifindexes of net devices after restoration. However,
current Open vSwitch API does not allow to target ifindex, so we cannot
correctly restore OVS configuration.
Add new OVS_DP_ATTR_IFINDEX for OVS_DP_CMD_NEW and use it as desired
ifindex.
Use OVS_VPORT_ATTR_IFINDEX during OVS_VPORT_CMD_NEW to specify new netdev
ifindex.
Signed-off-by: Andrey Zhadchenko <[email protected]>
Acked-by: Christian Brauner (Microsoft) <[email protected]>
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
ovs_dp_cmd_new()->ovs_dp_change()->ovs_dp_set_upcall_portids()
allocates array via kmalloc.
If for some reason new_vport() fails during ovs_dp_cmd_new()
dp->upcall_portids must be freed.
Add missing kfree.
Kmemleak example:
unreferenced object 0xffff88800c382500 (size 64):
comm "dump_state", pid 323, jiffies 4294955418 (age 104.347s)
hex dump (first 32 bytes):
5e c2 79 e4 1f 7a 38 c7 09 21 38 0c 80 88 ff ff ^.y..z8..!8.....
03 00 00 00 0a 00 00 00 14 00 00 00 28 00 00 00 ............(...
backtrace:
[<0000000071bebc9f>] ovs_dp_set_upcall_portids+0x38/0xa0
[<000000000187d8bd>] ovs_dp_change+0x63/0xe0
[<000000002397e446>] ovs_dp_cmd_new+0x1f0/0x380
[<00000000aa06f36e>] genl_family_rcv_msg_doit+0xea/0x150
[<000000008f583bc4>] genl_rcv_msg+0xdc/0x1e0
[<00000000fa10e377>] netlink_rcv_skb+0x50/0x100
[<000000004959cece>] genl_rcv+0x24/0x40
[<000000004699ac7f>] netlink_unicast+0x23e/0x360
[<00000000c153573e>] netlink_sendmsg+0x24e/0x4b0
[<000000006f4aa380>] sock_sendmsg+0x62/0x70
[<00000000d0068654>] ____sys_sendmsg+0x230/0x270
[<0000000012dacf7d>] ___sys_sendmsg+0x88/0xd0
[<0000000011776020>] __sys_sendmsg+0x59/0xa0
[<000000002e8f2dc1>] do_syscall_64+0x3b/0x90
[<000000003243e7cb>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
Fixes: b83d23a2a38b ("openvswitch: Introduce per-cpu upcall dispatch")
Acked-by: Aaron Conole <[email protected]>
Signed-off-by: Andrey Zhadchenko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
In genl_bind(), currently genl_lock and write cb_lock are taken
for iteration of genl_fam_idr and processing of static values
stored in struct genl_family. Take just read cb_lock for this task
as it is sufficient to guard the idr and the struct against
concurrent genl_register/unregister_family() calls.
This will allow to run genl command processing in genl_rcv() and
mnl_socket_setsockopt(.., NETLINK_ADD_MEMBERSHIP, ..) in parallel.
Reported-by: Vikas Gupta <[email protected]>
Signed-off-by: Jiri Pirko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Similar to devlink_compat_phys_port_name_get(), make sure that
devlink_compat_switch_id_get() is called with RTNL lock held. Comment
already says so, so put this in code as well.
Signed-off-by: Jiri Pirko <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:
====================
bluetooth pull request for net:
- Fix handling of duplicate connection handle
- Fix handling of HCI vendor opcode
- Fix suspend performance regression
- Fix build errors
- Fix not handling shutdown condition on ISO sockets
- Fix double free issue
* tag 'for-net-2022-08-25' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
Bluetooth: hci_sync: hold hdev->lock when cleanup hci_conn
Bluetooth: move from strlcpy with unused retval to strscpy
Bluetooth: hci_event: Fix checking conn for le_conn_complete_evt
Bluetooth: ISO: Fix not handling shutdown condition
Bluetooth: hci_sync: fix double mgmt_pending_free() in remove_adv_monitor()
Bluetooth: MGMT: Fix Get Device Flags
Bluetooth: L2CAP: Fix build errors in some archs
Bluetooth: hci_sync: Fix suspend performance regression
Bluetooth: hci_event: Fix vendor (unknown) opcode status handling
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Daniel borkmann says:
====================
The following pull-request contains BPF updates for your *net* tree.
We've added 11 non-merge commits during the last 14 day(s) which contain
a total of 13 files changed, 61 insertions(+), 24 deletions(-).
The main changes are:
1) Fix BPF verifier's precision tracking around BPF ring buffer, from Kumar Kartikeya Dwivedi.
2) Fix regression in tunnel key infra when passing FLOWI_FLAG_ANYSRC, from Eyal Birger.
3) Fix insufficient permissions for bpf_sys_bpf() helper, from YiFei Zhu.
4) Fix splat from hitting BUG when purging effective cgroup programs, from Pu Lehui.
5) Fix range tracking for array poke descriptors, from Daniel Borkmann.
6) Fix corrupted packets for XDP_SHARED_UMEM in aligned mode, from Magnus Karlsson.
7) Fix NULL pointer splat in BPF sockmap sk_msg_recvmsg(), from Liu Jian.
8) Add READ_ONCE() to bpf_jit_limit when reading from sysctl, from Kuniyuki Iwashima.
9) Add BPF selftest lru_bug check to s390x deny list, from Daniel Müller.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
The memory allocated by using kzallloc_node and kcalloc has been cleared.
Therefore, the structure members of the new qdisc are 0. So there's no
need to explicitly assign a value of 0.
Signed-off-by: Zhengchao Shao <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Johannes berg says:
====================
Various updates:
* rtw88: operation, locking, warning, and code style fixes
* rtw89: small updates
* cfg80211/mac80211: more EHT/MLO (802.11be, WiFi 7) work
* brcmfmac: a couple of fixes
* misc cleanups etc.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Johannes Berg says:
====================
pull-request: wireless-2022-08-26
Here are a couple of fixes for the current cycle,
see the tag description below.
Just a couple of fixes:
* two potential leaks
* use-after-free in certain scan races
* warning in IBSS code
* error return from a debugfs file was wrong
* possible NULL-ptr-deref when station lookup fails
Please pull and let me know if there's any problem.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
The full 'unsigned int' is better than 'unsigned'.
Signed-off-by: Xin Gao <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
[fix indentation]
Signed-off-by: Johannes Berg <[email protected]>
|
|
Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.
Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <[email protected]>
Signed-off-by: Johannes Berg <[email protected]>
|