Age | Commit message (Collapse) | Author | Files | Lines |
|
Connlabels are included in conntrack netlink event messages only if
the IPCT_LABEL bit is set in the event cache (see
ctnetlink_conntrack_event()). Set it after initializing labels for a
new connection.
Found upon further system testing, where it was noticed that labels
were missing from the conntrack events.
Fixes: 193e30967897 ("openvswitch: Do not trigger events for unconfirmed connections.")
Signed-off-by: Jarno Rajahalme <[email protected]>
Acked-by: Pravin B Shelar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
sctp has changed to use rhlist for transport rhashtable since commit
7fda702f9315 ("sctp: use new rhlist interface on sctp transport
rhashtable").
But rhltable_insert_key doesn't check the duplicate node when inserting
a node, unlike rhashtable_lookup_insert_key. It may cause duplicate
assoc/transport in rhashtable. like:
client (addr A, B) server (addr X, Y)
connect to X INIT (1)
------------>
connect to Y INIT (2)
------------>
INIT_ACK (1)
<------------
INIT_ACK (2)
<------------
After sending INIT (2), one transport will be created and hashed into
rhashtable. But when receiving INIT_ACK (1) and processing the address
params, another transport will be created and hashed into rhashtable
with the same addr Y and EP as the last transport. This will confuse
the assoc/transport's lookup.
This patch is to fix it by returning err if any duplicate node exists
before inserting it.
Fixes: 7fda702f9315 ("sctp: use new rhlist interface on sctp transport rhashtable")
Reported-by: Fabio M. Di Nitto <[email protected]>
Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This patch is to add reconf chunk event based on the sctp event
frame in rx path, it will call sctp_sf_do_reconf to process the
reconf chunk.
Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This patch is to add a function to process the incoming reconf chunk,
in which it verifies the chunk, and traverses the param and process
it with the right function one by one.
sctp_sf_do_reconf would be the process function of reconf chunk event.
Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This patch is to add a function sctp_verify_reconf to do some length
check and multi-params check for sctp stream reconf according to rfc6525
section 3.1.
Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Parameter
This patch is to implement Receiver-Side Procedures for the Incoming
SSN Reset Request Parameter described in rfc6525 section 5.2.3.
It's also to move str_list endian conversion out of sctp_make_strreset_req,
so that sctp_make_strreset_req can be used more conveniently to process
inreq.
Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Parameter
This patch is to implement Receiver-Side Procedures for the Outgoing
SSN Reset Request Parameter described in rfc6525 section 5.2.2.
Note that some checks must be after request_seq check, as even those
checks fail, strreset_inseq still has to be increase by 1.
Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This patch is to add Stream Reset Event described in rfc6525
section 6.1.1.
Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This patch is to define Re-configuration Response Parameter described
in rfc6525 section 4.4. As optional fields are only for SSN/TSN Reset
Request Parameter, it uses another function to make that.
Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
|
|
If ip6_dst_lookup_tail has acquired a dst and fails the IPv4-mapped
check, release the dst before returning an error.
Fixes: ec5e3b0a1d41 ("ipv6: Inhibit IPv4-mapped src address on the wire.")
Signed-off-by: Willem de Bruijn <[email protected]>
Acked-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
A nested lock depth was added to the hasbin_delete() code but it
doesn't actually work some well and results in tons of lockdep splats.
Fix the code instead to properly drop the lock around the operation
and just keep peeking the head of the hashbin queue.
Reported-by: Dmitry Vyukov <[email protected]>
Tested-by: Dmitry Vyukov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
sk_page_frag_refill() allocates either a compound page or an order-0
page. We can use page_ref_inc() which is slightly faster than get_page()
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
precision loss in window scaling
Prevent sending out a left-shifted sequence number from a Linux sender in
response to a peer's shrunk receive-window caused by losing least significant
bits in window-scaling.
Cc: "David S. Miller" <[email protected]>
Cc: Alexey Kuznetsov <[email protected]>
Cc: James Morris <[email protected]>
Cc: Hideaki YOSHIFUJI <[email protected]>
Cc: Patrick McHardy <[email protected]>
Signed-off-by: Cheng Cui <[email protected]>
Acked-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
In the function rds_ib_xmit_atomic, ib_ring is not allocated
successfully. As such, it is not necessary to unalloc it.
Cc: Joe Jin <[email protected]>
Cc: Junxiao Bi <[email protected]>
Signed-off-by: Zhu Yanjun <[email protected]>
Acked-by: Santosh Shilimkar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The qdisc_stab_lock is used in qdisc_get_stab and qdisc_put_stab.
These two functions are invoked in qdisc_create, qdisc_change, and
qdisc_destroy which run fully under RTNL.
So it already makes sure only one could access the qdisc_stab_list at
the same time. Then it is unnecessary to use qdisc_stab_lock now.
Signed-off-by: Gao Feng <[email protected]>
Acked-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Change module filename from af-rxrpc.ko to rxrpc.ko so as to be consistent
with the other protocol drivers.
Also adjust the documentation to reflect this.
Further, there is no longer a standalone rxkad module, as it has been
merged into the rxrpc core, so get rid of references to that.
Reported-by: Marc Dionne <[email protected]>
Signed-off-by: David Howells <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
When allocating rtnl dump messages, struct ifla_port_vsi is never dumped,
so we can save header plus payload in rtnl_port_size(). Infact, attribute
IFLA_PORT_VSI_TYPE and struct ifla_port_vsi are not used anywhere in
the kernel. We only need to keep the nla policy should applications in
user space be filling this out. Same NLA_BINARY issue exists as was fixed
in 364d5716a7ad ("rtnetlink: ifla_vf_policy: fix misuses of NLA_BINARY")
and others, but then again IFLA_PORT_VSI_TYPE is not used anywhere, so
just add a comment that it's unused.
Signed-off-by: Daniel Borkmann <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
added_by_external_learn fdb entries are added and expired by
external entities like switchdev driver or external controllers.
ageing is already disabled for such entries. Hence, don't
indicate expiry for such fdb entries.
CC: Nikolay Aleksandrov <[email protected]>
CC: Jiri Pirko <[email protected]>
CC: Ido Schimmel <[email protected]>
Signed-off-by: Roopa Prabhu <[email protected]>
Reviewed-by: Ido Schimmel <[email protected]>
Tested-by: Ido Schimmel <[email protected]>
Reviewed-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Long standing issue with JITed programs is that stack traces from
function tracing check whether a given address is kernel code
through {__,}kernel_text_address(), which checks for code in core
kernel, modules and dynamically allocated ftrace trampolines. But
what is still missing is BPF JITed programs (interpreted programs
are not an issue as __bpf_prog_run() will be attributed to them),
thus when a stack trace is triggered, the code walking the stack
won't see any of the JITed ones. The same for address correlation
done from user space via reading /proc/kallsyms. This is read by
tools like perf, but the latter is also useful for permanent live
tracing with eBPF itself in combination with stack maps when other
eBPF types are part of the callchain. See offwaketime example on
dumping stack from a map.
This work tries to tackle that issue by making the addresses and
symbols known to the kernel. The lookup from *kernel_text_address()
is implemented through a latched RB tree that can be read under
RCU in fast-path that is also shared for symbol/size/offset lookup
for a specific given address in kallsyms. The slow-path iteration
through all symbols in the seq file done via RCU list, which holds
a tiny fraction of all exported ksyms, usually below 0.1 percent.
Function symbols are exported as bpf_prog_<tag>, in order to aide
debugging and attribution. This facility is currently enabled for
root-only when bpf_jit_kallsyms is set to 1, and disabled if hardening
is active in any mode. The rationale behind this is that still a lot
of systems ship with world read permissions on kallsyms thus addresses
should not get suddenly exposed for them. If that situation gets
much better in future, we always have the option to change the
default on this. Likewise, unprivileged programs are not allowed
to add entries there either, but that is less of a concern as most
such programs types relevant in this context are for root-only anyway.
If enabled, call graphs and stack traces will then show a correct
attribution; one example is illustrated below, where the trace is
now visible in tooling such as perf script --kallsyms=/proc/kallsyms
and friends.
Before:
7fff8166889d bpf_clone_redirect+0x80007f0020ed (/lib/modules/4.9.0-rc8+/build/vmlinux)
f5d80 __sendmsg_nocancel+0xffff006451f1a007 (/usr/lib64/libc-2.18.so)
After:
7fff816688b7 bpf_clone_redirect+0x80007f002107 (/lib/modules/4.9.0-rc8+/build/vmlinux)
7fffa0575728 bpf_prog_33c45a467c9e061a+0x8000600020fb (/lib/modules/4.9.0-rc8+/build/vmlinux)
7fffa07ef1fc cls_bpf_classify+0x8000600020dc (/lib/modules/4.9.0-rc8+/build/vmlinux)
7fff81678b68 tc_classify+0x80007f002078 (/lib/modules/4.9.0-rc8+/build/vmlinux)
7fff8164d40b __netif_receive_skb_core+0x80007f0025fb (/lib/modules/4.9.0-rc8+/build/vmlinux)
7fff8164d718 __netif_receive_skb+0x80007f002018 (/lib/modules/4.9.0-rc8+/build/vmlinux)
7fff8164e565 process_backlog+0x80007f002095 (/lib/modules/4.9.0-rc8+/build/vmlinux)
7fff8164dc71 net_rx_action+0x80007f002231 (/lib/modules/4.9.0-rc8+/build/vmlinux)
7fff81767461 __softirqentry_text_start+0x80007f0020d1 (/lib/modules/4.9.0-rc8+/build/vmlinux)
7fff817658ac do_softirq_own_stack+0x80007f00201c (/lib/modules/4.9.0-rc8+/build/vmlinux)
7fff810a2c20 do_softirq+0x80007f002050 (/lib/modules/4.9.0-rc8+/build/vmlinux)
7fff810a2cb5 __local_bh_enable_ip+0x80007f002085 (/lib/modules/4.9.0-rc8+/build/vmlinux)
7fff8168d452 ip_finish_output2+0x80007f002152 (/lib/modules/4.9.0-rc8+/build/vmlinux)
7fff8168ea3d ip_finish_output+0x80007f00217d (/lib/modules/4.9.0-rc8+/build/vmlinux)
7fff8168f2af ip_output+0x80007f00203f (/lib/modules/4.9.0-rc8+/build/vmlinux)
[...]
7fff81005854 do_syscall_64+0x80007f002054 (/lib/modules/4.9.0-rc8+/build/vmlinux)
7fff817649eb return_from_SYSCALL_64+0x80007f002000 (/lib/modules/4.9.0-rc8+/build/vmlinux)
f5d80 __sendmsg_nocancel+0xffff01c484812007 (/usr/lib64/libc-2.18.so)
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Cc: [email protected]
Signed-off-by: David S. Miller <[email protected]>
|
|
All map types and prog types are registered to the BPF core through
bpf_register_map_type() and bpf_register_prog_type() during init and
remain unchanged thereafter. As by design we don't (and never will)
have any pluggable code that can register to that at any later point
in time, lets mark all the existing bpf_{map,prog}_type_list objects
in the tree as __ro_after_init, so they can be moved to read-only
section from then onwards.
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Alexei Starovoitov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
In the current DCCP implementation an skb for a DCCP_PKT_REQUEST packet
is forcibly freed via __kfree_skb in dccp_rcv_state_process if
dccp_v6_conn_request successfully returns.
However, if IPV6_RECVPKTINFO is set on a socket, the address of the skb
is saved to ireq->pktopts and the ref count for skb is incremented in
dccp_v6_conn_request, so skb is still in use. Nevertheless, it gets freed
in dccp_rcv_state_process.
Fix by calling consume_skb instead of doing goto discard and therefore
calling __kfree_skb.
Similar fixes for TCP:
fb7e2399ec17f1004c0e0ccfd17439f8759ede01 [TCP]: skb is unexpectedly freed.
0aea76d35c9651d55bbaf746e7914e5f9ae5a25d tcp: SYN packets are now
simply consumed
Signed-off-by: Andrey Konovalov <[email protected]>
Acked-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Fixes: efa5356b0d97 ("bridge: per vlan dst_metadata netlink support")
Signed-off-by: Roopa Prabhu <[email protected]>
Reviewed-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
There are two problems with the function tipc_sk_reinit. Firstly
it's doing a manual walk over an rhashtable. This is broken as
an rhashtable can be resized and if you manually walk over it
during a resize then you may miss entries.
Secondly it's missing memory barriers as previously the code used
spinlocks which provide the barriers implicitly.
This patch fixes both problems.
Fixes: 07f6c4bc048a ("tipc: convert tipc reference table to...")
Signed-off-by: Herbert Xu <[email protected]>
Acked-by: Ying Xue <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
BPF classifier support for the "in hw" offloading flags.
Signed-off-by: Or Gerlitz <[email protected]>
Reviewed-by: Amir Vadai <[email protected]>
Acked-by: Jakub Kicinski <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
U32 support for the "in hw" offloading flags.
Signed-off-by: Or Gerlitz <[email protected]>
Reviewed-by: Amir Vadai <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Matchall support for the "in hw" offloading flags.
Signed-off-by: Or Gerlitz <[email protected]>
Reviewed-by: Amir Vadai <[email protected]>
Acked-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Flower support for the "in hw" offloading flags.
Signed-off-by: Or Gerlitz <[email protected]>
Reviewed-by: Amir Vadai <[email protected]>
Acked-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The classifier flags are not dumped to user-space, do that.
Signed-off-by: Or Gerlitz <[email protected]>
Acked-by: Jiri Pirko <[email protected]>
Acked-by: Yotam Gigi <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Dump the classifier flags only if non zero and make sure to check
the return status of the handler that puts them into the netlink msg.
Signed-off-by: Or Gerlitz <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Commit 6664498280cf ("packet: call fanout_release, while UNREGISTERING a
netdev"), unfortunately, introduced the following issues.
1. calling mutex_lock(&fanout_mutex) (fanout_release()) from inside
rcu_read-side critical section. rcu_read_lock disables preemption, most often,
which prohibits calling sleeping functions.
[ ] include/linux/rcupdate.h:560 Illegal context switch in RCU read-side critical section!
[ ]
[ ] rcu_scheduler_active = 1, debug_locks = 0
[ ] 4 locks held by ovs-vswitchd/1969:
[ ] #0: (cb_lock){++++++}, at: [<ffffffff8158a6c9>] genl_rcv+0x19/0x40
[ ] #1: (ovs_mutex){+.+.+.}, at: [<ffffffffa04878ca>] ovs_vport_cmd_del+0x4a/0x100 [openvswitch]
[ ] #2: (rtnl_mutex){+.+.+.}, at: [<ffffffff81564157>] rtnl_lock+0x17/0x20
[ ] #3: (rcu_read_lock){......}, at: [<ffffffff81614165>] packet_notifier+0x5/0x3f0
[ ]
[ ] Call Trace:
[ ] [<ffffffff813770c1>] dump_stack+0x85/0xc4
[ ] [<ffffffff810c9077>] lockdep_rcu_suspicious+0x107/0x110
[ ] [<ffffffff810a2da7>] ___might_sleep+0x57/0x210
[ ] [<ffffffff810a2fd0>] __might_sleep+0x70/0x90
[ ] [<ffffffff8162e80c>] mutex_lock_nested+0x3c/0x3a0
[ ] [<ffffffff810de93f>] ? vprintk_default+0x1f/0x30
[ ] [<ffffffff81186e88>] ? printk+0x4d/0x4f
[ ] [<ffffffff816106dd>] fanout_release+0x1d/0xe0
[ ] [<ffffffff81614459>] packet_notifier+0x2f9/0x3f0
2. calling mutex_lock(&fanout_mutex) inside spin_lock(&po->bind_lock).
"sleeping function called from invalid context"
[ ] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620
[ ] in_atomic(): 1, irqs_disabled(): 0, pid: 1969, name: ovs-vswitchd
[ ] INFO: lockdep is turned off.
[ ] Call Trace:
[ ] [<ffffffff813770c1>] dump_stack+0x85/0xc4
[ ] [<ffffffff810a2f52>] ___might_sleep+0x202/0x210
[ ] [<ffffffff810a2fd0>] __might_sleep+0x70/0x90
[ ] [<ffffffff8162e80c>] mutex_lock_nested+0x3c/0x3a0
[ ] [<ffffffff816106dd>] fanout_release+0x1d/0xe0
[ ] [<ffffffff81614459>] packet_notifier+0x2f9/0x3f0
3. calling dev_remove_pack(&fanout->prot_hook), from inside
spin_lock(&po->bind_lock) or rcu_read-side critical-section. dev_remove_pack()
-> synchronize_net(), which might sleep.
[ ] BUG: scheduling while atomic: ovs-vswitchd/1969/0x00000002
[ ] INFO: lockdep is turned off.
[ ] Call Trace:
[ ] [<ffffffff813770c1>] dump_stack+0x85/0xc4
[ ] [<ffffffff81186274>] __schedule_bug+0x64/0x73
[ ] [<ffffffff8162b8cb>] __schedule+0x6b/0xd10
[ ] [<ffffffff8162c5db>] schedule+0x6b/0x80
[ ] [<ffffffff81630b1d>] schedule_timeout+0x38d/0x410
[ ] [<ffffffff810ea3fd>] synchronize_sched_expedited+0x53d/0x810
[ ] [<ffffffff810ea6de>] synchronize_rcu_expedited+0xe/0x10
[ ] [<ffffffff8154eab5>] synchronize_net+0x35/0x50
[ ] [<ffffffff8154eae3>] dev_remove_pack+0x13/0x20
[ ] [<ffffffff8161077e>] fanout_release+0xbe/0xe0
[ ] [<ffffffff81614459>] packet_notifier+0x2f9/0x3f0
4. fanout_release() races with calls from different CPU.
To fix the above problems, remove the call to fanout_release() under
rcu_read_lock(). Instead, call __dev_remove_pack(&fanout->prot_hook) and
netdev_run_todo will be happy that &dev->ptype_specific list is empty. In order
to achieve this, I moved dev_{add,remove}_pack() out of fanout_{add,release} to
__fanout_{link,unlink}. So, call to {,__}unregister_prot_hook() will make sure
fanout->prot_hook is removed as well.
Fixes: 6664498280cf ("packet: call fanout_release, while UNREGISTERING a netdev")
Reported-by: Eric Dumazet <[email protected]>
Signed-off-by: Anoob Soman <[email protected]>
Acked-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next
Steffen Klassert says:
====================
pull request (net-next): ipsec-next 2017-02-16
1) Make struct xfrm_input_afinfo const, nothing writes to it.
From Florian Westphal.
2) Remove all places that write to the afinfo policy backend
and make the struct const then.
From Florian Westphal.
3) Prepare for packet consuming gro callbacks and add
ESP GRO handlers. ESP packets can be decapsulated
at the GRO layer then. It saves a round through
the stack for each ESP packet.
Please note that this has a merge coflict between commit
63fca65d0863 ("net: add confirm_neigh method to dst_ops")
from net-next and
3d7d25a68ea5 ("xfrm: policy: remove garbage_collect callback")
a2817d8b279b ("xfrm: policy: remove family field")
from ipsec-next.
The conflict can be solved as it is done in linux-next.
Please pull or let me know if there are problems.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
|
|
This reverts commits:
6a25478077d987edc5e2f880590a2bc5fcab4441
9dbbfb0ab6680c6a85609041011484e6658e7d3c
40137906c5f55c252194ef5834130383e639536f
It's too risky to put in this late in the release
cycle. We'll put these changes into the next merge
window instead.
Signed-off-by: David S. Miller <[email protected]>
|
|
Commit 91572088e3fd ("net: use core MTU range checking in core net
infra") changed the openvswitch internal device to use the core net
infra for controlling the MTU range, but failed to actually set the
max_mtu as described in the commit message, which now defaults to
ETH_DATA_LEN.
This patch fixes this by setting max_mtu to ETH_MAX_MTU after
ether_setup() call.
Fixes: 91572088e3fd ("net: use core MTU range checking in core net infra")
Signed-off-by: Jarno Rajahalme <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
When setting a neigh related sysctl parameter, we always send a
NETEVENT_DELAY_PROBE_TIME_UPDATE netevent. For instance, when
executing
sysctl net.ipv6.neigh.wlp3s0.retrans_time_ms=2000
a NETEVENT_DELAY_PROBE_TIME_UPDATE netevent is generated.
This is caused by commit 2a4501ae18b5 ("neigh: Send a
notification when DELAY_PROBE_TIME changes"). According to the
commit's description, it was intended to generate such an event
when setting the "delay_first_probe_time" sysctl parameter.
In order to fix this, only generate this event when actually
setting the "delay_first_probe_time" sysctl parameter. This fix
should not have any unintended side-effects, because all but one
registered netevent callbacks check for other netevent event
types (the registered callbacks were obtained by grepping for
"register_netevent_notifier"). The only callback that uses the
NETEVENT_DELAY_PROBE_TIME_UPDATE event is
mlxsw_sp_router_netevent_event() (in
drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c): in case
of this event, it only accesses the DELAY_PROBE_TIME of the
passed neigh_parms.
Fixes: 2a4501ae18b5 ("neigh: Send a notification when DELAY_PROBE_TIME changes")
Signed-off-by: Marcus Huewe <[email protected]>
Reviewed-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This patch adds GRO ifrastructure and callbacks for ESP on
ipv4 and ipv6.
In case the GRO layer detects an ESP packet, the
esp{4,6}_gro_receive() function does a xfrm state lookup
and calls the xfrm input layer if it finds a matching state.
The packet will be decapsulated and reinjected it into layer 2.
Signed-off-by: Steffen Klassert <[email protected]>
|
|
We need to keep per packet offloading informations across
the layers. So we extend the sec_path to carry these for
the input and output offload codepath.
Signed-off-by: Steffen Klassert <[email protected]>
|
|
We need it in the ESP offload handlers, so export it.
Signed-off-by: Steffen Klassert <[email protected]>
|
|
The upcomming IPsec ESP gro callbacks will consume the skb,
so prepare for that.
Signed-off-by: Steffen Klassert <[email protected]>
|
|
Add a skb_gro_flush_final helper to prepare for consuming
skbs in call_gro_receive. We will extend this helper to not
touch the skb if the skb is consumed by a gro callback with
a followup patch. We need this to handle the upcomming IPsec
ESP callbacks as they reinject the skb to the napi_gro_receive
asynchronous. The handler is used in all gro_receive functions
that can call the ESP gro handlers.
Signed-off-by: Steffen Klassert <[email protected]>
|
|
Add a new helper to set the secpath to the skb.
This avoids code duplication, as this is used
in multiple places.
Signed-off-by: Steffen Klassert <[email protected]>
|
|
tcp_rcv_established() can now run in process context.
We need to disable BH while acquiring tcp probe spinlock,
or risk a deadlock.
Fixes: 5413d1babe8f ("net: do not block BH while processing socket backlog")
Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: Ricardo Nabinger Sanchez <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Multiple threads can call fanout_add() at the same time.
We need to grab fanout_mutex earlier to avoid races that could
lead to one thread freeing po->rollover that was set by another thread.
Do the same in fanout_release(), for peace of mind, and to help us
finding lockdep issues earlier.
Fixes: dc99f600698d ("packet: Add fanout support.")
Fixes: 0648ab70afe6 ("packet: rollover prepare: per-socket state")
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Willem de Bruijn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Fixes the following sparse warning:
net/sched/act_api.c:532:5: warning:
symbol 'nla_memdup_cookie' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
In commit 98e3862ca2b1 ("kcm: fix 0-length case for kcm_sendmsg()")
I tried to avoid skb allocation for 0-length case, but missed
a check for NULL pointer in the non EOR case.
Fixes: 98e3862ca2b1 ("kcm: fix 0-length case for kcm_sendmsg()")
Reported-by: Dmitry Vyukov <[email protected]>
Cc: Tom Herbert <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Acked-by: Tom Herbert <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
We can simplify the logic of entries pointing to the bridge by
converging the fdb_delete_by functions, this would allow us to use the
same function for both cases since the fdb's dst is set to NULL if it is
pointing to the bridge thus we can always check for a port match.
Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
In order to avoid new errors add checks to br_fdb_find and fdb_find_rcu
functions. The first requires hash_lock, the second obviously RCU.
Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Before this patch we had 3 different fdb searching functions which was
confusing. This patch reduces all of them to one - fdb_find_rcu(), and
two flavors: br_fdb_find() which requires hash_lock and br_fdb_find_rcu
which requires RCU. This makes it clear what needs to be used, we also
remove two abusers of __br_fdb_get which called it under hash_lock and
replace them with br_fdb_find().
Signed-off-by: Nikolay Aleksandrov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
This patch adds a check on the type of the source address for the case
where the destination address is in6addr_any. If the source is an
IPv4-mapped IPv6 source address, the destination is changed to
::ffff:127.0.0.1, and otherwise the destination is changed to ::1. This
is done in three locations to handle UDP calls to either connect() or
sendmsg() and TCP calls to connect(). Note that udpv6_sendmsg() delays
handling an in6addr_any destination until very late, so the patch only
needs to handle the case where the source is an IPv4-mapped IPv6
address.
Signed-off-by: Jonathan T. Leighton <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|