Age | Commit message (Collapse) | Author | Files | Lines |
|
assigning a dummy value of 'clock_id' to avoid cancellation of the cycle
timer before its initialization was a temporary solution, and we still
need to handle the case where act_gate timer parameters are changed by
commands like the following one:
# tc action replace action gate <parameters>
the fix consists in the following items:
1) remove the workaround assignment of 'clock_id', and init the list of
entries before the first error path after IDR atomic check/allocation
2) validate 'clock_id' earlier: there is no need to do IDR atomic
check/allocation if we know that 'clock_id' is a bad value
3) use a dedicated function, 'gate_setup_timer()', to ensure that the
timer is cancelled and re-initialized on action overwrite, and also
ensure we initialize the timer in the error path of tcf_gate_init()
v3: improve comment in the error path of tcf_gate_init() (thanks to
Vladimir Oltean)
v2: avoid 'goto' in gate_setup_timer (thanks to Cong Wang)
CC: Ivan Vecera <[email protected]>
Fixes: a01c245438c5 ("net/sched: fix a couple of splats in the error path of tfc_gate_init()")
Fixes: a51c328df310 ("net: qos: introduce a gate control flow action")
Signed-off-by: Davide Caratti <[email protected]>
Acked-by: Vladimir Oltean <[email protected]>
Tested-by: Vladimir Oltean <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
it is possible to see a KASAN use-after-free, immediately followed by a
NULL dereference crash, with the following command:
# tc action add action gate index 3 cycle-time 100000000ns \
> cycle-time-ext 100000000ns clockid CLOCK_TAI
BUG: KASAN: use-after-free in tcf_action_init_1+0x8eb/0x960
Write of size 1 at addr ffff88810a5908bc by task tc/883
CPU: 0 PID: 883 Comm: tc Not tainted 5.7.0+ #188
Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014
Call Trace:
dump_stack+0x75/0xa0
print_address_description.constprop.6+0x1a/0x220
kasan_report.cold.9+0x37/0x7c
tcf_action_init_1+0x8eb/0x960
tcf_action_init+0x157/0x2a0
tcf_action_add+0xd9/0x2f0
tc_ctl_action+0x2a3/0x39d
rtnetlink_rcv_msg+0x5f3/0x920
netlink_rcv_skb+0x120/0x380
netlink_unicast+0x439/0x630
netlink_sendmsg+0x714/0xbf0
sock_sendmsg+0xe2/0x110
____sys_sendmsg+0x5b4/0x890
___sys_sendmsg+0xe9/0x160
__sys_sendmsg+0xd3/0x170
do_syscall_64+0x9a/0x370
entry_SYSCALL_64_after_hwframe+0x44/0xa9
[...]
KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077]
CPU: 0 PID: 883 Comm: tc Tainted: G B 5.7.0+ #188
Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014
RIP: 0010:tcf_action_fill_size+0xa3/0xf0
[....]
RSP: 0018:ffff88813a48f250 EFLAGS: 00010212
RAX: dffffc0000000000 RBX: 0000000000000094 RCX: ffffffffa47c3eb6
RDX: 000000000000000e RSI: 0000000000000008 RDI: 0000000000000070
RBP: ffff88810a590800 R08: 0000000000000004 R09: ffffed1027491e03
R10: 0000000000000003 R11: ffffed1027491e03 R12: 0000000000000000
R13: 0000000000000000 R14: dffffc0000000000 R15: ffff88810a590800
FS: 00007f62cae8ce40(0000) GS:ffff888147c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f62c9d20a10 CR3: 000000013a52a000 CR4: 0000000000340ef0
Call Trace:
tcf_action_init+0x172/0x2a0
tcf_action_add+0xd9/0x2f0
tc_ctl_action+0x2a3/0x39d
rtnetlink_rcv_msg+0x5f3/0x920
netlink_rcv_skb+0x120/0x380
netlink_unicast+0x439/0x630
netlink_sendmsg+0x714/0xbf0
sock_sendmsg+0xe2/0x110
____sys_sendmsg+0x5b4/0x890
___sys_sendmsg+0xe9/0x160
__sys_sendmsg+0xd3/0x170
do_syscall_64+0x9a/0x370
entry_SYSCALL_64_after_hwframe+0x44/0xa9
this is caused by the test on 'cycletime_ext', that is still unassigned
when the action is newly created. This makes the action .init() return 0
without calling tcf_idr_insert(), hence the UAF + crash.
rework the logic that prevents zero values of cycle-time, as follows:
1) 'tcfg_cycletime_ext' seems to be unused in the action software path,
and it was already possible by other means to obtain non-zero
cycletime and zero cycletime-ext. So, removing that test should not
cause any damage.
2) while at it, we must prevent overwriting configuration data with wrong
ones: use a temporary variable for 'tcfg_cycletime', and validate it
preserving the original semantic (that allowed computing the cycle
time as the sum of all intervals, when not specified by
TCA_GATE_CYCLE_TIME).
3) remove the test on 'tcfg_cycletime', no more useful, and avoid
returning -EFAULT, which did not seem an appropriate return value for
a wrong netlink attribute.
v3: fix uninitialized 'cycletime' (thanks to Vladimir Oltean)
v2: remove useless 'return;' at the end of void gate_get_start_time()
Fixes: a51c328df310 ("net: qos: introduce a gate control flow action")
CC: Ivan Vecera <[email protected]>
Signed-off-by: Davide Caratti <[email protected]>
Acked-by: Vladimir Oltean <[email protected]>
Tested-by: Vladimir Oltean <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
In the datapath, the ip_tunnel_lookup() is used and it internally uses
fallback tunnel device pointer, which is fb_tunnel_dev.
This pointer variable should be set to NULL when a fb interface is deleted.
But there is no routine to set fb_tunnel_dev pointer to NULL.
So, this pointer will be still used after interface is deleted and
it eventually results in the use-after-free problem.
Test commands:
ip netns add A
ip netns add B
ip link add eth0 type veth peer name eth1
ip link set eth0 netns A
ip link set eth1 netns B
ip netns exec A ip link set lo up
ip netns exec A ip link set eth0 up
ip netns exec A ip link add gre1 type gre local 10.0.0.1 \
remote 10.0.0.2
ip netns exec A ip link set gre1 up
ip netns exec A ip a a 10.0.100.1/24 dev gre1
ip netns exec A ip a a 10.0.0.1/24 dev eth0
ip netns exec B ip link set lo up
ip netns exec B ip link set eth1 up
ip netns exec B ip link add gre1 type gre local 10.0.0.2 \
remote 10.0.0.1
ip netns exec B ip link set gre1 up
ip netns exec B ip a a 10.0.100.2/24 dev gre1
ip netns exec B ip a a 10.0.0.2/24 dev eth1
ip netns exec A hping3 10.0.100.2 -2 --flood -d 60000 &
ip netns del B
Splat looks like:
[ 77.793450][ C3] ==================================================================
[ 77.794702][ C3] BUG: KASAN: use-after-free in ip_tunnel_lookup+0xcc4/0xf30
[ 77.795573][ C3] Read of size 4 at addr ffff888060bd9c84 by task hping3/2905
[ 77.796398][ C3]
[ 77.796664][ C3] CPU: 3 PID: 2905 Comm: hping3 Not tainted 5.8.0-rc1+ #616
[ 77.797474][ C3] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 77.798453][ C3] Call Trace:
[ 77.798815][ C3] <IRQ>
[ 77.799142][ C3] dump_stack+0x9d/0xdb
[ 77.799605][ C3] print_address_description.constprop.7+0x2cc/0x450
[ 77.800365][ C3] ? ip_tunnel_lookup+0xcc4/0xf30
[ 77.800908][ C3] ? ip_tunnel_lookup+0xcc4/0xf30
[ 77.801517][ C3] ? ip_tunnel_lookup+0xcc4/0xf30
[ 77.802145][ C3] kasan_report+0x154/0x190
[ 77.802821][ C3] ? ip_tunnel_lookup+0xcc4/0xf30
[ 77.803503][ C3] ip_tunnel_lookup+0xcc4/0xf30
[ 77.804165][ C3] __ipgre_rcv+0x1ab/0xaa0 [ip_gre]
[ 77.804862][ C3] ? rcu_read_lock_sched_held+0xc0/0xc0
[ 77.805621][ C3] gre_rcv+0x304/0x1910 [ip_gre]
[ 77.806293][ C3] ? lock_acquire+0x1a9/0x870
[ 77.806925][ C3] ? gre_rcv+0xfe/0x354 [gre]
[ 77.807559][ C3] ? erspan_xmit+0x2e60/0x2e60 [ip_gre]
[ 77.808305][ C3] ? rcu_read_lock_sched_held+0xc0/0xc0
[ 77.809032][ C3] ? rcu_read_lock_held+0x90/0xa0
[ 77.809713][ C3] gre_rcv+0x1b8/0x354 [gre]
[ ... ]
Suggested-by: Eric Dumazet <[email protected]>
Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.")
Signed-off-by: Taehee Yoo <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
In the datapath, the ip6gre_tunnel_lookup() is used and it internally uses
fallback tunnel device pointer, which is fb_tunnel_dev.
This pointer variable should be set to NULL when a fb interface is deleted.
But there is no routine to set fb_tunnel_dev pointer to NULL.
So, this pointer will be still used after interface is deleted and
it eventually results in the use-after-free problem.
Test commands:
ip netns add A
ip netns add B
ip link add eth0 type veth peer name eth1
ip link set eth0 netns A
ip link set eth1 netns B
ip netns exec A ip link set lo up
ip netns exec A ip link set eth0 up
ip netns exec A ip link add ip6gre1 type ip6gre local fc:0::1 \
remote fc:0::2
ip netns exec A ip -6 a a fc:100::1/64 dev ip6gre1
ip netns exec A ip link set ip6gre1 up
ip netns exec A ip -6 a a fc:0::1/64 dev eth0
ip netns exec A ip link set ip6gre0 up
ip netns exec B ip link set lo up
ip netns exec B ip link set eth1 up
ip netns exec B ip link add ip6gre1 type ip6gre local fc:0::2 \
remote fc:0::1
ip netns exec B ip -6 a a fc:100::2/64 dev ip6gre1
ip netns exec B ip link set ip6gre1 up
ip netns exec B ip -6 a a fc:0::2/64 dev eth1
ip netns exec B ip link set ip6gre0 up
ip netns exec A ping fc:100::2 -s 60000 &
ip netns del B
Splat looks like:
[ 73.087285][ C1] BUG: KASAN: use-after-free in ip6gre_tunnel_lookup+0x1064/0x13f0 [ip6_gre]
[ 73.088361][ C1] Read of size 4 at addr ffff888040559218 by task ping/1429
[ 73.089317][ C1]
[ 73.089638][ C1] CPU: 1 PID: 1429 Comm: ping Not tainted 5.7.0+ #602
[ 73.090531][ C1] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[ 73.091725][ C1] Call Trace:
[ 73.092160][ C1] <IRQ>
[ 73.092556][ C1] dump_stack+0x96/0xdb
[ 73.093122][ C1] print_address_description.constprop.6+0x2cc/0x450
[ 73.094016][ C1] ? ip6gre_tunnel_lookup+0x1064/0x13f0 [ip6_gre]
[ 73.094894][ C1] ? ip6gre_tunnel_lookup+0x1064/0x13f0 [ip6_gre]
[ 73.095767][ C1] ? ip6gre_tunnel_lookup+0x1064/0x13f0 [ip6_gre]
[ 73.096619][ C1] kasan_report+0x154/0x190
[ 73.097209][ C1] ? ip6gre_tunnel_lookup+0x1064/0x13f0 [ip6_gre]
[ 73.097989][ C1] ip6gre_tunnel_lookup+0x1064/0x13f0 [ip6_gre]
[ 73.098750][ C1] ? gre_del_protocol+0x60/0x60 [gre]
[ 73.099500][ C1] gre_rcv+0x1c5/0x1450 [ip6_gre]
[ 73.100199][ C1] ? ip6gre_header+0xf00/0xf00 [ip6_gre]
[ 73.100985][ C1] ? rcu_read_lock_sched_held+0xc0/0xc0
[ 73.101830][ C1] ? ip6_input_finish+0x5/0xf0
[ 73.102483][ C1] ip6_protocol_deliver_rcu+0xcbb/0x1510
[ 73.103296][ C1] ip6_input_finish+0x5b/0xf0
[ 73.103920][ C1] ip6_input+0xcd/0x2c0
[ 73.104473][ C1] ? ip6_input_finish+0xf0/0xf0
[ 73.105115][ C1] ? rcu_read_lock_held+0x90/0xa0
[ 73.105783][ C1] ? rcu_read_lock_sched_held+0xc0/0xc0
[ 73.106548][ C1] ipv6_rcv+0x1f1/0x300
[ ... ]
Suggested-by: Eric Dumazet <[email protected]>
Fixes: c12b395a4664 ("gre: Support GRE over IPv6")
Signed-off-by: Taehee Yoo <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
I got a memleak report when doing some fuzz test:
unreferenced object 0xffff888112584000 (size 13599):
comm "ip", pid 3048, jiffies 4294911734 (age 343.491s)
hex dump (first 32 bytes):
74 61 70 30 00 00 00 00 00 00 00 00 00 00 00 00 tap0............
00 ee d9 19 81 88 ff ff 00 00 00 00 00 00 00 00 ................
backtrace:
[<000000002f60ba65>] __kmalloc_node+0x309/0x3a0
[<0000000075b211ec>] kvmalloc_node+0x7f/0xc0
[<00000000d3a97396>] alloc_netdev_mqs+0x76/0xfc0
[<00000000609c3655>] __tun_chr_ioctl+0x1456/0x3d70
[<000000001127ca24>] ksys_ioctl+0xe5/0x130
[<00000000b7d5e66a>] __x64_sys_ioctl+0x6f/0xb0
[<00000000e1023498>] do_syscall_64+0x56/0xa0
[<000000009ec0eb12>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
unreferenced object 0xffff888111845cc0 (size 8):
comm "ip", pid 3048, jiffies 4294911734 (age 343.491s)
hex dump (first 8 bytes):
74 61 70 30 00 88 ff ff tap0....
backtrace:
[<000000004c159777>] kstrdup+0x35/0x70
[<00000000d8b496ad>] kstrdup_const+0x3d/0x50
[<00000000494e884a>] kvasprintf_const+0xf1/0x180
[<0000000097880a2b>] kobject_set_name_vargs+0x56/0x140
[<000000008fbdfc7b>] dev_set_name+0xab/0xe0
[<000000005b99e3b4>] netdev_register_kobject+0xc0/0x390
[<00000000602704fe>] register_netdevice+0xb61/0x1250
[<000000002b7ca244>] __tun_chr_ioctl+0x1cd1/0x3d70
[<000000001127ca24>] ksys_ioctl+0xe5/0x130
[<00000000b7d5e66a>] __x64_sys_ioctl+0x6f/0xb0
[<00000000e1023498>] do_syscall_64+0x56/0xa0
[<000000009ec0eb12>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
unreferenced object 0xffff88811886d800 (size 512):
comm "ip", pid 3048, jiffies 4294911734 (age 343.491s)
hex dump (first 32 bytes):
00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00 .....N..........
ff ff ff ff ff ff ff ff c0 66 3d a3 ff ff ff ff .........f=.....
backtrace:
[<0000000050315800>] device_add+0x61e/0x1950
[<0000000021008dfb>] netdev_register_kobject+0x17e/0x390
[<00000000602704fe>] register_netdevice+0xb61/0x1250
[<000000002b7ca244>] __tun_chr_ioctl+0x1cd1/0x3d70
[<000000001127ca24>] ksys_ioctl+0xe5/0x130
[<00000000b7d5e66a>] __x64_sys_ioctl+0x6f/0xb0
[<00000000e1023498>] do_syscall_64+0x56/0xa0
[<000000009ec0eb12>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
If call_netdevice_notifiers() failed, then rollback_registered()
calls netdev_unregister_kobject() which holds the kobject. The
reference cannot be put because the netdev won't be add to todo
list, so it will leads a memleak, we need put the reference to
avoid memleak.
Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Yang Yingliang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Alexei Starovoitov says:
====================
pull-request: bpf 2020-06-17
The following pull-request contains BPF updates for your *net* tree.
We've added 10 non-merge commits during the last 2 day(s) which contain
a total of 14 files changed, 158 insertions(+), 59 deletions(-).
The main changes are:
1) Important fix for bpf_probe_read_kernel_str() return value, from Andrii.
2) [gs]etsockopt fix for large optlen, from Stanislav.
3) devmap allocation fix, from Toke.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
In commit 34cc0b338a61 we only handled the frame_sz in convert_to_xdp_frame().
This patch will also handle frame_sz in xdp_convert_zc_to_xdp_frame().
Fixes: 34cc0b338a61 ("xdp: Xdp_frame add member frame_sz and handle in convert_to_xdp_frame")
Signed-off-by: Hangbin Liu <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Jesper Dangaard Brouer <[email protected]>
Acked-by: John Fastabend <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
Back in 2013, we made a change that broke fast retransmit
for non SACK flows.
Indeed, for these flows, a sender needs to receive three duplicate
ACK before starting fast retransmit. Sending ACK with different
receive window do not count.
Even if enabling SACK is strongly recommended these days,
there still are some cases where it has to be disabled.
Not increasing the window seems better than having to
rely on RTO.
After the fix, following packetdrill test gives :
// Initialize connection
0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
+0 bind(3, ..., ...) = 0
+0 listen(3, 1) = 0
+0 < S 0:0(0) win 32792 <mss 1000,nop,wscale 7>
+0 > S. 0:0(0) ack 1 <mss 1460,nop,wscale 8>
+0 < . 1:1(0) ack 1 win 514
+0 accept(3, ..., ...) = 4
+0 < . 1:1001(1000) ack 1 win 514
// Quick ack
+0 > . 1:1(0) ack 1001 win 264
+0 < . 2001:3001(1000) ack 1 win 514
// DUPACK : Normally we should not change the window
+0 > . 1:1(0) ack 1001 win 264
+0 < . 3001:4001(1000) ack 1 win 514
// DUPACK : Normally we should not change the window
+0 > . 1:1(0) ack 1001 win 264
+0 < . 4001:5001(1000) ack 1 win 514
// DUPACK : Normally we should not change the window
+0 > . 1:1(0) ack 1001 win 264
+0 < . 1001:2001(1000) ack 1 win 514
// Hole is repaired.
+0 > . 1:1(0) ack 5001 win 272
Fixes: 4e4f1fc22681 ("tcp: properly increase rcv_ssthresh for ofo packets")
Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: Venkat Venkatsubra <[email protected]>
Acked-by: Neal Cardwell <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
socket malloced by sock_create_kern() should be release before return
in the error handling, otherwise it cause memory leak.
unreferenced object 0xffff88810910c000 (size 1216):
comm "00000003_test_m", pid 12238, jiffies 4295050289 (age 54.237s)
hex dump (first 32 bytes):
01 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 00 00 2f 30 0a 81 88 ff ff ........./0.....
backtrace:
[<00000000e877f89f>] sock_alloc_inode+0x18/0x1c0
[<0000000093d1dd51>] alloc_inode+0x63/0x1d0
[<000000005673fec6>] new_inode_pseudo+0x14/0xe0
[<00000000b5db6be8>] sock_alloc+0x3c/0x260
[<00000000e7e3cbb2>] __sock_create+0x89/0x620
[<0000000023e48593>] mptcp_subflow_create_socket+0xc0/0x5e0
[<00000000419795e4>] __mptcp_socket_create+0x1ad/0x3f0
[<00000000b2f942e8>] mptcp_stream_connect+0x281/0x4f0
[<00000000c80cd5cc>] __sys_connect_file+0x14d/0x190
[<00000000dc761f11>] __sys_connect+0x128/0x160
[<000000008b14e764>] __x64_sys_connect+0x6f/0xb0
[<000000007b4f93bd>] do_syscall_64+0xa1/0x530
[<00000000d3e770b6>] entry_SYSCALL_64_after_hwframe+0x49/0xb3
Fixes: 2303f994b3e1 ("mptcp: Associate MPTCP context with TCP socket")
Signed-off-by: Wei Yongjun <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Currently, nf_flow_table_offload_add/del_cb are exported by nf_flow_table
module, therefore modules using them will have hard-dependency
on nf_flow_table and will require loading it all the time.
This can lead to an unnecessary overhead on systems that do not
use this API.
To relax the hard-dependency between the modules, we unexport these
functions and make them static inline.
Fixes: 978703f42549 ("netfilter: flowtable: Add API for registering to flow table events")
Signed-off-by: Alaa Hleihel <[email protected]>
Reviewed-by: Roi Dayan <[email protected]>
Reviewed-by: Marcelo Ricardo Leitner <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Currently, tcf_ct_flow_table_restore_skb is exported by act_ct
module, therefore modules using it will have hard-dependency
on act_ct and will require loading it all the time.
This can lead to an unnecessary overhead on systems that do not
use hardware connection tracking action (ct_metadata action) in
the first place.
To relax the hard-dependency between the modules, we unexport this
function and make it a static inline one.
Fixes: 30b0cf90c6dd ("net/sched: act_ct: Support restoring conntrack info on skbs")
Signed-off-by: Alaa Hleihel <[email protected]>
Reviewed-by: Roi Dayan <[email protected]>
Acked-by: Marcelo Ricardo Leitner <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Commit a84d01647989 ("mld: fix memory leak in mld_del_delrec()") fixed
the memory leak of MLD, but missing the ipv6_mc_destroy_dev() path, in
which mca_sources are leaked after ma_put().
Using ip6_mc_clear_src() to take care of the missing free.
BUG: memory leak
unreferenced object 0xffff8881113d3180 (size 64):
comm "syz-executor071", pid 389, jiffies 4294887985 (age 17.943s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 ff 02 00 00 00 00 00 00 ................
00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 ................
backtrace:
[<000000002cbc483c>] kmalloc include/linux/slab.h:555 [inline]
[<000000002cbc483c>] kzalloc include/linux/slab.h:669 [inline]
[<000000002cbc483c>] ip6_mc_add1_src net/ipv6/mcast.c:2237 [inline]
[<000000002cbc483c>] ip6_mc_add_src+0x7f5/0xbb0 net/ipv6/mcast.c:2357
[<0000000058b8b1ff>] ip6_mc_source+0xe0c/0x1530 net/ipv6/mcast.c:449
[<000000000bfc4fb5>] do_ipv6_setsockopt.isra.12+0x1b2c/0x3b30 net/ipv6/ipv6_sockglue.c:754
[<00000000e4e7a722>] ipv6_setsockopt+0xda/0x150 net/ipv6/ipv6_sockglue.c:950
[<0000000029260d9a>] rawv6_setsockopt+0x45/0x100 net/ipv6/raw.c:1081
[<000000005c1b46f9>] __sys_setsockopt+0x131/0x210 net/socket.c:2132
[<000000008491f7db>] __do_sys_setsockopt net/socket.c:2148 [inline]
[<000000008491f7db>] __se_sys_setsockopt net/socket.c:2145 [inline]
[<000000008491f7db>] __x64_sys_setsockopt+0xba/0x150 net/socket.c:2145
[<00000000c7bc11c5>] do_syscall_64+0xa1/0x530 arch/x86/entry/common.c:295
[<000000005fb7a3f3>] entry_SYSCALL_64_after_hwframe+0x49/0xb3
Fixes: 1666d49e1d41 ("mld: do not remove mld souce list info when set link down")
Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Wang Hai <[email protected]>
Acked-by: Hangbin Liu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
1) Fix bogus EEXIST on element insertions to the rbtree with timeouts,
from Stefano Brivio.
2) Preempt BUG splat in the pipapo element insertion path, also from
Stefano.
3) Release filter from the ctnetlink error path.
4) Release flowtable hooks from the deletion path.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
Use list_first_entry_or_null to simplify the code.
Signed-off-by: Geliang Tang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
We have defined MPTCP_PM_ADDR_MAX in pm_netlink.c, so drop this duplicate macro.
Fixes: 1b1c7a0ef7f3 ("mptcp: Add path manager interface")
Signed-off-by: Geliang Tang <[email protected]>
Reviewed-by: Matthieu Baerts <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
The parent field of a struct device may be NULL. The macro
ibdev_to_node() should check for that.
Signed-off-by: Ka-Cheong Poon <[email protected]>
Acked-by: Santosh Shilimkar <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Pull networking fixes from David Miller:
1) Fix cfg80211 deadlock, from Johannes Berg.
2) RXRPC fails to send norigications, from David Howells.
3) MPTCP RM_ADDR parsing has an off by one pointer error, fix from
Geliang Tang.
4) Fix crash when using MSG_PEEK with sockmap, from Anny Hu.
5) The ucc_geth driver needs __netdev_watchdog_up exported, from
Valentin Longchamp.
6) Fix hashtable memory leak in dccp, from Wang Hai.
7) Fix how nexthops are marked as FDB nexthops, from David Ahern.
8) Fix mptcp races between shutdown and recvmsg, from Paolo Abeni.
9) Fix crashes in tipc_disc_rcv(), from Tuong Lien.
10) Fix link speed reporting in iavf driver, from Brett Creeley.
11) When a channel is used for XSK and then reused again later for XSK,
we forget to clear out the relevant data structures in mlx5 which
causes all kinds of problems. Fix from Maxim Mikityanskiy.
12) Fix memory leak in genetlink, from Cong Wang.
13) Disallow sockmap attachments to UDP sockets, it simply won't work.
From Lorenz Bauer.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (83 commits)
net: ethernet: ti: ale: fix allmulti for nu type ale
net: ethernet: ti: am65-cpsw-nuss: fix ale parameters init
net: atm: Remove the error message according to the atomic context
bpf: Undo internal BPF_PROBE_MEM in BPF insns dump
libbpf: Support pre-initializing .bss global variables
tools/bpftool: Fix skeleton codegen
bpf: Fix memlock accounting for sock_hash
bpf: sockmap: Don't attach programs to UDP sockets
bpf: tcp: Recv() should return 0 when the peer socket is closed
ibmvnic: Flush existing work items before device removal
genetlink: clean up family attributes allocations
net: ipa: header pad field only valid for AP->modem endpoint
net: ipa: program upper nibbles of sequencer type
net: ipa: fix modem LAN RX endpoint id
net: ipa: program metadata mask differently
ionic: add pcie_print_link_status
rxrpc: Fix race between incoming ACK parser and retransmitter
net/mlx5: E-Switch, Fix some error pointer dereferences
net/mlx5: Don't fail driver on failure to create debugfs
net/mlx5e: CT: Fix ipv6 nat header rewrite actions
...
|
|
Alexei Starovoitov says:
====================
pull-request: bpf 2020-06-12
The following pull-request contains BPF updates for your *net* tree.
We've added 26 non-merge commits during the last 10 day(s) which contain
a total of 27 files changed, 348 insertions(+), 93 deletions(-).
The main changes are:
1) sock_hash accounting fix, from Andrey.
2) libbpf fix and probe_mem sanitizing, from Andrii.
3) sock_hash fixes, from Jakub.
4) devmap_val fix, from Jesper.
5) load_bytes_relative fix, from YiFei.
====================
Signed-off-by: David S. Miller <[email protected]>
|
|
Looking into the context (atomic!) and the error message should be dropped.
Signed-off-by: Liao Pingfang <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull more Kbuild updates from Masahiro Yamada:
- fix build rules in binderfs sample
- fix build errors when Kbuild recurses to the top Makefile
- covert '---help---' in Kconfig to 'help'
* tag 'kbuild-v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
treewide: replace '---help---' in Kconfig files with 'help'
kbuild: fix broken builds because of GZIP,BZIP2,LZOP variables
samples: binderfs: really compile this sample and fix build issues
|
|
Pull 9p update from Dominique Martinet:
"Another very quiet cycle... Only one commit: increase the size of the
ring used for xen transport"
* tag '9p-for-5.8' of git://github.com/martinetd/linux:
9p/xen: increase XEN_9PFS_RING_ORDER
|
|
Since commit 84af7a6194e4 ("checkpatch: kconfig: prefer 'help' over
'---help---'"), the number of '---help---' has been gradually
decreasing, but there are still more than 2400 instances.
This commit finishes the conversion. While I touched the lines,
I also fixed the indentation.
There are a variety of indentation styles found.
a) 4 spaces + '---help---'
b) 7 spaces + '---help---'
c) 8 spaces + '---help---'
d) 1 space + 1 tab + '---help---'
e) 1 tab + '---help---' (correct indentation)
f) 1 tab + 1 space + '---help---'
g) 1 tab + 2 spaces + '---help---'
In order to convert all of them to 1 tab + 'help', I ran the
following commend:
$ find . -name 'Kconfig*' | xargs sed -i 's/^[[:space:]]*---help---/\thelp/'
Signed-off-by: Masahiro Yamada <[email protected]>
|
|
Add missed bpf_map_charge_init() in sock_hash_alloc() and
correspondingly bpf_map_charge_finish() on ENOMEM.
It was found accidentally while working on unrelated selftest that
checks "map->memory.pages > 0" is true for all map types.
Before:
# bpftool m l
...
3692: sockhash name m_sockhash flags 0x0
key 4B value 4B max_entries 8 memlock 0B
After:
# bpftool m l
...
84: sockmap name m_sockmap flags 0x0
key 4B value 4B max_entries 8 memlock 4096B
Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Andrey Ignatov <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
The stream parser infrastructure isn't set up to deal with UDP
sockets, so we mustn't try to attach programs to them.
I remember making this change at some point, but I must have lost
it while rebasing or something similar.
Fixes: 7b98cd42b049 ("bpf: sockmap: Add UDP support")
Signed-off-by: Lorenz Bauer <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Jakub Sitnicki <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
If the peer is closed, we will never get more data, so
tcp_bpf_wait_data will get stuck forever. In case we passed
MSG_DONTWAIT to recv(), we get EAGAIN but we should actually get
0.
>From man 2 recv:
RETURN VALUE
When a stream socket peer has performed an orderly shutdown, the
return value will be 0 (the traditional "end-of-file" return).
This patch makes tcp_bpf_wait_data always return 1 when the peer
socket has been shutdown. Either we have data available, and it would
have returned 1 anyway, or there isn't, in which case we'll call
tcp_recvmsg which does the right thing in this situation.
Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Sabrina Dubroca <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>
Acked-by: Jakub Sitnicki <[email protected]>
Link: https://lore.kernel.org/bpf/26038a28c21fea5d04d4bd4744c5686d3f2e5504.1591784177.git.sd@queasysnail.net
|
|
genl_family_rcv_msg_attrs_parse() and genl_family_rcv_msg_attrs_free()
take a boolean parameter to determine whether allocate/free the family
attrs. This is unnecessary as we can just check family->parallel_ops.
More importantly, callers would not need to worry about pairing these
parameters correctly after this patch.
And this fixes a memory leak, as after commit c36f05559104
("genetlink: fix memory leaks in genl_family_rcv_msg_dumpit()")
we call genl_family_rcv_msg_attrs_parse() for both parallel and
non-parallel cases.
Fixes: c36f05559104 ("genetlink: fix memory leaks in genl_family_rcv_msg_dumpit()")
Reported-by: Ido Schimmel <[email protected]>
Signed-off-by: Cong Wang <[email protected]>
Reviewed-by: Ido Schimmel <[email protected]>
Tested-by: Ido Schimmel <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
After looking up for the flowtable hooks that need to be removed,
release the hook objects in the deletion list. The error path needs to
released these hook objects too.
Fixes: abadb2f865d7 ("netfilter: nf_tables: delete devices from flowtable")
Reported-by: [email protected]
Signed-off-by: Pablo Neira Ayuso <[email protected]>
|
|
There's a race between the retransmission code and the received ACK parser.
The problem is that the retransmission loop has to drop the lock under
which it is iterating through the transmission buffer in order to transmit
a packet, but whilst the lock is dropped, the ACK parser can crank the Tx
window round and discard the packets from the buffer.
The retransmission code then updated the annotations for the wrong packet
and a later retransmission thought it had to retransmit a packet that
wasn't there, leading to a NULL pointer dereference.
Fix this by:
(1) Moving the annotation change to before we drop the lock prior to
transmission. This means we can't vary the annotation depending on
the outcome of the transmission, but that's fine - we'll retransmit
again later if it failed now.
(2) Skipping the packet if the skb pointer is NULL.
The following oops was seen:
BUG: kernel NULL pointer dereference, address: 000000000000002d
Workqueue: krxrpcd rxrpc_process_call
RIP: 0010:rxrpc_get_skb+0x14/0x8a
...
Call Trace:
rxrpc_resend+0x331/0x41e
? get_vtime_delta+0x13/0x20
rxrpc_process_call+0x3c0/0x4ac
process_one_work+0x18f/0x27f
worker_thread+0x1a3/0x247
? create_worker+0x17d/0x17d
kthread+0xe6/0xeb
? kthread_delayed_work_timer_fn+0x83/0x83
ret_from_fork+0x1f/0x30
Fixes: 248f219cb8bc ("rxrpc: Rewrite the data and ack handling code")
Signed-off-by: David Howells <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Propagate sock_alloc_send_skb error code, not set it to
EAGAIN unconditionally, when fail to allocate skb, which
might cause that user space unnecessary loops.
Fixes: 35fcde7f8deb ("xsk: support for Tx")
Signed-off-by: Li RongQing <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Acked-by: Björn Töpel <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
|
|
When a bearer is enabled, we create a 'tipc_discoverer' object to store
the bearer related data along with a timer and a preformatted discovery
message buffer for later probing... However, this is only carried after
the bearer was set 'up', that left a race condition resulting in kernel
panic.
It occurs when a discovery message from a peer node is received and
processed in bottom half (since the bearer is 'up' already) just before
the discoverer object is created but is now accessed in order to update
the preformatted buffer (with a new trial address, ...) so leads to the
NULL pointer dereference.
We solve the problem by simply moving the bearer 'up' setting to later,
so make sure everything is ready prior to any message receiving.
Acked-by: Jon Maloy <[email protected]>
Signed-off-by: Tuong Lien <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
syzbot found the following issue:
WARNING: CPU: 0 PID: 6808 at include/linux/thread_info.h:150 check_copy_size include/linux/thread_info.h:150 [inline]
WARNING: CPU: 0 PID: 6808 at include/linux/thread_info.h:150 copy_from_iter include/linux/uio.h:144 [inline]
WARNING: CPU: 0 PID: 6808 at include/linux/thread_info.h:150 tipc_msg_append+0x49a/0x5e0 net/tipc/msg.c:242
Kernel panic - not syncing: panic_on_warn set ...
This happens after commit 5e9eeccc58f3 ("tipc: fix NULL pointer
dereference in streaming") that tried to build at least one buffer even
when the message data length is zero... However, it now exposes another
bug that the 'mss' can be zero and the 'cpy' will be negative, thus the
above kernel WARNING will appear!
The zero value of 'mss' is never expected because it means Nagle is not
enabled for the socket (actually the socket type was 'SOCK_SEQPACKET'),
so the function 'tipc_msg_append()' must not be called at all. But that
was in this particular case since the message data length was zero, and
the 'send <= maxnagle' check became true.
We resolve the issue by explicitly checking if Nagle is enabled for the
socket, i.e. 'maxnagle != 0' before calling the 'tipc_msg_append()'. We
also reinforce the function to against such a negative values if any.
Reported-by: [email protected]
Fixes: c0bceb97db9e ("tipc: add smart nagle feature")
Acked-by: Jon Maloy <[email protected]>
Signed-off-by: Tuong Lien <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
Pull NFS client updates from Anna Schumaker:
"New features and improvements:
- Sunrpc receive buffer sizes only change when establishing a GSS credentials
- Add more sunrpc tracepoints
- Improve on tracepoints to capture internal NFS I/O errors
Other bugfixes and cleanups:
- Move a dprintk() to after a call to nfs_alloc_fattr()
- Fix off-by-one issues in rpc_ntop6
- Fix a few coccicheck warnings
- Use the correct SPDX license identifiers
- Fix rpc_call_done assignment for BIND_CONN_TO_SESSION
- Replace zero-length array with flexible array
- Remove duplicate headers
- Set invalid blocks after NFSv4 writes to update space_used attribute
- Fix direct WRITE throughput regression"
* tag 'nfs-for-5.8-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (27 commits)
NFS: Fix direct WRITE throughput regression
SUNRPC: rpc_xprt lifetime events should record xprt->state
xprtrdma: Make xprt_rdma_slot_table_entries static
nfs: set invalid blocks after NFSv4 writes
NFS: remove redundant initialization of variable result
sunrpc: add missing newline when printing parameter 'auth_hashtable_size' by sysfs
NFS: Add a tracepoint in nfs_set_pgio_error()
NFS: Trace short NFS READs
NFS: nfs_xdr_status should record the procedure name
SUNRPC: Set SOFTCONN when destroying GSS contexts
SUNRPC: rpc_call_null_helper() should set RPC_TASK_SOFT
SUNRPC: rpc_call_null_helper() already sets RPC_TASK_NULLCREDS
SUNRPC: trace RPC client lifetime events
SUNRPC: Trace transport lifetime events
SUNRPC: Split the xdr_buf event class
SUNRPC: Add tracepoint to rpc_call_rpcerror()
SUNRPC: Update the RPC_SHOW_SOCKET() macro
SUNRPC: Update the rpc_show_task_flags() macro
SUNRPC: Trace GSS context lifetimes
SUNRPC: receive buffer size estimation values almost never change
...
|
|
Fix the following sparse warning:
net/sunrpc/xprtrdma/transport.c:71:14: warning: symbol 'xprt_rdma_slot_table_entries'
was not declared. Should it be static?
Reported-by: Hulk Robot <[email protected]>
Signed-off-by: Zou Wei <[email protected]>
Reviewed-by: Chuck Lever <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>
|
|
sysfs
When I cat parameter
'/sys/module/sunrpc/parameters/auth_hashtable_size', it displays as
follows. It is better to add a newline for easy reading.
[root@hulk-202 ~]# cat /sys/module/sunrpc/parameters/auth_hashtable_size
16[root@hulk-202 ~]#
Signed-off-by: Xiongfeng Wang <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>
|
|
Move the RPC_TASK_SOFTCONN flag into rpc_call_null_helper(). The
only minor behavior change is that it is now also set when
destroying GSS contexts.
This gives a better guarantee that gss_send_destroy_context() will
not hang for long if a connection cannot be established.
Signed-off-by: Chuck Lever <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>
|
|
Clean up.
All of rpc_call_null_helper() call sites assert RPC_TASK_SOFT, so
move that setting into rpc_call_null_helper() itself.
Signed-off-by: Chuck Lever <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>
|
|
Clean up.
Commit a52458b48af1 ("NFS/NFSD/SUNRPC: replace generic creds with
'struct cred'.") made rpc_call_null_helper() set RPC_TASK_NULLCREDS
unconditionally. Therefore there's no need for
rpc_call_null_helper()'s call sites to set RPC_TASK_NULLCREDS.
Signed-off-by: Chuck Lever <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>
|
|
The "create" tracepoint records parts of the rpc_create arguments,
and the shutdown tracepoint records when the rpc_clnt is about to
signal pending tasks and destroy auths.
Signed-off-by: Chuck Lever <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>
|
|
Refactor: Hoist create/destroy/disconnect tracepoints out of
xprtrdma and into the generic RPC client. Some benefits include:
- Enable tracing of xprt lifetime events for the socket transport
types
- Expose the different types of disconnect to help run down
issues with lingering connections
Signed-off-by: Chuck Lever <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>
|
|
To help tie the recorded xdr_buf to a particular RPC transaction,
the client side version of this class should display task ID
information and the server side one should show the request's XID.
Signed-off-by: Chuck Lever <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>
|
|
Add a tracepoint in another common exit point for failing RPCs.
Signed-off-by: Chuck Lever <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>
|
|
Signed-off-by: Chuck Lever <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>
|
|
Avoid unnecessary cache sloshing by placing the buffer size
estimation update logic behind an atomic bit flag.
The size of GSS information included in each wrapped Reply does
not change during the lifetime of a GSS context. Therefore, the
au_rslack and au_ralign fields need to be updated only once after
establishing a fresh GSS credential.
Thus a slack size update must occur after a cred is created,
duplicated, renewed, or expires. I'm not sure I have this exactly
right. A trace point is introduced to track updates to these
variables to enable troubleshooting the problem if I missed a spot.
Signed-off-by: Chuck Lever <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>
|
|
Pull nfsd updates from Bruce Fields:
"Highlights:
- Keep nfsd clients from unnecessarily breaking their own
delegations.
Note this requires a small kthreadd addition. The result is Tejun
Heo's suggestion (see link), and he was OK with this going through
my tree.
- Patch nfsd/clients/ to display filenames, and to fix byte-order
when displaying stateid's.
- fix a module loading/unloading bug, from Neil Brown.
- A big series from Chuck Lever with RPC/RDMA and tracing
improvements, and lay some groundwork for RPC-over-TLS"
Link: https://lore.kernel.org/r/[email protected]
* tag 'nfsd-5.8' of git://linux-nfs.org/~bfields/linux: (49 commits)
sunrpc: use kmemdup_nul() in gssp_stringify()
nfsd: safer handling of corrupted c_type
nfsd4: make drc_slab global, not per-net
SUNRPC: Remove unreachable error condition in rpcb_getport_async()
nfsd: Fix svc_xprt refcnt leak when setup callback client failed
sunrpc: clean up properly in gss_mech_unregister()
sunrpc: svcauth_gss_register_pseudoflavor must reject duplicate registrations.
sunrpc: check that domain table is empty at module unload.
NFSD: Fix improperly-formatted Doxygen comments
NFSD: Squash an annoying compiler warning
SUNRPC: Clean up request deferral tracepoints
NFSD: Add tracepoints for monitoring NFSD callbacks
NFSD: Add tracepoints to the NFSD state management code
NFSD: Add tracepoints to NFSD's duplicate reply cache
SUNRPC: svc_show_status() macro should have enum definitions
SUNRPC: Restructure svc_udp_recvfrom()
SUNRPC: Refactor svc_recvfrom()
SUNRPC: Clean up svc_release_skb() functions
SUNRPC: Refactor recvfrom path dealing with incomplete TCP receives
SUNRPC: Replace dprintk() call sites in TCP receive path
...
|
|
Added a check in the switch case on start_header that checks for
the existence of the header, and in the case that MAC is not set
and the caller requests for MAC, -EFAULT. If the caller requests
for NET then MAC's existence is completely ignored.
There is no function to check NET header's existence and as far
as cgroup_skb/egress is concerned it should always be set.
Removed for ptr >= the start of header, considering offset is
bounded unsigned and should always be true. len <= end - mac is
redundant to ptr + len <= end.
Fixes: 3eee1f75f2b9 ("bpf: fix bpf_skb_load_bytes_relative pkt length check")
Signed-off-by: YiFei Zhu <[email protected]>
Signed-off-by: Daniel Borkmann <[email protected]>
Reviewed-by: Stanislav Fomichev <[email protected]>
Link: https://lore.kernel.org/bpf/76bb820ddb6a95f59a772ecbd8c8a336f646b362.1591812755.git.zhuyifei@google.com
|
|
If a listening MPTCP socket has unaccepted sockets at close
time, the related msks are freed via mptcp_sock_destruct(),
which in turn does not invoke the proto->destroy() method
nor the mptcp_token_destroy() function.
Due to the above, the child msk socket is not removed from
the token container, leading to later UaF.
Address the issue explicitly removing the token even in the
above error path.
Fixes: 79c0949e9a09 ("mptcp: Add key generation and token tree")
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Matthieu Baerts <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull sysctl fixes from Al Viro:
"Fixups to regressions in sysctl series"
* 'work.sysctl' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
sysctl: reject gigantic reads/write to sysctl files
cdrom: fix an incorrect __user annotation on cdrom_sysctl_info
trace: fix an incorrect __user annotation on stack_trace_sysctl
random: fix an incorrect __user annotation on proc_do_entropy
net/sysctl: remove leftover __user annotations on neigh_proc_dointvec*
net/sysctl: use cpumask_parse in flow_limit_cpu_sysctl
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/will/linux
Pull READ/WRITE_ONCE rework from Will Deacon:
"This the READ_ONCE rework I've been working on for a while, which
bumps the minimum GCC version and improves code-gen on arm64 when
stack protector is enabled"
[ Side note: I'm _really_ tempted to raise the minimum gcc version to
4.9, so that we can just say that we require _Generic() support.
That would allow us to more cleanly handle a lot of the cases where we
depend on very complex macros with 'sizeof' or __builtin_choose_expr()
with __builtin_types_compatible_p() etc.
This branch has a workaround for sparse not handling _Generic(),
either, but that was already fixed in the sparse development branch,
so it's really just gcc-4.9 that we'd require. - Linus ]
* 'rwonce/rework' of git://git.kernel.org/pub/scm/linux/kernel/git/will/linux:
compiler_types.h: Use unoptimized __unqual_scalar_typeof for sparse
compiler_types.h: Optimize __unqual_scalar_typeof compilation time
compiler.h: Enforce that READ_ONCE_NOCHECK() access size is sizeof(long)
compiler-types.h: Include naked type in __pick_integer_type() match
READ_ONCE: Fix comment describing 2x32-bit atomicity
gcov: Remove old GCC 3.4 support
arm64: barrier: Use '__unqual_scalar_typeof' for acquire/release macros
locking/barriers: Use '__unqual_scalar_typeof' for load-acquire macros
READ_ONCE: Drop pointer qualifiers when reading from scalar types
READ_ONCE: Enforce atomicity for {READ,WRITE}_ONCE() memory accesses
READ_ONCE: Simplify implementations of {READ,WRITE}_ONCE()
arm64: csum: Disable KASAN for do_csum()
fault_inject: Don't rely on "return value" from WRITE_ONCE()
net: tls: Avoid assigning 'const' pointer to non-const pointer
netfilter: Avoid assigning 'const' pointer to non-const pointer
compiler/gcc: Raise minimum GCC version for kernel builds to 4.8
|
|
The msk sk_shutdown flag is set by a workqueue, possibly
introducing some delay in user-space notification. If the last
subflow carries some data with the fin packet, the user space
can wake-up before RCV_SHUTDOWN is set. If it executes unblocking
recvmsg(), it may return with an error instead of eof.
Address the issue explicitly checking for eof in recvmsg(), when
no data is found.
Fixes: 59832e246515 ("mptcp: subflow: check parent mptcp socket on subflow state change")
Signed-off-by: Paolo Abeni <[email protected]>
Reviewed-by: Matthieu Baerts <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|
|
fdb nexthops are marked with a flag. For standalone nexthops, a flag was
added to the nh_info struct. For groups that flag was added to struct
nexthop when it should have been added to the group information. Fix
by removing the flag from the nexthop struct and adding a flag to nh_group
that mirrors nh_info and is really only a caching of the individual types.
Add a helper, nexthop_is_fdb, for use by the vxlan code and fixup the
internal code to use the flag from either nh_info or nh_group.
v2
- propagate fdb_nh in remove_nh_grp_entry
Fixes: 38428d68719c ("nexthop: support for fdb ecmp nexthops")
Cc: Roopa Prabhu <[email protected]>
Signed-off-by: David Ahern <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
|