Age | Commit message (Collapse) | Author | Files | Lines |
|
Bartosz Golaszewski says:
====================
net: stmmac: introduce devres helpers for stmmac platform drivers
The goal of this series is two-fold: to make the API for stmmac platforms more
logically correct (by providing functions that acquire resources with release
counterparts that undo only their actions and nothing more) and to provide
devres variants of commonly use registration functions that allows to
significantly simplify the platform drivers.
The current pattern for stmmac platform drivers is to call
stmmac_probe_config_dt(), possibly the platform's init() callback and then
call stmmac_drv_probe(). The resources allocated by these calls will then
be released by calling stmmac_pltfr_remove(). This goes against the commonly
accepted way of providing each function that allocated a resource with a
function that frees it.
First: provide wrappers around platform's init() and exit() callbacks that
allow users to skip checking if the callbacks exist manually.
Second: provide stmmac_pltfr_probe() which calls the platform init() callback
and then calls stmmac_drv_probe() together with a variant of
stmmac_pltfr_remove() that DOES NOT call stmmac_remove_config_dt(). For now
this variant is called stmmac_pltfr_remove_no_dt() but once all users of
the old stmmac_pltfr_remove() are converted to the devres helper, it will be
renamed back to stmmac_pltfr_remove() and the no_dt function removed.
Finally use the devres helpers in dwmac-qco-ethqos to show how much simplier
the driver's probe() becomes.
This series obviously just starts the conversion process and other platform
drivers will need to be converted once the helpers land in net/.
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Use the devres variant of stmmac_pltfr_probe() and finally drop the
remove() callback entirely.
Signed-off-by: Bartosz Golaszewski <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Provide a devres variant of stmmac_pltfr_probe() which allows users to
skip calling stmmac_pltfr_remove() at driver detach.
Signed-off-by: Bartosz Golaszewski <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Significantly simplify the driver's probe() function by using the devres
variant of stmmac_probe_config_dt(). This allows to drop the goto jumps
entirely.
The remove_new() callback now needs to be switched to
stmmac_pltfr_remove_no_dt().
Signed-off-by: Bartosz Golaszewski <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Provide a devres variant of stmmac_probe_config_dt() that allows users to
skip calling stmmac_remove_config_dt() at driver detach.
Signed-off-by: Bartosz Golaszewski <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Add a variant of stmmac_pltfr_remove() that only frees resources
allocated by stmmac_pltfr_probe() and - unlike stmmac_pltfr_remove() -
does not call stmmac_remove_config_dt().
Signed-off-by: Bartosz Golaszewski <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Shrink the code and remove labels by using the new stmmac_pltfr_probe()
function.
Signed-off-by: Bartosz Golaszewski <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Implement stmmac_pltfr_probe() which is the logical API counterpart
for stmmac_pltfr_remove(). It calls the platform's init() callback and
then probes the stmmac device.
Signed-off-by: Bartosz Golaszewski <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Shrink the code in dwmac-generic by using the new stmmac_pltfr_exit()
helper.
Signed-off-by: Bartosz Golaszewski <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Provide a helper wrapper around calling the platform's exit() callback.
This allows users to skip checking if the callback exists.
Signed-off-by: Bartosz Golaszewski <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Shrink the code in dwmac-generic by using the new stmmac_pltfr_init()
helper.
Signed-off-by: Bartosz Golaszewski <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Provide a helper wrapper around calling the platform's init() callback.
This allows users to skip checking if the callback exists.
Signed-off-by: Bartosz Golaszewski <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2023-06-22 (ice)
This series contains updates to ice driver only.
Jake adds a slight wait on control queue send to reduce wait time for
responses that occur within normal times.
Maciej allows for hot-swapping XDP programs.
Przemek removes unnecessary checks when enabling SR-IOV and freeing
allocated memory.
Christophe Jaillet converts a managed memory allocation to a regular one.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
ice: use ice_down_up() where applicable
ice: Remove managed memory usage in ice_get_fw_log_cfg()
ice: remove null checks before devm_kfree() calls
ice: clean up freeing SR-IOV VFs
ice: allow hot-swapping XDP programs
ice: reduce initial wait for control queue messages
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Even when tcp_splice_read() reads all it was asked for, for blocking
sockets it'll release and immediately regrab the socket lock, loop
around and break on the while check.
Check tss.len right after we adjust it, and return if we're done.
That saves us one release_sock(); lock_sock(); pair per successful
blocking splice read.
Signed-off-by: Pavel Begunkov <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Link: https://lore.kernel.org/r/80736a2cc6d478c383ea565ba825eaf4d1abd876.1687523671.git.asml.silence@gmail.com
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
syzkaller reported use-after-free in __gtp_encap_destroy(). [0]
It shows the same process freed sk and touched it illegally.
Commit e198987e7dd7 ("gtp: fix suspicious RCU usage") added lock_sock()
and release_sock() in __gtp_encap_destroy() to protect sk->sk_user_data,
but release_sock() is called after sock_put() releases the last refcnt.
[0]:
BUG: KASAN: slab-use-after-free in instrument_atomic_read_write include/linux/instrumented.h:96 [inline]
BUG: KASAN: slab-use-after-free in atomic_try_cmpxchg_acquire include/linux/atomic/atomic-instrumented.h:541 [inline]
BUG: KASAN: slab-use-after-free in queued_spin_lock include/asm-generic/qspinlock.h:111 [inline]
BUG: KASAN: slab-use-after-free in do_raw_spin_lock include/linux/spinlock.h:186 [inline]
BUG: KASAN: slab-use-after-free in __raw_spin_lock_bh include/linux/spinlock_api_smp.h:127 [inline]
BUG: KASAN: slab-use-after-free in _raw_spin_lock_bh+0x75/0xe0 kernel/locking/spinlock.c:178
Write of size 4 at addr ffff88800dbef398 by task syz-executor.2/2401
CPU: 1 PID: 2401 Comm: syz-executor.2 Not tainted 6.4.0-rc5-01219-gfa0e21fa4443 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x72/0xa0 lib/dump_stack.c:106
print_address_description mm/kasan/report.c:351 [inline]
print_report+0xcc/0x620 mm/kasan/report.c:462
kasan_report+0xb2/0xe0 mm/kasan/report.c:572
check_region_inline mm/kasan/generic.c:181 [inline]
kasan_check_range+0x39/0x1c0 mm/kasan/generic.c:187
instrument_atomic_read_write include/linux/instrumented.h:96 [inline]
atomic_try_cmpxchg_acquire include/linux/atomic/atomic-instrumented.h:541 [inline]
queued_spin_lock include/asm-generic/qspinlock.h:111 [inline]
do_raw_spin_lock include/linux/spinlock.h:186 [inline]
__raw_spin_lock_bh include/linux/spinlock_api_smp.h:127 [inline]
_raw_spin_lock_bh+0x75/0xe0 kernel/locking/spinlock.c:178
spin_lock_bh include/linux/spinlock.h:355 [inline]
release_sock+0x1f/0x1a0 net/core/sock.c:3526
gtp_encap_disable_sock drivers/net/gtp.c:651 [inline]
gtp_encap_disable+0xb9/0x220 drivers/net/gtp.c:664
gtp_dev_uninit+0x19/0x50 drivers/net/gtp.c:728
unregister_netdevice_many_notify+0x97e/0x1520 net/core/dev.c:10841
rtnl_delete_link net/core/rtnetlink.c:3216 [inline]
rtnl_dellink+0x3c0/0xb30 net/core/rtnetlink.c:3268
rtnetlink_rcv_msg+0x450/0xb10 net/core/rtnetlink.c:6423
netlink_rcv_skb+0x15d/0x450 net/netlink/af_netlink.c:2548
netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline]
netlink_unicast+0x700/0x930 net/netlink/af_netlink.c:1365
netlink_sendmsg+0x91c/0xe30 net/netlink/af_netlink.c:1913
sock_sendmsg_nosec net/socket.c:724 [inline]
sock_sendmsg+0x1b7/0x200 net/socket.c:747
____sys_sendmsg+0x75a/0x990 net/socket.c:2493
___sys_sendmsg+0x11d/0x1c0 net/socket.c:2547
__sys_sendmsg+0xfe/0x1d0 net/socket.c:2576
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3f/0x90 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x72/0xdc
RIP: 0033:0x7f1168b1fe5d
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 9f 1b 00 f7 d8 64 89 01 48
RSP: 002b:00007f1167edccc8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00000000004bbf80 RCX: 00007f1168b1fe5d
RDX: 0000000000000000 RSI: 00000000200002c0 RDI: 0000000000000003
RBP: 00000000004bbf80 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000000b R14: 00007f1168b80530 R15: 0000000000000000
</TASK>
Allocated by task 1483:
kasan_save_stack+0x22/0x50 mm/kasan/common.c:45
kasan_set_track+0x25/0x30 mm/kasan/common.c:52
__kasan_slab_alloc+0x59/0x70 mm/kasan/common.c:328
kasan_slab_alloc include/linux/kasan.h:186 [inline]
slab_post_alloc_hook mm/slab.h:711 [inline]
slab_alloc_node mm/slub.c:3451 [inline]
slab_alloc mm/slub.c:3459 [inline]
__kmem_cache_alloc_lru mm/slub.c:3466 [inline]
kmem_cache_alloc+0x16d/0x340 mm/slub.c:3475
sk_prot_alloc+0x5f/0x280 net/core/sock.c:2073
sk_alloc+0x34/0x6c0 net/core/sock.c:2132
inet6_create net/ipv6/af_inet6.c:192 [inline]
inet6_create+0x2c7/0xf20 net/ipv6/af_inet6.c:119
__sock_create+0x2a1/0x530 net/socket.c:1535
sock_create net/socket.c:1586 [inline]
__sys_socket_create net/socket.c:1623 [inline]
__sys_socket_create net/socket.c:1608 [inline]
__sys_socket+0x137/0x250 net/socket.c:1651
__do_sys_socket net/socket.c:1664 [inline]
__se_sys_socket net/socket.c:1662 [inline]
__x64_sys_socket+0x72/0xb0 net/socket.c:1662
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3f/0x90 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x72/0xdc
Freed by task 2401:
kasan_save_stack+0x22/0x50 mm/kasan/common.c:45
kasan_set_track+0x25/0x30 mm/kasan/common.c:52
kasan_save_free_info+0x2e/0x50 mm/kasan/generic.c:521
____kasan_slab_free mm/kasan/common.c:236 [inline]
____kasan_slab_free mm/kasan/common.c:200 [inline]
__kasan_slab_free+0x10c/0x1b0 mm/kasan/common.c:244
kasan_slab_free include/linux/kasan.h:162 [inline]
slab_free_hook mm/slub.c:1781 [inline]
slab_free_freelist_hook mm/slub.c:1807 [inline]
slab_free mm/slub.c:3786 [inline]
kmem_cache_free+0xb4/0x490 mm/slub.c:3808
sk_prot_free net/core/sock.c:2113 [inline]
__sk_destruct+0x500/0x720 net/core/sock.c:2207
sk_destruct+0xc1/0xe0 net/core/sock.c:2222
__sk_free+0xed/0x3d0 net/core/sock.c:2233
sk_free+0x7c/0xa0 net/core/sock.c:2244
sock_put include/net/sock.h:1981 [inline]
__gtp_encap_destroy+0x165/0x1b0 drivers/net/gtp.c:634
gtp_encap_disable_sock drivers/net/gtp.c:651 [inline]
gtp_encap_disable+0xb9/0x220 drivers/net/gtp.c:664
gtp_dev_uninit+0x19/0x50 drivers/net/gtp.c:728
unregister_netdevice_many_notify+0x97e/0x1520 net/core/dev.c:10841
rtnl_delete_link net/core/rtnetlink.c:3216 [inline]
rtnl_dellink+0x3c0/0xb30 net/core/rtnetlink.c:3268
rtnetlink_rcv_msg+0x450/0xb10 net/core/rtnetlink.c:6423
netlink_rcv_skb+0x15d/0x450 net/netlink/af_netlink.c:2548
netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline]
netlink_unicast+0x700/0x930 net/netlink/af_netlink.c:1365
netlink_sendmsg+0x91c/0xe30 net/netlink/af_netlink.c:1913
sock_sendmsg_nosec net/socket.c:724 [inline]
sock_sendmsg+0x1b7/0x200 net/socket.c:747
____sys_sendmsg+0x75a/0x990 net/socket.c:2493
___sys_sendmsg+0x11d/0x1c0 net/socket.c:2547
__sys_sendmsg+0xfe/0x1d0 net/socket.c:2576
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3f/0x90 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x72/0xdc
The buggy address belongs to the object at ffff88800dbef300
which belongs to the cache UDPv6 of size 1344
The buggy address is located 152 bytes inside of
freed 1344-byte region [ffff88800dbef300, ffff88800dbef840)
The buggy address belongs to the physical page:
page:00000000d31bfed5 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88800dbeed40 pfn:0xdbe8
head:00000000d31bfed5 order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0
memcg:ffff888008ee0801
flags: 0x100000000010200(slab|head|node=0|zone=1)
page_type: 0xffffffff()
raw: 0100000000010200 ffff88800c7a3000 dead000000000122 0000000000000000
raw: ffff88800dbeed40 0000000080160015 00000001ffffffff ffff888008ee0801
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff88800dbef280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff88800dbef300: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88800dbef380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff88800dbef400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff88800dbef480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
Fixes: e198987e7dd7 ("gtp: fix suspicious RCU usage")
Reported-by: syzkaller <[email protected]>
Signed-off-by: Kuniyuki Iwashima <[email protected]>
Reviewed-by: Pablo Neira Ayuso <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
On systems where netdevsim is built-in or loaded before the test
starts, kci_test_ipsec_offload doesn't remove the netdevsim device it
created during the test.
Fixes: e05b2d141fef ("netdevsim: move netdev creation/destruction to dev probe")
Signed-off-by: Sabrina Dubroca <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Link: https://lore.kernel.org/r/e1cb94f4f82f4eca4a444feec4488a1323396357.1687466906.git.sd@queasysnail.net
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
syzkaller hit a WARN_ON_ONCE(!scm->pid) in scm_pidfd_recv().
In unix_stream_read_generic(), if there is no skb in the queue, we could
bail out the do-while loop without calling scm_set_cred():
1. No skb in the queue
2. sk is non-blocking
or
shutdown(sk, RCV_SHUTDOWN) is called concurrently
or
peer calls close()
If the socket is configured with SO_PASSCRED or SO_PASSPIDFD, scm_recv()
would populate cmsg with garbage.
Let's not call scm_recv() unless there is skb to receive.
WARNING: CPU: 1 PID: 3245 at include/net/scm.h:138 scm_pidfd_recv include/net/scm.h:138 [inline]
WARNING: CPU: 1 PID: 3245 at include/net/scm.h:138 scm_recv.constprop.0+0x754/0x850 include/net/scm.h:177
Modules linked in:
CPU: 1 PID: 3245 Comm: syz-executor.1 Not tainted 6.4.0-rc5-01219-gfa0e21fa4443 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:scm_pidfd_recv include/net/scm.h:138 [inline]
RIP: 0010:scm_recv.constprop.0+0x754/0x850 include/net/scm.h:177
Code: 67 fd e9 55 fd ff ff e8 4a 70 67 fd e9 7f fd ff ff e8 40 70 67 fd e9 3e fb ff ff e8 36 70 67 fd e9 02 fd ff ff e8 8c 3a 20 fd <0f> 0b e9 fe fb ff ff e8 50 70 67 fd e9 2e f9 ff ff e8 46 70 67 fd
RSP: 0018:ffffc90009af7660 EFLAGS: 00010216
RAX: 00000000000000a1 RBX: ffff888041e58a80 RCX: ffffc90003852000
RDX: 0000000000040000 RSI: ffffffff842675b4 RDI: 0000000000000007
RBP: ffffc90009af7810 R08: 0000000000000007 R09: 0000000000000013
R10: 00000000000000f8 R11: 0000000000000001 R12: ffffc90009af7db0
R13: 0000000000000000 R14: ffff888041e58a88 R15: 1ffff9200135eecc
FS: 00007f6b7113f640(0000) GS:ffff88806cf00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f6b7111de38 CR3: 0000000012a6e002 CR4: 0000000000770ee0
PKRU: 55555554
Call Trace:
<TASK>
unix_stream_read_generic+0x5fe/0x1f50 net/unix/af_unix.c:2830
unix_stream_recvmsg+0x194/0x1c0 net/unix/af_unix.c:2880
sock_recvmsg_nosec net/socket.c:1019 [inline]
sock_recvmsg+0x188/0x1d0 net/socket.c:1040
____sys_recvmsg+0x210/0x610 net/socket.c:2712
___sys_recvmsg+0xff/0x190 net/socket.c:2754
do_recvmmsg+0x25d/0x6c0 net/socket.c:2848
__sys_recvmmsg net/socket.c:2927 [inline]
__do_sys_recvmmsg net/socket.c:2950 [inline]
__se_sys_recvmmsg net/socket.c:2943 [inline]
__x64_sys_recvmmsg+0x224/0x290 net/socket.c:2943
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3f/0x90 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x72/0xdc
RIP: 0033:0x7f6b71da2e5d
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 9f 1b 00 f7 d8 64 89 01 48
RSP: 002b:00007f6b7113ecc8 EFLAGS: 00000246 ORIG_RAX: 000000000000012b
RAX: ffffffffffffffda RBX: 00000000004bc050 RCX: 00007f6b71da2e5d
RDX: 0000000000000007 RSI: 0000000020006600 RDI: 000000000000000b
RBP: 00000000004bc050 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000120 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000006e R14: 00007f6b71e03530 R15: 0000000000000000
</TASK>
Fixes: 5e2ff6704a27 ("scm: add SO_PASSPIDFD and SCM_PIDFD")
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzkaller <[email protected]>
Signed-off-by: Kuniyuki Iwashima <[email protected]>
Reviewed-by: Alexander Mikhalitsyn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
In blamed commit, I missed that get_dist_table() was allocating
memory using GFP_KERNEL, and acquiring qdisc lock to perform
the swap of newly allocated table with current one.
In this patch, get_dist_table() is allocating memory and
copy user data before we acquire the qdisc lock.
Then we perform swap operations while being protected by the lock.
Note that after this patch netem_change() no longer can do partial changes.
If an error is returned, qdisc conf is left unchanged.
Fixes: 2174a08db80d ("sch_netem: acquire qdisc lock in netem_change()")
Reported-by: syzbot <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Acked-by: Jamal Hadi Salim <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2023-06-22 (iavf)
This series contains updates to iavf driver only.
Przemek defers removing, previous, primary MAC address until after
getting result of adding its replacement. He also does some cleanup by
removing unused functions and making applicable functions static.
* '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
iavf: make functions static where possible
iavf: remove some unused functions and pointless wrappers
iavf: fix err handling for MAC replace
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The referenced patch is causing build errors when ETHERNET=y and
FDDI=m. While we work out the preferred patch(es), revert this patch
to make the pain go away.
Fixes: 128272336120 ("s390/net: lcs: use IS_ENABLED() for kconfig detection")
Signed-off-by: Randy Dunlap <[email protected]>
Reported-by: kernel test robot <[email protected]>
Link: lore.kernel.org/r/[email protected]
Cc: Alexandra Winter <[email protected]>
Cc: Wenjia Zhang <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: Alexander Gordeev <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Sven Schnelle <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Linux provides phy_set_bits() helper so let's drop brcm_phy_setbits() and
use phy_set_bits() in its place.
Signed-off-by: Giulio Benetti <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
igc: TX timestamping fixes
This is the fixes part of the series intended to add support for using
the 4 timestamp registers present in i225/i226.
Moving the timestamp handling to be inline with the interrupt handling
has the advantage of improving the TX timestamping retrieval latency,
here are some numbers using ntpperf:
Before:
$ sudo ./ntpperf -i enp3s0 -m 10:22:22:22:22:21 -d 192.168.1.3 -s 172.18.0.0/16 -I -H -o -37
| responses | TX timestamp offset (ns)
rate clients | lost invalid basic xleave | min mean max stddev
1000 100 0.00% 0.00% 0.00% 100.00% -56 +9 +52 19
1500 150 0.00% 0.00% 0.00% 100.00% -40 +30 +75 22
2250 225 0.00% 0.00% 0.00% 100.00% -11 +29 +72 15
3375 337 0.00% 0.00% 0.00% 100.00% -18 +40 +88 22
5062 506 0.00% 0.00% 0.00% 100.00% -19 +23 +77 15
7593 759 0.00% 0.00% 0.00% 100.00% +7 +47 +5168 43
11389 1138 0.00% 0.00% 0.00% 100.00% -11 +41 +5240 39
17083 1708 0.00% 0.00% 0.00% 100.00% +19 +60 +5288 50
25624 2562 0.00% 0.00% 0.00% 100.00% +1 +56 +5368 58
38436 3843 0.00% 0.00% 0.00% 100.00% -84 +12 +8847 66
57654 5765 0.00% 0.00% 100.00% 0.00%
86481 8648 0.00% 0.00% 100.00% 0.00%
129721 12972 0.00% 0.00% 100.00% 0.00%
194581 16384 0.00% 0.00% 100.00% 0.00%
291871 16384 27.35% 0.00% 72.65% 0.00%
437806 16384 50.05% 0.00% 49.95% 0.00%
After:
$ sudo ./ntpperf -i enp3s0 -m 10:22:22:22:22:21 -d 192.168.1.3 -s 172.18.0.0/16 -I -H -o -37
| responses | TX timestamp offset (ns)
rate clients | lost invalid basic xleave | min mean max stddev
1000 100 0.00% 0.00% 0.00% 100.00% -44 +0 +61 19
1500 150 0.00% 0.00% 0.00% 100.00% -6 +39 +81 16
2250 225 0.00% 0.00% 0.00% 100.00% -22 +25 +69 15
3375 337 0.00% 0.00% 0.00% 100.00% -28 +15 +56 14
5062 506 0.00% 0.00% 0.00% 100.00% +7 +78 +143 27
7593 759 0.00% 0.00% 0.00% 100.00% -54 +24 +144 47
11389 1138 0.00% 0.00% 0.00% 100.00% -90 -33 +28 21
17083 1708 0.00% 0.00% 0.00% 100.00% -50 -2 +35 14
25624 2562 0.00% 0.00% 0.00% 100.00% -62 +7 +66 23
38436 3843 0.00% 0.00% 0.00% 100.00% -33 +30 +5395 36
57654 5765 0.00% 0.00% 100.00% 0.00%
86481 8648 0.00% 0.00% 100.00% 0.00%
129721 12972 0.00% 0.00% 100.00% 0.00%
194581 16384 19.50% 0.00% 80.50% 0.00%
291871 16384 35.81% 0.00% 64.19% 0.00%
437806 16384 55.40% 0.00% 44.60% 0.00%
During this series, and to show that as is always the case, things are
never easy as they should be, a hardware issue was found, and it took
some time to find the workaround(s). The bug and workaround are better
explained in patch 4/4.
Note: the workaround has a simpler alternative, but it would involve
adding support for the other timestamp registers, and only using the
TXSTMP{H/L}_0 as a way to clear the interrupt. But I feel bad about
throwing this kind of resources away. Didn't test this extensively but
it should work.
Also, as Marc Kleine-Budde suggested, after some consensus is reached
on this series, most parts of it will be proposed for igb.
* '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
igc: Work around HW bug causing missing timestamps
igc: Retrieve TX timestamp during interrupt handling
igc: Check if hardware TX timestamping is enabled earlier
igc: Fix race condition in PTP tx code
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
pull-request: bpf-next 2023-06-23
We've added 49 non-merge commits during the last 24 day(s) which contain
a total of 70 files changed, 1935 insertions(+), 442 deletions(-).
The main changes are:
1) Extend bpf_fib_lookup helper to allow passing the route table ID,
from Louis DeLosSantos.
2) Fix regsafe() in verifier to call check_ids() for scalar registers,
from Eduard Zingerman.
3) Extend the set of cpumask kfuncs with bpf_cpumask_first_and()
and a rework of bpf_cpumask_any*() kfuncs. Additionally,
add selftests, from David Vernet.
4) Fix socket lookup BPF helpers for tc/XDP to respect VRF bindings,
from Gilad Sever.
5) Change bpf_link_put() to use workqueue unconditionally to fix it
under PREEMPT_RT, from Sebastian Andrzej Siewior.
6) Follow-ups to address issues in the bpf_refcount shared ownership
implementation, from Dave Marchevsky.
7) A few general refactorings to BPF map and program creation permissions
checks which were part of the BPF token series, from Andrii Nakryiko.
8) Various fixes for benchmark framework and add a new benchmark
for BPF memory allocator to BPF selftests, from Hou Tao.
9) Documentation improvements around iterators and trusted pointers,
from Anton Protopopov.
10) Small cleanup in verifier to improve allocated object check,
from Daniel T. Lee.
11) Improve performance of bpf_xdp_pointer() by avoiding access
to shared_info when XDP packet does not have frags,
from Jesper Dangaard Brouer.
12) Silence a harmless syzbot-reported warning in btf_type_id_size(),
from Yonghong Song.
13) Remove duplicate bpfilter_umh_cleanup in favor of umd_cleanup_helper,
from Jarkko Sakkinen.
14) Fix BPF selftests build for resolve_btfids under custom HOSTCFLAGS,
from Viktor Malik.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (49 commits)
bpf, docs: Document existing macros instead of deprecated
bpf, docs: BPF Iterator Document
selftests/bpf: Fix compilation failure for prog vrf_socket_lookup
selftests/bpf: Add vrf_socket_lookup tests
bpf: Fix bpf socket lookup from tc/xdp to respect socket VRF bindings
bpf: Call __bpf_sk_lookup()/__bpf_skc_lookup() directly via TC hookpoint
bpf: Factor out socket lookup functions for the TC hookpoint.
selftests/bpf: Set the default value of consumer_cnt as 0
selftests/bpf: Ensure that next_cpu() returns a valid CPU number
selftests/bpf: Output the correct error code for pthread APIs
selftests/bpf: Use producer_cnt to allocate local counter array
xsk: Remove unused inline function xsk_buff_discard()
bpf: Keep BPF_PROG_LOAD permission checks clear of validations
bpf: Centralize permissions checks for all BPF map types
bpf: Inline map creation logic in map_create() function
bpf: Move unprivileged checks into map_create() and bpf_prog_load()
bpf: Remove in_atomic() from bpf_link_put().
selftests/bpf: Verify that check_ids() is used for scalars in regsafe()
bpf: Verify scalar ids mapping in regsafe() using check_ids()
selftests/bpf: Check if mark_chain_precision() follows scalar ids
...
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
This is a small step towards a model where GUP itself would not expand
the stack, and any user that needs GUP to not look up existing mappings,
but actually expand on them, would have to do so manually before-hand,
and with the mm lock held for writing.
It turns out that execve() already did almost exactly that, except it
didn't take the mm lock at all (it's single-threaded so no locking
technically needed, but it could cause lockdep errors). And it only did
it for the CONFIG_STACK_GROWSUP case, since in that case GUP has
obviously never expanded the stack downwards.
So just make that CONFIG_STACK_GROWSUP case do the right thing with
locking, and enable it generally. This will eventually help GUP, and in
the meantime avoids a special case and the lockdep issue.
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Make calls to extend_vma() and find_extend_vma() fail if the write lock
is required.
To avoid making this a flag-day event, this still allows the old
read-locking case for the trivial situations, and passes in a flag to
say "is it write-locked". That way write-lockers can say "yes, I'm
being careful", and legacy users will continue to work in all the common
cases until they have been fully converted to the new world order.
Co-Developed-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Liam R. Howlett <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This is one of the simple cases, except there's no pt_regs pointer.
Which is fine, as lock_mm_and_find_vma() is set up to work fine with a
NULL pt_regs.
Powerpc already enabled LOCK_MM_AND_FIND_VMA for the main CPU faulting,
so we can just use the helper without any extra work.
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This does the simple pattern conversion of alpha, arc, csky, hexagon,
loongarch, nios2, sh, sparc32, and xtensa to the lock_mm_and_find_vma()
helper. They all have the regular fault handling pattern without odd
special cases.
The remaining architectures all have something that keeps us from a
straightforward conversion: ia64 and parisc have stacks that can grow
both up as well as down (and ia64 has special address region checks).
And m68k, microblaze, openrisc, sparc64, and um end up having extra
rules about only expanding the stack down a limited amount below the
user space stack pointer. That is something that x86 used to do too
(long long ago), and it probably could just be skipped, but it still
makes the conversion less than trivial.
Note that this conversion was done manually and with the exception of
alpha without any build testing, because I have a fairly limited cross-
building environment. The cases are all simple, and I went through the
changes several times, but...
Signed-off-by: Linus Torvalds <[email protected]>
|
|
arm has an additional check for address < FIRST_USER_ADDRESS before
expanding the stack. Since FIRST_USER_ADDRESS is defined everywhere
(generally as 0), move that check to the generic expand_downwards().
Signed-off-by: Ben Hutchings <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Signed-off-by: Ben Hutchings <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Signed-off-by: Ben Hutchings <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Signed-off-by: Michael Ellerman <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This converts arm64 to use the new page fault helper. It was very
straightforward, but still needed a fix for the "obvious" conversion I
initially did. Thanks to Suren for the fix and testing.
Fixed-and-tested-by: Suren Baghdasaryan <[email protected]>
Unnecessary-code-removal-by: Liam R. Howlett <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This is done as a separate patch from introducing the new
lock_mm_and_find_vma() helper, because while it's an obvious change,
it's not what x86 used to do in this area.
We already abort the page fault on fatal signals anyway, so why should
we wait for the mmap lock only to then abort later? With the new helper
function that returns without the lock held on failure anyway, this is
particularly easy and straightforward.
Signed-off-by: Linus Torvalds <[email protected]>
|
|
.. and make x86 use it.
This basically extracts the existing x86 "find and expand faulting vma"
code, but extends it to also take the mmap lock for writing in case we
actually do need to expand the vma.
We've historically short-circuited that case, and have some rather ugly
special logic to serialize the stack segment expansion (since we only
hold the mmap lock for reading) that doesn't match the normal VM
locking.
That slight violation of locking worked well, right up until it didn't:
the maple tree code really does want proper locking even for simple
extension of an existing vma.
So extract the code for "look up the vma of the fault" from x86, fix it
up to do the necessary write locking, and make it available as a helper
function for other architectures that can use the common helper.
Note: I say "common helper", but it really only handles the normal
stack-grows-down case. Which is all architectures except for PA-RISC
and IA64. So some rare architectures can't use the helper, but if they
care they'll just need to open-code this logic.
It's also worth pointing out that this code really would like to have an
optimistic "mmap_upgrade_trylock()" to make it quicker to go from a
read-lock (for the common case) to taking the write lock (for having to
extend the vma) in the normal single-threaded situation where there is
no other locking activity.
But that _is_ all the very uncommon special case, so while it would be
nice to have such an operation, it probably doesn't matter in reality.
I did put in the skeleton code for such a possible future expansion,
even if it only acts as pseudo-documentation for what we're doing.
Signed-off-by: Linus Torvalds <[email protected]>
|
|
There is a potential OOB read at fast_imageblit, for
"colortab[(*src >> 4)]" can become a negative value due to
"const char *s = image->data, *src".
This change makes sure the index for colortab always positive
or zero.
Similar commit:
https://patchwork.kernel.org/patch/11746067
Potential bug report:
https://groups.google.com/g/syzkaller-bugs/c/9ubBXKeKXf4/m/k-QXy4UgAAAJ
Signed-off-by: Zhang Shurong <[email protected]>
Cc: [email protected]
Signed-off-by: Helge Deller <[email protected]>
|
|
The series 2022/2023 reports slightly longer vendor/product strings
and shares USB ids. Technically the reply size is the USB HID packet
size (64 bytes) but all the supported commands do not use more than 8
bytes and replies reporting back strings do not use more then 24 bytes
(vendor and product are in one string in the newer devices now). The
rest of the reply is always filled with '\0'. Also update comments
and documentation accordingly.
Signed-off-by: Wilken Gottwalt <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Guenter Roeck <[email protected]>
|
|
In the absence of other debug facilities dumping user code around the
unhandled exception address may help debugging the issue.
Signed-off-by: Max Filippov <[email protected]>
|
|
Commit a3eb95484f27 ("spi: dt-bindings: atmel,at91rm9200-spi: add sam9x7
compatible") adding sam9x7 compatible did not make any sense as it added
new compatible into middle of existing compatible list. The intention
was probably to add new set of compatibles with sam9x7 as first one.
Fixes: a3eb95484f27 ("spi: dt-bindings: atmel,at91rm9200-spi: add sam9x7 compatible")
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
Link: https://lore.kernel.org/r/Message-Id: <[email protected]>
Signed-off-by: Mark Brown <[email protected]>
|
|
Commit d4313a68ec91 ("fbdev/media: Use GPIO descriptors for VIA GPIO")
moves via-gpio.h from include/linux to drivers/video/fbdev/via, but misses
to adjust the file entry for the VIA UNICHROME(PRO)/CHROME9 FRAMEBUFFER
DRIVER section.
Hence, ./scripts/get_maintainer.pl --self-test=patterns complains about a
broken reference.
Remove the file entry in VIA UNICHROME(PRO)/CHROME9 FRAMEBUFFER DRIVER, as
the new location of the header is already covered by the file entry
drivers/video/fbdev/via/.
Signed-off-by: Lukas Bulwahn <[email protected]>
Fixes: d4313a68ec91 ("fbdev/media: Use GPIO descriptors for VIA GPIO")
Signed-off-by: Helge Deller <[email protected]>
|
|
Petr Machata says:
====================
mlxsw: Maintain candidate RIFs
The mlxsw driver currently makes the assumption that the user applies
configuration in a bottom-up manner. Thus netdevices need to be added to
the bridge before IP addresses are configured on that bridge or SVI added
on top of it. Enslaving a netdevice to another netdevice that already has
uppers is in fact forbidden by mlxsw for this reason. Despite this safety,
it is rather easy to get into situations where the offloaded configuration
is just plain wrong.
As an example, take a front panel port, configure an IP address: it gets a
RIF. Now enslave the port to the bridge, and the RIF is gone. Remove the
port from the bridge again, but the RIF never comes back. There is a number
of similar situations, where changing the configuration there and back
utterly breaks the offload.
The situation is going to be made better by implementing a range of replays
and post-hoc offloads.
This patch set lays the ground for replay of next hops. The particular
issue that it deals with is that currently, driver-specific bookkeeping for
next hops is hooked off RIF objects, which come and go across the lifetime
of a netdevice. We would rather keep these objects at an entity that
mirrors the lifetime of the netdevice itself. That way they are at hand and
can be offloaded when a RIF is eventually created.
To that end, with this patchset, mlxsw keeps a hash table of CRIFs:
candidate RIFs, persistent handles for netdevices that mlxsw deems
potentially interesting. The lifetime of a CRIF matches that of the
underlying netdevice, and thus a RIF can always assume a CRIF exists. A
CRIF is where next hops are kept, and when RIF is created, these next hops
can be easily offloaded. (Previously only the next hops created after the
RIF was created were offloaded.)
- Patches #1 and #2 are minor adjustments.
- In patches #3 and #4, add CRIF bookkeeping.
- In patch #5, link CRIFs to RIFs such that given a netdevice-backed RIF,
the corresponding CRIF is easy to look up.
- Patch #6 is a clean-up allowed by the previous patches
- Patches #7 and #8 move next hop tracking to CRIFs
No observable effects are intended as of yet. This will be useful once
there is support for RIF creation for netdevices that become mlxsw uppers,
which will come in following patch sets.
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Move the list of next hops from struct mlxsw_sp_rif to mlxsw_sp_crif. The
reason is that eventually, next hops for mlxsw uppers should be offloaded
and unoffloaded on demand as a netdevice becomes an upper, or stops being
one. Currently, next hops are tracked at RIFs, but RIFs do not exist when a
netdevice is not an mlxsw uppers. CRIFs are kept track of throughout the
netdevice lifetime.
Correspondingly, track at each next hop not its RIF, but its CRIF (from
which a RIF can always be deduced).
Note that now that next hops are tracked at a CRIF, it is not necessary to
move each over to a new RIF when it is necessary to edit a RIF. Therefore
drop mlxsw_sp_nexthop_rif_migrate() and have mlxsw_sp_rif_migrate_destroy()
call mlxsw_sp_nexthop_rif_update() directly.
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Danielle Ratson <[email protected]>
Link: https://lore.kernel.org/r/e7c1c0a7dd13883b0f09aeda12c4fcf4d63a70e3.1687438411.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Nexthop finalization consists of two steps: the part where the offload is
removed, because the backing RIF is now gone; and the part where the
association to the RIF is severed.
Extract from mlxsw_sp_nexthop_type_fini() a helper that covers the
unoffloading part, mlxsw_sp_nexthop_type_rif_gone(), so that it can later
be called independently.
Note that this swaps around the ordering of mlxsw_sp_nexthop_ipip_fini()
vs. mlxsw_sp_nexthop_rif_fini(). The current ordering is more of a
historical happenstance than a conscious decision. The two cleanups do not
depend on each other, and this change should have no observable effects.
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Danielle Ratson <[email protected]>
Link: https://lore.kernel.org/r/7134559534c5f5c4807c3a1569fae56f8887e763.1687438411.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
A previous patch added a pointer to loopback CRIF to the router data
structure. That makes the loopback RIF index redundant, as everything
necessary can be derived from the CRIF. Drop the field and adjust the code
accordingly.
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Danielle Ratson <[email protected]>
Link: https://lore.kernel.org/r/8637bf959bc5b6c9d5184b9bd8a0cd53c5132835.1687438411.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
When a RIF is about to be created, the registration of the netdevice that
it should be associated with must have been seen in the past, and a CRIF
created. Therefore make this a hard requirement by looking up the CRIF
during RIF creation, and complaining loudly when there isn't one.
This then allows to keep a link between a RIF and its corresponding
CRIF (and back, as the relationship is one-to-at-most-one), which do.
The CRIF will later be useful as the objects tracked there will be
offloaded lazily as a result of RIF creation.
CRIFs are created when an "interesting" netdevice is registered, and
destroyed after such device is unregistered. CRIFs are supposed to already
exist when a RIF creation request arises, and exist at least as long as
that RIF exists. This makes for a simple invariant: it is always safe to
dereference CRIF pointer from "its" RIF.
To guarantee this, CRIFs cannot be removed immediately when the UNREGISTER
event is delivered. The reason is that if a RIF's netdevices has an IPv6
address, removal of this address is notified in an atomic block. To remove
the RIF, the IPv6 removal handler schedules a work item. It must be safe
for this work item to access the associated CRIF as well.
Thus when a netdevice that backs the CRIF is removed, if it still has a
RIF, do not actually free the CRIF, only toggle its can_destroy flag, which
this patch adds. Later on, mlxsw_sp_rif_destroy() collects the CRIF.
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Danielle Ratson <[email protected]>
Link: https://lore.kernel.org/r/68c8e33afa6b8c03c431b435e1685ffdff752e63.1687438411.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
CRIFs are generally not maintained for loopback RIFs. However, the RIF for
the default VRF is used for offloading of blackhole nexthops. Nexthops
expect to have a valid CRIF. Therefore in this patch, add code to maintain
CRIF for the loopback RIF as well.
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Danielle Ratson <[email protected]>
Link: https://lore.kernel.org/r/7f2b2fcc98770167ed1254a904c3f7f585ba43f0.1687438411.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
CRIFs are objects that mlxsw maintains for netdevices that may not have an
associated RIF (i.e. they may not have been instantiated in the ASIC), but
if indeed they do not, it is quite possible they will in the future. These
netdevices are candidate RIFs, hence CRIFs. Netdevices for which CRIFs are
created include e.g. bridges, LAGs, or front panel ports. The idea is that
next hops would be kept at CRIFs, not RIFs, and thus it would be easier to
offload and unoffload the entities that have been added before the RIF was
created.
In this patch, add the code for low-level CRIF maintenance: create and
destroy, and keep in a table keyed by the netdevice pointer for easy
recall.
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Danielle Ratson <[email protected]>
Link: https://lore.kernel.org/r/186d44e399c475159da20689f2c540719f2d1ed0.1687438411.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The current function, mlxsw_sp_router_ul_rif_get(), is a wrapper around the
function mentioned in the subject. As such it forms an external interface
of the router code.
In future patches we will want to maintain connection between RIFs and the
CRIFs (introduced in the next patch) that back them. That will not hold
for the VRF-based loopback netdevices, so the whole CRIF business can be
kept hidden from the rest of mlxsw.
But for the main VRF loopback RIF we do want to keep the RIF-CRIF
connection, because that RIF is used for blackhole next hops, and the next
hop code can be kept simpler for assuming rif->crif is valid.
Hence, instead, call mlxsw_sp_ul_rif_get() to create the main VRF loopback
RIF. This being an internal function will take the CRIF argument anyway.
Furthermore, the function does not lock, which is not necessary at this
point in code yet.
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Danielle Ratson <[email protected]>
Link: https://lore.kernel.org/r/7a39a011a02a84164cd7f5da7985ec5b2ae01ba5.1687438411.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
The extack will be handy in later patches.
Signed-off-by: Petr Machata <[email protected]>
Reviewed-by: Danielle Ratson <[email protected]>
Link: https://lore.kernel.org/r/e87ba300121010d580b80a281877573a7b1377ca.1687438411.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
Drivers must not assume in their ndo_start_xmit() that
skbs have their mac_header set. skb->data is all what is needed.
bonding seems to be one of the last offender as caught by syzbot:
WARNING: CPU: 1 PID: 12155 at include/linux/skbuff.h:2907 skb_mac_offset include/linux/skbuff.h:2913 [inline]
WARNING: CPU: 1 PID: 12155 at include/linux/skbuff.h:2907 bond_xmit_hash drivers/net/bonding/bond_main.c:4170 [inline]
WARNING: CPU: 1 PID: 12155 at include/linux/skbuff.h:2907 bond_xmit_3ad_xor_slave_get drivers/net/bonding/bond_main.c:5149 [inline]
WARNING: CPU: 1 PID: 12155 at include/linux/skbuff.h:2907 bond_3ad_xor_xmit drivers/net/bonding/bond_main.c:5186 [inline]
WARNING: CPU: 1 PID: 12155 at include/linux/skbuff.h:2907 __bond_start_xmit drivers/net/bonding/bond_main.c:5442 [inline]
WARNING: CPU: 1 PID: 12155 at include/linux/skbuff.h:2907 bond_start_xmit+0x14ab/0x19d0 drivers/net/bonding/bond_main.c:5470
Modules linked in:
CPU: 1 PID: 12155 Comm: syz-executor.3 Not tainted 6.1.30-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/25/2023
RIP: 0010:skb_mac_header include/linux/skbuff.h:2907 [inline]
RIP: 0010:skb_mac_offset include/linux/skbuff.h:2913 [inline]
RIP: 0010:bond_xmit_hash drivers/net/bonding/bond_main.c:4170 [inline]
RIP: 0010:bond_xmit_3ad_xor_slave_get drivers/net/bonding/bond_main.c:5149 [inline]
RIP: 0010:bond_3ad_xor_xmit drivers/net/bonding/bond_main.c:5186 [inline]
RIP: 0010:__bond_start_xmit drivers/net/bonding/bond_main.c:5442 [inline]
RIP: 0010:bond_start_xmit+0x14ab/0x19d0 drivers/net/bonding/bond_main.c:5470
Code: 8b 7c 24 30 e8 76 dd 1a 01 48 85 c0 74 0d 48 89 c3 e8 29 67 2e fe e9 15 ef ff ff e8 1f 67 2e fe e9 10 ef ff ff e8 15 67 2e fe <0f> 0b e9 45 f8 ff ff e8 09 67 2e fe e9 dc fa ff ff e8 ff 66 2e fe
RSP: 0018:ffffc90002fff6e0 EFLAGS: 00010283
RAX: ffffffff835874db RBX: 000000000000ffff RCX: 0000000000040000
RDX: ffffc90004dcf000 RSI: 00000000000000b5 RDI: 00000000000000b6
RBP: ffffc90002fff8b8 R08: ffffffff83586d16 R09: ffffffff83586584
R10: 0000000000000007 R11: ffff8881599fc780 R12: ffff88811b6a7b7e
R13: 1ffff110236d4f6f R14: ffff88811b6a7ac0 R15: 1ffff110236d4f76
FS: 00007f2e9eb47700(0000) GS:ffff8881f6b00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b2e421000 CR3: 000000010e6d4000 CR4: 00000000003526e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
[<ffffffff8471a49f>] netdev_start_xmit include/linux/netdevice.h:4925 [inline]
[<ffffffff8471a49f>] __dev_direct_xmit+0x4ef/0x850 net/core/dev.c:4380
[<ffffffff851d845b>] dev_direct_xmit include/linux/netdevice.h:3043 [inline]
[<ffffffff851d845b>] packet_direct_xmit+0x18b/0x300 net/packet/af_packet.c:284
[<ffffffff851c7472>] packet_snd net/packet/af_packet.c:3112 [inline]
[<ffffffff851c7472>] packet_sendmsg+0x4a22/0x64d0 net/packet/af_packet.c:3143
[<ffffffff8467a4b2>] sock_sendmsg_nosec net/socket.c:716 [inline]
[<ffffffff8467a4b2>] sock_sendmsg net/socket.c:736 [inline]
[<ffffffff8467a4b2>] __sys_sendto+0x472/0x5f0 net/socket.c:2139
[<ffffffff8467a715>] __do_sys_sendto net/socket.c:2151 [inline]
[<ffffffff8467a715>] __se_sys_sendto net/socket.c:2147 [inline]
[<ffffffff8467a715>] __x64_sys_sendto+0xe5/0x100 net/socket.c:2147
[<ffffffff8553071f>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
[<ffffffff8553071f>] do_syscall_64+0x2f/0x50 arch/x86/entry/common.c:80
[<ffffffff85600087>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
Fixes: 7b8fc0103bb5 ("bonding: add a vlan+srcmac tx hashing option")
Reported-by: syzbot <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Jarod Wilson <[email protected]>
Cc: Moshe Tal <[email protected]>
Cc: Jussi Maki <[email protected]>
Cc: Jay Vosburgh <[email protected]>
Cc: Andy Gospodarek <[email protected]>
Cc: Vladimir Oltean <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|
|
With support for Ethernet PHY LEDs having been added, while
unregistering a MDIO bus and its child device liks PHYs there may be
"late" accesses to the MDIO bus. One typical use case is setting the PHY
LEDs brightness to OFF for instance.
We need to ensure that the MDIO bus controller remains entirely
functional since it runs off the main GENET adapter clock.
Cc: [email protected]
Link: https://lore.kernel.org/all/[email protected]/
Fixes: 9a4e79697009 ("net: bcmgenet: utilize generic Broadcom UniMAC MDIO controller driver")
Signed-off-by: Florian Fainelli <[email protected]>
Reviewed-by: Andrew Lunn <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
|