aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2018-03-24netfilter: nf_socket: Fix out of bounds access in nf_sk_lookup_slow_v{4,6}Subash Abhinov Kasiviswanathan2-4/+8
skb_header_pointer will copy data into a buffer if data is non linear, otherwise it will return a pointer in the linear section of the data. nf_sk_lookup_slow_v{4,6} always copies data of size udphdr but later accesses memory within the size of tcphdr (th->doff) in case of TCP packets. This causes a crash when running with KASAN with the following call stack - BUG: KASAN: stack-out-of-bounds in xt_socket_lookup_slow_v4+0x524/0x718 net/netfilter/xt_socket.c:178 Read of size 2 at addr ffffffe3d417a87c by task syz-executor/28971 CPU: 2 PID: 28971 Comm: syz-executor Tainted: G B W O 4.9.65+ #1 Call trace: [<ffffff9467e8d390>] dump_backtrace+0x0/0x428 arch/arm64/kernel/traps.c:76 [<ffffff9467e8d7e0>] show_stack+0x28/0x38 arch/arm64/kernel/traps.c:226 [<ffffff946842d9b8>] __dump_stack lib/dump_stack.c:15 [inline] [<ffffff946842d9b8>] dump_stack+0xd4/0x124 lib/dump_stack.c:51 [<ffffff946811d4b0>] print_address_description+0x68/0x258 mm/kasan/report.c:248 [<ffffff946811d8c8>] kasan_report_error mm/kasan/report.c:347 [inline] [<ffffff946811d8c8>] kasan_report.part.2+0x228/0x2f0 mm/kasan/report.c:371 [<ffffff946811df44>] kasan_report+0x5c/0x70 mm/kasan/report.c:372 [<ffffff946811bebc>] check_memory_region_inline mm/kasan/kasan.c:308 [inline] [<ffffff946811bebc>] __asan_load2+0x84/0x98 mm/kasan/kasan.c:739 [<ffffff94694d6f04>] __tcp_hdrlen include/linux/tcp.h:35 [inline] [<ffffff94694d6f04>] xt_socket_lookup_slow_v4+0x524/0x718 net/netfilter/xt_socket.c:178 Fix this by copying data into appropriate size headers based on protocol. Fixes: a583636a83ea ("inet: refactor inet[6]_lookup functions to take skb") Signed-off-by: Tejaswi Tanikella <[email protected]> Signed-off-by: Subash Abhinov Kasiviswanathan <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2018-03-22netfilter: nf_tables: do not hold reference on netdevice from preparation phasePablo Neira Ayuso1-15/+4
The netfilter netdevice event handler hold the nfnl_lock mutex, this avoids races with a device going away while such device is being attached to hooks from the netlink control plane. Therefore, either control plane bails out with ENOENT or netdevice event path waits until the hook that is attached to net_device is registered. Signed-off-by: Pablo Neira Ayuso <[email protected]>
2018-03-22netfilter: nf_tables: cache device name in flowtable objectPablo Neira Ayuso2-6/+13
Devices going away have to grab the nfnl_lock from the netdev event path to avoid races with control plane updates. However, netlink dumps in netfilter do not hold nfnl_lock mutex. Cache the device name into the objects to avoid an use-after-free situation for a device that is going away. Signed-off-by: Pablo Neira Ayuso <[email protected]>
2018-03-22netfilter: drop template ct when conntrack is skipped.Paolo Abeni1-1/+13
The ipv4 nf_ct code currently skips the nf_conntrak_in() call for fragmented packets. As a results later matches/target can end up manipulating template ct entry instead of 'real' ones. Exploiting the above, syzbot found a way to trigger the following splat: WARNING: CPU: 1 PID: 4242 at net/netfilter/xt_cluster.c:55 xt_cluster_mt+0x6c1/0x840 net/netfilter/xt_cluster.c:127 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 4242 Comm: syzkaller027971 Not tainted 4.16.0-rc2+ #243 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x24d lib/dump_stack.c:53 panic+0x1e4/0x41c kernel/panic.c:183 __warn+0x1dc/0x200 kernel/panic.c:547 report_bug+0x211/0x2d0 lib/bug.c:184 fixup_bug.part.11+0x37/0x80 arch/x86/kernel/traps.c:178 fixup_bug arch/x86/kernel/traps.c:247 [inline] do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315 invalid_op+0x58/0x80 arch/x86/entry/entry_64.S:957 RIP: 0010:xt_cluster_hash net/netfilter/xt_cluster.c:55 [inline] RIP: 0010:xt_cluster_mt+0x6c1/0x840 net/netfilter/xt_cluster.c:127 RSP: 0018:ffff8801d2f6f2d0 EFLAGS: 00010293 RAX: ffff8801af700540 RBX: 0000000000000000 RCX: ffffffff84a2d1e1 RDX: 0000000000000000 RSI: ffff8801d2f6f478 RDI: ffff8801cafd336a RBP: ffff8801d2f6f2e8 R08: 0000000000000000 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801b03b3d18 R13: ffff8801cafd3300 R14: dffffc0000000000 R15: ffff8801d2f6f478 ipt_do_table+0xa91/0x19b0 net/ipv4/netfilter/ip_tables.c:296 iptable_filter_hook+0x65/0x80 net/ipv4/netfilter/iptable_filter.c:41 nf_hook_entry_hookfn include/linux/netfilter.h:120 [inline] nf_hook_slow+0xba/0x1a0 net/netfilter/core.c:483 nf_hook include/linux/netfilter.h:243 [inline] NF_HOOK include/linux/netfilter.h:286 [inline] raw_send_hdrinc.isra.17+0xf39/0x1880 net/ipv4/raw.c:432 raw_sendmsg+0x14cd/0x26b0 net/ipv4/raw.c:669 inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:763 sock_sendmsg_nosec net/socket.c:629 [inline] sock_sendmsg+0xca/0x110 net/socket.c:639 SYSC_sendto+0x361/0x5c0 net/socket.c:1748 SyS_sendto+0x40/0x50 net/socket.c:1716 do_syscall_64+0x280/0x940 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x42/0xb7 RIP: 0033:0x441b49 RSP: 002b:00007ffff5ca8b18 EFLAGS: 00000216 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000441b49 RDX: 0000000000000030 RSI: 0000000020ff7000 RDI: 0000000000000003 RBP: 00000000006cc018 R08: 000000002066354c R09: 0000000000000010 R10: 0000000000000000 R11: 0000000000000216 R12: 0000000000403470 R13: 0000000000403500 R14: 0000000000000000 R15: 0000000000000000 Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: disabled Rebooting in 86400 seconds.. Instead of adding checks for template ct on every target/match manipulating skb->_nfct, simply drop the template ct when skipping nf_conntrack_in(). Fixes: 7b4fdf77a450ec ("netfilter: don't track fragmented packets") Reported-and-tested-by: [email protected] Signed-off-by: Paolo Abeni <[email protected]> Acked-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2018-03-20netfilter: nf_tables: add missing netlink attrs to policiesFlorian Westphal1-0/+3
Fixes: 8aeff920dcc9 ("netfilter: nf_tables: add stateful object reference to set elements") Fixes: f25ad2e907f1 ("netfilter: nf_tables: prepare for expressions associated to set elements") Fixes: 1a94e38d254b ("netfilter: nf_tables: add NFTA_RULE_ID attribute") Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2018-03-20netfilter: nf_tables: permit second nat hook if colliding hook is going awayFlorian Westphal1-1/+63
Sergei Trofimovich reported that restoring an nft ruleset doesn't work anymore unless old rule content is flushed first. The problem stems from a recent change designed to prevent multiple nat hooks at the same hook point locations and nftables transaction model. A 'flush ruleset' won't take effect until the entire transaction has completed. So, if one has a nft.rules file that contains a 'flush ruleset', followed by a nat hook register request, then 'nft -f file' will work, but running 'nft -f file' again will fail with -EBUSY. Reason is that nftables will place the flush/removal requests in the transaction list, but it will not act on the removal until after all new rules are in place. The netfilter core will therefore get request to register a new nat hook before the old one is removed -- this now fails as the netfilter core can't know the existing hook is staged for removal. To fix this, we can search the transaction log when a hook collision is detected. The collision is okay if 1. there is a delete request pending for the nat hook that is already registered. 2. there is no second add request for a matching nat hook. This is required to only apply the exception once. Fixes: f92b40a8b2645 ("netfilter: core: only allow one nat hook per hook point") Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2018-03-20netfilter: nf_tables: meter: pick a set backend that supports updatesFlorian Westphal2-2/+5
in nftables, 'meter' can be used to instantiate a hash-table at run time: rule add filter forward iif "internal" meter hostacct { ip saddr counter} nft list meter ip filter hostacct table ip filter { meter hostacct { type ipv4_addr elements = { 192.168.0.1 : counter packets 8 bytes 2672, .. because elemets get added on the fly, the kernel must chose a set backend type that implements the ->update() function, otherwise rule insertion fails with EOPNOTSUPP. Therefore, skip set types that lack ->update, and also make sure we do not discard a (bad) candidate when we did yet find any candidate at all. This could happen when userspace prefers low memory footprint -- the set implementation currently checked might not be a fit at all. Make sure we pick it anyway (!bops). In case next candidate is a better fix, it will be chosen instead. But in case nothing else is found we at least have a non-ideal match rather than no match at all. Fixes: 6c03ae210ce3 ("netfilter: nft_set_hash: add non-resizable hashtable implementation") Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2018-03-19Merge branch 'phy-relax-error-checking'David S. Miller2-4/+12
Grygorii Strashko says: ==================== net: phy: relax error checking when creating sysfs link netdev->phydev Some ethernet drivers (like TI CPSW) may connect and manage >1 Net PHYs per one netdevice, as result such drivers will produce warning during system boot and fail to connect second phy to netdevice when PHYLIB framework will try to create sysfs link netdev->phydev for second PHY in phy_attach_direct(), because sysfs link with the same name has been created already for the first PHY. As result, second CPSW external port will became unusable. This regression was introduced by commits: 5568363f0cb3 ("net: phy: Create sysfs reciprocal links for attached_dev/phydev" a3995460491d ("net: phy: Relax error checking on sysfs_create_link()" Patch 1: exports sysfs_create_link_nowarn() function as preparation for Patch 2. Patch 2: relaxes error checking when PHYLIB framework is creating sysfs link netdev->phydev in phy_attach_direct(), suppresses warning by using sysfs_create_link_nowarn() and adds error message instead, so links creation failure is not fatal any more and system can continue working, which fixes TI CPSW issue and makes boot logs accessible in case of NFS boot, for example. This can be stable material 4.13+. Changes in v2: - commit messages updated. v1: https://patchwork.ozlabs.org/cover/886058/ ==================== Signed-off-by: David S. Miller <[email protected]>
2018-03-19net: phy: relax error checking when creating sysfs link netdev->phydevGrygorii Strashko1-4/+11
Some ethernet drivers (like TI CPSW) may connect and manage >1 Net PHYs per one netdevice, as result such drivers will produce warning during system boot and fail to connect second phy to netdevice when PHYLIB framework will try to create sysfs link netdev->phydev for second PHY in phy_attach_direct(), because sysfs link with the same name has been created already for the first PHY. As result, second CPSW external port will became unusable. Fix it by relaxing error checking when PHYLIB framework is creating sysfs link netdev->phydev in phy_attach_direct(), suppressing warning by using sysfs_create_link_nowarn() and adding error message instead. After this change links (phy->netdev and netdev->phy) creation failure is not fatal any more and system can continue working, which fixes TI CPSW issue. Cc: Florian Fainelli <[email protected]> Cc: Andrew Lunn <[email protected]> Fixes: a3995460491d ("net: phy: Relax error checking on sysfs_create_link()") Signed-off-by: Grygorii Strashko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-19sysfs: symlink: export sysfs_create_link_nowarn()Grygorii Strashko1-0/+1
The sysfs_create_link_nowarn() is going to be used in phylib framework in subsequent patch which can be built as module. Hence, export sysfs_create_link_nowarn() to avoid build errors. Cc: Florian Fainelli <[email protected]> Cc: Andrew Lunn <[email protected]> Fixes: a3995460491d ("net: phy: Relax error checking on sysfs_create_link()") Signed-off-by: Grygorii Strashko <[email protected]> Acked-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-18net: fec: Fix unbalanced PM runtime callsFlorian Fainelli1-0/+2
When unbinding/removing the driver, we will run into the following warnings: [ 259.655198] fec 400d1000.ethernet: 400d1000.ethernet supply phy not found, using dummy regulator [ 259.665065] fec 400d1000.ethernet: Unbalanced pm_runtime_enable! [ 259.672770] fec 400d1000.ethernet (unnamed net_device) (uninitialized): Invalid MAC address: 00:00:00:00:00:00 [ 259.683062] fec 400d1000.ethernet (unnamed net_device) (uninitialized): Using random MAC address: f2:3e:93:b7:29:c1 [ 259.696239] libphy: fec_enet_mii_bus: probed Avoid these warnings by balancing the runtime PM calls during fec_drv_remove(). Signed-off-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-17vmxnet3: use correct flag to indicate LRO featureRonak Doshi2-4/+4
'Commit 45dac1d6ea04 ("vmxnet3: Changes for vmxnet3 adapter version 2 (fwd)")' introduced a flag "lro" in structure vmxnet3_adapter which is used to indicate whether LRO is enabled or not. However, the patch did not set the flag and hence it was never exercised. So, when LRO is enabled, it resulted in poor TCP performance due to delayed acks. This issue is seen with packets which are larger than the mss getting a delayed ack rather than an immediate ack, thus resulting in high latency. This patch removes the lro flag and directly uses device features against NETIF_F_LRO to check if lro is enabled. Fixes: 45dac1d6ea04 ("vmxnet3: Changes for vmxnet3 adapter version 2 (fwd)") Reported-by: Rachel Lunnon <[email protected]> Signed-off-by: Ronak Doshi <[email protected]> Acked-by: Shrikrishna Khare <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-17vmxnet3: avoid xmit reset due to a race in vmxnet3Ronak Doshi2-7/+10
The field txNumDeferred is used by the driver to keep track of the number of packets it has pushed to the emulation. The driver increments it on pushing the packet to the emulation and the emulation resets it to 0 at the end of the transmit. There is a possibility of a race either when (a) ESX is under heavy load or (b) workload inside VM is of low packet rate. This race results in xmit hangs when network coalescing is disabled. This change creates a local copy of txNumDeferred and uses it to perform ring arithmetic. Reported-by: Noriho Tanaka <[email protected]> Signed-off-by: Ronak Doshi <[email protected]> Acked-by: Shrikrishna Khare <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-17Merge branch 'tcf_foo_init-NULL-deref'David S. Miller5-8/+13
Davide Caratti says: ==================== net/sched: fix NULL dereference in the error path of .init() with several TC actions it's possible to see NULL pointer dereference, when the .init() function calls tcf_idr_alloc(), fails at some point and then calls tcf_idr_release(): this series fixes all them introducing non-NULL tests in the .cleanup() function. ==================== Signed-off-by: David S. Miller <[email protected]>
2018-03-17net/sched: fix NULL dereference on the error path of tcf_skbmod_init()Davide Caratti1-1/+2
when the following command # tc action replace action skbmod swap mac index 100 is run for the first time, and tcf_skbmod_init() fails to allocate struct tcf_skbmod_params, tcf_skbmod_cleanup() calls kfree_rcu(NULL), thus causing the following error: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 IP: __call_rcu+0x23/0x2b0 PGD 8000000034057067 P4D 8000000034057067 PUD 74937067 PMD 0 Oops: 0002 [#1] SMP PTI Modules linked in: act_skbmod(E) psample ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 snd_hda_codec_generic snd_hda_intel snd_hda_codec crct10dif_pclmul mbcache jbd2 crc32_pclmul snd_hda_core ghash_clmulni_intel snd_hwdep pcbc snd_seq snd_seq_device snd_pcm aesni_intel snd_timer crypto_simd glue_helper snd cryptd virtio_balloon joydev soundcore pcspkr i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm virtio_console virtio_net virtio_blk ata_piix libata crc32c_intel virtio_pci serio_raw virtio_ring virtio i2c_core floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: act_skbmod] CPU: 3 PID: 3144 Comm: tc Tainted: G E 4.16.0-rc4.act_vlan.orig+ #403 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:__call_rcu+0x23/0x2b0 RSP: 0018:ffffbd2e403e7798 EFLAGS: 00010246 RAX: ffffffffc0872080 RBX: ffff981d34bff780 RCX: 00000000ffffffff RDX: ffffffff922a5f00 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000000001 R09: 000000000000021f R10: 000000003d003000 R11: 0000000000aaaaaa R12: 0000000000000000 R13: ffffffff922a5f00 R14: 0000000000000001 R15: ffff981d3b698c2c FS: 00007f3678292740(0000) GS:ffff981d3fd80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000008 CR3: 000000007c57a006 CR4: 00000000001606e0 Call Trace: __tcf_idr_release+0x79/0xf0 tcf_skbmod_init+0x1d1/0x210 [act_skbmod] tcf_action_init_1+0x2cc/0x430 tcf_action_init+0xd3/0x1b0 tc_ctl_action+0x18b/0x240 rtnetlink_rcv_msg+0x29c/0x310 ? _cond_resched+0x15/0x30 ? __kmalloc_node_track_caller+0x1b9/0x270 ? rtnl_calcit.isra.28+0x100/0x100 netlink_rcv_skb+0xd2/0x110 netlink_unicast+0x17c/0x230 netlink_sendmsg+0x2cd/0x3c0 sock_sendmsg+0x30/0x40 ___sys_sendmsg+0x27a/0x290 ? filemap_map_pages+0x34a/0x3a0 ? __handle_mm_fault+0xbfd/0xe20 __sys_sendmsg+0x51/0x90 do_syscall_64+0x6e/0x1a0 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 RIP: 0033:0x7f36776a3ba0 RSP: 002b:00007fff4703b618 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007fff4703b740 RCX: 00007f36776a3ba0 RDX: 0000000000000000 RSI: 00007fff4703b690 RDI: 0000000000000003 RBP: 000000005aaaba36 R08: 0000000000000002 R09: 0000000000000000 R10: 00007fff4703b0a0 R11: 0000000000000246 R12: 0000000000000000 R13: 00007fff4703b754 R14: 0000000000000001 R15: 0000000000669f60 Code: 5d e9 42 da ff ff 66 90 0f 1f 44 00 00 41 57 41 56 41 55 49 89 d5 41 54 55 48 89 fd 53 48 83 ec 08 40 f6 c7 07 0f 85 19 02 00 00 <48> 89 75 08 48 c7 45 00 00 00 00 00 9c 58 0f 1f 44 00 00 49 89 RIP: __call_rcu+0x23/0x2b0 RSP: ffffbd2e403e7798 CR2: 0000000000000008 Fix it in tcf_skbmod_cleanup(), ensuring that kfree_rcu(p, ...) is called only when p is not NULL. Fixes: 86da71b57383 ("net_sched: Introduce skbmod action") Signed-off-by: Davide Caratti <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-17net/sched: fix NULL dereference in the error path of tcf_sample_init()Davide Caratti1-1/+2
when the following command # tc action add action sample rate 100 group 100 index 100 is run for the first time, and psample_group_get(100) fails to create a new group, tcf_sample_cleanup() calls psample_group_put(NULL), thus causing the following error: BUG: unable to handle kernel NULL pointer dereference at 000000000000001c IP: psample_group_put+0x15/0x71 [psample] PGD 8000000075775067 P4D 8000000075775067 PUD 7453c067 PMD 0 Oops: 0002 [#1] SMP PTI Modules linked in: act_sample(E) psample ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core mbcache jbd2 crct10dif_pclmul snd_hwdep crc32_pclmul snd_seq ghash_clmulni_intel pcbc snd_seq_device snd_pcm aesni_intel crypto_simd snd_timer glue_helper snd cryptd joydev pcspkr i2c_piix4 soundcore virtio_balloon nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm virtio_net ata_piix virtio_console virtio_blk libata serio_raw crc32c_intel virtio_pci i2c_core virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: act_tunnel_key] CPU: 2 PID: 5740 Comm: tc Tainted: G E 4.16.0-rc4.act_vlan.orig+ #403 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:psample_group_put+0x15/0x71 [psample] RSP: 0018:ffffb8a80032f7d0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000024 RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffffffffc06d93c0 RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000044 R10: 00000000bd003000 R11: ffff979fba04aa59 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: ffff979fbba3f22c FS: 00007f7638112740(0000) GS:ffff979fbfd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000001c CR3: 00000000734ea001 CR4: 00000000001606e0 Call Trace: __tcf_idr_release+0x79/0xf0 tcf_sample_init+0x125/0x1d0 [act_sample] tcf_action_init_1+0x2cc/0x430 tcf_action_init+0xd3/0x1b0 tc_ctl_action+0x18b/0x240 rtnetlink_rcv_msg+0x29c/0x310 ? _cond_resched+0x15/0x30 ? __kmalloc_node_track_caller+0x1b9/0x270 ? rtnl_calcit.isra.28+0x100/0x100 netlink_rcv_skb+0xd2/0x110 netlink_unicast+0x17c/0x230 netlink_sendmsg+0x2cd/0x3c0 sock_sendmsg+0x30/0x40 ___sys_sendmsg+0x27a/0x290 ? filemap_map_pages+0x34a/0x3a0 ? __handle_mm_fault+0xbfd/0xe20 __sys_sendmsg+0x51/0x90 do_syscall_64+0x6e/0x1a0 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 RIP: 0033:0x7f7637523ba0 RSP: 002b:00007fff0473ef58 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007fff0473f080 RCX: 00007f7637523ba0 RDX: 0000000000000000 RSI: 00007fff0473efd0 RDI: 0000000000000003 RBP: 000000005aaaac80 R08: 0000000000000002 R09: 0000000000000000 R10: 00007fff0473e9e0 R11: 0000000000000246 R12: 0000000000000000 R13: 00007fff0473f094 R14: 0000000000000001 R15: 0000000000669f60 Code: be 02 00 00 00 48 89 df e8 a9 fe ff ff e9 7c ff ff ff 0f 1f 40 00 0f 1f 44 00 00 53 48 89 fb 48 c7 c7 c0 93 6d c0 e8 db 20 8c ef <83> 6b 1c 01 74 10 48 c7 c7 c0 93 6d c0 ff 14 25 e8 83 83 b0 5b RIP: psample_group_put+0x15/0x71 [psample] RSP: ffffb8a80032f7d0 CR2: 000000000000001c Fix it in tcf_sample_cleanup(), ensuring that calls to psample_group_put(p) are done only when p is not NULL. Fixes: cadb9c9fdbc6 ("net/sched: act_sample: Fix error path in init") Signed-off-by: Davide Caratti <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-17net/sched: fix NULL dereference in the error path of tunnel_key_init()Davide Caratti1-4/+5
when the following command # tc action add action tunnel_key unset index 100 is run for the first time, and tunnel_key_init() fails to allocate struct tcf_tunnel_key_params, tunnel_key_release() dereferences NULL pointers. This causes the following error: BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 IP: tunnel_key_release+0xd/0x40 [act_tunnel_key] PGD 8000000033787067 P4D 8000000033787067 PUD 74646067 PMD 0 Oops: 0000 [#1] SMP PTI Modules linked in: act_tunnel_key(E) act_csum ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 mbcache jbd2 crct10dif_pclmul crc32_pclmul snd_hda_codec_generic ghash_clmulni_intel snd_hda_intel pcbc snd_hda_codec snd_hda_core snd_hwdep snd_seq aesni_intel snd_seq_device crypto_simd glue_helper snd_pcm cryptd joydev snd_timer pcspkr virtio_balloon snd i2c_piix4 soundcore nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm virtio_net virtio_blk drm virtio_console crc32c_intel ata_piix serio_raw i2c_core virtio_pci libata virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod CPU: 2 PID: 3101 Comm: tc Tainted: G E 4.16.0-rc4.act_vlan.orig+ #403 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:tunnel_key_release+0xd/0x40 [act_tunnel_key] RSP: 0018:ffffba46803b7768 EFLAGS: 00010286 RAX: ffffffffc09010a0 RBX: 0000000000000000 RCX: 0000000000000024 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff99ee336d7480 RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000044 R10: 0000000000000220 R11: ffff99ee79d73131 R12: 0000000000000000 R13: ffff99ee32d67610 R14: ffff99ee7671dc38 R15: 00000000fffffff4 FS: 00007febcb2cd740(0000) GS:ffff99ee7fd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000010 CR3: 000000007c8e4005 CR4: 00000000001606e0 Call Trace: __tcf_idr_release+0x79/0xf0 tunnel_key_init+0xd9/0x460 [act_tunnel_key] tcf_action_init_1+0x2cc/0x430 tcf_action_init+0xd3/0x1b0 tc_ctl_action+0x18b/0x240 rtnetlink_rcv_msg+0x29c/0x310 ? _cond_resched+0x15/0x30 ? __kmalloc_node_track_caller+0x1b9/0x270 ? rtnl_calcit.isra.28+0x100/0x100 netlink_rcv_skb+0xd2/0x110 netlink_unicast+0x17c/0x230 netlink_sendmsg+0x2cd/0x3c0 sock_sendmsg+0x30/0x40 ___sys_sendmsg+0x27a/0x290 __sys_sendmsg+0x51/0x90 do_syscall_64+0x6e/0x1a0 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 RIP: 0033:0x7febca6deba0 RSP: 002b:00007ffe7b0dd128 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007ffe7b0dd250 RCX: 00007febca6deba0 RDX: 0000000000000000 RSI: 00007ffe7b0dd1a0 RDI: 0000000000000003 RBP: 000000005aaa90cb R08: 0000000000000002 R09: 0000000000000000 R10: 00007ffe7b0dcba0 R11: 0000000000000246 R12: 0000000000000000 R13: 00007ffe7b0dd264 R14: 0000000000000001 R15: 0000000000669f60 Code: 44 00 00 8b 0d b5 23 00 00 48 8b 87 48 10 00 00 48 8b 3c c8 e9 a5 e5 d8 c3 0f 1f 44 00 00 0f 1f 44 00 00 53 48 8b 9f b0 00 00 00 <83> 7b 10 01 74 0b 48 89 df 31 f6 5b e9 f2 fa 7f c3 48 8b 7b 18 RIP: tunnel_key_release+0xd/0x40 [act_tunnel_key] RSP: ffffba46803b7768 CR2: 0000000000000010 Fix this in tunnel_key_release(), ensuring 'param' is not NULL before dereferencing it. Fixes: d0f6dd8a914f ("net/sched: Introduce act_tunnel_key") Signed-off-by: Davide Caratti <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-17net/sched: fix NULL dereference in the error path of tcf_csum_init()Davide Caratti1-1/+2
when the following command # tc action add action csum udp continue index 100 is run for the first time, and tcf_csum_init() fails allocating struct tcf_csum, tcf_csum_cleanup() calls kfree_rcu(NULL,...). This causes the following error: BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 IP: __call_rcu+0x23/0x2b0 PGD 80000000740b4067 P4D 80000000740b4067 PUD 32e7f067 PMD 0 Oops: 0002 [#1] SMP PTI Modules linked in: act_csum(E) act_vlan ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 mbcache jbd2 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec_generic pcbc snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer aesni_intel crypto_simd glue_helper cryptd snd joydev pcspkr virtio_balloon i2c_piix4 soundcore nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm virtio_blk drm virtio_net virtio_console ata_piix crc32c_intel libata virtio_pci serio_raw i2c_core virtio_ring virtio floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: act_vlan] CPU: 2 PID: 5763 Comm: tc Tainted: G E 4.16.0-rc4.act_vlan.orig+ #403 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:__call_rcu+0x23/0x2b0 RSP: 0018:ffffb275803e77c0 EFLAGS: 00010246 RAX: ffffffffc057b080 RBX: ffff9674bc6f5240 RCX: 00000000ffffffff RDX: ffffffff928a5f00 RSI: 0000000000000008 RDI: 0000000000000008 RBP: 0000000000000008 R08: 0000000000000001 R09: 0000000000000044 R10: 0000000000000220 R11: ffff9674b9ab4821 R12: 0000000000000000 R13: ffffffff928a5f00 R14: 0000000000000000 R15: 0000000000000001 FS: 00007fa6368d8740(0000) GS:ffff9674bfd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000010 CR3: 0000000073dec001 CR4: 00000000001606e0 Call Trace: __tcf_idr_release+0x79/0xf0 tcf_csum_init+0xfb/0x180 [act_csum] tcf_action_init_1+0x2cc/0x430 tcf_action_init+0xd3/0x1b0 tc_ctl_action+0x18b/0x240 rtnetlink_rcv_msg+0x29c/0x310 ? _cond_resched+0x15/0x30 ? __kmalloc_node_track_caller+0x1b9/0x270 ? rtnl_calcit.isra.28+0x100/0x100 netlink_rcv_skb+0xd2/0x110 netlink_unicast+0x17c/0x230 netlink_sendmsg+0x2cd/0x3c0 sock_sendmsg+0x30/0x40 ___sys_sendmsg+0x27a/0x290 ? filemap_map_pages+0x34a/0x3a0 ? __handle_mm_fault+0xbfd/0xe20 __sys_sendmsg+0x51/0x90 do_syscall_64+0x6e/0x1a0 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 RIP: 0033:0x7fa635ce9ba0 RSP: 002b:00007ffc185b0fc8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007ffc185b10f0 RCX: 00007fa635ce9ba0 RDX: 0000000000000000 RSI: 00007ffc185b1040 RDI: 0000000000000003 RBP: 000000005aaa85e0 R08: 0000000000000002 R09: 0000000000000000 R10: 00007ffc185b0a20 R11: 0000000000000246 R12: 0000000000000000 R13: 00007ffc185b1104 R14: 0000000000000001 R15: 0000000000669f60 Code: 5d e9 42 da ff ff 66 90 0f 1f 44 00 00 41 57 41 56 41 55 49 89 d5 41 54 55 48 89 fd 53 48 83 ec 08 40 f6 c7 07 0f 85 19 02 00 00 <48> 89 75 08 48 c7 45 00 00 00 00 00 9c 58 0f 1f 44 00 00 49 89 RIP: __call_rcu+0x23/0x2b0 RSP: ffffb275803e77c0 CR2: 0000000000000010 fix this in tcf_csum_cleanup(), ensuring that kfree_rcu(param, ...) is called only when param is not NULL. Fixes: 9c5f69bbd75a ("net/sched: act_csum: don't use spinlock in the fast path") Signed-off-by: Davide Caratti <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-17net/sched: fix NULL dereference in the error path of tcf_vlan_init()Davide Caratti1-1/+2
when the following command # tc actions replace action vlan pop index 100 is run for the first time, and tcf_vlan_init() fails allocating struct tcf_vlan_params, tcf_vlan_cleanup() calls kfree_rcu(NULL, ...). This causes the following error: BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 IP: __call_rcu+0x23/0x2b0 PGD 80000000760a2067 P4D 80000000760a2067 PUD 742c1067 PMD 0 Oops: 0002 [#1] SMP PTI Modules linked in: act_vlan(E) ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 snd_hda_codec_generic snd_hda_intel mbcache snd_hda_codec jbd2 snd_hda_core crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc snd_hwdep snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd snd_timer glue_helper snd cryptd joydev soundcore virtio_balloon pcspkr i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm virtio_console virtio_blk virtio_net ata_piix crc32c_intel libata virtio_pci i2c_core virtio_ring serio_raw virtio floppy dm_mirror dm_region_hash dm_log dm_mod [last unloaded: act_vlan] CPU: 3 PID: 3119 Comm: tc Tainted: G E 4.16.0-rc4.act_vlan.orig+ #403 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 RIP: 0010:__call_rcu+0x23/0x2b0 RSP: 0018:ffffaac3005fb798 EFLAGS: 00010246 RAX: ffffffffc0704080 RBX: ffff97f2b4bbe900 RCX: 00000000ffffffff RDX: ffffffffabca5f00 RSI: 0000000000000010 RDI: 0000000000000010 RBP: 0000000000000010 R08: 0000000000000001 R09: 0000000000000044 R10: 00000000fd003000 R11: ffff97f2faab5b91 R12: 0000000000000000 R13: ffffffffabca5f00 R14: ffff97f2fb80202c R15: 00000000fffffff4 FS: 00007f68f75b4740(0000) GS:ffff97f2ffd80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000018 CR3: 0000000072b52001 CR4: 00000000001606e0 Call Trace: __tcf_idr_release+0x79/0xf0 tcf_vlan_init+0x168/0x270 [act_vlan] tcf_action_init_1+0x2cc/0x430 tcf_action_init+0xd3/0x1b0 tc_ctl_action+0x18b/0x240 rtnetlink_rcv_msg+0x29c/0x310 ? _cond_resched+0x15/0x30 ? __kmalloc_node_track_caller+0x1b9/0x270 ? rtnl_calcit.isra.28+0x100/0x100 netlink_rcv_skb+0xd2/0x110 netlink_unicast+0x17c/0x230 netlink_sendmsg+0x2cd/0x3c0 sock_sendmsg+0x30/0x40 ___sys_sendmsg+0x27a/0x290 ? filemap_map_pages+0x34a/0x3a0 ? __handle_mm_fault+0xbfd/0xe20 __sys_sendmsg+0x51/0x90 do_syscall_64+0x6e/0x1a0 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 RIP: 0033:0x7f68f69c5ba0 RSP: 002b:00007fffd79c1118 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007fffd79c1240 RCX: 00007f68f69c5ba0 RDX: 0000000000000000 RSI: 00007fffd79c1190 RDI: 0000000000000003 RBP: 000000005aaa708e R08: 0000000000000002 R09: 0000000000000000 R10: 00007fffd79c0ba0 R11: 0000000000000246 R12: 0000000000000000 R13: 00007fffd79c1254 R14: 0000000000000001 R15: 0000000000669f60 Code: 5d e9 42 da ff ff 66 90 0f 1f 44 00 00 41 57 41 56 41 55 49 89 d5 41 54 55 48 89 fd 53 48 83 ec 08 40 f6 c7 07 0f 85 19 02 00 00 <48> 89 75 08 48 c7 45 00 00 00 00 00 9c 58 0f 1f 44 00 00 49 89 RIP: __call_rcu+0x23/0x2b0 RSP: ffffaac3005fb798 CR2: 0000000000000018 fix this in tcf_vlan_cleanup(), ensuring that kfree_rcu(p, ...) is called only when p is not NULL. Fixes: 4c5b9d9642c8 ("act_vlan: VLAN action rewrite to use RCU lock/unlock and update") Acked-by: Jiri Pirko <[email protected]> Acked-by: Manish Kurup <[email protected]> Signed-off-by: Davide Caratti <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-17net: ethernet: ti: cpsw: add check for in-band mode setting with RGMII PHY ↵SZ Lin (林上智)1-1/+2
interface According to AM335x TRM[1] 14.3.6.2, AM437x TRM[2] 15.3.6.2 and DRA7 TRM[3] 24.11.4.8.7.3.3, in-band mode in EXT_EN(bit18) register is only available when PHY is configured in RGMII mode with 10Mbps speed. It will cause some networking issues without RGMII mode, such as carrier sense errors and low throughput. TI also mentioned this issue in their forum[4]. This patch adds the check mechanism for PHY interface with RGMII interface type, the in-band mode can only be set in RGMII mode with 10Mbps speed. References: [1]: https://www.ti.com/lit/ug/spruh73p/spruh73p.pdf [2]: http://www.ti.com/lit/ug/spruhl7h/spruhl7h.pdf [3]: http://www.ti.com/lit/ug/spruic2b/spruic2b.pdf [4]: https://e2e.ti.com/support/arm/sitara_arm/f/791/p/640765/2392155 Suggested-by: Holsety Chen (陳憲輝) <[email protected]> Signed-off-by: SZ Lin (林上智) <[email protected]> Signed-off-by: Schuyler Patton <[email protected]> Reviewed-by: Grygorii Strashko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-17net: hns: Fix ethtool private flagsMatthias Brugger4-4/+6
The driver implementation returns support for private flags, while no private flags are present. When asked for the number of private flags it returns the number of statistic flag names. Fix this by returning EOPNOTSUPP for not implemented ethtool flags. Signed-off-by: Matthias Brugger <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-17mlxsw: spectrum_buffers: Set a minimum quota for CPU port trafficIdo Schimmel1-6/+6
In commit 9ffcc3725f09 ("mlxsw: spectrum: Allow packets to be trapped from any PG") I fixed a problem where packets could not be trapped to the CPU due to exceeded shared buffer quotas. The mentioned commit explains the problem in detail. The problem was fixed by assigning a minimum quota for the CPU port and the traffic class used for scheduling traffic to the CPU. However, commit 117b0dad2d54 ("mlxsw: Create a different trap group list for each device") assigned different traffic classes to different packet types and rendered the fix useless. Fix the problem by assigning a minimum quota for the CPU port and all the traffic classes that are currently in use. Fixes: 117b0dad2d54 ("mlxsw: Create a different trap group list for each device") Signed-off-by: Ido Schimmel <[email protected]> Reported-by: Eddie Shklaer <[email protected]> Tested-by: Eddie Shklaer <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-17net: sched: fix uses after freeEric Dumazet1-9/+13
syzbot reported one use-after-free in pfifo_fast_enqueue() [1] Issue here is that we can not reuse skb after a successful skb_array_produce() since another cpu might have consumed it already. I believe a similar problem exists in try_bulk_dequeue_skb_slow() in case we put an skb into qdisc_enqueue_skb_bad_txq() for lockless qdisc. [1] BUG: KASAN: use-after-free in qdisc_pkt_len include/net/sch_generic.h:610 [inline] BUG: KASAN: use-after-free in qdisc_qstats_cpu_backlog_inc include/net/sch_generic.h:712 [inline] BUG: KASAN: use-after-free in pfifo_fast_enqueue+0x4bc/0x5e0 net/sched/sch_generic.c:639 Read of size 4 at addr ffff8801cede37e8 by task syzkaller717588/5543 CPU: 1 PID: 5543 Comm: syzkaller717588 Not tainted 4.16.0-rc4+ #265 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x194/0x24d lib/dump_stack.c:53 print_address_description+0x73/0x250 mm/kasan/report.c:256 kasan_report_error mm/kasan/report.c:354 [inline] kasan_report+0x23c/0x360 mm/kasan/report.c:412 __asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:432 qdisc_pkt_len include/net/sch_generic.h:610 [inline] qdisc_qstats_cpu_backlog_inc include/net/sch_generic.h:712 [inline] pfifo_fast_enqueue+0x4bc/0x5e0 net/sched/sch_generic.c:639 __dev_xmit_skb net/core/dev.c:3216 [inline] Fixes: c5ad119fb6c0 ("net: sched: pfifo_fast use skb_array") Signed-off-by: Eric Dumazet <[email protected]> Reported-by: [email protected] Cc: John Fastabend <[email protected]> Cc: Jamal Hadi Salim <[email protected]> Cc: Cong Wang <[email protected]> Cc: Jiri Pirko <[email protected]> Acked-by: John Fastabend <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-16net: dsa: mv88e6xxx: Fix binding documentation for MDIO bussesAndrew Lunn1-23/+25
The MDIO busses are switch properties and so should be inside the switch node. Fix the examples in the binding document. Reported-by: 尤晓杰 <[email protected]> Signed-off-by: Andrew Lunn <[email protected]> Fixes: a3c53be55c95 ("net: dsa: mv88e6xxx: Support multiple MDIO busses") Signed-off-by: David S. Miller <[email protected]>
2018-03-16skbuff: Fix not waking applications when errors are enqueuedVinicius Costa Gomes1-1/+1
When errors are enqueued to the error queue via sock_queue_err_skb() function, it is possible that the waiting application is not notified. Calling 'sk->sk_data_ready()' would not notify applications that selected only POLLERR events in poll() (for example). Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: Randy E. Witt <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: Vinicius Costa Gomes <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-16netlink: avoid a double skb free in genlmsg_mcast()Nicolas Dichtel1-1/+1
nlmsg_multicast() consumes always the skb, thus the original skb must be freed only when this function is called with a clone. Fixes: cb9f7a9a5c96 ("netlink: ensure to loop over all netns in genlmsg_multicast_allns()") Reported-by: Ben Hutchings <[email protected]> Signed-off-by: Nicolas Dichtel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-16qede: Fix qedr link updateMichal Kalderon1-2/+2
Link updates were not reported to qedr correctly. Leading to cases where a link could be down, but qedr would see it as up. In addition, once qede was loaded, link state would be up, regardless of the actual link state. Signed-off-by: Michal Kalderon <[email protected]> Signed-off-by: Ariel Elior <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-16Merge branch 'qed-iWARP-related-fixes'David S. Miller1-1/+16
Michal Kalderon says: ==================== qed: iWARP related fixes This series contains two fixes related to iWARP flow. ==================== Signed-off-by: Michal Kalderon <[email protected]> Signed-off-by: Ariel Elior <[email protected]>
2018-03-16qed: Fix non TCP packets should be dropped on iWARP ll2 connectionMichal Kalderon1-0/+15
FW workaround. The iWARP LL2 connection did not expect TCP packets to arrive on it's connection. The fix drops any non-tcp packets Fixes b5c29ca ("qed: iWARP CM - setup a ll2 connection for handling SYN packets") Signed-off-by: Michal Kalderon <[email protected]> Signed-off-by: Ariel Elior <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-16qed: Fix MPA unalign flow in case header is split across two packets.Michal Kalderon1-1/+1
There is a corner case in the MPA unalign flow where a FPDU header is split over two tcp segments. The length of the first fragment in this case was not initialized properly and should be '1' Fixes: c7d1d839 ("qed: Add support for MPA header being split over two tcp packets") Signed-off-by: Michal Kalderon <[email protected]> Signed-off-by: Ariel Elior <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-16net/iucv: Free memory obtained by kzallocArvind Yadav1-1/+3
Free memory by calling put_device(), if afiucv_iucv_init is not successful. Signed-off-by: Arvind Yadav <[email protected]> Reviewed-by: Cornelia Huck <[email protected]> Signed-off-by: Ursula Braun <[email protected]> Signed-off-by: Julian Wiedmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-16net: systemport: Rewrite __bcm_sysport_tx_reclaim()Florian Fainelli2-19/+16
There is no need for complex checking between the last consumed index and current consumed index, a simple subtraction will do. This also eliminates the possibility of a permanent transmit queue stall under the following conditions: - one CPU bursts ring->size worth of traffic (up to 256 buffers), to the point where we run out of free descriptors, so we stop the transmit queue at the end of bcm_sysport_xmit() - because of our locking, we have the transmit process disable interrupts which means we can be blocking the TX reclamation process - when TX reclamation finally runs, we will be computing the difference between ring->c_index (last consumed index by SW) and what the HW reports through its register - this register is masked with (ring->size - 1) = 0xff, which will lead to stripping the upper bits of the index (register is 16-bits wide) - we will be computing last_tx_cn as 0, which means there is no work to be done, and we never wake-up the transmit queue, leaving it permanently disabled A practical example is e.g: ring->c_index aka last_c_index = 12, we pushed 256 entries, HW consumer index = 268, we mask it with 0xff = 12, so last_tx_cn == 0, nothing happens. Fixes: 80105befdb4b ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver") Signed-off-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-16kcm: lock lower socket in kcm_attachTom Herbert1-10/+23
Need to lock lower socket in order to provide mutual exclusion with kcm_unattach. v2: Add Reported-by for syzbot Fixes: ab7ac4eb9832e32a09f4e804 ("kcm: Kernel Connection Multiplexor module") Reported-by: syzbot+ea75c0ffcd353d32515f064aaebefc5279e6161e@syzkaller.appspotmail.com Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-16Merge branch 'vlan-untag-and-insert-fixes'David S. Miller4-15/+63
Toshiaki Makita says: ==================== Fix vlan untag and insertion for bridge and vlan with reorder_hdr off As Brandon Carpenter reported[1], sending non-vlan-offloaded packets from bridge devices ends up with corrupted packets. He narrowed down this problem and found that the root cause is in skb_reorder_vlan_header(). While I was working on fixing this problem, I found that the function does not work properly for double tagged packets with reorder_hdr off as well. Patch 1 fixes these 2 problems in skb_reorder_vlan_header(). And it turned out that fixing skb_reorder_vlan_header() is not sufficient to receive double tagged packets with reorder_hdr off while I was testing the fix. Vlan tags got out of order when vlan devices with reorder_hdr disabled were stacked. Patch 2 fixes this problem. [1] https://www.spinics.net/lists/linux-ethernet-bridging/msg07039.html ==================== Signed-off-by: David S. Miller <[email protected]>
2018-03-16vlan: Fix out of order vlan headers with reorder header offToshiaki Makita2-13/+57
With reorder header off, received packets are untagged in skb_vlan_untag() called from within __netif_receive_skb_core(), and later the tag will be inserted back in vlan_do_receive(). This caused out of order vlan headers when we create a vlan device on top of another vlan device, because vlan_do_receive() inserts a tag as the outermost vlan tag. E.g. the outer tag is first removed in skb_vlan_untag() and inserted back in vlan_do_receive(), then the inner tag is next removed and inserted back as the outermost tag. This patch fixes the behaviour by inserting the inner tag at the right position. Signed-off-by: Toshiaki Makita <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-16net: Fix vlan untag for bridge and vlan_dev with reorder_hdr offToshiaki Makita2-2/+6
When we have a bridge with vlan_filtering on and a vlan device on top of it, packets would be corrupted in skb_vlan_untag() called from br_dev_xmit(). The problem sits in skb_reorder_vlan_header() used in skb_vlan_untag(), which makes use of skb->mac_len. In this function mac_len is meant for handling rx path with vlan devices with reorder_header disabled, but in tx path mac_len is typically 0 and cannot be used, which is the problem in this case. The current code even does not properly handle rx path (skb_vlan_untag() called from __netif_receive_skb_core()) with reorder_header off actually. In rx path single tag case, it works as follows: - Before skb_reorder_vlan_header() mac_header data v v +-------------------+-------------+------+---- | ETH | VLAN | ETH | | ADDRS | TPID | TCI | TYPE | +-------------------+-------------+------+---- <-------- mac_len ---------> <-------------> to be removed - After skb_reorder_vlan_header() mac_header data v v +-------------------+------+---- | ETH | ETH | | ADDRS | TYPE | +-------------------+------+---- <-------- mac_len ---------> This is ok, but in rx double tag case, it corrupts packets: - Before skb_reorder_vlan_header() mac_header data v v +-------------------+-------------+-------------+------+---- | ETH | VLAN | VLAN | ETH | | ADDRS | TPID | TCI | TPID | TCI | TYPE | +-------------------+-------------+-------------+------+---- <--------------- mac_len ----------------> <-------------> should be removed <---------------------------> actually will be removed - After skb_reorder_vlan_header() mac_header data v v +-------------------+------+---- | ETH | ETH | | ADDRS | TYPE | +-------------------+------+---- <--------------- mac_len ----------------> So, two of vlan tags are both removed while only inner one should be removed and mac_header (and mac_len) is broken. skb_vlan_untag() is meant for removing the vlan header at (skb->data - 2), so use skb->data and skb->mac_header to calculate the right offset. Reported-by: Brandon Carpenter <[email protected]> Fixes: a6e18ff11170 ("vlan: Fix untag operations of stacked vlans with REORDER_HEADER off") Signed-off-by: Toshiaki Makita <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-15net sched actions: return explicit error when tunnel_key mode is not specifiedRoman Mashak1-0/+1
If set/unset mode of the tunnel_key action is not provided, ->init() still returns 0, and the caller proceeds with bogus 'struct tc_action *' object, this results in crash: % tc actions add action tunnel_key src_ip 1.1.1.1 dst_ip 2.2.2.1 id 7 index 1 [ 35.805515] general protection fault: 0000 [#1] SMP PTI [ 35.806161] Modules linked in: act_tunnel_key kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd serio_raw [ 35.808233] CPU: 1 PID: 428 Comm: tc Not tainted 4.16.0-rc4+ #286 [ 35.808929] RIP: 0010:tcf_action_init+0x90/0x190 [ 35.809457] RSP: 0018:ffffb8edc068b9a0 EFLAGS: 00010206 [ 35.810053] RAX: 1320c000000a0003 RBX: 0000000000000001 RCX: 0000000000000000 [ 35.810866] RDX: 0000000000000070 RSI: 0000000000007965 RDI: ffffb8edc068b910 [ 35.811660] RBP: ffffb8edc068b9d0 R08: 0000000000000000 R09: ffffb8edc068b808 [ 35.812463] R10: ffffffffc02bf040 R11: 0000000000000040 R12: ffffb8edc068bb38 [ 35.813235] R13: 0000000000000000 R14: 0000000000000000 R15: ffffb8edc068b910 [ 35.814006] FS: 00007f3d0d8556c0(0000) GS:ffff91d1dbc40000(0000) knlGS:0000000000000000 [ 35.814881] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 35.815540] CR2: 000000000043f720 CR3: 0000000019248001 CR4: 00000000001606a0 [ 35.816457] Call Trace: [ 35.817158] tc_ctl_action+0x11a/0x220 [ 35.817795] rtnetlink_rcv_msg+0x23d/0x2e0 [ 35.818457] ? __slab_alloc+0x1c/0x30 [ 35.819079] ? __kmalloc_node_track_caller+0xb1/0x2b0 [ 35.819544] ? rtnl_calcit.isra.30+0xe0/0xe0 [ 35.820231] netlink_rcv_skb+0xce/0x100 [ 35.820744] netlink_unicast+0x164/0x220 [ 35.821500] netlink_sendmsg+0x293/0x370 [ 35.822040] sock_sendmsg+0x30/0x40 [ 35.822508] ___sys_sendmsg+0x2c5/0x2e0 [ 35.823149] ? pagecache_get_page+0x27/0x220 [ 35.823714] ? filemap_fault+0xa2/0x640 [ 35.824423] ? page_add_file_rmap+0x108/0x200 [ 35.825065] ? alloc_set_pte+0x2aa/0x530 [ 35.825585] ? finish_fault+0x4e/0x70 [ 35.826140] ? __handle_mm_fault+0xbc1/0x10d0 [ 35.826723] ? __sys_sendmsg+0x41/0x70 [ 35.827230] __sys_sendmsg+0x41/0x70 [ 35.827710] do_syscall_64+0x68/0x120 [ 35.828195] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [ 35.828859] RIP: 0033:0x7f3d0ca4da67 [ 35.829331] RSP: 002b:00007ffc9f284338 EFLAGS: 00000246 ORIG_RAX: 000000000000002e [ 35.830304] RAX: ffffffffffffffda RBX: 00007ffc9f284460 RCX: 00007f3d0ca4da67 [ 35.831247] RDX: 0000000000000000 RSI: 00007ffc9f2843b0 RDI: 0000000000000003 [ 35.832167] RBP: 000000005aa6a7a9 R08: 0000000000000001 R09: 0000000000000000 [ 35.833075] R10: 00000000000005f1 R11: 0000000000000246 R12: 0000000000000000 [ 35.833997] R13: 00007ffc9f2884c0 R14: 0000000000000001 R15: 0000000000674640 [ 35.834923] Code: 24 30 bb 01 00 00 00 45 31 f6 eb 5e 8b 50 08 83 c2 07 83 e2 fc 83 c2 70 49 8b 07 48 8b 40 70 48 85 c0 74 10 48 89 14 24 4c 89 ff <ff> d0 48 8b 14 24 48 01 c2 49 01 d6 45 85 ed 74 05 41 83 47 2c [ 35.837442] RIP: tcf_action_init+0x90/0x190 RSP: ffffb8edc068b9a0 [ 35.838291] ---[ end trace a095c06ee4b97a26 ]--- Fixes: d0f6dd8a914f ("net/sched: Introduce act_tunnel_key") Signed-off-by: Roman Mashak <[email protected]> Acked-by: Cong Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-15net/smc: simplify wait when closing listen socketUrsula Braun2-26/+3
Closing of a listen socket wakes up kernel_accept() of smc_tcp_listen_worker(), and then has to wait till smc_tcp_listen_worker() gives up the internal clcsock. The wait logic introduced with commit 127f49705823 ("net/smc: release clcsock from tcp_listen_worker") might wait longer than necessary. This patch implements the idea to implement the wait just with flush_work(), and gets rid of the extra smc_close_wait_listen_clcsock() function. Fixes: 127f49705823 ("net/smc: release clcsock from tcp_listen_worker") Reported-by: Hans Wippel <[email protected]> Signed-off-by: Ursula Braun <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-14sunvnet: does not support GSO for sctpCathy Zhou1-1/+1
The NETIF_F_GSO_SOFTWARE implies support for GSO on SCTP, but the sunvnet driver does not support GSO for sctp. Here we remove the NETIF_F_GSO_SOFTWARE feature flag and only report NETIF_F_ALL_TSO instead. Signed-off-by: Cathy Zhou <[email protected]> Signed-off-by: Shannon Nelson <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-14Merge tag 'linux-can-fixes-for-4.16-20180314' of ↵David S. Miller2-40/+65
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2018-03-14 this is a pull request of two patches for net/master. Both patches are by Andri Yngvason and fix problems in the cc770 driver, that show up quite fast on RT systems, but also on non RT setups. ==================== Signed-off-by: David S. Miller <[email protected]>
2018-03-14tg3: prevent scheduling while atomic splatJonathan Toppins1-1/+1
The problem was introduced in commit 506b0a395f26 ("[netdrv] tg3: APE heartbeat changes"). The bug occurs because tp->lock spinlock is held which is obtained in tg3_start by way of tg3_full_lock(), line 11571. The documentation for usleep_range() specifically states it cannot be used inside a spinlock. Fixes: 506b0a395f26 ("[netdrv] tg3: APE heartbeat changes") Signed-off-by: Jonathan Toppins <[email protected]> Acked-by: Michael Chan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-14ipv4: lock mtu in fnhe when received PMTU < net.ipv4.route.min_pmtuSabrina Dubroca5-10/+32
Prior to the rework of PMTU information storage in commit 2c8cec5c10bc ("ipv4: Cache learned PMTU information in inetpeer."), when a PMTU event advertising a PMTU smaller than net.ipv4.route.min_pmtu was received, we would disable setting the DF flag on packets by locking the MTU metric, and set the PMTU to net.ipv4.route.min_pmtu. Since then, we don't disable DF, and set PMTU to net.ipv4.route.min_pmtu, so the intermediate router that has this link with a small MTU will have to drop the packets. This patch reestablishes pre-2.6.39 behavior by splitting rtable->rt_pmtu into a bitfield with rt_mtu_locked and rt_pmtu. rt_mtu_locked indicates that we shouldn't set the DF bit on that path, and is checked in ip_dont_fragment(). One possible workaround is to set net.ipv4.route.min_pmtu to a value low enough to accommodate the lowest MTU encountered. Fixes: 2c8cec5c10bc ("ipv4: Cache learned PMTU information in inetpeer.") Signed-off-by: Sabrina Dubroca <[email protected]> Reviewed-by: Stefano Brivio <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-14Merge branch 'DPAA-Ethernet-fixes'David S. Miller2-27/+9
Madalin Bucur says: ==================== DPAA Ethernet fixes This patch set is addressing several issues in the DPAA Ethernet driver suite: - module unload crash caused by wrong reference to device being left in the cleanup code after the DSA related changes - scheduling wile atomic bug in QMan code revealed during dpaa_eth module unload - a couple of error counter fixes, a duplicated init in dpaa_eth. ==================== Signed-off-by: David S. Miller <[email protected]>
2018-03-14dpaa_eth: remove duplicate increment of the tx_errors counterCamelia Groza1-1/+0
The tx_errors counter is incremented by the dpaa_xmit caller. Signed-off-by: Camelia Groza <[email protected]> Signed-off-by: Madalin Bucur <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-14dpaa_eth: increment the RX dropped counter when neededCamelia Groza1-1/+3
Signed-off-by: Camelia Groza <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-14dpaa_eth: remove duplicate initializationCamelia Groza1-1/+0
The fd_format has already been initialized at this point. Signed-off-by: Camelia Groza <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-14dpaa_eth: fix error in dpaa_remove()Madalin Bucur1-1/+1
The recent changes that make the driver probing compatible with DSA were not propagated in the dpa_remove() function, breaking the module unload function. Using the proper device to address the issue. Signed-off-by: Madalin Bucur <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-14soc/fsl/qbman: fix issue in qman_delete_cgr_safe()Madalin Bucur1-23/+5
The wait_for_completion() call in qman_delete_cgr_safe() was triggering a scheduling while atomic bug, replacing the kthread with a smp_call_function_single() call to fix it. Signed-off-by: Madalin Bucur <[email protected]> Signed-off-by: Roy Pledge <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-14net: use skb_to_full_sk() in skb_update_prio()Eric Dumazet2-9/+17
Andrei Vagin reported a KASAN: slab-out-of-bounds error in skb_update_prio() Since SYNACK might be attached to a request socket, we need to get back to the listener socket. Since this listener is manipulated without locks, add const qualifiers to sock_cgroup_prioidx() so that the const can also be used in skb_update_prio() Also add the const qualifier to sock_cgroup_classid() for consistency. Fixes: ca6fb0651883 ("tcp: attach SYNACK messages to request sockets instead of listener") Signed-off-by: Eric Dumazet <[email protected]> Reported-by: Andrei Vagin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-03-14can: cc770: Fix queue stall & dropped RTR replyAndri Yngvason2-28/+68
While waiting for the TX object to send an RTR, an external message with a matching id can overwrite the TX data. In this case we must call the rx routine and then try transmitting the message that was overwritten again. The queue was being stalled because the RX event did not generate an interrupt to wake up the queue again and the TX event did not happen because the TXRQST flag is reset by the chip when new data is received. According to the CC770 datasheet the id of a message object should not be changed while the MSGVAL bit is set. This has been fixed by resetting the MSGVAL bit before modifying the object in the transmit function and setting it after. It is not enough to set & reset CPUUPD. It is important to keep the MSGVAL bit reset while the message object is being modified. Otherwise, during RTR transmission, a frame with matching id could trigger an rx-interrupt, which would cause a race condition between the interrupt routine and the transmit function. Signed-off-by: Andri Yngvason <[email protected]> Tested-by: Richard Weinberger <[email protected]> Cc: linux-stable <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>