aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-04-08netfilter: make it safer during the inet6_dev->addr_list traversalLiping Zhang2-1/+6
inet6_dev->addr_list is protected by inet6_dev->lock, so only using rcu_read_lock is not enough, we should acquire read_lock_bh(&idev->lock) before the inet6_dev->addr_list traversal. Signed-off-by: Liping Zhang <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2017-04-08netfilter: ctnetlink: make it safer when checking the ct helper nameLiping Zhang1-5/+10
One CPU is doing ctnetlink_change_helper(), while another CPU is doing unhelp() at the same time. So even if help->helper is not NULL at first, the later statement strcmp(help->helper->name, ...) may still access the NULL pointer. So we must use rcu_read_lock and rcu_dereference to avoid such _bad_ thing happen. Fixes: f95d7a46bc57 ("netfilter: ctnetlink: Fix regression in CTA_HELP processing") Signed-off-by: Liping Zhang <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2017-04-08netfilter: helper: Add the rcu lock when call __nf_conntrack_helper_findGao Feng2-7/+20
When invoke __nf_conntrack_helper_find, it needs the rcu lock to protect the helper module which would not be unloaded. Now there are two caller nf_conntrack_helper_try_module_get and ctnetlink_create_expect which don't hold rcu lock. And the other callers left like ctnetlink_change_helper, ctnetlink_create_conntrack, and ctnetlink_glue_attach_expect, they already hold the rcu lock or spin_lock_bh. Remove the rcu lock in functions nf_ct_helper_expectfn_find_by_name and nf_ct_helper_expectfn_find_by_symbol. Because they return one pointer which needs rcu lock, so their caller should hold the rcu lock, not in these two functions. Signed-off-by: Gao Feng <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2017-04-08netfilter: ctnetlink: using bit to represent the ct eventLiping Zhang1-2/+2
Otherwise, creating a new conntrack via nfnetlink: # conntrack -I -p udp -s 1.1.1.1 -d 2.2.2.2 -t 10 --sport 10 --dport 20 will emit the wrong ct events(where UPDATE should be NEW): # conntrack -E [UPDATE] udp 17 10 src=1.1.1.1 dst=2.2.2.2 sport=10 dport=20 [UNREPLIED] src=2.2.2.2 dst=1.1.1.1 sport=20 dport=10 mark=0 Signed-off-by: Liping Zhang <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2017-04-08netfilter: xt_TCPMSS: add more sanity tests on tcph->doffEric Dumazet1-1/+5
Denys provided an awesome KASAN report pointing to an use after free in xt_TCPMSS I have provided three patches to fix this issue, either in xt_TCPMSS or in xt_tcpudp.c. It seems xt_TCPMSS patch has the smallest possible impact. Signed-off-by: Eric Dumazet <[email protected]> Reported-by: Denys Fedoryshchenko <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2017-04-03tcp: minimize false-positives on TCP/GRO checkMarcelo Ricardo Leitner1-5/+9
Markus Trippelsdorf reported that after commit dcb17d22e1c2 ("tcp: warn on bogus MSS and try to amend it") the kernel started logging the warning for a NIC driver that doesn't even support GRO. It was diagnosed that it was possibly caused on connections that were using TCP Timestamps but some packets lacked the Timestamps option. As we reduce rcv_mss when timestamps are used, the lack of them would cause the packets to be bigger than expected, although this is a valid case. As this warning is more as a hint, getting a clean-cut on the threshold is probably not worth the execution time spent on it. This patch thus alleviates the false-positives with 2 quick checks: by accounting for the entire TCP option space and also checking against the interface MTU if it's available. These changes, specially the MTU one, might mask some real positives, though if they are really happening, it's possible that sooner or later it will be triggered anyway. Reported-by: Markus Trippelsdorf <[email protected]> Cc: Eric Dumazet <[email protected]> Signed-off-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-04-03sctp: check for dst and pathmtu update in sctp_packet_configXin Long2-34/+50
This patch is to move sctp_transport_dst_check into sctp_packet_config from sctp_packet_transmit and add pathmtu check in sctp_packet_config. With this fix, sctp can update dst or pathmtu before appending chunks, which can void dropping packets in sctp_packet_transmit when dst is obsolete or dst's mtu is changed. This patch is also to improve some other codes in sctp_packet_config. It updates packet max_size with gso_max_size, checks for dst and pathmtu, and appends ecne chunk only when packet is empty and asoc is not NULL. It makes sctp flush work better, as we only need to set up them once for one flush schedule. It's also safe, since asoc is NULL only when the packet is created by sctp_ootb_pkt_new in which it just gets the new dst, no need to do more things for it other than set packet with transport's pathmtu. Signed-off-by: Xin Long <[email protected]> Acked-by: Marcelo Ricardo Leitner <[email protected]> Acked-by: Neil Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-04-03flow dissector: correct size of storage for ARPSimon Horman1-1/+1
The last argument to __skb_header_pointer() should be a buffer large enough to store struct arphdr. This can be a pointer to a struct arphdr structure. The code was previously using a pointer to a pointer to struct arphdr. By my counting the storage available both before and after is 8 bytes on x86_64. Fixes: 55733350e5e8 ("flow disector: ARP support") Reported-by: Nicolas Iooss <[email protected]> Signed-off-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-04-02net: ethernet: ti: cpsw: wake tx queues on ndo_tx_timeoutGrygorii Strashko1-0/+2
In case, if TX watchdog is fired some or all netdev TX queues will be stopped and as part of recovery it is required not only to drain and reinitailize CPSW TX channeles, but also wake up stoppted TX queues what doesn't happen now and netdevice will stop transmiting data until reopenned. Hence, add netif_tx_wake_all_queues() call in .ndo_tx_timeout() to complete recovery and restore TX path. Signed-off-by: Grygorii Strashko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-04-01Merge branch 'l2tp_session_find-fixes'David S. Miller7-101/+222
Guillaume Nault says: ==================== l2tp: fix usage of l2tp_session_find() l2tp_session_find() doesn't take a reference on the session returned to its caller. Virtually all l2tp_session_find() users are racy, either because the session can disappear from under them or because they take a reference too late. This leads to bugs like 'use after free' or failure to notice duplicate session creations. In some cases, taking a reference on the session is not enough. The special callbacks .ref() and .deref() also have to be called in cases where the PPP pseudo-wire uses the socket associated with the session. Therefore, when looking up a session, we also have to pass a flag indicating if the .ref() callback has to be called. In the future, we probably could drop the .ref() and .deref() callbacks entirely by protecting the .sock field of struct pppol2tp_session with RCU, thus allowing it to be freed and set to NULL even if the L2TP session is still alive. ==================== Signed-off-by: David S. Miller <[email protected]>
2017-04-01l2tp: take a reference on sessions used in genetlink handlersGuillaume Nault3-16/+35
Callers of l2tp_nl_session_find() need to hold a reference on the returned session since there's no guarantee that it isn't going to disappear from under them. Relying on the fact that no l2tp netlink message may be processed concurrently isn't enough: sessions can be deleted by other means (e.g. by closing the PPPOL2TP socket of a ppp pseudowire). l2tp_nl_cmd_session_delete() is a bit special: it runs a callback function that may require a previous call to session->ref(). In particular, for ppp pseudowires, the callback is l2tp_session_delete(), which then calls pppol2tp_session_close() and dereferences the PPPOL2TP socket. The socket might already be gone at the moment l2tp_session_delete() calls session->ref(), so we need to take a reference during the session lookup. So we need to pass the do_ref variable down to l2tp_session_get() and l2tp_session_get_by_ifname(). Since all callers have to be updated, l2tp_session_find_by_ifname() and l2tp_nl_session_find() are renamed to reflect their new behaviour. Fixes: 309795f4bec2 ("l2tp: Add netlink control API for L2TP") Signed-off-by: Guillaume Nault <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-04-01l2tp: hold session while sending creation notificationsGuillaume Nault1-2/+4
l2tp_session_find() doesn't take any reference on the returned session. Therefore, the session may disappear while sending the notification. Use l2tp_session_get() instead and decrement session's refcount once the notification is sent. Fixes: 33f72e6f0c67 ("l2tp : multicast notification to the registered listeners") Signed-off-by: Guillaume Nault <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-04-01l2tp: fix duplicate session creationGuillaume Nault3-56/+84
l2tp_session_create() relies on its caller for checking for duplicate sessions. This is racy since a session can be concurrently inserted after the caller's verification. Fix this by letting l2tp_session_create() verify sessions uniqueness upon insertion. Callers need to be adapted to check for l2tp_session_create()'s return code instead of calling l2tp_session_find(). pppol2tp_connect() is a bit special because it has to work on existing sessions (if they're not connected) or to create a new session if none is found. When acting on a preexisting session, a reference must be held or it could go away on us. So we have to use l2tp_session_get() instead of l2tp_session_find() and drop the reference before exiting. Fixes: d9e31d17ceba ("l2tp: Add L2TP ethernet pseudowire support") Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts") Signed-off-by: Guillaume Nault <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-04-01l2tp: ensure session can't get removed during pppol2tp_session_ioctl()Guillaume Nault1-4/+11
Holding a reference on session is required before calling pppol2tp_session_ioctl(). The session could get freed while processing the ioctl otherwise. Since pppol2tp_session_ioctl() uses the session's socket, we also need to take a reference on it in l2tp_session_get(). Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts") Signed-off-by: Guillaume Nault <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-04-01l2tp: fix race in l2tp_recv_common()Guillaume Nault4-23/+88
Taking a reference on sessions in l2tp_recv_common() is racy; this has to be done by the callers. To this end, a new function is required (l2tp_session_get()) to atomically lookup a session and take a reference on it. Callers then have to manually drop this reference. Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts") Signed-off-by: Guillaume Nault <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-04-01sctp: use right in and out stream cntXin Long4-12/+11
Since sctp reconf was added in sctp, the real cnt of in/out stream have not been c.sinit_max_instreams and c.sinit_num_ostreams any more. This patch is to replace them with stream->in/outcnt. Signed-off-by: Xin Long <[email protected]> Acked-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-04-01Merge tag 'mac80211-for-davem-2017-03-31' of ↵David S. Miller2-7/+6
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Two fixes: * don't block netdev queues (indefinitely!) if mac80211 manages traffic queueing itself * check wiphy registration before checking for ops on resume, to avoid crash ==================== Signed-off-by: David S. Miller <[email protected]>
2017-04-01Merge branch 'bpf-map_value_adj-reg-types-fixes'David S. Miller4-26/+322
Daniel Borkmann says: ==================== BPF fixes on map_value_adj reg types This set adds two fixes for map_value_adj register type in the verifier and user space tests along with them for the BPF self test suite. For details, please see individual patches. ==================== Signed-off-by: David S. Miller <[email protected]>
2017-04-01bpf: add various verifier test cases for self-testsDaniel Borkmann3-6/+283
Add a couple of test cases, for example, probing for xadd on a spilled pointer to packet and map_value_adj register, various other map_value_adj tests including the unaligned load/store, and trying out pointer arithmetic on map_value_adj register itself. For the unaligned load/store, we need to figure out whether the architecture has efficient unaligned access and need to mark affected tests accordingly. Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-04-01bpf, verifier: fix rejection of unaligned access checks for map_value_adjDaniel Borkmann1-20/+38
Currently, the verifier doesn't reject unaligned access for map_value_adj register types. Commit 484611357c19 ("bpf: allow access into map value arrays") added logic to check_ptr_alignment() extending it from PTR_TO_PACKET to also PTR_TO_MAP_VALUE_ADJ, but for PTR_TO_MAP_VALUE_ADJ no enforcement is in place, because reg->id for PTR_TO_MAP_VALUE_ADJ reg types is never non-zero, meaning, we can cause BPF_H/_W/_DW-based unaligned access for architectures not supporting efficient unaligned access, and thus worst case could raise exceptions on some archs that are unable to correct the unaligned access or perform a different memory access to the actual requested one and such. i) Unaligned load with !CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS on r0 (map_value_adj): 0: (bf) r2 = r10 1: (07) r2 += -8 2: (7a) *(u64 *)(r2 +0) = 0 3: (18) r1 = 0x42533a00 5: (85) call bpf_map_lookup_elem#1 6: (15) if r0 == 0x0 goto pc+11 R0=map_value(ks=8,vs=48,id=0),min_value=0,max_value=0 R10=fp 7: (61) r1 = *(u32 *)(r0 +0) 8: (35) if r1 >= 0xb goto pc+9 R0=map_value(ks=8,vs=48,id=0),min_value=0,max_value=0 R1=inv,min_value=0,max_value=10 R10=fp 9: (07) r0 += 3 10: (79) r7 = *(u64 *)(r0 +0) R0=map_value_adj(ks=8,vs=48,id=0),min_value=3,max_value=3 R1=inv,min_value=0,max_value=10 R10=fp 11: (79) r7 = *(u64 *)(r0 +2) R0=map_value_adj(ks=8,vs=48,id=0),min_value=3,max_value=3 R1=inv,min_value=0,max_value=10 R7=inv R10=fp [...] ii) Unaligned store with !CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS on r0 (map_value_adj): 0: (bf) r2 = r10 1: (07) r2 += -8 2: (7a) *(u64 *)(r2 +0) = 0 3: (18) r1 = 0x4df16a00 5: (85) call bpf_map_lookup_elem#1 6: (15) if r0 == 0x0 goto pc+19 R0=map_value(ks=8,vs=48,id=0),min_value=0,max_value=0 R10=fp 7: (07) r0 += 3 8: (7a) *(u64 *)(r0 +0) = 42 R0=map_value_adj(ks=8,vs=48,id=0),min_value=3,max_value=3 R10=fp 9: (7a) *(u64 *)(r0 +2) = 43 R0=map_value_adj(ks=8,vs=48,id=0),min_value=3,max_value=3 R10=fp 10: (7a) *(u64 *)(r0 -2) = 44 R0=map_value_adj(ks=8,vs=48,id=0),min_value=3,max_value=3 R10=fp [...] For the PTR_TO_PACKET type, reg->id is initially zero when skb->data was fetched, it later receives a reg->id from env->id_gen generator once another register with UNKNOWN_VALUE type was added to it via check_packet_ptr_add(). The purpose of this reg->id is twofold: i) it is used in find_good_pkt_pointers() for setting the allowed access range for regs with PTR_TO_PACKET of same id once verifier matched on data/data_end tests, and ii) for check_ptr_alignment() to determine that when not having efficient unaligned access and register with UNKNOWN_VALUE was added to PTR_TO_PACKET, that we're only allowed to access the content bytewise due to unknown unalignment. reg->id was never intended for PTR_TO_MAP_VALUE{,_ADJ} types and thus is always zero, the only marking is in PTR_TO_MAP_VALUE_OR_NULL that was added after 484611357c19 via 57a09bf0a416 ("bpf: Detect identical PTR_TO_MAP_VALUE_OR_NULL registers"). Above tests will fail for non-root environment due to prohibited pointer arithmetic. The fix splits register-type specific checks into their own helper instead of keeping them combined, so we don't run into a similar issue in future once we extend check_ptr_alignment() further and forget to add reg->type checks for some of the checks. Fixes: 484611357c19 ("bpf: allow access into map value arrays") Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Josef Bacik <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-04-01bpf, verifier: fix alu ops against map_value{, _adj} register typesDaniel Borkmann1-0/+1
While looking into map_value_adj, I noticed that alu operations directly on the map_value() resp. map_value_adj() register (any alu operation on a map_value() register will turn it into a map_value_adj() typed register) are not sufficiently protected against some of the operations. Two non-exhaustive examples are provided that the verifier needs to reject: i) BPF_AND on r0 (map_value_adj): 0: (bf) r2 = r10 1: (07) r2 += -8 2: (7a) *(u64 *)(r2 +0) = 0 3: (18) r1 = 0xbf842a00 5: (85) call bpf_map_lookup_elem#1 6: (15) if r0 == 0x0 goto pc+2 R0=map_value(ks=8,vs=48,id=0),min_value=0,max_value=0 R10=fp 7: (57) r0 &= 8 8: (7a) *(u64 *)(r0 +0) = 22 R0=map_value_adj(ks=8,vs=48,id=0),min_value=0,max_value=8 R10=fp 9: (95) exit from 6 to 9: R0=inv,min_value=0,max_value=0 R10=fp 9: (95) exit processed 10 insns ii) BPF_ADD in 32 bit mode on r0 (map_value_adj): 0: (bf) r2 = r10 1: (07) r2 += -8 2: (7a) *(u64 *)(r2 +0) = 0 3: (18) r1 = 0xc24eee00 5: (85) call bpf_map_lookup_elem#1 6: (15) if r0 == 0x0 goto pc+2 R0=map_value(ks=8,vs=48,id=0),min_value=0,max_value=0 R10=fp 7: (04) (u32) r0 += (u32) 0 8: (7a) *(u64 *)(r0 +0) = 22 R0=map_value_adj(ks=8,vs=48,id=0),min_value=0,max_value=0 R10=fp 9: (95) exit from 6 to 9: R0=inv,min_value=0,max_value=0 R10=fp 9: (95) exit processed 10 insns Issue is, while min_value / max_value boundaries for the access are adjusted appropriately, we change the pointer value in a way that cannot be sufficiently tracked anymore from its origin. Operations like BPF_{AND,OR,DIV,MUL,etc} on a destination register that is PTR_TO_MAP_VALUE{,_ADJ} was probably unintended, in fact, all the test cases coming with 484611357c19 ("bpf: allow access into map value arrays") perform BPF_ADD only on the destination register that is PTR_TO_MAP_VALUE_ADJ. Only for UNKNOWN_VALUE register types such operations make sense, f.e. with unknown memory content fetched initially from a constant offset from the map value memory into a register. That register is then later tested against lower / upper bounds, so that the verifier can then do the tracking of min_value / max_value, and properly check once that UNKNOWN_VALUE register is added to the destination register with type PTR_TO_MAP_VALUE{,_ADJ}. This is also what the original use-case is solving. Note, tracking on what is being added is done through adjust_reg_min_max_vals() and later access to the map value enforced with these boundaries and the given offset from the insn through check_map_access_adj(). Tests will fail for non-root environment due to prohibited pointer arithmetic, in particular in check_alu_op(), we bail out on the is_pointer_value() check on the dst_reg (which is false in root case as we allow for pointer arithmetic via env->allow_ptr_leaks). Similarly to PTR_TO_PACKET, one way to fix it is to restrict the allowed operations on PTR_TO_MAP_VALUE{,_ADJ} registers to 64 bit mode BPF_ADD. The test_verifier suite runs fine after the patch and it also rejects mentioned test cases. Fixes: 484611357c19 ("bpf: allow access into map value arrays") Signed-off-by: Daniel Borkmann <[email protected]> Reviewed-by: Josef Bacik <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-04-01r8152: The Microsoft Surface docks also use R8152 v2René Rebe2-0/+18
Without this the generic cdc_ether grabs the device, and does not really work. Signed-off-by: René Rebe <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-04-01openvswitch: Fix ovs_flow_key_update()Yi-Hung Wei1-2/+8
ovs_flow_key_update() is called when the flow key is invalid, and it is used to update and revalidate the flow key. Commit 329f45bc4f19 ("openvswitch: add mac_proto field to the flow key") introduces mac_proto field to flow key and use it to determine whether the flow key is valid. However, the commit does not update the code path in ovs_flow_key_update() to revalidate the flow key which may cause BUG_ON() on execute_recirc(). This patch addresses the aforementioned issue. Fixes: 329f45bc4f19 ("openvswitch: add mac_proto field to the flow key") Signed-off-by: Yi-Hung Wei <[email protected]> Acked-by: Jiri Benc <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-04-01net/faraday: Explicitly include linux/of.h and linux/property.hMark Brown1-0/+2
This driver uses interfaces from linux/of.h and linux/property.h but relies on implict inclusion of those headers which means that changes in other headers could break the build, as happened in -next for arm today. Add a explicit includes. Signed-off-by: Mark Brown <[email protected]> Acked-by: Joel Stanley <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-04-01net: hns: Add ACPI support to check SFP presentDaode Huang2-5/+34
The current code only supports DT to check SFP present. This patch adds ACPI support as well. Signed-off-by: Daode Huang <[email protected]> Reviewed-by: Yisen Zhuang <[email protected]> Signed-off-by: Salil Mehta <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-03-30be2net: Fix endian issue in logical link config commandSuresh Reddy1-3/+6
Use cpu_to_le32() for link_config variable in set_logical_link_config command as this variable is of type u32. Signed-off-by: Suresh Reddy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-03-30sctp: alloc stream info when initializing asocXin Long4-17/+45
When sending a msg without asoc established, sctp will send INIT packet first and then enqueue chunks. Before receiving INIT_ACK, stream info is not yet alloced. But enqueuing chunks needs to access stream info, like out stream state and out stream cnt. This patch is to fix it by allocing out stream info when initializing an asoc, allocing in stream and re-allocing out stream when processing init. Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-03-30net/packet: fix overflow in check for tp_reserveAndrey Konovalov1-0/+2
When calculating po->tp_hdrlen + po->tp_reserve the result can overflow. Fix by checking that tp_reserve <= INT_MAX on assign. Signed-off-by: Andrey Konovalov <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-03-30net/packet: fix overflow in check for tp_frame_nrAndrey Konovalov1-0/+2
When calculating rb->frames_per_block * req->tp_block_nr the result can overflow. Add a check that tp_block_size * tp_block_nr <= UINT_MAX. Since frames_per_block <= tp_block_size, the expression would never overflow. Signed-off-by: Andrey Konovalov <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-03-30net/packet: fix overflow in check for priv area sizeAndrey Konovalov1-2/+2
Subtracting tp_sizeof_priv from tp_block_size and casting to int to check whether one is less then the other doesn't always work (both of them are unsigned ints). Compare them as is instead. Also cast tp_sizeof_priv to u64 before using BLK_PLUS_PRIV, as it can overflow inside BLK_PLUS_PRIV otherwise. Signed-off-by: Andrey Konovalov <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-03-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller8-130/+206
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains a rather large update with Netfilter fixes, specifically targeted to incorrect RCU usage in several spots and the userspace conntrack helper infrastructure (nfnetlink_cthelper), more specifically they are: 1) expect_class_max is incorrect set via cthelper, as in kernel semantics mandate that this represents the array of expectation classes minus 1. Patch from Liping Zhang. 2) Expectation policy updates via cthelper are currently broken for several reasons: This code allows illegal changes in the policy such as changing the number of expeciation classes, it is leaking the updated policy and such update occurs with no RCU protection at all. Fix this by adding a new nfnl_cthelper_update_policy() that describes what is really legal on the update path. 3) Fix several memory leaks in cthelper, from Jeffy Chen. 4) synchronize_rcu() is missing in the removal path of several modules, this may lead to races since CPU may still be running on code that has just gone. Also from Liping Zhang. 5) Don't use the helper hashtable from cthelper, it is not safe to walk over those bits without the helper mutex. Fix this by introducing a new independent list for userspace helpers. From Liping Zhang. 6) nf_ct_extend_unregister() needs synchronize_rcu() to make sure no packets are walking on any conntrack extension that is gone after module removal, again from Liping. 7) nf_nat_snmp may crash if we fail to unregister the helper due to accidental leftover code, from Gao Feng. 8) Fix leak in nfnetlink_queue with secctx support, from Liping Zhang. ==================== Signed-off-by: David S. Miller <[email protected]>
2017-03-29ezchip: nps_enet: check if napi has been completedZakharov Vlad1-3/+1
After a new NAPI_STATE_MISSED state was added to NAPI we can get into this state and in such case we have to reschedule NAPI as some work is still pending and we have to process it. napi_complete_done() function returns false if we have to reschedule something (e.g. in case we were in MISSED state) as current polling have not been completed yet. nps_enet driver hasn't been verifying the return value of napi_complete_done() and has been forcibly enabling interrupts. That is not correct as we should not enable interrupts before we have processed all scheduled work. As a result we were getting trapped in interrupt hanlder chain as we had never been able to disabale ethernet interrupts again. So this patch makes nps_enet_poll() func verify return value of napi_complete_done() and enable interrupts only in case all scheduled work has been completed. Signed-off-by: Vlad Zakharov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-03-29Merge branch 'bnxt_en-fixes'David S. Miller1-6/+24
Michael Chan says: ==================== bnxt_en: Small misc. fixes. Fix a NULL pointer crash in open failure path, wrong arguments when printing error messages, and a DMA unmap bug in XDP shutdown path. ==================== Signed-off-by: David S. Miller <[email protected]>
2017-03-29bnxt_en: Fix DMA unmapping of the RX buffers in XDP mode during shutdown.Michael Chan1-5/+10
In bnxt_free_rx_skbs(), which is called to free up all RX buffers during shutdown, we need to unmap the page if we are running in XDP mode. Fixes: c61fb99cae51 ("bnxt_en: Add RX page mode support.") Signed-off-by: Michael Chan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-03-29bnxt_en: Correct the order of arguments to netdev_err() in bnxt_set_tpa()Sankar Patchineelam1-1/+1
Signed-off-by: Sankar Patchineelam <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-03-29bnxt_en: Fix NULL pointer dereference in reopen failure pathSankar Patchineelam1-0/+13
Net device reset can fail when the h/w or f/w is in a bad state. Subsequent netdevice open fails in bnxt_hwrm_stat_ctx_alloc(). The cleanup invokes bnxt_hwrm_resource_free() which inturn calls bnxt_disable_int(). In this routine, the code segment if (ring->fw_ring_id != INVALID_HW_RING_ID) BNXT_CP_DB(cpr->cp_doorbell, cpr->cp_raw_cons); results in NULL pointer dereference as cpr->cp_doorbell is not yet initialized, and fw_ring_id is zero. The fix is to initialize cpr fw_ring_id to INVALID_HW_RING_ID before bnxt_init_chip() is invoked. Signed-off-by: Sankar Patchineelam <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-03-29l2tp: purge socket queues in the .destruct() callbackGuillaume Nault1-3/+4
The Rx path may grab the socket right before pppol2tp_release(), but nothing guarantees that it will enqueue packets before skb_queue_purge(). Therefore, the socket can be destroyed without its queues fully purged. Fix this by purging queues in pppol2tp_session_destruct() where we're guaranteed nothing is still referencing the socket. Fixes: 9e9cb6221aa7 ("l2tp: fix userspace reception on plain L2TP sockets") Signed-off-by: Guillaume Nault <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-03-29l2tp: hold tunnel socket when handling control frames in l2tp_ip and l2tp_ip6Guillaume Nault2-4/+6
The code following l2tp_tunnel_find() expects that a new reference is held on sk. Either sk_receive_skb() or the discard_put error path will drop a reference from the tunnel's socket. This issue exists in both l2tp_ip and l2tp_ip6. Fixes: a3c18422a4b4 ("l2tp: hold socket before dropping lock in l2tp_ip{, 6}_recv()") Signed-off-by: Guillaume Nault <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-03-29mac80211: unconditionally start new netdev queues with iTXQ supportJohannes Berg1-1/+2
When internal mac80211 TXQs aren't supported, netdev queues must always started out started even when driver queues are stopped while the interface is added. This is necessary because with the internal TXQ support netdev queues are never stopped and packet scheduling/dropping is done in mac80211. Cc: [email protected] # 4.9+ Fixes: 80a83cfc434b1 ("mac80211: skip netdev queue control with software queuing") Reported-and-tested-by: Sven Eckelmann <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2017-03-29netfilter: nfnetlink_queue: fix secctx memory leakLiping Zhang1-2/+7
We must call security_release_secctx to free the memory returned by security_secid_to_secctx, otherwise memory may be leaked forever. Fixes: ef493bd930ae ("netfilter: nfnetlink_queue: add security context information") Signed-off-by: Liping Zhang <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2017-03-29cfg80211: check rdev resume callback only for registered wiphyArend Van Spriel1-6/+4
We got the following use-after-free KASAN report: BUG: KASAN: use-after-free in wiphy_resume+0x591/0x5a0 [cfg80211] at addr ffff8803fc244090 Read of size 8 by task kworker/u16:24/2587 CPU: 6 PID: 2587 Comm: kworker/u16:24 Tainted: G B 4.9.13-debug+ Hardware name: Dell Inc. XPS 15 9550/0N7TVV, BIOS 1.2.19 12/22/2016 Workqueue: events_unbound async_run_entry_fn ffff880425d4f9d8 ffffffffaeedb541 ffff88042b80ef00 ffff8803fc244088 ffff880425d4fa00 ffffffffae84d7a1 ffff880425d4fa98 ffff8803fc244080 ffff88042b80ef00 ffff880425d4fa88 ffffffffae84da3a ffffffffc141f7d9 Call Trace: [<ffffffffaeedb541>] dump_stack+0x85/0xc4 [<ffffffffae84d7a1>] kasan_object_err+0x21/0x70 [<ffffffffae84da3a>] kasan_report_error+0x1fa/0x500 [<ffffffffc141f7d9>] ? cfg80211_bss_age+0x39/0xc0 [cfg80211] [<ffffffffc141f83a>] ? cfg80211_bss_age+0x9a/0xc0 [cfg80211] [<ffffffffae48d46d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffffc13fb1c0>] ? wiphy_suspend+0xc70/0xc70 [cfg80211] [<ffffffffae84def1>] __asan_report_load8_noabort+0x61/0x70 [<ffffffffc13fb100>] ? wiphy_suspend+0xbb0/0xc70 [cfg80211] [<ffffffffc13fb751>] ? wiphy_resume+0x591/0x5a0 [cfg80211] [<ffffffffc13fb751>] wiphy_resume+0x591/0x5a0 [cfg80211] [<ffffffffc13fb1c0>] ? wiphy_suspend+0xc70/0xc70 [cfg80211] [<ffffffffaf3b206e>] dpm_run_callback+0x6e/0x4f0 [<ffffffffaf3b31b2>] device_resume+0x1c2/0x670 [<ffffffffaf3b367d>] async_resume+0x1d/0x50 [<ffffffffae3ee84e>] async_run_entry_fn+0xfe/0x610 [<ffffffffae3d0666>] process_one_work+0x716/0x1a50 [<ffffffffae3d05c9>] ? process_one_work+0x679/0x1a50 [<ffffffffafdd7b6d>] ? _raw_spin_unlock_irq+0x3d/0x60 [<ffffffffae3cff50>] ? pwq_dec_nr_in_flight+0x2b0/0x2b0 [<ffffffffae3d1a80>] worker_thread+0xe0/0x1460 [<ffffffffae3d19a0>] ? process_one_work+0x1a50/0x1a50 [<ffffffffae3e54c2>] kthread+0x222/0x2e0 [<ffffffffae3e52a0>] ? kthread_park+0x80/0x80 [<ffffffffae3e52a0>] ? kthread_park+0x80/0x80 [<ffffffffae3e52a0>] ? kthread_park+0x80/0x80 [<ffffffffafdd86aa>] ret_from_fork+0x2a/0x40 Object at ffff8803fc244088, in cache kmalloc-1024 size: 1024 Allocated: PID = 71 save_stack_trace+0x1b/0x20 save_stack+0x46/0xd0 kasan_kmalloc+0xad/0xe0 kasan_slab_alloc+0x12/0x20 __kmalloc_track_caller+0x134/0x360 kmemdup+0x20/0x50 brcmf_cfg80211_attach+0x10b/0x3a90 [brcmfmac] brcmf_bus_start+0x19a/0x9a0 [brcmfmac] brcmf_pcie_setup+0x1f1a/0x3680 [brcmfmac] brcmf_fw_request_nvram_done+0x44c/0x11b0 [brcmfmac] request_firmware_work_func+0x135/0x280 process_one_work+0x716/0x1a50 worker_thread+0xe0/0x1460 kthread+0x222/0x2e0 ret_from_fork+0x2a/0x40 Freed: PID = 2568 save_stack_trace+0x1b/0x20 save_stack+0x46/0xd0 kasan_slab_free+0x71/0xb0 kfree+0xe8/0x2e0 brcmf_cfg80211_detach+0x62/0xf0 [brcmfmac] brcmf_detach+0x14a/0x2b0 [brcmfmac] brcmf_pcie_remove+0x140/0x5d0 [brcmfmac] brcmf_pcie_pm_leave_D3+0x198/0x2e0 [brcmfmac] pci_pm_resume+0x186/0x220 dpm_run_callback+0x6e/0x4f0 device_resume+0x1c2/0x670 async_resume+0x1d/0x50 async_run_entry_fn+0xfe/0x610 process_one_work+0x716/0x1a50 worker_thread+0xe0/0x1460 kthread+0x222/0x2e0 ret_from_fork+0x2a/0x40 Memory state around the buggy address: ffff8803fc243f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff8803fc244000: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff8803fc244080: fc fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff8803fc244100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8803fc244180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb What is happening is that brcmf_pcie_resume() detects a device that is no longer responsive and it decides to unbind resulting in a wiphy_unregister() and wiphy_free() call. Now the wiphy instance remains allocated, because PM needs to call wiphy_resume() for it. However, brcmfmac already does a kfree() for the struct cfg80211_registered_device::ops field. Change the checks in wiphy_resume() to only access the struct cfg80211_registered_device::ops if the wiphy instance is still registered at this time. Cc: [email protected] # 4.10.x, 4.9.x Reported-by: Daniel J Blueman <[email protected]> Reviewed-by: Hante Meuleman <[email protected]> Reviewed-by: Pieter-Paul Giesberts <[email protected]> Reviewed-by: Franky Lin <[email protected]> Signed-off-by: Arend van Spriel <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2017-03-28openvswitch: Fix refcount leak on force commit.Jarno Rajahalme1-2/+2
The reference count held for skb needs to be released when the skb's nfct pointer is cleared regardless of if nf_ct_delete() is called or not. Failing to release the skb's reference cound led to deferred conntrack cleanup spinning forever within nf_conntrack_cleanup_net_list() when cleaning up a network namespace:    kworker/u16:0-19025 [004] 45981067.173642: sched_switch: kworker/u16:0:19025 [120] R ==> rcu_preempt:7 [120]    kworker/u16:0-19025 [004] 45981067.173651: kernel_stack: <stack trace> => ___preempt_schedule (ffffffffa001ed36) => _raw_spin_unlock_bh (ffffffffa0713290) => nf_ct_iterate_cleanup (ffffffffc00a4454) => nf_conntrack_cleanup_net_list (ffffffffc00a5e1e) => nf_conntrack_pernet_exit (ffffffffc00a63dd) => ops_exit_list.isra.1 (ffffffffa06075f3) => cleanup_net (ffffffffa0607df0) => process_one_work (ffffffffa0084c31) => worker_thread (ffffffffa008592b) => kthread (ffffffffa008bee2) => ret_from_fork (ffffffffa071b67c) Fixes: dd41d33f0b03 ("openvswitch: Add force commit.") Reported-by: Yang Song <[email protected]> Signed-off-by: Jarno Rajahalme <[email protected]> Acked-by: Joe Stringer <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-03-28rocker: fix Wmaybe-uninitialized false-positiveArnd Bergmann1-7/+4
gcc-7 reports a warning that earlier versions did not have: drivers/net/ethernet/rocker/rocker_ofdpa.c: In function 'ofdpa_port_stp_update': arch/x86/include/asm/string_32.h:79:22: error: '*((void *)&prev_ctrls+4)' may be used uninitialized in this function [-Werror=maybe-uninitialized] *((short *)to + 2) = *((short *)from + 2); ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~ drivers/net/ethernet/rocker/rocker_ofdpa.c:2218:7: note: '*((void *)&prev_ctrls+4)' was declared here This is clearly a variation of the warning about 'prev_state' that was shut up using uninitialized_var(). We can slightly simplify the code and get rid of the warning by unconditionally saving the prev_state and prev_ctrls variables. The inlined memcpy is not particularly expensive here, as it just has to read five bytes from one or two cache lines. Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-03-28net/mlx5: Avoid dereferencing uninitialized pointerTalat Batheesh1-2/+3
In NETDEV_CHANGEUPPER event the upper_info field is valid only when linking is true. Otherwise it should be ignored. Fixes: 7907f23adc18 (net/mlx5: Implement RoCE LAG feature) Signed-off-by: Talat Batheesh <[email protected]> Reviewed-by: Aviv Heller <[email protected]> Reviewed-by: Moni Shoua <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-03-28net: moxa: fix TX overrun memory leakJonas Jensen2-2/+19
moxart_mac_start_xmit() doesn't care where tx_tail is, tx_head can catch and pass tx_tail, which is bad because moxart_tx_finished() isn't guaranteed to catch up on freeing resources from tx_tail. Add a check in moxart_mac_start_xmit() stopping the queue at the end of the circular buffer. Also add a check in moxart_tx_finished() waking the queue if the buffer has TX_WAKE_THRESHOLD or more free descriptors. While we're at it, move spin_lock_irq() to happen before our descriptor pointer is assigned in moxart_mac_start_xmit(). Addresses https://bugzilla.kernel.org/show_bug.cgi?id=99451 Signed-off-by: Jonas Jensen <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-03-28isdn: kcapi: avoid uninitialized dataArnd Bergmann1-0/+1
gcc-7 points out that the AVMB1_ADDCARD ioctl results in an unintialized value ending up in the cardnr parameter: drivers/isdn/capi/kcapi.c: In function 'old_capi_manufacturer': drivers/isdn/capi/kcapi.c:1042:24: error: 'cdef.cardnr' may be used uninitialized in this function [-Werror=maybe-uninitialized] cparams.cardnr = cdef.cardnr; This has been broken since before the start of the git history, so either the value is not used for anything important, or the ioctl command doesn't get called in practice. Setting the cardnr to zero avoids the warning and makes sure we have consistent behavior. Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-03-28sctp: change to save MSG_MORE flag into assocXin Long3-3/+3
David Laight noticed the support for MSG_MORE with datamsg->force_delay didn't really work as we expected, as the first msg with MSG_MORE set would always block the following chunks' dequeuing. This Patch is to rewrite it by saving the MSG_MORE flag into assoc as David Laight suggested. asoc->force_delay is used to save MSG_MORE flag before a msg is sent. All chunks in queue would not be sent out if asoc->force_delay is set by the msg with MSG_MORE flag, until a new msg without MSG_MORE flag clears asoc->force_delay. Note that this change would not affect the flush is generated by other triggers, like asoc->state != ESTABLISHED, queue size > pmtu etc. v1->v2: Not clear asoc->force_delay after sending the msg with MSG_MORE flag. Fixes: 4ea0c32f5f42 ("sctp: add support for MSG_MORE") Signed-off-by: Xin Long <[email protected]> Acked-by: David Laight <[email protected]> Acked-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-03-27net: ipconfig: fix ic_close_devs() use-after-freeMark Rutland1-1/+1
Our chosen ic_dev may be anywhere in our list of ic_devs, and we may free it before attempting to close others. When we compare d->dev and ic_dev->dev, we're potentially dereferencing memory returned to the allocator. This causes KASAN to scream for each subsequent ic_dev we check. As there's a 1-1 mapping between ic_devs and netdevs, we can instead compare d and ic_dev directly, which implicitly handles the !ic_dev case, and avoids the use-after-free. The ic_dev pointer may be stale, but we will not dereference it. Original splat: [ 6.487446] ================================================================== [ 6.494693] BUG: KASAN: use-after-free in ic_close_devs+0xc4/0x154 at addr ffff800367efa708 [ 6.503013] Read of size 8 by task swapper/0/1 [ 6.507452] CPU: 5 PID: 1 Comm: swapper/0 Not tainted 4.11.0-rc3-00002-gda42158 #8 [ 6.514993] Hardware name: AppliedMicro Mustang/Mustang, BIOS 3.05.05-beta_rc Jan 27 2016 [ 6.523138] Call trace: [ 6.525590] [<ffff200008094778>] dump_backtrace+0x0/0x570 [ 6.530976] [<ffff200008094d08>] show_stack+0x20/0x30 [ 6.536017] [<ffff200008bee928>] dump_stack+0x120/0x188 [ 6.541231] [<ffff20000856d5e4>] kasan_object_err+0x24/0xa0 [ 6.546790] [<ffff20000856d924>] kasan_report_error+0x244/0x738 [ 6.552695] [<ffff20000856dfec>] __asan_report_load8_noabort+0x54/0x80 [ 6.559204] [<ffff20000aae86ac>] ic_close_devs+0xc4/0x154 [ 6.564590] [<ffff20000aaedbac>] ip_auto_config+0x2ed4/0x2f1c [ 6.570321] [<ffff200008084b04>] do_one_initcall+0xcc/0x370 [ 6.575882] [<ffff20000aa31de8>] kernel_init_freeable+0x5f8/0x6c4 [ 6.581959] [<ffff20000a16df00>] kernel_init+0x18/0x190 [ 6.587171] [<ffff200008084710>] ret_from_fork+0x10/0x40 [ 6.592468] Object at ffff800367efa700, in cache kmalloc-128 size: 128 [ 6.598969] Allocated: [ 6.601324] PID = 1 [ 6.603427] save_stack_trace_tsk+0x0/0x418 [ 6.607603] save_stack_trace+0x20/0x30 [ 6.611430] kasan_kmalloc+0xd8/0x188 [ 6.615087] ip_auto_config+0x8c4/0x2f1c [ 6.619002] do_one_initcall+0xcc/0x370 [ 6.622832] kernel_init_freeable+0x5f8/0x6c4 [ 6.627178] kernel_init+0x18/0x190 [ 6.630660] ret_from_fork+0x10/0x40 [ 6.634223] Freed: [ 6.636233] PID = 1 [ 6.638334] save_stack_trace_tsk+0x0/0x418 [ 6.642510] save_stack_trace+0x20/0x30 [ 6.646337] kasan_slab_free+0x88/0x178 [ 6.650167] kfree+0xb8/0x478 [ 6.653131] ic_close_devs+0x130/0x154 [ 6.656875] ip_auto_config+0x2ed4/0x2f1c [ 6.660875] do_one_initcall+0xcc/0x370 [ 6.664705] kernel_init_freeable+0x5f8/0x6c4 [ 6.669051] kernel_init+0x18/0x190 [ 6.672534] ret_from_fork+0x10/0x40 [ 6.676098] Memory state around the buggy address: [ 6.680880] ffff800367efa600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 6.688078] ffff800367efa680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 6.695276] >ffff800367efa700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 6.702469] ^ [ 6.705952] ffff800367efa780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 6.713149] ffff800367efa800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 6.720343] ================================================================== [ 6.727536] Disabling lock debugging due to kernel taint Signed-off-by: Mark Rutland <[email protected]> Cc: Alexey Kuznetsov <[email protected]> Cc: David S. Miller <[email protected]> Cc: Hideaki YOSHIFUJI <[email protected]> Cc: James Morris <[email protected]> Cc: Patrick McHardy <[email protected]> Cc: [email protected] Signed-off-by: David S. Miller <[email protected]>
2017-03-27MAINTAINERS: Add Andrew Lunn as co-maintainer of PHYLIBFlorian Fainelli1-0/+1
Andrew has been contributing a lot to PHYLIB over the past months and his feedback on patches is more than welcome. Signed-off-by: Florian Fainelli <[email protected]> Acked-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-03-27netfilter: nf_nat_snmp: Fix panic when snmp_trap_helper fails to registerGao Feng1-18/+1
In the commit 93557f53e1fb ("netfilter: nf_conntrack: nf_conntrack snmp helper"), the snmp_helper is replaced by nf_nat_snmp_hook. So the snmp_helper is never registered. But it still tries to unregister the snmp_helper, it could cause the panic. Now remove the useless snmp_helper and the unregister call in the error handler. Fixes: 93557f53e1fb ("netfilter: nf_conntrack: nf_conntrack snmp helper") Signed-off-by: Gao Feng <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>