aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2019-02-26SUNRPC: Fix an Oops in udp_poll()Trond Myklebust1-2/+19
udp_poll() checks the struct file for the O_NONBLOCK flag, so we must not call it with a NULL file pointer. Fixes: 0ffe86f48026 ("SUNRPC: Use poll() to fix up the socket requeue races") Signed-off-by: Trond Myklebust <[email protected]>
2019-02-26Bluetooth: Add quirk for reading BD_ADDR from fwnode propertyMatthias Kaehlcke2-2/+47
Add HCI_QUIRK_USE_BDADDR_PROPERTY to allow controllers to retrieve the public Bluetooth address from the firmware node property 'local-bd-address'. If quirk is set and the property does not exist or is invalid the controller is marked as unconfigured. Signed-off-by: Matthias Kaehlcke <[email protected]> Reviewed-by: Balakrishna Godavarthi <[email protected]> Tested-by: Balakrishna Godavarthi <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2019-02-26Bluetooth: mgmt: Use struct_size() helperGustavo A. R. Silva1-5/+3
Make use of the struct_size() helper instead of an open-coded version in order to avoid any potential type mistakes, in particular in the context in which this code is being used. So, change the following form: sizeof(*rp) + (sizeof(rp->entry[0]) * count); to : struct_size(rp, entry, count) Notice that, in this case, variable rp_len is not necessary, hence it is removed. This code was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2019-02-25net: avoid use IPCB in cipso_v4_errorNazarov Sergey2-7/+32
Extract IP options in cipso_v4_error and use __icmp_send. Signed-off-by: Sergey Nazarov <[email protected]> Acked-by: Paul Moore <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-25net: Add __icmp_send helper.Nazarov Sergey1-3/+4
Add __icmp_send function having ip_options struct parameter Signed-off-by: Sergey Nazarov <[email protected]> Reviewed-by: Paul Moore <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-25net: sched: pie: update referencesMohit P. Tahiliani1-3/+1
RFC 8033 replaces the IETF draft for PIE Signed-off-by: Mohit P. Tahiliani <[email protected]> Signed-off-by: Dhaval Khandla <[email protected]> Signed-off-by: Hrishikesh Hiraskar <[email protected]> Signed-off-by: Manish Kumar B <[email protected]> Signed-off-by: Sachin D. Patil <[email protected]> Signed-off-by: Leslie Monis <[email protected]> Acked-by: Dave Taht <[email protected]> Acked-by: Jamal Hadi Salim <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-25net: sched: pie: add derandomization mechanismMohit P. Tahiliani1-1/+27
Random dropping of packets to achieve latency control may introduce outlier situations where packets are dropped too close to each other or too far from each other. This can cause the real drop percentage to temporarily deviate from the intended drop probability. In certain scenarios, such as a small number of simultaneous TCP flows, these deviations can cause significant deviations in link utilization and queuing latency. RFC 8033 suggests using a derandomization mechanism to avoid these deviations. Signed-off-by: Mohit P. Tahiliani <[email protected]> Signed-off-by: Dhaval Khandla <[email protected]> Signed-off-by: Hrishikesh Hiraskar <[email protected]> Signed-off-by: Manish Kumar B <[email protected]> Signed-off-by: Sachin D. Patil <[email protected]> Signed-off-by: Leslie Monis <[email protected]> Acked-by: Dave Taht <[email protected]> Acked-by: Jamal Hadi Salim <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-25net: sched: pie: add more cases to auto-tune alpha and betaMohit P. Tahiliani1-33/+32
The current implementation scales the local alpha and beta variables in the calculate_probability function by the same amount for all values of drop probability below 1%. RFC 8033 suggests using additional cases for auto-tuning alpha and beta when the drop probability is less than 1%. In order to add more auto-tuning cases, MAX_PROB must be scaled by u64 instead of u32 to prevent underflow when scaling the local alpha and beta variables in the calculate_probability function. Signed-off-by: Mohit P. Tahiliani <[email protected]> Signed-off-by: Dhaval Khandla <[email protected]> Signed-off-by: Hrishikesh Hiraskar <[email protected]> Signed-off-by: Manish Kumar B <[email protected]> Signed-off-by: Sachin D. Patil <[email protected]> Signed-off-by: Leslie Monis <[email protected]> Acked-by: Dave Taht <[email protected]> Acked-by: Jamal Hadi Salim <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-25net: sched: pie: change initial value of pie_vars->burst_timeMohit P. Tahiliani1-2/+2
RFC 8033 suggests an initial value of 150 milliseconds for the maximum time allowed for a burst of packets. Signed-off-by: Mohit P. Tahiliani <[email protected]> Signed-off-by: Dhaval Khandla <[email protected]> Signed-off-by: Hrishikesh Hiraskar <[email protected]> Signed-off-by: Manish Kumar B <[email protected]> Signed-off-by: Sachin D. Patil <[email protected]> Signed-off-by: Leslie Monis <[email protected]> Acked-by: Dave Taht <[email protected]> Acked-by: Jamal Hadi Salim <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-25net: sched: pie: change default value of pie_params->tupdateMohit P. Tahiliani1-1/+1
RFC 8033 suggests a default value of 15 milliseconds for the update interval. Signed-off-by: Mohit P. Tahiliani <[email protected]> Signed-off-by: Dhaval Khandla <[email protected]> Signed-off-by: Hrishikesh Hiraskar <[email protected]> Signed-off-by: Manish Kumar B <[email protected]> Signed-off-by: Sachin D. Patil <[email protected]> Signed-off-by: Leslie Monis <[email protected]> Acked-by: Dave Taht <[email protected]> Acked-by: Jamal Hadi Salim <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-25net: sched: pie: change default value of pie_params->targetMohit P. Tahiliani1-1/+1
RFC 8033 suggests a default value of 15 milliseconds for the target queue delay. Signed-off-by: Mohit P. Tahiliani <[email protected]> Signed-off-by: Dhaval Khandla <[email protected]> Signed-off-by: Hrishikesh Hiraskar <[email protected]> Signed-off-by: Manish Kumar B <[email protected]> Signed-off-by: Sachin D. Patil <[email protected]> Signed-off-by: Leslie Monis <[email protected]> Acked-by: Dave Taht <[email protected]> Acked-by: Jamal Hadi Salim <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-25net: sched: pie: change value of QUEUE_THRESHOLDMohit P. Tahiliani1-1/+1
RFC 8033 recommends a value of 16384 bytes for the queue threshold. Signed-off-by: Mohit P. Tahiliani <[email protected]> Signed-off-by: Dhaval Khandla <[email protected]> Signed-off-by: Hrishikesh Hiraskar <[email protected]> Signed-off-by: Manish Kumar B <[email protected]> Signed-off-by: Sachin D. Patil <[email protected]> Signed-off-by: Leslie Monis <[email protected]> Acked-by: Dave Taht <[email protected]> Acked-by: Jamal Hadi Salim <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-25bpf/test_run: fix unkillable BPF_PROG_TEST_RUN for flow dissectorStanislav Fomichev1-6/+20
Syzbot found out that running BPF_PROG_TEST_RUN with repeat=0xffffffff makes process unkillable. The problem is that when CONFIG_PREEMPT is enabled, we never see need_resched() return true. This is due to the fact that preempt_enable() (which we do in bpf_test_run_one on each iteration) now handles resched if it's needed. Let's disable preemption for the whole run, not per test. In this case we can properly see whether resched is needed. Let's also properly return -EINTR to the userspace in case of a signal interrupt. This is a follow up for a recently fixed issue in bpf_test_run, see commit df1a2cb7c74b ("bpf/test_run: fix unkillable BPF_PROG_TEST_RUN"). Reported-by: syzbot <[email protected]> Signed-off-by: Stanislav Fomichev <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
2019-02-25net: socket: set sock->sk to NULL after calling proto_ops::release()Eric Biggers1-0/+1
Commit 9060cb719e61 ("net: crypto set sk to NULL when af_alg_release.") fixed a use-after-free in sockfs_setattr() when an AF_ALG socket is closed concurrently with fchownat(). However, it ignored that many other proto_ops::release() methods don't set sock->sk to NULL and therefore allow the same use-after-free: - base_sock_release - bnep_sock_release - cmtp_sock_release - data_sock_release - dn_release - hci_sock_release - hidp_sock_release - iucv_sock_release - l2cap_sock_release - llcp_sock_release - llc_ui_release - rawsock_release - rfcomm_sock_release - sco_sock_release - svc_release - vcc_release - x25_release Rather than fixing all these and relying on every socket type to get this right forever, just make __sock_release() set sock->sk to NULL itself after calling proto_ops::release(). Reproducer that produces the KASAN splat when any of these socket types are configured into the kernel: #include <pthread.h> #include <stdlib.h> #include <sys/socket.h> #include <unistd.h> pthread_t t; volatile int fd; void *close_thread(void *arg) { for (;;) { usleep(rand() % 100); close(fd); } } int main() { pthread_create(&t, NULL, close_thread, NULL); for (;;) { fd = socket(rand() % 50, rand() % 11, 0); fchownat(fd, "", 1000, 1000, 0x1000); close(fd); } } Fixes: 86741ec25462 ("net: core: Add a UID field to struct sock.") Signed-off-by: Eric Biggers <[email protected]> Acked-by: Cong Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-25net: sched: don't release block->lock when dumping chainsVlad Buslov1-9/+7
Function tc_dump_chain() obtains and releases block->lock on each iteration of its inner loop that dumps all chains on block. Outputting chain template info is fast operation so locking/unlocking mutex multiple times is an overhead when lock is highly contested. Modify tc_dump_chain() to only obtain block->lock once and dump all chains without releasing it. Signed-off-by: Vlad Buslov <[email protected]> Suggested-by: Cong Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-25net: sched: set dedicated tcf_walker flag when tp is emptyVlad Buslov1-4/+9
Using tcf_walker->stop flag to determine when tcf_walker->fn() was called at least once is unreliable. Some classifiers set 'stop' flag on error before calling walker callback, other classifiers used to call it with NULL filter pointer when empty. In order to prevent further regressions, extend tcf_walker structure with dedicated 'nonempty' flag. Set this flag in tcf_walker->fn() implementation that is used to check if classifier has filters configured. Fixes: 8b64678e0af8 ("net: sched: refactor tp insert/delete for concurrent execution") Signed-off-by: Vlad Buslov <[email protected]> Suggested-by: Cong Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-25net: sched: act_tunnel_key: fix NULL pointer dereference during initVlad Buslov1-1/+2
Metadata pointer is only initialized for action TCA_TUNNEL_KEY_ACT_SET, but it is unconditionally dereferenced in tunnel_key_init() error handler. Verify that metadata pointer is not NULL before dereferencing it in tunnel_key_init error handling code. Fixes: ee28bb56ac5b ("net/sched: fix memory leak in act_tunnel_key_init()") Signed-off-by: Vlad Buslov <[email protected]> Reviewed-by: Davide Caratti <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-25tcp: clean up SOCK_DEBUG()Yafang Shao2-20/+1
Per discussion with Daniel[1] and Eric[2], these SOCK_DEBUG() calles in TCP are not needed now. We'd better clean up it. [1] https://patchwork.ozlabs.org/patch/1035573/ [2] https://patchwork.ozlabs.org/patch/1040533/ Signed-off-by: Yafang Shao <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-25tcp: remove unused parameter of tcp_sacktag_bsearch()Taehee Yoo1-10/+6
parameter state in the tcp_sacktag_bsearch() is not used. So, it can be removed. Signed-off-by: Taehee Yoo <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-25net: dsa: fix a leaked reference by adding missing of_node_putWen Yang2-6/+11
The call to of_parse_phandle returns a node pointer with refcount incremented thus it must be explicitly decremented after the last usage. Detected by coccinelle with the following warnings: ./net/dsa/port.c:294:1-7: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 284, but without a corresponding object release within this function. ./net/dsa/dsa2.c:627:3-9: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 618, but without a corresponding object release within this function. ./net/dsa/dsa2.c:630:3-9: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 618, but without a corresponding object release within this function. ./net/dsa/dsa2.c:636:3-9: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 618, but without a corresponding object release within this function. ./net/dsa/dsa2.c:639:1-7: ERROR: missing of_node_put; acquired a node pointer with refcount incremented on line 618, but without a corresponding object release within this function. Signed-off-by: Wen Yang <[email protected]> Reviewed-by: Vivien Didelot <[email protected]> Cc: Andrew Lunn <[email protected]> Cc: Vivien Didelot <[email protected]> Cc: Florian Fainelli <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Vivien Didelot <[email protected]> Cc: [email protected] Cc: [email protected] Signed-off-by: David S. Miller <[email protected]>
2019-02-25Merge tag 'nfs-rdma-for-5.1-1' of ↵Trond Myklebust27-847/+858
git://git.linux-nfs.org/projects/anna/linux-nfs NFSoRDMA client updates for 5.1 New features: - Convert rpc auth layer to use xdr_streams - Config option to disable insecure enctypes - Reduce size of RPC receive buffers Bugfixes and cleanups: - Fix sparse warnings - Check inline size before providing a write chunk - Reduce the receive doorbell rate - Various tracepoint improvements [Trond: Fix up merge conflicts] Signed-off-by: Trond Myklebust <[email protected]>
2019-02-24switchdev: Complete removal of switchdev_port_attr_get()Florian Fainelli1-42/+0
We have no more in tree users of switchdev_port_attr_get() after d0e698d57a94 ("Merge branch 'net-Get-rid-of-switchdev_port_attr_get'") so completely remove the function signature and body. Signed-off-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-24dsa: Remove phydev parameter from disable_port callAndrew Lunn3-4/+4
No current DSA driver makes use of the phydev parameter passed to the disable_port call. Remove it. Signed-off-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-24Merge branch 'for-upstream' of ↵David S. Miller12-71/+107
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== Here's the main bluetooth-next pull request for the 5.1 kernel. - Fixes & improvements to mediatek, hci_qca, btrtl, and btmrvl HCI drivers - Fixes to parsing invalid L2CAP config option sizes - Locking fix to bt_accept_enqueue() - Add support for new Marvel sd8977 chipset - Various other smaller fixes & cleanups ==================== Signed-off-by: David S. Miller <[email protected]>
2019-02-24net: fix double-free in bpf_lwt_xmit_reroutePeter Oskolkov1-1/+1
dst_output() frees skb when it fails (see, for example, ip_finish_output2), so it must not be freed in this case. Fixes: 3bd0b15281af ("bpf: add handling of BPF_LWT_REROUTE to lwt_bpf.c") Signed-off-by: Peter Oskolkov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-24ip_tunnel: Add ip tunnel tun_info type dst_cache in ip_tunnel_xmitwenxu1-11/+27
ip l add dev tun type gretap key 1000 Non-tunnel-dst ip tunnel device can send packet through lwtunnel This patch provide the tun_inf dst cache support for this mode. Signed-off-by: wenxu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-24ip_tunnel: Add dst_cache support in lwtunnel_state of ip tunnelwenxu2-8/+26
The lwtunnel_state is not init the dst_cache Which make the ip_md_tunnel_xmit can't use the dst_cache. It will lookup route table every packets. Signed-off-by: wenxu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-24tls: Return type of non-data records retrieved using MSG_PEEK in recvmsgVakul Garg1-11/+67
The patch enables returning 'type' in msghdr for records that are retrieved with MSG_PEEK in recvmsg. Further it prevents records peeked from socket from getting clubbed with any other record of different type when records are subsequently dequeued from strparser. For each record, we now retain its type in sk_buff's control buffer cb[]. Inside control buffer, record's full length and offset are already stored by strparser in 'struct strp_msg'. We store record type after 'struct strp_msg' inside 'struct tls_msg'. For tls1.2, the type is stored just after record dequeue. For tls1.3, the type is stored after record has been decrypted. Inside process_rx_list(), before processing a non-data record, we check that we must be able to return back the record type to the user application. If not, the decrypted records in tls context's rx_list is left there without consuming any data. Fixes: 692d7b5d1f912 ("tls: Fix recvmsg() to be able to peek across multiple records") Signed-off-by: Vakul Garg <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-24ipv6: icmp: use percpu allocationKefeng Wang1-6/+5
Use percpu allocation for the ipv6.icmp_sk. Signed-off-by: Kefeng Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-24ipv6: icmp: use icmpv6_sk_exit()Kefeng Wang1-14/+11
Simply use icmpv6_sk_exit() when inet_ctl_sock_create() fail in icmpv6_sk_init(). Signed-off-by: Kefeng Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-24ipv4: icmp: use icmp_sk_exit()Kefeng Wang1-3/+1
Simply use icmp_sk_exit() when inet_ctl_sock_create() fail in icmp_sk_init(). Signed-off-by: Kefeng Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-24ila: Fix uninitialised return value in ila_xlat_nl_cmd_flushHerbert Xu1-1/+1
This patch fixes an uninitialised return value error in ila_xlat_nl_cmd_flush. Reported-by: Dan Carpenter <[email protected]> Fixes: 6c4128f65857 ("rhashtable: Remove obsolete...") Signed-off-by: Herbert Xu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-24net/sched: act_tunnel_key: Add dst_cache supportwenxu1-4/+21
The metadata_dst is not init the dst_cache which make the ip_md_tunnel_xmit can't use the dst_cache. It will lookup route table every packets. Signed-off-by: wenxu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-24net: Remove switchdev.h inclusion from team/bond/vlanFlorian Fainelli1-1/+0
This is no longer necessary after eca59f691566 ("net: Remove support for bridge bypass ndos from stacked devices") Suggested-by: Ido Schimmel <[email protected]> Signed-off-by: Florian Fainelli <[email protected]> Reviewed-by: Andy Gospodarek <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-24net/sched: act_skbedit: fix refcount leak when replace failsDavide Caratti1-2/+1
when act_skbedit was converted to use RCU in the data plane, we added an error path, but we forgot to drop the action refcount in case of failure during a 'replace' operation: # tc actions add action skbedit ptype otherhost pass index 100 # tc action show action skbedit total acts 1 action order 0: skbedit ptype otherhost pass index 100 ref 1 bind 0 # tc actions replace action skbedit ptype otherhost drop index 100 RTNETLINK answers: Cannot allocate memory We have an error talking to the kernel # tc action show action skbedit total acts 1 action order 0: skbedit ptype otherhost pass index 100 ref 2 bind 0 Ensure we call tcf_idr_release(), in case 'params_new' allocation failed, also when the action is being replaced. Fixes: c749cdda9089 ("net/sched: act_skbedit: don't use spinlock in the data path") Signed-off-by: Davide Caratti <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-24net/sched: act_ipt: fix refcount leak when replace failsDavide Caratti1-2/+1
After commit 4e8ddd7f1758 ("net: sched: don't release reference on action overwrite"), the error path of all actions was converted to drop refcount also when the action was being overwritten. But we forgot act_ipt_init(), in case allocation of 'tname' was not successful: # tc action add action xt -j LOG --log-prefix hello index 100 tablename: mangle hook: NF_IP_POST_ROUTING target: LOG level warning prefix "hello" index 100 # tc action show action xt total acts 1 action order 0: tablename: mangle hook: NF_IP_POST_ROUTING target LOG level warning prefix "hello" index 100 ref 1 bind 0 # tc action replace action xt -j LOG --log-prefix world index 100 tablename: mangle hook: NF_IP_POST_ROUTING target: LOG level warning prefix "world" index 100 RTNETLINK answers: Cannot allocate memory We have an error talking to the kernel # tc action show action xt total acts 1 action order 0: tablename: mangle hook: NF_IP_POST_ROUTING target LOG level warning prefix "hello" index 100 ref 2 bind 0 Ensure we call tcf_idr_release(), in case 'tname' allocation failed, also when the action is being replaced. Fixes: 4e8ddd7f1758 ("net: sched: don't release reference on action overwrite") Signed-off-by: Davide Caratti <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-24net: dev: add generic protodown handlerAndy Roulin1-0/+19
Introduce dev_change_proto_down_generic, a generic ndo_change_proto_down implementation, which sets the netdev carrier state according to proto_down. This adds the ability to set protodown on vxlan and macvlan devices in a generic way for use by control protocols like VRRPD. Signed-off-by: Andy Roulin <[email protected]> Acked-by: Roopa Prabhu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-24net: Skip GSO length estimation if transport header is not setMaxim Mikityanskiy1-1/+1
qdisc_pkt_len_init expects transport_header to be set for GSO packets. Patch [1] skips transport_header validation for GSO packets that don't have network_header set at the moment of calling virtio_net_hdr_to_skb, and allows them to pass into the stack. After patch [2] no placeholder value is assigned to transport_header if dissection fails, so this patch adds a check to the place where the value of transport_header is used. [1] https://patchwork.ozlabs.org/patch/1044429/ [2] https://patchwork.ozlabs.org/patch/1046122/ Signed-off-by: Maxim Mikityanskiy <[email protected]> Acked-by: Willem de Bruijn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-24net: Use RCU_INIT_POINTER() to set sk_wqLi RongQing1-3/+3
This pointer is RCU protected, so proper primitives should be used. Signed-off-by: Zhang Yu <[email protected]> Signed-off-by: Li RongQing <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller30-192/+255
Three conflicts, one of which, for marvell10g.c is non-trivial and requires some follow-up from Heiner or someone else. The issue is that Heiner converted the marvell10g driver over to use the generic c45 code as much as possible. However, in 'net' a bug fix appeared which makes sure that a new local mask (MDIO_AN_10GBT_CTRL_ADV_NBT_MASK) with value 0x01e0 is cleared. Signed-off-by: David S. Miller <[email protected]>
2019-02-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds29-186/+246
Pull networking fixes from David Miller: "Hopefully the last pull request for this release. Fingers crossed: 1) Only refcount ESP stats on full sockets, from Martin Willi. 2) Missing barriers in AF_UNIX, from Al Viro. 3) RCU protection fixes in ipv6 route code, from Paolo Abeni. 4) Avoid false positives in untrusted GSO validation, from Willem de Bruijn. 5) Forwarded mesh packets in mac80211 need more tailroom allocated, from Felix Fietkau. 6) Use operstate consistently for linkup in team driver, from George Wilkie. 7) ThunderX bug fixes from Vadim Lomovtsev. Mostly races between VF and PF code paths. 8) Purge ipv6 exceptions during netdevice removal, from Paolo Abeni. 9) nfp eBPF code gen fixes from Jiong Wang. 10) bnxt_en firmware timeout fix from Michael Chan. 11) Use after free in udp/udpv6 error handlers, from Paolo Abeni. 12) Fix a race in x25_bind triggerable by syzbot, from Eric Dumazet" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (65 commits) net: phy: realtek: Dummy IRQ calls for RTL8366RB tcp: repaired skbs must init their tso_segs net/x25: fix a race in x25_bind() net: dsa: Remove documentation for port_fdb_prepare Revert "bridge: do not add port to router list when receives query with source 0.0.0.0" selftests: fib_tests: sleep after changing carrier. again. net: set static variable an initial value in atl2_probe() net: phy: marvell10g: Fix Multi-G advertisement to only advertise 10G bpf, doc: add bpf list as secondary entry to maintainers file udp: fix possible user after free in error handler udpv6: fix possible user after free in error handler fou6: fix proto error handler argument type udpv6: add the required annotation to mib type mdio_bus: Fix use-after-free on device_register fails net: Set rtm_table to RT_TABLE_COMPAT for ipv6 for tables > 255 bnxt_en: Wait longer for the firmware message response to complete. bnxt_en: Fix typo in firmware message timeout logic. nfp: bpf: fix ALU32 high bits clearance bug nfp: bpf: fix code-gen bug on BPF_ALU | BPF_XOR | BPF_K Documentation: networking: switchdev: Update port parent ID section ...
2019-02-23tcp: repaired skbs must init their tso_segsEric Dumazet1-0/+1
syzbot reported a WARN_ON(!tcp_skb_pcount(skb)) in tcp_send_loss_probe() [1] This was caused by TCP_REPAIR sent skbs that inadvertenly were missing a call to tcp_init_tso_segs() [1] WARNING: CPU: 1 PID: 0 at net/ipv4/tcp_output.c:2534 tcp_send_loss_probe+0x771/0x8a0 net/ipv4/tcp_output.c:2534 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.0.0-rc7+ #77 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 panic+0x2cb/0x65c kernel/panic.c:214 __warn.cold+0x20/0x45 kernel/panic.c:571 report_bug+0x263/0x2b0 lib/bug.c:186 fixup_bug arch/x86/kernel/traps.c:178 [inline] fixup_bug arch/x86/kernel/traps.c:173 [inline] do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:271 do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:290 invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:973 RIP: 0010:tcp_send_loss_probe+0x771/0x8a0 net/ipv4/tcp_output.c:2534 Code: 88 fc ff ff 4c 89 ef e8 ed 75 c8 fb e9 c8 fc ff ff e8 43 76 c8 fb e9 63 fd ff ff e8 d9 75 c8 fb e9 94 f9 ff ff e8 bf 03 91 fb <0f> 0b e9 7d fa ff ff e8 b3 03 91 fb 0f b6 1d 37 43 7a 03 31 ff 89 RSP: 0018:ffff8880ae907c60 EFLAGS: 00010206 RAX: ffff8880a989c340 RBX: 0000000000000000 RCX: ffffffff85dedbdb RDX: 0000000000000100 RSI: ffffffff85dee0b1 RDI: 0000000000000005 RBP: ffff8880ae907c90 R08: ffff8880a989c340 R09: ffffed10147d1ae1 R10: ffffed10147d1ae0 R11: ffff8880a3e8d703 R12: ffff888091b90040 R13: ffff8880a3e8d540 R14: 0000000000008000 R15: ffff888091b90860 tcp_write_timer_handler+0x5c0/0x8a0 net/ipv4/tcp_timer.c:583 tcp_write_timer+0x10e/0x1d0 net/ipv4/tcp_timer.c:607 call_timer_fn+0x190/0x720 kernel/time/timer.c:1325 expire_timers kernel/time/timer.c:1362 [inline] __run_timers kernel/time/timer.c:1681 [inline] __run_timers kernel/time/timer.c:1649 [inline] run_timer_softirq+0x652/0x1700 kernel/time/timer.c:1694 __do_softirq+0x266/0x95a kernel/softirq.c:292 invoke_softirq kernel/softirq.c:373 [inline] irq_exit+0x180/0x1d0 kernel/softirq.c:413 exiting_irq arch/x86/include/asm/apic.h:536 [inline] smp_apic_timer_interrupt+0x14a/0x570 arch/x86/kernel/apic/apic.c:1062 apic_timer_interrupt+0xf/0x20 arch/x86/entry/entry_64.S:807 </IRQ> RIP: 0010:native_safe_halt+0x2/0x10 arch/x86/include/asm/irqflags.h:58 Code: ff ff ff 48 89 c7 48 89 45 d8 e8 59 0c a1 fa 48 8b 45 d8 e9 ce fe ff ff 48 89 df e8 48 0c a1 fa eb 82 90 90 90 90 90 90 fb f4 <c3> 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 f4 c3 90 90 90 90 90 90 RSP: 0018:ffff8880a98afd78 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff13 RAX: 1ffffffff1125061 RBX: ffff8880a989c340 RCX: 0000000000000000 RDX: dffffc0000000000 RSI: 0000000000000001 RDI: ffff8880a989cbbc RBP: ffff8880a98afda8 R08: ffff8880a989c340 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001 R13: ffffffff889282f8 R14: 0000000000000001 R15: 0000000000000000 arch_cpu_idle+0x10/0x20 arch/x86/kernel/process.c:555 default_idle_call+0x36/0x90 kernel/sched/idle.c:93 cpuidle_idle_call kernel/sched/idle.c:153 [inline] do_idle+0x386/0x570 kernel/sched/idle.c:262 cpu_startup_entry+0x1b/0x20 kernel/sched/idle.c:353 start_secondary+0x404/0x5c0 arch/x86/kernel/smpboot.c:271 secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:243 Kernel Offset: disabled Rebooting in 86400 seconds.. Fixes: 79861919b889 ("tcp: fix TCP_REPAIR xmit queue setup") Signed-off-by: Eric Dumazet <[email protected]> Reported-by: syzbot <[email protected]> Cc: Andrey Vagin <[email protected]> Cc: Soheil Hassas Yeganeh <[email protected]> Cc: Neal Cardwell <[email protected]> Acked-by: Soheil Hassas Yeganeh <[email protected]> Acked-by: Neal Cardwell <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-23net/x25: fix a race in x25_bind()Eric Dumazet1-5/+8
syzbot was able to trigger another soft lockup [1] I first thought it was the O(N^2) issue I mentioned in my prior fix (f657d22ee1f "net/x25: do not hold the cpu too long in x25_new_lci()"), but I eventually found that x25_bind() was not checking SOCK_ZAPPED state under socket lock protection. This means that multiple threads can end up calling x25_insert_socket() for the same socket, and corrupt x25_list [1] watchdog: BUG: soft lockup - CPU#0 stuck for 123s! [syz-executor.2:10492] Modules linked in: irq event stamp: 27515 hardirqs last enabled at (27514): [<ffffffff81006673>] trace_hardirqs_on_thunk+0x1a/0x1c hardirqs last disabled at (27515): [<ffffffff8100668f>] trace_hardirqs_off_thunk+0x1a/0x1c softirqs last enabled at (32): [<ffffffff8632ee73>] x25_get_neigh+0xa3/0xd0 net/x25/x25_link.c:336 softirqs last disabled at (34): [<ffffffff86324bc3>] x25_find_socket+0x23/0x140 net/x25/af_x25.c:341 CPU: 0 PID: 10492 Comm: syz-executor.2 Not tainted 5.0.0-rc7+ #88 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__sanitizer_cov_trace_pc+0x4/0x50 kernel/kcov.c:97 Code: f4 ff ff ff e8 11 9f ea ff 48 c7 05 12 fb e5 08 00 00 00 00 e9 c8 e9 ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 55 48 89 e5 <48> 8b 75 08 65 48 8b 04 25 40 ee 01 00 65 8b 15 38 0c 92 7e 81 e2 RSP: 0018:ffff88806e94fc48 EFLAGS: 00000286 ORIG_RAX: ffffffffffffff13 RAX: 1ffff1100d84dac5 RBX: 0000000000000001 RCX: ffffc90006197000 RDX: 0000000000040000 RSI: ffffffff86324bf3 RDI: ffff88806c26d628 RBP: ffff88806e94fc48 R08: ffff88806c1c6500 R09: fffffbfff1282561 R10: fffffbfff1282560 R11: ffffffff89412b03 R12: ffff88806c26d628 R13: ffff888090455200 R14: dffffc0000000000 R15: 0000000000000000 FS: 00007f3a107e4700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f3a107e3db8 CR3: 00000000a5544000 CR4: 00000000001406f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __x25_find_socket net/x25/af_x25.c:327 [inline] x25_find_socket+0x7d/0x140 net/x25/af_x25.c:342 x25_new_lci net/x25/af_x25.c:355 [inline] x25_connect+0x380/0xde0 net/x25/af_x25.c:784 __sys_connect+0x266/0x330 net/socket.c:1662 __do_sys_connect net/socket.c:1673 [inline] __se_sys_connect net/socket.c:1670 [inline] __x64_sys_connect+0x73/0xb0 net/socket.c:1670 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x457e29 Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f3a107e3c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000457e29 RDX: 0000000000000012 RSI: 0000000020000200 RDI: 0000000000000005 RBP: 000000000073c040 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007f3a107e46d4 R13: 00000000004be362 R14: 00000000004ceb98 R15: 00000000ffffffff Sending NMI from CPU 0 to CPUs 1: NMI backtrace for cpu 1 CPU: 1 PID: 10493 Comm: syz-executor.3 Not tainted 5.0.0-rc7+ #88 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:__read_once_size include/linux/compiler.h:193 [inline] RIP: 0010:queued_write_lock_slowpath+0x143/0x290 kernel/locking/qrwlock.c:86 Code: 4c 8d 2c 01 41 83 c7 03 41 0f b6 45 00 41 38 c7 7c 08 84 c0 0f 85 0c 01 00 00 8b 03 3d 00 01 00 00 74 1a f3 90 41 0f b6 55 00 <41> 38 d7 7c eb 84 d2 74 e7 48 89 df e8 cc aa 4e 00 eb dd be 04 00 RSP: 0018:ffff888085c47bd8 EFLAGS: 00000206 RAX: 0000000000000300 RBX: ffffffff89412b00 RCX: 1ffffffff1282560 RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff89412b00 RBP: ffff888085c47c70 R08: 1ffffffff1282560 R09: fffffbfff1282561 R10: fffffbfff1282560 R11: ffffffff89412b03 R12: 00000000000000ff R13: fffffbfff1282560 R14: 1ffff11010b88f7d R15: 0000000000000003 FS: 00007fdd04086700(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fdd04064db8 CR3: 0000000090be0000 CR4: 00000000001406e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: queued_write_lock include/asm-generic/qrwlock.h:104 [inline] do_raw_write_lock+0x1d6/0x290 kernel/locking/spinlock_debug.c:203 __raw_write_lock_bh include/linux/rwlock_api_smp.h:204 [inline] _raw_write_lock_bh+0x3b/0x50 kernel/locking/spinlock.c:312 x25_insert_socket+0x21/0xe0 net/x25/af_x25.c:267 x25_bind+0x273/0x340 net/x25/af_x25.c:703 __sys_bind+0x23f/0x290 net/socket.c:1481 __do_sys_bind net/socket.c:1492 [inline] __se_sys_bind net/socket.c:1490 [inline] __x64_sys_bind+0x73/0xb0 net/socket.c:1490 do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x457e29 Fixes: 90c27297a9bf ("X.25 remove bkl in bind") Signed-off-by: Eric Dumazet <[email protected]> Cc: andrew hendry <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-23Revert "bridge: do not add port to router list when receives query with ↵Hangbin Liu1-8/+1
source 0.0.0.0" This reverts commit 5a2de63fd1a5 ("bridge: do not add port to router list when receives query with source 0.0.0.0") and commit 0fe5119e267f ("net: bridge: remove ipv6 zero address check in mcast queries") The reason is RFC 4541 is not a standard but suggestive. Currently we will elect 0.0.0.0 as Querier if there is no ip address configured on bridge. If we do not add the port which recives query with source 0.0.0.0 to router list, the IGMP reports will not be about to forward to Querier, IGMP data will also not be able to forward to dest. As Nikolay suggested, revert this change first and add a boolopt api to disable none-zero election in future if needed. Reported-by: Linus Lüssing <[email protected]> Reported-by: Sebastian Gottschall <[email protected]> Fixes: 5a2de63fd1a5 ("bridge: do not add port to router list when receives query with source 0.0.0.0") Fixes: 0fe5119e267f ("net: bridge: remove ipv6 zero address check in mcast queries") Signed-off-by: Hangbin Liu <[email protected]> Acked-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-23kcm: Remove unnecessary SLAB_PANIC for kmem_cache_create() in kcm_initYueHaibing1-2/+2
There has check NULL on kmem_cache_create on failure in kcm_init, no need use SLAB_PANIC to panic the system. Signed-off-by: YueHaibing <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-23bpfilter: re-add header search paths to tools include to fix build errorMasahiro Yamada1-0/+1
I thought header search paths to tools/include(/uapi) were unneeded, but it looks like a build error occurs depending on the compiler. Commit 303a339f30a9 ("bpfilter: remove extra header search paths for bpfilter_umh") reintroduced the build error fixed by commit ae40832e53c3 ("bpfilter: fix a build err"). Apology for the breakage, and thanks to Guenter for reporting this. Fixes: 303a339f30a9 ("bpfilter: remove extra header search paths for bpfilter_umh") Reported-by: Guenter Roeck <[email protected]> Signed-off-by: Masahiro Yamada <[email protected]> Tested-by: Guenter Roeck <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller2-22/+39
Daniel Borkmann says: ==================== pull-request: bpf 2019-02-23 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix a bug in BPF's LPM deletion logic to match correct prefix length, from Alban. 2) Fix AF_XDP teardown by not destroying umem prematurely as it is still needed till all outstanding skbs are freed, from Björn. 3) Fix unkillable BPF_PROG_TEST_RUN under preempt kernel by checking signal_pending() outside need_resched() condition which is never triggered there, from Stanislav. 4) Fix two nfp JIT bugs, one in code emission for K-based xor, and another one to explicitly clear upper bits in alu32, from Jiong. 5) Add bpf list address to maintainers file, from Daniel. ==================== Signed-off-by: David S. Miller <[email protected]>
2019-02-22udp: fix possible user after free in error handlerPaolo Abeni1-2/+4
Similar to the previous commit, this addresses the same issue for ipv4: use a single fetch operation and use the correct rcu annotation. Fixes: e7cc082455cb ("udp: Support for error handlers of tunnels with arbitrary destination port") Signed-off-by: Paolo Abeni <[email protected]> Acked-by: Stefano Brivio <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-22udpv6: fix possible user after free in error handlerPaolo Abeni1-4/+6
Before derefencing the encap pointer, commit e7cc082455cb ("udp: Support for error handlers of tunnels with arbitrary destination port") checks for a NULL value, but the two fetch operation can race with removal. Fix the above using a single access. Also fix a couple of type annotations, to make sparse happy. Fixes: e7cc082455cb ("udp: Support for error handlers of tunnels with arbitrary destination port") Signed-off-by: Paolo Abeni <[email protected]> Acked-by: Stefano Brivio <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-02-22fou6: fix proto error handler argument typePaolo Abeni1-1/+1
Last argument of gue6_err_proto_handler() has a wrong type annotation, fix it and make sparse happy again. Fixes: b8a51b38e4d4 ("fou, fou6: ICMP error handlers for FoU and GUE") Signed-off-by: Paolo Abeni <[email protected]> Acked-by: Stefano Brivio <[email protected]> Signed-off-by: David S. Miller <[email protected]>