aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2019-09-06net: sched: fix reordering issuesEric Dumazet1-2/+7
Whenever MQ is not used on a multiqueue device, we experience serious reordering problems. Bisection found the cited commit. The issue can be described this way : - A single qdisc hierarchy is shared by all transmit queues. (eg : tc qdisc replace dev eth0 root fq_codel) - When/if try_bulk_dequeue_skb_slow() dequeues a packet targetting a different transmit queue than the one used to build a packet train, we stop building the current list and save the 'bad' skb (P1) in a special queue. (bad_txq) - When dequeue_skb() calls qdisc_dequeue_skb_bad_txq() and finds this skb (P1), it checks if the associated transmit queues is still in frozen state. If the queue is still blocked (by BQL or NIC tx ring full), we leave the skb in bad_txq and return NULL. - dequeue_skb() calls q->dequeue() to get another packet (P2) The other packet can target the problematic queue (that we found in frozen state for the bad_txq packet), but another cpu just ran TX completion and made room in the txq that is now ready to accept new packets. - Packet P2 is sent while P1 is still held in bad_txq, P1 might be sent at next round. In practice P2 is the lead of a big packet train (P2,P3,P4 ...) filling the BQL budget and delaying P1 by many packets :/ To solve this problem, we have to block the dequeue process as long as the first packet in bad_txq can not be sent. Reordering issues disappear and no side effects have been seen. Fixes: a53851e2c321 ("net: sched: explicit locking in gso_cpu fallback") Signed-off-by: Eric Dumazet <[email protected]> Cc: John Fastabend <[email protected]> Acked-by: John Fastabend <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-09-06Merge branch 'master' of ↵David S. Miller2-33/+29
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2019-09-05 1) Several xfrm interface fixes from Nicolas Dichtel: - Avoid an interface ID corruption on changelink. - Fix wrong intterface names in the logs. - Fix a list corruption when changing network namespaces. - Fix unregistation of the underying phydev. 2) Fix a potential warning when merging xfrm_plocy nodes. From Florian Westphal. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <[email protected]>
2019-09-06net_sched: act_police: add 2 new attributes to support police 64bit rate and ↵David Dai1-4/+23
peakrate For high speed adapter like Mellanox CX-5 card, it can reach upto 100 Gbits per second bandwidth. Currently htb already supports 64bit rate in tc utility. However police action rate and peakrate are still limited to 32bit value (upto 32 Gbits per second). Add 2 new attributes TCA_POLICE_RATE64 and TCA_POLICE_RATE64 in kernel for 64bit support so that tc utility can use them for 64bit rate and peakrate value to break the 32bit limit, and still keep the backward binary compatibility. Tested-by: David Dai <[email protected]> Signed-off-by: David Dai <[email protected]> Acked-by: Cong Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-09-06net: openvswitch: Set OvS recirc_id from tc chain indexPaul Blakey6-5/+79
Offloaded OvS datapath rules are translated one to one to tc rules, for example the following simplified OvS rule: recirc_id(0),in_port(dev1),eth_type(0x0800),ct_state(-trk) actions:ct(),recirc(2) Will be translated to the following tc rule: $ tc filter add dev dev1 ingress \ prio 1 chain 0 proto ip \ flower tcp ct_state -trk \ action ct pipe \ action goto chain 2 Received packets will first travel though tc, and if they aren't stolen by it, like in the above rule, they will continue to OvS datapath. Since we already did some actions (action ct in this case) which might modify the packets, and updated action stats, we would like to continue the proccessing with the correct recirc_id in OvS (here recirc_id(2)) where we left off. To support this, introduce a new skb extension for tc, which will be used for translating tc chain to ovs recirc_id to handle these miss cases. Last tc chain index will be set by tc goto chain action and read by OvS datapath. Signed-off-by: Paul Blakey <[email protected]> Signed-off-by: Vlad Buslov <[email protected]> Acked-by: Jiri Pirko <[email protected]> Acked-by: Pravin B Shelar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-09-05new helper: get_tree_keyed()Al Viro1-2/+1
For vfs_get_keyed_super users. Signed-off-by: Al Viro <[email protected]>
2019-09-05Bluetooth: mgmt: Use struct_size() helperGustavo A. R. Silva1-6/+2
One of the more common cases of allocation size calculations is finding the size of a structure that has a zero-sized array at the end, along with memory for some number of elements for that array. For example: struct mgmt_rp_get_connections { ... struct mgmt_addr_info addr[0]; } __packed; Make use of the struct_size() helper instead of an open-coded version in order to avoid any potential type mistakes. So, replace the following form: sizeof(*rp) + (i * sizeof(struct mgmt_addr_info)); with: struct_size(rp, addr, i) Also, notice that, in this case, variable rp_len is not necessary, hence it is removed. This code was detected with the help of Coccinelle. Signed-off-by: Gustavo A. R. Silva <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2019-09-05Bluetooth: 6lowpan: Make variable header_ops constantNishka Dasgupta1-1/+1
Static variable header_ops, of type header_ops, is used only once, when it is assigned to field header_ops of a variable having type net_device. This corresponding field is declared as const in the definition of net_device. Hence make header_ops constant as well to protect it from unnecessary modification. Issue found with Coccinelle. Signed-off-by: Nishka Dasgupta <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2019-09-05Bluetooth: Add support for utilizing Fast Advertising IntervalSpoorthi Ravishankar Koppad1-7/+22
Changes made to add support for fast advertising interval as per core 4.1 specification, section 9.3.11.2. A peripheral device entering any of the following GAP modes and sending either non-connectable advertising events or scannable undirected advertising events should use adv_fast_interval2 (100ms - 150ms) for adv_fast_period(30s). - Non-Discoverable Mode - Non-Connectable Mode - Limited Discoverable Mode - General Discoverable Mode Signed-off-by: Spoorthi Ravishankar Koppad <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2019-09-05xsk: lock the control mutex in sock_diag interfaceBjörn Töpel1-0/+3
When accessing the members of an XDP socket, the control mutex should be held. This commit fixes that. Acked-by: Jonathan Lemon <[email protected]> Fixes: a36b38aa2af6 ("xsk: add sock_diag interface for AF_XDP") Signed-off-by: Björn Töpel <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
2019-09-05xsk: use state member for socket synchronizationBjörn Töpel1-15/+39
Prior the state variable was introduced by Ilya, the dev member was used to determine whether the socket was bound or not. However, when dev was read, proper SMP barriers and READ_ONCE were missing. In order to address the missing barriers and READ_ONCE, we start using the state variable as a point of synchronization. The state member read/write is paired with proper SMP barriers, and from this follows that the members described above does not need READ_ONCE if used in conjunction with state check. In all syscalls and the xsk_rcv path we check if state is XSK_BOUND. If that is the case we do a SMP read barrier, and this implies that the dev, umem and all rings are correctly setup. Note that no READ_ONCE are needed for these variable if used when state is XSK_BOUND (plus the read barrier). To summarize: The members struct xdp_sock members dev, queue_id, umem, fq, cq, tx, rx, and state were read lock-less, with incorrect barriers and missing {READ, WRITE}_ONCE. Now, umem, fq, cq, tx, rx, and state are read lock-less. When these members are updated, WRITE_ONCE is used. When read, READ_ONCE are only used when read outside the control mutex (e.g. mmap) or, not synchronized with the state member (XSK_BOUND plus smp_rmb()) Note that dev and queue_id do not need a WRITE_ONCE or READ_ONCE, due to the introduce state synchronization (XSK_BOUND plus smp_rmb()). Introducing the state check also fixes a race, found by syzcaller, in xsk_poll() where umem could be accessed when stale. Suggested-by: Hillf Danton <[email protected]> Reported-by: [email protected] Fixes: 77cd0d7b3f25 ("xsk: add support for need_wakeup flag in AF_XDP rings") Signed-off-by: Björn Töpel <[email protected]> Acked-by: Jonathan Lemon <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
2019-09-05xsk: avoid store-tearing when assigning umemBjörn Töpel1-2/+2
The umem member of struct xdp_sock is read outside of the control mutex, in the mmap implementation, and needs a WRITE_ONCE to avoid potential store-tearing. Acked-by: Jonathan Lemon <[email protected]> Fixes: 423f38329d26 ("xsk: add umem fill queue support and mmap") Signed-off-by: Björn Töpel <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
2019-09-05xsk: avoid store-tearing when assigning queuesBjörn Töpel1-1/+1
Use WRITE_ONCE when doing the store of tx, rx, fq, and cq, to avoid potential store-tearing. These members are read outside of the control mutex in the mmap implementation. Acked-by: Jonathan Lemon <[email protected]> Fixes: 37b076933a8e ("xsk: add missing write- and data-dependency barrier") Signed-off-by: Björn Töpel <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
2019-09-05netfilter: nf_tables: fix possible null-pointer dereference in object updateFernando Fernandez Mancera1-0/+3
Not all objects have an update operation. If the object type doesn't implement an update operation and the user tries to update it will hit EOPNOTSUPP. Fixes: d62d0ba97b58 ("netfilter: nf_tables: Introduce stateful object update operation") Signed-off-by: Fernando Fernandez Mancera <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2019-09-05net: Properly update v4 routes with v6 nexthopDonald Sharp2-12/+14
When creating a v4 route that uses a v6 nexthop from a nexthop group. Allow the kernel to properly send the nexthop as v6 via the RTA_VIA attribute. Broken behavior: $ ip nexthop add via fe80::9 dev eth0 $ ip nexthop show id 1 via fe80::9 dev eth0 scope link $ ip route add 4.5.6.7/32 nhid 1 $ ip route show default via 10.0.2.2 dev eth0 4.5.6.7 nhid 1 via 254.128.0.0 dev eth0 10.0.2.0/24 dev eth0 proto kernel scope link src 10.0.2.15 $ Fixed behavior: $ ip nexthop add via fe80::9 dev eth0 $ ip nexthop show id 1 via fe80::9 dev eth0 scope link $ ip route add 4.5.6.7/32 nhid 1 $ ip route show default via 10.0.2.2 dev eth0 4.5.6.7 nhid 1 via inet6 fe80::9 dev eth0 10.0.2.0/24 dev eth0 proto kernel scope link src 10.0.2.15 $ v2, v3: Addresses code review comments from David Ahern Fixes: dcb1ecb50edf (“ipv4: Prepare for fib6_nh from a nexthop object”) Signed-off-by: Donald Sharp <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-09-05pppoatm: use %*ph to print small bufferAndy Shevchenko1-3/+1
Use %*ph format to print small buffer as hex string. Signed-off-by: Andy Shevchenko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-09-05Merge tag 'linux-can-next-for-5.4-20190904' of ↵David S. Miller15-282/+4730
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next Marc Kleine-Budde says: ==================== pull-request: can-next 2019-09-04 j1939 this is a pull request for net-next/master consisting of 21 patches. the first 12 patches are by me and target the CAN core infrastructure. They clean up the names of variables , structs and struct members, convert can_rx_register() to use max() instead of open coding it and remove unneeded code from the can_pernet_exit() callback. The next three patches are also by me and they introduce and make use of the CAN midlayer private structure. It is used to hold protocol specific per device data structures. The next patch is by Oleksij Rempel, switches the &net->can.rcvlists_lock from a spin_lock() to a spin_lock_bh(), so that it can be used from NAPI (soft IRQ) context. The next 4 patches are by Kurt Van Dijck, he first updates his email address via mailmap and then extends sockaddr_can to include j1939 members. The final patch is the collective effort of many entities (The j1939 authors: Oliver Hartkopp, Bastian Stender, Elenita Hinds, kbuild test robot, Kurt Van Dijck, Maxime Jayat, Robin van der Gracht, Oleksij Rempel, Marc Kleine-Budde). It adds support of SAE J1939 protocol to the CAN networking stack. SAE J1939 is the vehicle bus recommended practice used for communication and diagnostics among vehicle components. Originating in the car and heavy-duty truck industry in the United States, it is now widely used in other parts of the world. P.S.: This pull request doesn't invalidate my last pull request: "pull-request: can-next 2019-09-03". ==================== Signed-off-by: David S. Miller <[email protected]>
2019-09-05net: mpoa: Use kzfree rather than its implementation.zhong jiang1-4/+2
Use kzfree instead of memset() + kfree(). Signed-off-by: zhong jiang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-09-05sunrpc: Use kzfree rather than its implementation.zhong jiang1-6/+3
Use kzfree instead of memset() + kfree(). Signed-off-by: zhong jiang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-09-05ipv6: Fix RTA_MULTIPATH with nexthop objectsDavid Ahern1-1/+1
A change to the core nla helpers was missed during the push of the nexthop changes. rt6_fill_node_nexthop should be calling nla_nest_start_noflag not nla_nest_start. Currently, iproute2 does not print multipath data because of parsing issues with the attribute. Fixes: f88d8ea67fbd ("ipv6: Plumb support for nexthop object in a fib6_info") Signed-off-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-09-05net: sock_map, fix missing ulp check in sock hash caseJohn Fastabend1-0/+3
sock_map and ULP only work together when ULP is loaded after the sock map is loaded. In the sock_map case we added a check for this to fail the load if ULP is already set. However, we missed the check on the sock_hash side. Add a ULP check to the sock_hash update path. Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface") Reported-by: [email protected] Signed-off-by: John Fastabend <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-09-05tipc: add NULL pointer check before calling kfree_rcuXin Long1-1/+2
Unlike kfree(p), kfree_rcu(p, rcu) won't do NULL pointer check. When tipc_nametbl_remove_publ returns NULL, the panic below happens: BUG: unable to handle kernel NULL pointer dereference at 0000000000000068 RIP: 0010:__call_rcu+0x1d/0x290 Call Trace: <IRQ> tipc_publ_notify+0xa9/0x170 [tipc] tipc_node_write_unlock+0x8d/0x100 [tipc] tipc_node_link_down+0xae/0x1d0 [tipc] tipc_node_check_dest+0x3ea/0x8f0 [tipc] ? tipc_disc_rcv+0x2c7/0x430 [tipc] tipc_disc_rcv+0x2c7/0x430 [tipc] ? tipc_rcv+0x6bb/0xf20 [tipc] tipc_rcv+0x6bb/0xf20 [tipc] ? ip_route_input_slow+0x9cf/0xb10 tipc_udp_recv+0x195/0x1e0 [tipc] ? tipc_udp_is_known_peer+0x80/0x80 [tipc] udp_queue_rcv_skb+0x180/0x460 udp_unicast_rcv_skb.isra.56+0x75/0x90 __udp4_lib_rcv+0x4ce/0xb90 ip_local_deliver_finish+0x11c/0x210 ip_local_deliver+0x6b/0xe0 ? ip_rcv_finish+0xa9/0x410 ip_rcv+0x273/0x362 Fixes: 97ede29e80ee ("tipc: convert name table read-write lock to RCU") Reported-by: Li Shuang <[email protected]> Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-09-05vsock/virtio: a better comment on credit updateMichael S. Tsirkin1-2/+7
The comment we have is just repeating what the code does. Include the *reason* for the condition instead. Cc: Stefano Garzarella <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]> Reviewed-by: Stefano Garzarella <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-09-05net/tls: dedup the record cleanupJakub Kicinski1-5/+1
If retransmit record hint fall into the cleanup window we will free it by just walking the list. No need to duplicate the code. Signed-off-by: Jakub Kicinski <[email protected]> Reviewed-by: John Hurley <[email protected]> Reviewed-by: Dirk van der Merwe <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-09-05net/tls: clean up the number of #ifdefs for CONFIG_TLS_DEVICEJakub Kicinski2-22/+3
TLS code has a number of #ifdefs which make the code a little harder to follow. Recent fixes removed the ifdef around the TLS_HW define, so we can switch to the often used pattern of defining tls_device functions as empty static inlines in the header when CONFIG_TLS_DEVICE=n. Signed-off-by: Jakub Kicinski <[email protected]> Reviewed-by: John Hurley <[email protected]> Reviewed-by: Dirk van der Merwe <[email protected]> Acked-by: John Fastabend <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-09-05net/tls: narrow down the critical area of device_offload_lockJakub Kicinski1-24/+22
On setsockopt path we need to hold device_offload_lock from the moment we check netdev is up until the context is fully ready to be added to the tls_device_list. No need to hold it around the get_netdev_for_sock(). Change the code and remove the confusing comment. Signed-off-by: Jakub Kicinski <[email protected]> Reviewed-by: John Hurley <[email protected]> Reviewed-by: Dirk van der Merwe <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-09-05net/tls: don't jump to returnJakub Kicinski1-13/+13
Reusing parts of error path for normal exit will make next commit harder to read, untangle the two. Signed-off-by: Jakub Kicinski <[email protected]> Reviewed-by: John Hurley <[email protected]> Reviewed-by: Dirk van der Merwe <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-09-05net/tls: use the full sk_proto pointerJakub Kicinski1-17/+10
Since we already have the pointer to the full original sk_proto stored use that instead of storing all individual callback pointers as well. Signed-off-by: Jakub Kicinski <[email protected]> Reviewed-by: John Hurley <[email protected]> Reviewed-by: Dirk van der Merwe <[email protected]> Acked-by: John Fastabend <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-09-05Convert usage of IN_MULTICAST to ipv4_is_multicastDave Taht3-6/+6
IN_MULTICAST's primary intent is as a uapi macro. Elsewhere in the kernel we use ipv4_is_multicast consistently. This patch unifies linux's multicast checks to use that function rather than this macro. Signed-off-by: Dave Taht <[email protected]> Reviewed-by: Toke Høiland-Jørgensen <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-09-05net/sched: cbs: remove redundant assignment to variable port_rateColin Ian King1-1/+1
Variable port_rate is being initialized with a value that is never read and is being re-assigned a little later on. The assignment is redundant and hence can be removed. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-09-05Merge branch 'for-upstream' of ↵David S. Miller2-13/+1
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Johan Hedberg says: ==================== pull request: bluetooth 2019-09-05 Here are a few more Bluetooth fixes for 5.3. I hope they can still make it. There's one USB ID addition for btusb, two reverts due to discovered regressions, and two other important fixes. ==================== Signed-off-by: David S. Miller <[email protected]>
2019-09-05Revert "Bluetooth: validate BLE connection interval updates"Marcel Holtmann2-13/+1
This reverts commit c49a8682fc5d298d44e8d911f4fa14690ea9485e. There are devices which require low connection intervals for usable operation including keyboards and mice. Forcing a static connection interval for these types of devices has an impact in latency and causes a regression. Signed-off-by: Marcel Holtmann <[email protected]> Signed-off-by: Johan Hedberg <[email protected]>
2019-09-05net-ipv6: fix excessive RTF_ADDRCONF flag on ::1/128 local route (and others)Maciej Żenczykowski1-2/+6
There is a subtle change in behaviour introduced by: commit c7a1ce397adacaf5d4bb2eab0a738b5f80dc3e43 'ipv6: Change addrconf_f6i_alloc to use ip6_route_info_create' Before that patch /proc/net/ipv6_route includes: 00000000000000000000000000000001 80 00000000000000000000000000000000 00 00000000000000000000000000000000 00000000 00000003 00000000 80200001 lo Afterwards /proc/net/ipv6_route includes: 00000000000000000000000000000001 80 00000000000000000000000000000000 00 00000000000000000000000000000000 00000000 00000002 00000000 80240001 lo ie. the above commit causes the ::1/128 local (automatic) route to be flagged with RTF_ADDRCONF (0x040000). AFAICT, this is incorrect since these routes are *not* coming from RA's. As such, this patch restores the old behaviour. Fixes: c7a1ce397ada ("ipv6: Change addrconf_f6i_alloc to use ip6_route_info_create") Cc: David Ahern <[email protected]> Cc: Lorenzo Colitti <[email protected]> Signed-off-by: Maciej Żenczykowski <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-09-05sctp: use transport pf_retrans in sctp_do_8_2_transport_strikeXin Long1-1/+1
Transport should use its own pf_retrans to do the error_count check, instead of asoc's. Otherwise, it's meaningless to make pf_retrans per transport. Fixes: 5aa93bcf66f4 ("sctp: Implement quick failover draft from tsvwg") Signed-off-by: Xin Long <[email protected]> Acked-by: Marcelo Ricardo Leitner <[email protected]> Acked-by: Neil Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-09-05rxrpc: Fix misplaced tracelineDavid Howells1-1/+1
There's a misplaced traceline in rxrpc_input_packet() which is looking at a packet that just got released rather than the replacement packet. Fix this by moving the traceline after the assignment that moves the new packet pointer to the actual packet pointer. Fixes: d0d5c0cd1e71 ("rxrpc: Use skb_unshare() rather than skb_cow_data()") Reported-by: Hillf Danton <[email protected]> Signed-off-by: David Howells <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2019-09-04can: add support of SAE J1939 protocolThe j1939 authors10-0/+4520
SAE J1939 is the vehicle bus recommended practice used for communication and diagnostics among vehicle components. Originating in the car and heavy-duty truck industry in the United States, it is now widely used in other parts of the world. J1939, ISO 11783 and NMEA 2000 all share the same high level protocol. SAE J1939 can be considered the replacement for the older SAE J1708 and SAE J1587 specifications. Acked-by: Oliver Hartkopp <[email protected]> Signed-off-by: Bastian Stender <[email protected]> Signed-off-by: Elenita Hinds <[email protected]> Signed-off-by: kbuild test robot <[email protected]> Signed-off-by: Kurt Van Dijck <[email protected]> Signed-off-by: Maxime Jayat <[email protected]> Signed-off-by: Robin van der Gracht <[email protected]> Signed-off-by: Oleksij Rempel <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>
2019-09-04can: introduce CAN_REQUIRED_SIZE macroKurt Van Dijck2-4/+4
The size of this structure will be increased with J1939 support. To stay binary compatible, the CAN_REQUIRED_SIZE macro is introduced for existing CAN protocols. Signed-off-by: Kurt Van Dijck <[email protected]> Signed-off-by: Oleksij Rempel <[email protected]> Acked-by: Oliver Hartkopp <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>
2019-09-04can: af_can: use spin_lock_bh() for &net->can.rcvlists_lockOleksij Rempel1-4/+4
The can_rx_unregister() can be called from NAPI (soft IRQ) context, at least by j1939 stack. This leads to potential dead lock with &net->can.rcvlists_lock called from can_rx_register: =============================================================================== WARNING: inconsistent lock state 4.19.0-20181029-1-g3e67f95ba0d3 #3 Not tainted -------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. testj1939/224 [HC0[0]:SC1[1]:HE1:SE0] takes: 1ad0fda3 (&(&net->can.rcvlists_lock)->rlock){+.?.}, at: can_rx_unregister+0x4c/0x1ac {SOFTIRQ-ON-W} state was registered at: lock_acquire+0xd0/0x1f4 _raw_spin_lock+0x30/0x40 can_rx_register+0x5c/0x14c j1939_netdev_start+0xdc/0x1f8 j1939_sk_bind+0x18c/0x1c8 __sys_bind+0x70/0xb0 sys_bind+0x10/0x14 ret_fast_syscall+0x0/0x28 0xbedc9b64 irq event stamp: 2440 hardirqs last enabled at (2440): [<c01302c0>] __local_bh_enable_ip+0xac/0x184 hardirqs last disabled at (2439): [<c0130274>] __local_bh_enable_ip+0x60/0x184 softirqs last enabled at (2412): [<c08b0bf4>] release_sock+0x84/0xa4 softirqs last disabled at (2415): [<c013055c>] irq_exit+0x100/0x1b0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&net->can.rcvlists_lock)->rlock); <Interrupt> lock(&(&net->can.rcvlists_lock)->rlock); *** DEADLOCK *** 2 locks held by testj1939/224: #0: 168eb13b (rcu_read_lock){....}, at: netif_receive_skb_internal+0x3c/0x350 #1: 168eb13b (rcu_read_lock){....}, at: can_receive+0x88/0x1c0 =============================================================================== To avoid this situation, we should use spin_lock_bh() instead of spin_lock(). Signed-off-by: Oleksij Rempel <[email protected]> Acked-by: Oliver Hartkopp <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>
2019-09-04can: af_can: remove NULL-ptr checks from users of can_dev_rcv_lists_find()Marc Kleine-Budde1-29/+16
Since using the "struct can_ml_priv" for the per device "struct dev_rcv_lists" the call can_dev_rcv_lists_find() cannot fail anymore. This patch simplifies af_can by removing the NULL pointer checks from the dev_rcv_lists returned by can_dev_rcv_lists_find(). Signed-off-by: Oleksij Rempel <[email protected]> Acked-by: Oliver Hartkopp <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>
2019-09-04can: make use of preallocated can_ml_priv for per device struct ↵Marc Kleine-Budde1-38/+7
can_dev_rcv_lists This patch removes the old method of allocating the per device protocol specific memory via a netdevice_notifier. This had the drawback, that the allocation can fail, leading to a lot of null pointer checks in the code. This also makes the live cycle management of this memory quite complicated. This patch switches from the allocating the struct can_dev_rcv_lists in a NETDEV_REGISTER call to using the dev->ml_priv, which is allocated by the driver since the previous patch. Signed-off-by: Oleksij Rempel <[email protected]> Acked-by: Oliver Hartkopp <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>
2019-09-04can: introduce CAN midlayer private and allocate it automaticallyMarc Kleine-Budde3-15/+2
This patch introduces the CAN midlayer private structure ("struct can_ml_priv") which should be used to hold protocol specific per device data structures. For now it's only member is "struct can_dev_rcv_lists". The CAN midlayer private is allocated via alloc_netdev()'s private and assigned to "struct net_device::ml_priv" during device creation. This is done transparently for CAN drivers using alloc_candev(). The slcan, vcan and vxcan drivers which are not using alloc_candev() have been adopted manually. The memory layout of the netdev_priv allocated via alloc_candev() will looke like this: +-------------------------+ | driver's priv | +-------------------------+ | struct can_ml_priv | +-------------------------+ | array of struct sk_buff | +-------------------------+ Signed-off-by: Oleksij Rempel <[email protected]> Signed-off-by: Oliver Hartkopp <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>
2019-09-04can: af_can: can_pernet_exit(): no need to iterate over and cleanup ↵Marc Kleine-Budde1-15/+0
registered CAN devices The networking core takes care and unregisters every network device in a namespace before calling the can_pernet_exit() hook. This patch removes the unneeded cleanup. Acked-by: Oliver Hartkopp <[email protected]> Suggested-by: Kirill Tkhai <[email protected]> Signed-off-by: Oleksij Rempel <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>
2019-09-04can: af_can: can_rx_register(): use max() instead of open coding itMarc Kleine-Budde1-2/+2
This patch replaces an open coded max by the proper kernel define max(). Acked-by: Oliver Hartkopp <[email protected]> Signed-off-by: Oleksij Rempel <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>
2019-09-04can: af_can: give variable holding the CAN receiver and the receiver list a ↵Marc Kleine-Budde1-51/+50
sensible name This patch gives the variables holding the CAN receiver and the receiver list a better name by renaming them from "r to "rcv" and "rl" to "recv_list". Signed-off-by: Oleksij Rempel <[email protected]> Acked-by: Oliver Hartkopp <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>
2019-09-04can: af_can: rename find_dev_rcv_lists() to can_dev_rcv_lists_find()Marc Kleine-Budde1-5/+5
This patch add the commonly used prefix "can_" to the find_dev_rcv_lists() function and moves the "find" to the end, as the function returns a struct can_dev_rcv_list. This improves the overall readability of the code. Signed-off-by: Oleksij Rempel <[email protected]> Acked-by: Oliver Hartkopp <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>
2019-09-04can: af_can: rename find_rcv_list() to can_rcv_list_find()Marc Kleine-Budde1-5/+5
This patch add the commonly used prefix "can_" to the find_rcv_list() function and add the "find" to the end, as the function returns a struct rcv_list. This improves the overall readability of the code. Signed-off-by: Oleksij Rempel <[email protected]> Acked-by: Oliver Hartkopp <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>
2019-09-04can: proc: give variable holding the CAN per device receive lists a sensible ↵Marc Kleine-Budde1-18/+20
name This patch gives the variables holding the CAN per device receive filter lists a better name by renaming them from "d" to "dev_rcv_lists". Signed-off-by: Oleksij Rempel <[email protected]> Acked-by: Oliver Hartkopp <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>
2019-09-04can: af_can: give variable holding the CAN per device receive lists a ↵Marc Kleine-Budde1-45/+44
sensible name This patch gives the variables holding the CAN receive filter lists a better name by renaming them from "d" to "dev_rcv_lists". Signed-off-by: Oleksij Rempel <[email protected]> Acked-by: Oliver Hartkopp <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>
2019-09-04can: netns: remove "can_" prefix from members struct netns_canMarc Kleine-Budde2-24/+24
This patch improves the code reability by removing the redundant "can_" prefix from the members of struct netns_can (as the struct netns_can itself is the member "can" of the struct net.) The conversion is done with: sed -i \ -e "s/struct can_dev_rcv_lists \*can_rx_alldev_list;/struct can_dev_rcv_lists *rx_alldev_list;/" \ -e "s/spinlock_t can_rcvlists_lock;/spinlock_t rcvlists_lock;/" \ -e "s/struct timer_list can_stattimer;/struct timer_list stattimer; /" \ -e "s/can\.can_rx_alldev_list/can.rx_alldev_list/g" \ -e "s/can\.can_rcvlists_lock/can.rcvlists_lock/g" \ -e "s/can\.can_stattimer/can.stattimer/g" \ include/net/netns/can.h \ net/can/*.[ch] Signed-off-by: Oleksij Rempel <[email protected]> Acked-by: Oliver Hartkopp <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>
2019-09-04can: proc: give variables holding CAN statistics a sensible nameMarc Kleine-Budde1-58/+58
This patch rename the variables holding the CAN statistics (can_stats and can_pstats) to pkg_stats and rcv_lists_stats which reflect better their meaning. The conversion is done with: sed -i \ -e "s/can_stats\([^_]\)/pkg_stats\1/g" \ -e "s/can_pstats/rcv_lists_stats/g" \ net/can/proc.c Signed-off-by: Oleksij Rempel <[email protected]> Acked-by: Oliver Hartkopp <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>
2019-09-04can: af_can: give variables holding CAN statistics a sensible nameMarc Kleine-Budde1-15/+15
This patch rename the variables holding the CAN statistics (can_stats and can_pstats) to pkg_stats and rcv_lists_stats which reflect better their meaning. The conversion is done with: sed -i \ -e "s/can_stats\([^_]\)/pkg_stats\1/g" \ -e "s/can_pstats/rcv_lists_stats/g" \ net/can/af_can.c Signed-off-by: Oleksij Rempel <[email protected]> Acked-by: Oliver Hartkopp <[email protected]> Signed-off-by: Marc Kleine-Budde <[email protected]>