aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2014-03-28ipv6: move DAD and addrconf_verify processing to workqueueHannes Frederic Sowa1-51/+142
addrconf_join_solict and addrconf_join_anycast may cause actions which need rtnl locked, especially on first address creation. A new DAD state is introduced which defers processing of the initial DAD processing into a workqueue. To get rtnl lock we need to push the code paths which depend on those calls up to workqueues, specifically addrconf_verify and the DAD processing. (v2) addrconf_dad_failure needs to be queued up to the workqueue, too. This patch introduces a new DAD state and stop the DAD processing in the workqueue (this is because of the possible ipv6_del_addr processing which removes the solicited multicast address from the device). addrconf_verify_lock is removed, too. After the transition it is not needed any more. As we are not processing in bottom half anymore we need to be a bit more careful about disabling bottom half out when we lock spin_locks which are also used in bh. Relevant backtrace: [ 541.030090] RTNL: assertion failed at net/core/dev.c (4496) [ 541.031143] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G O 3.10.33-1-amd64-vyatta #1 [ 541.031145] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 [ 541.031146] ffffffff8148a9f0 000000000000002f ffffffff813c98c1 ffff88007c4451f8 [ 541.031148] 0000000000000000 0000000000000000 ffffffff813d3540 ffff88007fc03d18 [ 541.031150] 0000880000000006 ffff88007c445000 ffffffffa0194160 0000000000000000 [ 541.031152] Call Trace: [ 541.031153] <IRQ> [<ffffffff8148a9f0>] ? dump_stack+0xd/0x17 [ 541.031180] [<ffffffff813c98c1>] ? __dev_set_promiscuity+0x101/0x180 [ 541.031183] [<ffffffff813d3540>] ? __hw_addr_create_ex+0x60/0xc0 [ 541.031185] [<ffffffff813cfe1a>] ? __dev_set_rx_mode+0xaa/0xc0 [ 541.031189] [<ffffffff813d3a81>] ? __dev_mc_add+0x61/0x90 [ 541.031198] [<ffffffffa01dcf9c>] ? igmp6_group_added+0xfc/0x1a0 [ipv6] [ 541.031208] [<ffffffff8111237b>] ? kmem_cache_alloc+0xcb/0xd0 [ 541.031212] [<ffffffffa01ddcd7>] ? ipv6_dev_mc_inc+0x267/0x300 [ipv6] [ 541.031216] [<ffffffffa01c2fae>] ? addrconf_join_solict+0x2e/0x40 [ipv6] [ 541.031219] [<ffffffffa01ba2e9>] ? ipv6_dev_ac_inc+0x159/0x1f0 [ipv6] [ 541.031223] [<ffffffffa01c0772>] ? addrconf_join_anycast+0x92/0xa0 [ipv6] [ 541.031226] [<ffffffffa01c311e>] ? __ipv6_ifa_notify+0x11e/0x1e0 [ipv6] [ 541.031229] [<ffffffffa01c3213>] ? ipv6_ifa_notify+0x33/0x50 [ipv6] [ 541.031233] [<ffffffffa01c36c8>] ? addrconf_dad_completed+0x28/0x100 [ipv6] [ 541.031241] [<ffffffff81075c1d>] ? task_cputime+0x2d/0x50 [ 541.031244] [<ffffffffa01c38d6>] ? addrconf_dad_timer+0x136/0x150 [ipv6] [ 541.031247] [<ffffffffa01c37a0>] ? addrconf_dad_completed+0x100/0x100 [ipv6] [ 541.031255] [<ffffffff8105313a>] ? call_timer_fn.isra.22+0x2a/0x90 [ 541.031258] [<ffffffffa01c37a0>] ? addrconf_dad_completed+0x100/0x100 [ipv6] Hunks and backtrace stolen from a patch by Stephen Hemminger. Reported-by: Stephen Hemminger <[email protected]> Signed-off-by: Stephen Hemminger <[email protected]> Signed-off-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-28net: net: add a core netdev->tx_dropped counterEric Dumazet1-0/+2
Dropping packets in __dev_queue_xmit() when transmit queue is stopped (NIC TX ring buffer full or BQL limit reached) currently outputs a syslog message. It would be better to get a precise count of such events available in netdevice stats so that monitoring tools can have a clue. This extends the work done in caf586e5f23ce ("net: add a core netdev->rx_dropped counter") Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-28packet: respect devices with LLTX flag in direct xmitDaniel Borkmann1-20/+20
Quite often it can be useful to test with dummy or similar devices as a blackhole sink for skbs. Such devices are only equipped with a single txq, but marked as NETIF_F_LLTX as they do not require locking their internal queues on xmit (or implement locking themselves). Therefore, rather use HARD_TX_{UN,}LOCK API, so that NETIF_F_LLTX will be respected. trafgen mmap/TX_RING example against dummy device with config foo: { fill(0xff, 64) } results in the following performance improvements for such scenarios on an ordinary Core i7/2.80GHz: Before: Performance counter stats for 'trafgen -i foo -o du0 -n100000000' (10 runs): 160,975,944,159 instructions:k # 0.55 insns per cycle ( +- 0.09% ) 293,319,390,278 cycles:k # 0.000 GHz ( +- 0.35% ) 192,501,104 branch-misses:k ( +- 1.63% ) 831 context-switches:k ( +- 9.18% ) 7 cpu-migrations:k ( +- 7.40% ) 69,382 cache-misses:k # 0.010 % of all cache refs ( +- 2.18% ) 671,552,021 cache-references:k ( +- 1.29% ) 22.856401569 seconds time elapsed ( +- 0.33% ) After: Performance counter stats for 'trafgen -i foo -o du0 -n100000000' (10 runs): 133,788,739,692 instructions:k # 0.92 insns per cycle ( +- 0.06% ) 145,853,213,256 cycles:k # 0.000 GHz ( +- 0.17% ) 59,867,100 branch-misses:k ( +- 4.72% ) 384 context-switches:k ( +- 3.76% ) 6 cpu-migrations:k ( +- 6.28% ) 70,304 cache-misses:k # 0.077 % of all cache refs ( +- 1.73% ) 90,879,408 cache-references:k ( +- 1.35% ) 11.719372413 seconds time elapsed ( +- 0.24% ) Signed-off-by: Daniel Borkmann <[email protected]> Cc: Jesper Dangaard Brouer <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-28tcp: fix get_timewait4_sock() delay computation on 64bitEric Dumazet1-1/+1
It seems I missed one change in get_timewait4_sock() to compute the remaining time before deletion of IPV4 timewait socket. This could result in wrong output in /proc/net/tcp for tm->when field. Fixes: 96f817fedec4 ("tcp: shrink tcp6_timewait_sock by one cache line") Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-28openvswitch: fix a possible deadlock and lockdep warningFlavio Leitner1-20/+6
There are two problematic situations. A deadlock can happen when is_percpu is false because it can get interrupted while holding the spinlock. Then it executes ovs_flow_stats_update() in softirq context which tries to get the same lock. The second sitation is that when is_percpu is true, the code correctly disables BH but only for the local CPU, so the following can happen when locking the remote CPU without disabling BH: CPU#0 CPU#1 ovs_flow_stats_get() stats_read() +->spin_lock remote CPU#1 ovs_flow_stats_get() | <interrupted> stats_read() | ... +--> spin_lock remote CPU#0 | | <interrupted> | ovs_flow_stats_update() | ... | spin_lock local CPU#0 <--+ ovs_flow_stats_update() +---------------------------------- spin_lock local CPU#1 This patch disables BH for both cases fixing the deadlocks. Acked-by: Jesse Gross <[email protected]> ================================= [ INFO: inconsistent lock state ] 3.14.0-rc8-00007-g632b06a #1 Tainted: G I --------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. swapper/0/0 [HC0[0]:SC1[5]:HE1:SE0] takes: (&(&cpu_stats->lock)->rlock){+.?...}, at: [<ffffffffa05dd8a1>] ovs_flow_stats_update+0x51/0xd0 [openvswitch] {SOFTIRQ-ON-W} state was registered at: [<ffffffff810f973f>] __lock_acquire+0x68f/0x1c40 [<ffffffff810fb4e2>] lock_acquire+0xa2/0x1d0 [<ffffffff817d8d9e>] _raw_spin_lock+0x3e/0x80 [<ffffffffa05dd9e4>] ovs_flow_stats_get+0xc4/0x1e0 [openvswitch] [<ffffffffa05da855>] ovs_flow_cmd_fill_info+0x185/0x360 [openvswitch] [<ffffffffa05daf05>] ovs_flow_cmd_build_info.constprop.27+0x55/0x90 [openvswitch] [<ffffffffa05db41d>] ovs_flow_cmd_new_or_set+0x4dd/0x570 [openvswitch] [<ffffffff816c245d>] genl_family_rcv_msg+0x1cd/0x3f0 [<ffffffff816c270e>] genl_rcv_msg+0x8e/0xd0 [<ffffffff816c0239>] netlink_rcv_skb+0xa9/0xc0 [<ffffffff816c0798>] genl_rcv+0x28/0x40 [<ffffffff816bf830>] netlink_unicast+0x100/0x1e0 [<ffffffff816bfc57>] netlink_sendmsg+0x347/0x770 [<ffffffff81668e9c>] sock_sendmsg+0x9c/0xe0 [<ffffffff816692d9>] ___sys_sendmsg+0x3a9/0x3c0 [<ffffffff8166a911>] __sys_sendmsg+0x51/0x90 [<ffffffff8166a962>] SyS_sendmsg+0x12/0x20 [<ffffffff817e3ce9>] system_call_fastpath+0x16/0x1b irq event stamp: 1740726 hardirqs last enabled at (1740726): [<ffffffff8175d5e0>] ip6_finish_output2+0x4f0/0x840 hardirqs last disabled at (1740725): [<ffffffff8175d59b>] ip6_finish_output2+0x4ab/0x840 softirqs last enabled at (1740674): [<ffffffff8109be12>] _local_bh_enable+0x22/0x50 softirqs last disabled at (1740675): [<ffffffff8109db05>] irq_exit+0xc5/0xd0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&cpu_stats->lock)->rlock); <Interrupt> lock(&(&cpu_stats->lock)->rlock); *** DEADLOCK *** 5 locks held by swapper/0/0: #0: (((&ifa->dad_timer))){+.-...}, at: [<ffffffff810a7155>] call_timer_fn+0x5/0x320 #1: (rcu_read_lock){.+.+..}, at: [<ffffffff81788a55>] mld_sendpack+0x5/0x4a0 #2: (rcu_read_lock_bh){.+....}, at: [<ffffffff8175d149>] ip6_finish_output2+0x59/0x840 #3: (rcu_read_lock_bh){.+....}, at: [<ffffffff8168ba75>] __dev_queue_xmit+0x5/0x9b0 #4: (rcu_read_lock){.+.+..}, at: [<ffffffffa05e41b5>] internal_dev_xmit+0x5/0x110 [openvswitch] stack backtrace: CPU: 0 PID: 0 Comm: swapper/0 Tainted: G I 3.14.0-rc8-00007-g632b06a #1 Hardware name: /DX58SO, BIOS SOX5810J.86A.5599.2012.0529.2218 05/29/2012 0000000000000000 0fcf20709903df0c ffff88042d603808 ffffffff817cfe3c ffffffff81c134c0 ffff88042d603858 ffffffff817cb6da 0000000000000005 ffffffff00000001 ffff880400000000 0000000000000006 ffffffff81c134c0 Call Trace: <IRQ> [<ffffffff817cfe3c>] dump_stack+0x4d/0x66 [<ffffffff817cb6da>] print_usage_bug+0x1f4/0x205 [<ffffffff810f7f10>] ? check_usage_backwards+0x180/0x180 [<ffffffff810f8963>] mark_lock+0x223/0x2b0 [<ffffffff810f96d3>] __lock_acquire+0x623/0x1c40 [<ffffffff810f5707>] ? __lock_is_held+0x57/0x80 [<ffffffffa05e26c6>] ? masked_flow_lookup+0x236/0x250 [openvswitch] [<ffffffff810fb4e2>] lock_acquire+0xa2/0x1d0 [<ffffffffa05dd8a1>] ? ovs_flow_stats_update+0x51/0xd0 [openvswitch] [<ffffffff817d8d9e>] _raw_spin_lock+0x3e/0x80 [<ffffffffa05dd8a1>] ? ovs_flow_stats_update+0x51/0xd0 [openvswitch] [<ffffffffa05dd8a1>] ovs_flow_stats_update+0x51/0xd0 [openvswitch] [<ffffffffa05dcc64>] ovs_dp_process_received_packet+0x84/0x120 [openvswitch] [<ffffffff810f93f7>] ? __lock_acquire+0x347/0x1c40 [<ffffffffa05e3bea>] ovs_vport_receive+0x2a/0x30 [openvswitch] [<ffffffffa05e4218>] internal_dev_xmit+0x68/0x110 [openvswitch] [<ffffffffa05e41b5>] ? internal_dev_xmit+0x5/0x110 [openvswitch] [<ffffffff8168b4a6>] dev_hard_start_xmit+0x2e6/0x8b0 [<ffffffff8168be87>] __dev_queue_xmit+0x417/0x9b0 [<ffffffff8168ba75>] ? __dev_queue_xmit+0x5/0x9b0 [<ffffffff8175d5e0>] ? ip6_finish_output2+0x4f0/0x840 [<ffffffff8168c430>] dev_queue_xmit+0x10/0x20 [<ffffffff8175d641>] ip6_finish_output2+0x551/0x840 [<ffffffff8176128a>] ? ip6_finish_output+0x9a/0x220 [<ffffffff8176128a>] ip6_finish_output+0x9a/0x220 [<ffffffff8176145f>] ip6_output+0x4f/0x1f0 [<ffffffff81788c29>] mld_sendpack+0x1d9/0x4a0 [<ffffffff817895b8>] mld_send_initial_cr.part.32+0x88/0xa0 [<ffffffff817691b0>] ? addrconf_dad_completed+0x220/0x220 [<ffffffff8178e301>] ipv6_mc_dad_complete+0x31/0x50 [<ffffffff817690d7>] addrconf_dad_completed+0x147/0x220 [<ffffffff817691b0>] ? addrconf_dad_completed+0x220/0x220 [<ffffffff8176934f>] addrconf_dad_timer+0x19f/0x1c0 [<ffffffff810a71e9>] call_timer_fn+0x99/0x320 [<ffffffff810a7155>] ? call_timer_fn+0x5/0x320 [<ffffffff817691b0>] ? addrconf_dad_completed+0x220/0x220 [<ffffffff810a76c4>] run_timer_softirq+0x254/0x3b0 [<ffffffff8109d47d>] __do_softirq+0x12d/0x480 Signed-off-by: Flavio Leitner <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-28bridge: Fix handling stacked vlan tagsToshiaki Makita1-17/+1
If a bridge with vlan_filtering enabled receives frames with stacked vlan tags, i.e., they have two vlan tags, br_vlan_untag() strips not only the outer tag but also the inner tag. br_vlan_untag() is called only from br_handle_vlan(), and in this case, it is enough to set skb->vlan_tci to 0 here, because vlan_tci has already been set before calling br_handle_vlan(). Signed-off-by: Toshiaki Makita <[email protected]> Acked-by: Vlad Yasevich <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-28bridge: Fix inabillity to retrieve vlan tags when tx offload is disabledToshiaki Makita2-3/+15
Bridge vlan code (br_vlan_get_tag()) assumes that all frames have vlan_tci if they are tagged, but if vlan tx offload is manually disabled on bridge device and frames are sent from vlan device on the bridge device, the tags are embedded in skb->data and they break this assumption. Extract embedded vlan tags and move them to vlan_tci at ingress. Signed-off-by: Toshiaki Makita <[email protected]> Acked-by: Vlad Yasevich <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-28tipc: make discovery domain a bearer attributeErik Hugne3-16/+10
The node discovery domain is assigned when a bearer is enabled. In the previous commit we reflect this attribute directly in the bearer structure since it's needed to reinitialize the node discovery mechanism after a hardware address change. There's no need to replicate this attribute anywhere else, so we remove it from the tipc_link_req structure. Signed-off-by: Erik Hugne <[email protected]> Reviewed-by: Ying Xue <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-28tipc: fix neighbor detection problem after hw address changeErik Hugne2-0/+9
If the hardware address of a underlying netdevice is changed, it is not enough to simply reset the bearer/links over this device. We also need to reflect this change in the TIPC bearer and node discovery structures aswell. This patch adds the necessary reinitialization of the node disovery mechanism following a hardware address change so that the correct originating media address is advertised in the discovery messages. Signed-off-by: Erik Hugne <[email protected]> Reported-by: Dong Liu <[email protected]> Reviewed-by: Ying Xue <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-27core, nfqueue, openvswitch: Orphan frags in skb_zerocopy and handle errorsZoltan Kiss3-10/+32
skb_zerocopy can copy elements of the frags array between skbs, but it doesn't orphan them. Also, it doesn't handle errors, so this patch takes care of that as well, and modify the callers accordingly. skb_tx_error() is also added to the callers so they will signal the failed delivery towards the creator of the skb. Signed-off-by: Zoltan Kiss <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-27hsr: replace del_timer by del_timer_syncJulia Lawall1-1/+1
Use del_timer_sync to ensure that the timer is stopped on all CPUs before the driver exists. This change was suggested by Thomas Gleixner. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @r@ declarer name module_exit; identifier ex; @@ module_exit(ex); @@ identifier r.ex; @@ ex(...) { <... - del_timer + del_timer_sync (...) ...> } // </smpl> Signed-off-by: Julia Lawall <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-27atm: replace del_timer by del_timer_syncJulia Lawall1-1/+1
Use del_timer_sync to ensure that the timer is stopped on all CPUs before the driver exists. This change was suggested by Thomas Gleixner. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @r@ declarer name module_exit; identifier ex; @@ module_exit(ex); @@ identifier r.ex; @@ ex(...) { <... - del_timer + del_timer_sync (...) ...> } // </smpl> Signed-off-by: Julia Lawall <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-27tcp: tcp_make_synack() minor changesEric Dumazet1-2/+2
There is no need to allocate 15 bytes in excess for a SYNACK packet, as it contains no data, only headers. SYNACK are always generated in softirq context, and contain a single segment, we can use TCP_INC_STATS_BH() Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-27ipv6: do not overwrite inetpeer metrics prematurelyMichal Kubeček2-36/+55
If an IPv6 host route with metrics exists, an attempt to add a new route for the same target with different metrics fails but rewrites the metrics anyway: 12sp0:~ # ip route add fec0::1 dev eth0 rto_min 1000 12sp0:~ # ip -6 route show fe80::/64 dev eth0 proto kernel metric 256 fec0::1 dev eth0 metric 1024 rto_min lock 1s 12sp0:~ # ip route add fec0::1 dev eth0 rto_min 1500 RTNETLINK answers: File exists 12sp0:~ # ip -6 route show fe80::/64 dev eth0 proto kernel metric 256 fec0::1 dev eth0 metric 1024 rto_min lock 1.5s This is caused by all IPv6 host routes using the metrics in their inetpeer (or the shared default). This also holds for the new route created in ip6_route_add() which shares the metrics with the already existing route and thus ip6_route_add() rewrites the metrics even if the new route ends up not being used at all. Another problem is that old metrics in inetpeer can reappear unexpectedly for a new route, e.g. 12sp0:~ # ip route add fec0::1 dev eth0 rto_min 1000 12sp0:~ # ip route del fec0::1 12sp0:~ # ip route add fec0::1 dev eth0 12sp0:~ # ip route change fec0::1 dev eth0 hoplimit 10 12sp0:~ # ip -6 route show fe80::/64 dev eth0 proto kernel metric 256 fec0::1 dev eth0 metric 1024 hoplimit 10 rto_min lock 1s Resolve the first problem by moving the setting of metrics down into fib6_add_rt2node() to the point we are sure we are inserting the new route into the tree. Second problem is addressed by introducing new flag DST_METRICS_FORCE_OVERWRITE which is set for a new host route in ip6_route_add() and makes ipv6_cow_metrics() always overwrite the metrics in inetpeer (even if they are not "new"); it is reset after that. v5: use a flag in _metrics member rather than one in flags v4: fix a typo making a condition always true (thanks to Hannes Frederic Sowa) v3: rewritten based on David Miller's idea to move setting the metrics (and allocation in non-host case) down to the point we already know the route is to be inserted. Also rebased to net-next as it is quite late in the cycle. Signed-off-by: Michal Kubecek <[email protected]> Acked-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-27vlan: Set hard_header_len according to available accelerationVlad Yasevich2-2/+5
Currently, if the card supports CTAG acceleration we do not account for the vlan header even if we are configuring an 8021AD vlan. This may not be best since we'll do software tagging for 8021AD which will cause data copy on skb head expansion Configure the length based on available hw offload capabilities and vlan protocol. CC: Patrick McHardy <[email protected]> Signed-off-by: Vlad Yasevich <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-27Merge branch 'for-upstream' of ↵John W. Linville6-77/+106
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
2014-03-27tipc: use node list lock to protect tipc_num_links variableYing Xue1-10/+11
Without properly implicit or explicit read memory barrier, it's unsafe to read an atomic variable with atomic_read() from another thread which is different with the thread of changing the atomic variable with atomic_inc() or atomic_dec(). So a stale tipc_num_links may be got with atomic_read() in tipc_node_get_links(). If the tipc_num_links variable type is converted from atomic to unsigned integer and node list lock is used to protect it, the issue would be avoided. Signed-off-by: Ying Xue <[email protected]> Reviewed-by: Erik Hugne <[email protected]> Reviewed-by: Jon Maloy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-27tipc: use node_list_lock to protect tipc_num_nodes variableYing Xue1-4/+3
As tipc_node_list is protected by rcu read lock on read side, it's unnecessary to hold node_list_lock to protect tipc_node_list in tipc_node_get_links(). Instead, node_list_lock should just protects tipc_num_nodes in the function. Signed-off-by: Ying Xue <[email protected]> Reviewed-by: Erik Hugne <[email protected]> Reviewed-by: Jon Maloy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-27tipc: tipc: convert node list and node hlist to RCU listsYing Xue4-21/+31
Convert tipc_node_list list and node_htable hash list to RCU lists. On read side, the two lists are protected with RCU read lock, and on update side, node_list_lock is applied to them. Signed-off-by: Ying Xue <[email protected]> Reviewed-by: Erik Hugne <[email protected]> Reviewed-by: Jon Maloy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-27tipc: rename node create lock to protect node list and hlistYing Xue3-35/+35
When a node is created, tipc_net_lock read lock is first held and then node_create_lock is grabbed in order to prevent the same node from being created and inserted into both node list and hlist twice. But when we query node from the two node lists, we only hold tipc_net_lock read lock without grabbing node_create_lock. Obviously this locking policy is unable to guarantee that the two node lists are always synchronized especially when the operation of changing and accessing them occurs in different contexts like currently doing. Therefore, rename node_create_lock to node_list_lock to protect the two node lists, that is, whenever node is inserted into them or node is queried from them, the node_list_lock should be always held. As a result, tipc_net_lock read lock becomes redundant and then can be removed from the node query functions. Signed-off-by: Ying Xue <[email protected]> Reviewed-by: Erik Hugne <[email protected]> Reviewed-by: Jon Maloy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-27tipc: make broadcast bearer store in bearer_list arrayYing Xue2-4/+6
Now unicast bearer is dynamically allocated and placed into its identity specified slot of bearer_list array. When we search bearer_list array with a bearer identity, the corresponding bearer instance can be found. But broadcast bearer is statically allocated and it is not located in the bearer_list array yet. So we decide to enlarge bearer_list array into MAX_BEARERS + 1 slots, and its last slot stores the broadcast bearer so that the broadcast bearer can be found from bearer_list array with MAX_BEARERS as index. The change will help us reduce the complex relationship between bearer and link in the future. Signed-off-by: Ying Xue <[email protected]> Reviewed-by: Erik Hugne <[email protected]> Reviewed-by: Jon Maloy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-27tipc: remove active flag from tipc_bearer structureYing Xue4-12/+5
After the allocation of tipc_bearer structure instance is converted from statical way to dynamical way, we identify whether a certain tipc_bearer structure pointer is valid by checking whether the pointer is NULL or not. So the active flag in tipc_bearer structure becomes redundant. Signed-off-by: Ying Xue <[email protected]> Reviewed-by: Erik Hugne <[email protected]> Reviewed-by: Jon Maloy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-27tipc: convert tipc_bearers array to pointer listYing Xue3-15/+38
As part of the effort to introduce RCU protection for the bearer list, we first need to change it to a list of pointers. Signed-off-by: Ying Xue <[email protected]> Reviewed-by: Erik Hugne <[email protected]> Reviewed-by: Jon Maloy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-27tipc: acquire necessary locks in named_cluster_distribute routineYing Xue1-3/+11
The 'tipc_node_list' is guarded by tipc_net_lock and 'links' array defined in 'tipc_node' structure is protected by node lock as well. Without acquiring the two locks in named_cluster_distribute() a fatal oops may happen in case that a destroyed link might be got and then accessed. Therefore, above mentioned two locks must be held in named_cluster_distribute() to prevent the issue from happening accidentally. As 'links' array in node struct must be protected by node lock, we have to move the code of selecting an active link from tipc_link_xmit() to named_cluster_distribute() and then call __tipc_link_xmit() with the selected link to deliver name messages. Signed-off-by: Ying Xue <[email protected]> Reviewed-by: Erik Hugne <[email protected]> Reviewed-by: Jon Maloy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-27tipc: obsolete the remote management featureYing Xue5-121/+3
Due to the lacking of any credential, it's allowed to accept commands requested from remote nodes to query the local node status, which is prone to involve potential security risks. Instead, if we login to a remote node with ssh command, this approach is not only more safe than the remote management feature, but also it can give us more permissions like changing the remote node configuration. So it's reasonable for us to obsolete the remote management feature now. Signed-off-by: Ying Xue <[email protected]> Reviewed-by: Erik Hugne <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-27tipc: remove unnecessary checking for node objectYing Xue1-6/+0
tipc_node_create routine doesn't need to check whether a node object specified with a node address exists or not because its caller(ie, tipc_disc_recv_msg routine) has checked this before calling it. Signed-off-by: Ying Xue <[email protected]> Reviewed-by: Erik Hugne <[email protected]> Reviewed-by: Jon Maloy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-27net/core: Use RCU_INIT_POINTER(x, NULL) in netpoll.cMonam Agarwal1-1/+1
This patch replaces rcu_assign_pointer(x, NULL) with RCU_INIT_POINTER(x, NULL) The rcu_assign_pointer() ensures that the initialization of a structure is carried out before storing a pointer to that structure. And in the case of the NULL pointer, there is no structure to initialize. So, rcu_assign_pointer(p, NULL) can be safely converted to RCU_INIT_POINTER(p, NULL) Signed-off-by: Monam Agarwal <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-27net/bridge: Use RCU_INIT_POINTER(x, NULL) in br_vlan.cMonam Agarwal1-4/+4
This patch replaces rcu_assign_pointer(x, NULL) with RCU_INIT_POINTER(x, NULL) The rcu_assign_pointer() ensures that the initialization of a structure is carried out before storing a pointer to that structure. And in the case of the NULL pointer, there is no structure to initialize. So, rcu_assign_pointer(p, NULL) can be safely converted to RCU_INIT_POINTER(p, NULL) Signed-off-by: Monam Agarwal <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-26net: unix: non blocking recvmsg() should not return -EINTREric Dumazet1-5/+12
Some applications didn't expect recvmsg() on a non blocking socket could return -EINTR. This possibility was added as a side effect of commit b3ca9b02b00704 ("net: fix multithreaded signal handling in unix recv routines"). To hit this bug, you need to be a bit unlucky, as the u->readlock mutex is usually held for very small periods. Fixes: b3ca9b02b00704 ("net: fix multithreaded signal handling in unix recv routines") Signed-off-by: Eric Dumazet <[email protected]> Cc: Rainer Weikusat <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-26vlan: make a new function vlan_dev_vlan_proto() and exportdingtianhong1-0/+6
The vlan support 2 proto: 802.1q and 802.1ad, so make a new function called vlan_dev_vlan_proto() which could return the vlan proto for input dev. Signed-off-by: Ding Tianhong <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-26net: Rename skb->rxhash to skb->hashTom Herbert4-14/+14
The packet hash can be considered a property of the packet, not just on RX path. This patch changes name of rxhash and l4_rxhash skbuff fields to be hash and l4_hash respectively. This includes changing uses of the field in the code which don't call the access functions. Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Cc: Mahesh Bandewar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-26tcp: delete unused parameter in tcp_nagle_check()Peter Pan(潘卫平)1-3/+3
After commit d4589926d7a9 (tcp: refine TSO splits), tcp_nagle_check() does not use parameter mss_now anymore. Signed-off-by: Weiping Pan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-26ip_tunnel: Fix dst ref-count.Pravin B Shelar3-3/+9
Commit 10ddceb22ba (ip_tunnel:multicast process cause panic due to skb->_skb_refdst NULL pointer) removed dst-drop call from ip-tunnel-recv. Following commit reintroduce dst-drop and fix the original bug by checking loopback packet before releasing dst. Original bug: https://bugzilla.kernel.org/show_bug.cgi?id=70681 CC: Xin Long <[email protected]> Signed-off-by: Pravin B Shelar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-26Bluetooth: Fix returning peer address in pending connect stateJohan Hedberg2-2/+4
We should let user space request the peer address also in the pending connect states, i.e. BT_CONNECT and BT_CONNECT2. There is existing user space code that tries to do this and will fail without extending the set of allowed states for the peer address information. This patch adds the two states to the allowed ones in the L2CAP and RFCOMM sock_getname functions, thereby preventing ENOTCONN from being returned. Reported-by: Andrzej Kaczmarek <[email protected]> Signed-off-by: Johan Hedberg <[email protected]> Tested-by: Andrzej Kaczmarek <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2014-03-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller9-53/+63
Conflicts: Documentation/devicetree/bindings/net/micrel-ks8851.txt net/core/netpoll.c The net/core/netpoll.c conflict is a bug fix in 'net' happening to code which is completely removed in 'net-next'. In micrel-ks8851.txt we simply have overlapping changes. Signed-off-by: David S. Miller <[email protected]>
2014-03-25Merge branch 'for-davem' of ↵David S. Miller46-392/+989
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next John W. Linville says: ==================== Please pull this batch of wireless updates intended for 3.15! For the mac80211 bits, Johannes says: "This has a whole bunch of bugfixes for things that went into -next previously as well as some other bugfixes I didn't want to rush into 3.14 at this point. The rest of it is some cleanups and a few small features, the biggest of which is probably Janusz's regulatory DFS CAC time code." For the Bluetooth bits, Gustavo says: "One more pull request to 3.15. This is mostly and bug fix pull request, it contains several fixes and clean up all over the tree, plus some small new features." For the NFC bits, Samuel says: "This is the NFC pull request for 3.15. With this one we have: - Support for ISO 15693 a.k.a. NFC vicinity a.k.a. Type 5 tags. ISO 15693 are long range (1 - 2 meters) vicinity tags/cards. The kernel now supports those through the NFC netlink and digital APIs. - Support for TI's trf7970a chipset. This chipset relies on the NFC digital layer and the driver currently supports type 2, 4A and 5 tags. - Support for NXP's pn544 secure firmare download. The pn544 C3 chipsets relies on a different firmware download protocal than the C2 one. We now support both and use the right one depending on the version we detect at runtime. - Support for 4A tags from the NFC digital layer. - A bunch of cleanups and minor fixes from Axel Lin and Thierry Escande." For the iwlwifi bits, Emmanuel says: "We were sending a host command while the mutex wasn't held. This led to hard-to-catch races." And... "I have a fix for a "merge damage" which is not really a merge damage: it enables scheduled scan which has been disabled in wireless.git. Since you merged wireless.git into wireless-next.git, this can now be fixed in wireless-next.git. Besides this, Alex made a workaround for a hardware bug. This fix allows us to consume less power in S3. Arik and Eliad continue to work on D0i3 which is a run-time power saving feature. Eliad also contributes a few bits to the rate scaling logic to which Eyal adds his own contribution. Avri dives deep in the power code - newer firmware will allow to enable power save in newer scenarios. Johannes made a few clean-ups. I have the regular amount of BT Coex boring stuff. I disable uAPSD since we identified firmware bugs that cause packet loss. One thing that do stand out is the udev event that we now send when the FW asserts. I hope it will allow us to debug the FW more easily." Also included is one last iwlwifi pull for a build breakage fix... For the Atheros bits, Kalle says: "Michal now did some optimisations and was able to improve throughput by 100 Mbps on our MIPS based AP135 platform. Chun-Yeow added some workarounds to be able to better use ad-hoc mode. Ben improved log messages and added support for MSDU chaining. And, as usual, also some smaller fixes." Beyond that... Andrea Merello continues his rtl8180 refactoring, in preparation for a long-awaited rtl8187 driver. We get a new driver (rsi) for the RS9113 chip, from Fariya Fatima. And, of course, we get the usual round of updates for ath9k, brcmfmac, mwifiex, wil6210, etc. as well. ==================== Signed-off-by: David S. Miller <[email protected]>
2014-03-24tipc: fix spinlock recursion bug for failed subscriptionsErik Hugne1-14/+15
If a topology event subscription fails for any reason, such as out of memory, max number reached or because we received an invalid request the correct behavior is to terminate the subscribers connection to the topology server. This is currently broken and produces the following oops: [27.953662] tipc: Subscription rejected, illegal request [27.955329] BUG: spinlock recursion on CPU#1, kworker/u4:0/6 [27.957066] lock: 0xffff88003c67f408, .magic: dead4ead, .owner: kworker/u4:0/6, .owner_cpu: 1 [27.958054] CPU: 1 PID: 6 Comm: kworker/u4:0 Not tainted 3.14.0-rc6+ #5 [27.960230] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [27.960874] Workqueue: tipc_rcv tipc_recv_work [tipc] [27.961430] ffff88003c67f408 ffff88003de27c18 ffffffff815c0207 ffff88003de1c050 [27.962292] ffff88003de27c38 ffffffff815beec5 ffff88003c67f408 ffffffff817f0a8a [27.963152] ffff88003de27c58 ffffffff815beeeb ffff88003c67f408 ffffffffa0013520 [27.964023] Call Trace: [27.964292] [<ffffffff815c0207>] dump_stack+0x45/0x56 [27.964874] [<ffffffff815beec5>] spin_dump+0x8c/0x91 [27.965420] [<ffffffff815beeeb>] spin_bug+0x21/0x26 [27.965995] [<ffffffff81083df6>] do_raw_spin_lock+0x116/0x140 [27.966631] [<ffffffff815c6215>] _raw_spin_lock_bh+0x15/0x20 [27.967256] [<ffffffffa0008540>] subscr_conn_shutdown_event+0x20/0xa0 [tipc] [27.968051] [<ffffffffa000fde4>] tipc_close_conn+0xa4/0xb0 [tipc] [27.968722] [<ffffffffa00101ba>] tipc_conn_terminate+0x1a/0x30 [tipc] [27.969436] [<ffffffffa00089a2>] subscr_conn_msg_event+0x1f2/0x2f0 [tipc] [27.970209] [<ffffffffa0010000>] tipc_receive_from_sock+0x90/0xf0 [tipc] [27.970972] [<ffffffffa000fa79>] tipc_recv_work+0x29/0x50 [tipc] [27.971633] [<ffffffff8105dbf5>] process_one_work+0x165/0x3e0 [27.972267] [<ffffffff8105e869>] worker_thread+0x119/0x3a0 [27.972896] [<ffffffff8105e750>] ? manage_workers.isra.25+0x2a0/0x2a0 [27.973622] [<ffffffff810648af>] kthread+0xdf/0x100 [27.974168] [<ffffffff810647d0>] ? kthread_create_on_node+0x1a0/0x1a0 [27.974893] [<ffffffff815ce13c>] ret_from_fork+0x7c/0xb0 [27.975466] [<ffffffff810647d0>] ? kthread_create_on_node+0x1a0/0x1a0 The recursion occurs when subscr_terminate tries to grab the subscriber lock, which is already taken by subscr_conn_msg_event. We fix this by checking if the request to establish a new subscription was successful, and if not we initiate termination of the subscriber after we have released the subscriber lock. Signed-off-by: Erik Hugne <[email protected]> Reviewed-by: Jon Maloy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-24netpoll: fix the skb check in pkt_is_nsLi RongQing1-1/+1
Neighbor Solicitation is ipv6 protocol, so we should check skb->protocol with ETH_P_IPV6 Signed-off-by: Li RongQing <[email protected]> Cc: WANG Cong <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-24ipv4: remove ip_rt_dump from route.cLi RongQing2-6/+1
ip_rt_dump do nothing after IPv4 route caches removal, so we can remove it. Signed-off-by: Li RongQing <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-24Bluetooth: Remove unnecessary assignment in SMPJohan Hedberg1-2/+0
The smp variable in smp_conn_security is not used anywhere before the smp = smp_chan_create() call in the smp_conn_security function so it makes no sense to assign any other value to it before that. Signed-off-by: Johan Hedberg <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2014-03-24Bluetooth: Fix potential NULL pointer dereference in smp_conn_securityJohan Hedberg1-1/+1
The smp pointer might not be initialized for jumps to the "done" label in the smp_conn_security function. Furthermore doing the set_bit after done might "overwrite" a previous value of the flag in case pairing was already in progress. This patch moves the call to set_bit before the label so that it is only done for a newly created smp context (as returned by smp_chan_create). Signed-off-by: Johan Hedberg <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2014-03-24Bluetooth: Remove LTK re-encryption procedureJohan Hedberg2-49/+7
Due to several devices being unable to handle this procedure reliably (resulting in forced disconnections before pairing completes) it's better to remove it altogether. Signed-off-by: Johan Hedberg <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2014-03-24Bluetooth: Don't try to confirm locally initiated SMP pairingJohan Hedberg1-0/+5
In the case that the just-works model would be triggered we only want to confirm remotely initiated pairings (i.e. those triggered by a Security Request or Pairing Request). This patch adds the necessary check to the tk_request function to fall back to the JUST_WORKS method in the case of a locally initiated pairing. Signed-off-by: Johan Hedberg <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2014-03-24Bluetooth: Add SMP flag to track which side is the initiatorJohan Hedberg2-0/+7
For remotely initiated just-works pairings we want to show the user a confirmation dialog for the pairing. However, we can only know which side was the initiator by tracking which side sends the first Security Request or Pairing Request PDU. This patch adds a new SMP flag to indicate whether our side was the initiator for the pairing. Signed-off-by: Johan Hedberg <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2014-03-24Bluetooth: Fix SMP confirmation callback handlingJohan Hedberg1-0/+4
In the case that a local pairing confirmation (JUST_CFM) has been selected as the method we need to use the user confirm request mgmt event for it with the confirm_hint set to 1 (to indicate confirmation without any specific passkey value). Without this (if passkey_notify was used) the pairing would never proceed. This patch adds the necessary call to mgmt_user_confirm_request in this scenario. Signed-off-by: Johan Hedberg <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2014-03-24Bluetooth: Add missing cmd_status handler for LE_Start_EncryptionJohan Hedberg1-0/+34
It is possible that the HCI_LE_Start_Encryption command fails in an early stage and triggers a command status event with the failure code. In such a case we need to properly notify the hci_conn object and cleanly bring the connection down. This patch adds the missing command status handler for this HCI command. Signed-off-by: Johan Hedberg <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2014-03-24Bluetooth: Fix potential NULL pointer dereference in SMPJohan Hedberg1-1/+7
If a sudden disconnection happens the l2cap_conn pointer may already have been cleaned up by the time hci_conn_security gets called, resulting in the following oops if we don't have a proper NULL check: BUG: unable to handle kernel NULL pointer dereference at 000000c8 IP: [<c132e2ed>] smp_conn_security+0x26/0x151 *pde = 00000000 Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC CPU: 1 PID: 673 Comm: memcheck-x86-li Not tainted 3.14.0-rc2+ #437 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 task: f0ef0520 ti: f0d6a000 task.ti: f0d6a000 EIP: 0060:[<c132e2ed>] EFLAGS: 00010246 CPU: 1 EIP is at smp_conn_security+0x26/0x151 EAX: f0ec1770 EBX: f0ec1770 ECX: 00000002 EDX: 00000002 ESI: 00000002 EDI: 00000000 EBP: f0d6bdc0 ESP: f0d6bda0 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 CR0: 80050033 CR2: 000000c8 CR3: 30f0f000 CR4: 00000690 Stack: f4f55000 00000002 f0d6bdcc c1097a2b c1319f40 f0ec1770 00000002 f0d6bdd0 f0d6bde8 c1312a82 f0d6bdfc c1312a82 c1319f84 00000008 f4d81c20 f0e5fd86 f0ec1770 f0d6bdfc f0d6be28 c131be3b c131bdc1 f0d25270 c131be3b 00000008 Call Trace: [<c1097a2b>] ? __kmalloc+0x118/0x128 [<c1319f40>] ? mgmt_pending_add+0x49/0x9b [<c1312a82>] hci_conn_security+0x4a/0x1dd [<c1312a82>] ? hci_conn_security+0x4a/0x1dd [<c1319f84>] ? mgmt_pending_add+0x8d/0x9b [<c131be3b>] pair_device+0x1e1/0x206 [<c131bdc1>] ? pair_device+0x167/0x206 [<c131be3b>] ? pair_device+0x1e1/0x206 [<c131ed44>] mgmt_control+0x275/0x2d6 Signed-off-by: Johan Hedberg <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2014-03-24ipv4: remove ipv4_ifdown_dst from route.cLi RongQing1-6/+0
ipv4_ifdown_dst does nothing after IPv4 route caches removal, so we can remove it. Signed-off-by: Li RongQing <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-03-22batman-adv: Start new development cycleSimon Wunderlich1-1/+1
Signed-off-by: Simon Wunderlich <[email protected]> Signed-off-by: Antonio Quartulli <[email protected]>
2014-03-22batman-adv: improve DAT documentationAntonio Quartulli2-2/+6
Add missing documentation for BATADV_DAT_ADDR_MAX and convert an existing documentation to kerneldoc Signed-off-by: Antonio Quartulli <[email protected]> Signed-off-by: Marek Lindner <[email protected]>