aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2014-02-06net: sctp: fix initialization of local source address on accepted ipv6 socketsMatija Glavinic Pecotic1-0/+2
commit efe4208f47f907b86f528788da711e8ab9dea44d: 'ipv6: make lookups simpler and faster' broke initialization of local source address on accepted ipv6 sockets. Before the mentioned commit receive address was copied along with the contents of ipv6_pinfo in sctp_v6_create_accept_sk. Now when it is moved, it has to be copied separately. This also fixes lksctp's ipv6 regression in a sense that test_getname_v6, TC5 - 'getsockname on a connected server socket' now passes. Signed-off-by: Matija Glavinic Pecotic <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-02-06r8152: fix the submission of the interrupt transferhayeswang1-9/+8
The submission of the interrupt transfer should be done after setting the bit of WORK_ENABLE, otherwise the callback function would have the opportunity to be returned directly. Clear the bit of WORK_ENABLE before killing the interrupt transfer. Signed-off-by: Hayes Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-02-06bnx2x: Allow VF rss on higher PFsYuval Mintz1-3/+3
bnx2x driver uses incorrect PF identifier to configure (in HW) the VF interrupt scheme; As a result, in multi-function mode the configuration for PFs with a high index (4+) will overflow and the PF will erroneously configure a single ISR scheme for its VFs. As a result, if such a VF uses multiple queues, interrupt generation will stop after VF receives an Rx packet or sends a Tx packet on a queue other than queue[0]. Signed-off-by: Yuval Mintz <[email protected]> Signed-off-by: Ariel Elior <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-02-06tg3: Fix deadlock in tg3_change_mtu()Nithin Sujir1-2/+2
Quoting David Vrabel - "5780 cards cannot have jumbo frames and TSO enabled together. When jumbo frames are enabled by setting the MTU, the TSO feature must be cleared. This is done indirectly by calling netdev_update_features() which will call tg3_fix_features() to actually clear the flags. netdev_update_features() will also trigger a new netlink message for the feature change event which will result in a call to tg3_get_stats64() which deadlocks on the tg3 lock." tg3_set_mtu() does not need to be under the tg3 lock since converting the flags to use set_bit(). Move it out to after tg3_netif_stop(). Reported-by: David Vrabel <[email protected]> Tested-by: David Vrabel <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: Nithin Nayak Sujir <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-02-06tg3: cleanup an error path in tg3_phy_reset_5703_4_5()Dan Carpenter1-6/+7
In the original code, if tg3_readphy() fails then it does an unnecessary check to verify "err" is still zero and then returns -EBUSY. My static checker complains about the unnecessary "if (!err)" check and anyway it is better to propagate the -EBUSY error code from tg3_readphy() instead of hard coding it here. And really the original code is confusing to look at. Signed-off-by: Dan Carpenter <[email protected]> Acked-by: Nithin Nayak Sujir <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-02-06ipv4: Fix runtime WARNING in rtmsg_ifa()Geert Uytterhoeven1-1/+2
On m68k/ARAnyM: WARNING: CPU: 0 PID: 407 at net/ipv4/devinet.c:1599 0x316a99() Modules linked in: CPU: 0 PID: 407 Comm: ifconfig Not tainted 3.13.0-atari-09263-g0c71d68014d1 #1378 Stack from 10c4fdf0: 10c4fdf0 002ffabb 000243e8 00000000 008ced6c 00024416 00316a99 0000063f 00316a99 00000009 00000000 002501b4 00316a99 0000063f c0a86117 00000080 c0a86117 00ad0c90 00250a5a 00000014 00ad0c90 00000000 00000000 00000001 00b02dd0 00356594 00000000 00356594 c0a86117 eff6c9e4 008ced6c 00000002 008ced60 0024f9b4 00250b52 00ad0c90 00000000 00000000 00252390 00ad0c90 eff6c9e4 0000004f 00000000 00000000 eff6c9e4 8000e25c eff6c9e4 80001020 Call Trace: [<000243e8>] warn_slowpath_common+0x52/0x6c [<00024416>] warn_slowpath_null+0x14/0x1a [<002501b4>] rtmsg_ifa+0xdc/0xf0 [<00250a5a>] __inet_insert_ifa+0xd6/0x1c2 [<0024f9b4>] inet_abc_len+0x0/0x42 [<00250b52>] inet_insert_ifa+0xc/0x12 [<00252390>] devinet_ioctl+0x2ae/0x5d6 Adding some debugging code reveals that net_fill_ifaddr() fails in put_cacheinfo(skb, ifa->ifa_cstamp, ifa->ifa_tstamp, preferred, valid)) nla_put complains: lib/nlattr.c:454: skb_tailroom(skb) = 12, nla_total_size(attrlen) = 20 Apparently commit 5c766d642bcaffd0c2a5b354db2068515b3846cf ("ipv4: introduce address lifetime") forgot to take into account the addition of struct ifa_cacheinfo in inet_nlmsg_size(). Hence add it, like is already done for ipv6. Suggested-by: Cong Wang <[email protected]> Signed-off-by: Geert Uytterhoeven <[email protected]> Signed-off-by: Cong Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-02-06bnx2[x]: Make module parameters readableJames M Leddy2-7/+7
Occasionally users want to know what parameters their Broadcom drivers are running with. For example, a user may want to know if MSI is disabled. This patch has been compile tested. Signed-off-by: James M Leddy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-02-06net: irda: ep7211-sir: Remove driverAlexander Shiyan3-78/+0
This patch removes old and unsupported CLPS711X IrDA driver. Support for IrDA for CLPS711X serial port now provided by commit 4a33f1f59abd (serial: clps711x: Add support for N_IRDA line discipline), so IrDA-mode can be turned ON with "irattach" tool through "irtty" driver. Signed-off-by: Alexander Shiyan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-02-06ARM: sunxi: dt: Convert to the new net compatiblesMaxime Ripard3-6/+6
Switch the device tree to the new compatibles introduced in the ethernet and mdio drivers to have a common pattern accross all Allwinner SoCs. Signed-off-by: Maxime Ripard <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-02-06net: phy: sunxi: Add new compatiblesMaxime Ripard2-2/+6
The Allwinner A10 compatibles were following a slightly different compatible patterns than the rest of the SoCs for historical reasons. Add compatibles matching the other pattern to the mdio driver for consistency, and keep the older one for backward compatibility. Signed-off-by: Maxime Ripard <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-02-06net: ethernet: sunxi: Add new compatiblesMaxime Ripard2-2/+6
The Allwinner A10 compatibles were following a slightly different compatible patterns than the rest of the SoCs for historical reasons. Add compatibles matching the other pattern to the ethernet driver for consistency, and keep the older one for backward compatibility. Signed-off-by: Maxime Ripard <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-02-06rtl8180: Add error check for pci_map_single return value in TX pathandrea.merello1-0/+7
Orignal code will not detect a DMA mapping failure, causing the HW to attempt a DMA from an invalid address. This patch add the error check and eventually simply drops the TX packet if we can't map it for DMA. Signed-off-by: andrea merello <[email protected]> Signed-off-by: John W. Linville <[email protected]>
2014-02-06rtl8180: Add error check for pci_map_single return value in RX pathandrea.merello1-3/+13
In original code the old RX DMA buffer is unmapped and processed and at the end of the isr a new buffer is mapped with pci_map_single and attached to the RX descriptor. If pci_map_single fails then the RX descriptor remains with no valid DMA buffer attached. In this condition the DMA will target where it shouldn't with obvious evil consequences. Simply avoiding re-arming the descriptor will prevent buggy DMA but it will result soon in RX stuck. This patch move the DMA mapping of the new buffer at the beginning of the ISR (and it adds error check for pci_map_single success/fail). If the DMA mapping fails then we do not unmap the old buffer and we re-arm the descriptor without processing it, with the old DMA buffer still attached. In this way we lose the currently RX-ed packet, but whenever next calls to pci_map_single will succeed again,then the RX process will go on without stuck. Signed-off-by: andrea merello <[email protected]> Signed-off-by: John W. Linville <[email protected]>
2014-02-06Merge branch 'for-john' of ↵John W. Linville11-83/+102
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
2014-02-06netfilter: nf_tables: fix racy rule deletionPablo Neira Ayuso2-21/+23
We may lost race if we flush the rule-set (which happens asynchronously via call_rcu) and we try to remove the table (that userspace assumes to be empty). Fix this by recovering synchronous rule and chain deletion. This was introduced time ago before we had no batch support, and synchronous rule deletion performance was not good. Now that we have the batch support, we can just postpone the purge of old rule in a second step in the commit phase. All object deletions are synchronous after this patch. As a side effect, we save memory as we don't need rcu_head per rule anymore. Cc: Patrick McHardy <[email protected]> Reported-by: Arturo Borrero Gonzalez <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2014-02-06netfilter: nf_tables: fix log/queue expressions for NFPROTO_INETPatrick McHardy2-7/+2
The log and queue expressions both store the family during ->init() and use it to deliver packets. This is wrong when used in NFPROTO_INET since they should both deliver to the actual AF of the packet, not the dummy NFPROTO_INET. Use the family from the hook ops to fix this. Signed-off-by: Patrick McHardy <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2014-02-06mac80211: fix virtual monitor interface iterationJohannes Berg1-4/+8
During channel context assignment, the interface should be found by interface iteration, so we need to assign the pointer before the channel context. Reported-by: Emmanuel Grumbach <[email protected]> Tested-by: Emmanuel Grumbach <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2014-02-06mac80211: fix fragmentation code, particularly for encryptionJohannes Berg1-1/+1
The "new" fragmentation code (since my rewrite almost 5 years ago) erroneously sets skb->len rather than using skb_trim() to adjust the length of the first fragment after copying out all the others. This leaves the skb tail pointer pointing to after where the data originally ended, and thus causes the encryption MIC to be written at that point, rather than where it belongs: immediately after the data. The impact of this is that if software encryption is done, then a) encryption doesn't work for the first fragment, the connection becomes unusable as the first fragment will never be properly verified at the receiver, the MIC is practically guaranteed to be wrong b) we leak up to 8 bytes of plaintext (!) of the packet out into the air This is only mitigated by the fact that many devices are capable of doing encryption in hardware, in which case this can't happen as the tail pointer is irrelevant in that case. Additionally, fragmentation is not used very frequently and would normally have to be configured manually. Fix this by using skb_trim() properly. Cc: [email protected] Fixes: 2de8e0d999b8 ("mac80211: rewrite fragmentation") Reported-by: Jouni Malinen <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2014-02-06mac80211: Fix IBSS disconnectSujith Manoharan1-4/+1
Currently, when a station leaves an IBSS network, the corresponding BSS is not dropped from cfg80211 if there are other active stations in the network. But, the small window that is present when trying to determine a station's status based on IEEE80211_IBSS_MERGE_INTERVAL introduces a race. Instead of trying to keep the BSS, always remove it when leaving an IBSS network. There is not much benefit to retain the BSS entry since it will be added with a subsequent join operation. This fixes an issue where a dangling BSS entry causes ath9k to wait for a beacon indefinitely. Cc: <[email protected]> Reported-by: Simon Wunderlich <[email protected]> Signed-off-by: Sujith Manoharan <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2014-02-06mac80211: release the channel in error path in start_apEmmanuel Grumbach1-1/+4
When the driver cannot start the AP or when the assignement of the beacon goes wrong, we need to unassign the vif. Cc: [email protected] Signed-off-by: Emmanuel Grumbach <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2014-02-06cfg80211: send scan results from work queueJohannes Berg6-42/+45
Due to the previous commit, when a scan finishes, it is in theory possible to hit the following sequence: 1. interface starts being removed 2. scan is cancelled by driver and cfg80211 is notified 3. scan done work is scheduled 4. interface is removed completely, rdev->scan_req is freed, event sent to userspace but scan done work remains pending 5. new scan is requested on another virtual interface 6. scan done work runs, freeing the still-running scan To fix this situation, hang on to the scan done message and block new scans while that is the case, and only send the message from the work function, regardless of whether the scan_req is already freed from interface removal. This makes step 5 above impossible and changes step 6 to be 5. scan done work runs, sending the scan done message As this can't work for wext, so we send the message immediately, but this shouldn't be an issue since we still return -EBUSY. Signed-off-by: Johannes Berg <[email protected]>
2014-02-06cfg80211: fix scan done raceJohannes Berg1-4/+10
When an interface/wdev is removed, any ongoing scan should be cancelled by the driver. This will make it call cfg80211, which only queues a work struct. If interface/wdev removal is quick enough, this can leave the scan request pending and processed only after the interface is gone, causing a use-after-free. Fix this by making sure the scan request is not pending after the interface is destroyed. We can't flush or cancel the work item due to locking concerns, but when it'll run it shouldn't find anything to do. This leaves a potential issue, if a new scan gets requested before the work runs, it prematurely stops the running scan, potentially causing another crash. I'll fix that in the next patch. This was particularly observed with P2P_DEVICE wdevs, likely because freeing them is quicker than freeing netdevs. Reported-by: Andrei Otcheretianski <[email protected]> Fixes: 4a58e7c38443 ("cfg80211: don't "leak" uncompleted scans") Signed-off-by: Johannes Berg <[email protected]>
2014-02-06mac80211: avoid deadlock revealed by lockdepEmmanuel Grumbach3-7/+15
sdata->u.ap.request_smps_work can’t be flushed synchronously under wdev_lock(wdev) since ieee80211_request_smps_ap_work itself locks the same lock. While at it, reset the driver_smps_mode when the ap is stopped to its default: OFF. This solves: ====================================================== [ INFO: possible circular locking dependency detected ] 3.12.0-ipeer+ #2 Tainted: G O ------------------------------------------------------- rmmod/2867 is trying to acquire lock: ((&sdata->u.ap.request_smps_work)){+.+...}, at: [<c105b8d0>] flush_work+0x0/0x90 but task is already holding lock: (&wdev->mtx){+.+.+.}, at: [<f9b32626>] cfg80211_stop_ap+0x26/0x230 [cfg80211] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&wdev->mtx){+.+.+.}: [<c10aefa9>] lock_acquire+0x79/0xe0 [<c1607a1a>] mutex_lock_nested+0x4a/0x360 [<fb06288b>] ieee80211_request_smps_ap_work+0x2b/0x50 [mac80211] [<c105cdd8>] process_one_work+0x198/0x450 [<c105d469>] worker_thread+0xf9/0x320 [<c10669ff>] kthread+0x9f/0xb0 [<c1613397>] ret_from_kernel_thread+0x1b/0x28 -> #0 ((&sdata->u.ap.request_smps_work)){+.+...}: [<c10ae9df>] __lock_acquire+0x183f/0x1910 [<c10aefa9>] lock_acquire+0x79/0xe0 [<c105b917>] flush_work+0x47/0x90 [<c105d867>] __cancel_work_timer+0x67/0xe0 [<c105d90f>] cancel_work_sync+0xf/0x20 [<fb0765cc>] ieee80211_stop_ap+0x8c/0x340 [mac80211] [<f9b3268c>] cfg80211_stop_ap+0x8c/0x230 [cfg80211] [<f9b0d8f9>] cfg80211_leave+0x79/0x100 [cfg80211] [<f9b0da72>] cfg80211_netdev_notifier_call+0xf2/0x4f0 [cfg80211] [<c160f2c9>] notifier_call_chain+0x59/0x130 [<c106c6de>] __raw_notifier_call_chain+0x1e/0x30 [<c106c70f>] raw_notifier_call_chain+0x1f/0x30 [<c14f8213>] call_netdevice_notifiers_info+0x33/0x70 [<c14f8263>] call_netdevice_notifiers+0x13/0x20 [<c14f82a4>] __dev_close_many+0x34/0xb0 [<c14f83fe>] dev_close_many+0x6e/0xc0 [<c14f9c77>] rollback_registered_many+0xa7/0x1f0 [<c14f9dd4>] unregister_netdevice_many+0x14/0x60 [<fb06f4d9>] ieee80211_remove_interfaces+0xe9/0x170 [mac80211] [<fb055116>] ieee80211_unregister_hw+0x56/0x110 [mac80211] [<fa3e9396>] iwl_op_mode_mvm_stop+0x26/0xe0 [iwlmvm] [<f9b9d8ca>] _iwl_op_mode_stop+0x3a/0x70 [iwlwifi] [<f9b9d96f>] iwl_opmode_deregister+0x6f/0x90 [iwlwifi] [<fa405179>] __exit_compat+0xd/0x19 [iwlmvm] [<c10b8bf9>] SyS_delete_module+0x179/0x2b0 [<c1613421>] sysenter_do_call+0x12/0x32 Fixes: 687da132234f ("mac80211: implement SMPS for AP") Cc: <[email protected]> [3.13] Reported-by: Ilan Peer <[email protected]> Signed-off-by: Emmanuel Grumbach <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2014-02-06cfg80211: re-enable 5/10 MHz supportJohannes Berg1-3/+0
Unfortunately I forgot this during the merge window, but the patch seems small enough to go in as a fix. The userspace API bug that was the reason for disabling it has long been fixed. Signed-off-by: Johannes Berg <[email protected]>
2014-02-06nl80211: Reset split_start when netlink skb is exhaustedPontus Fuchs1-1/+2
When the netlink skb is exhausted split_start is left set. In the subsequent retry, with a larger buffer, the dump is continued from the failing point instead of from the beginning. This was causing my rt28xx based USB dongle to now show up when running "iw list" with an old iw version without split dump support. Cc: [email protected] Fixes: 3713b4e364ef ("nl80211: allow splitting wiphy information in dumps") Signed-off-by: Pontus Fuchs <[email protected]> [avoid the entire workaround when state->split is set] Signed-off-by: Johannes Berg <[email protected]>
2014-02-06mac80211: move roc cookie assignment earlierEliad Peller1-18/+18
ieee80211_start_roc_work() might add a new roc to existing roc, and tell cfg80211 it has already started. However, this might happen before the roc cookie was set, resulting in REMAIN_ON_CHANNEL (started) event with null cookie. Consequently, it can make wpa_supplicant go out of sync. Fix it by setting the roc cookie earlier. Cc: [email protected] Signed-off-by: Eliad Peller <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2014-02-06netfilter: nf_tables: add reject module for NFPROTO_INETPatrick McHardy6-6/+85
Add a reject module for NFPROTO_INET. It does nothing but dispatch to the AF-specific modules based on the hook family. Signed-off-by: Patrick McHardy <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2014-02-06netfilter: nft_reject: split up reject module into IPv4 and IPv6 specifc partsPatrick McHardy9-81/+187
Currently the nft_reject module depends on symbols from ipv6. This is wrong since no generic module should force IPv6 support to be loaded. Split up the module into AF-specific and a generic part. Signed-off-by: Patrick McHardy <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2014-02-05Merge branch 'fixes' of ↵David S. Miller3-53/+60
git://git.kernel.org/pub/scm/linux/kernel/git/jesse/openvswitch Jesse Gross says: ==================== Open vSwitch A handful of bug fixes for net/3.14. High level fixes are: * Regressions introduced by the zerocopy changes, particularly with old userspaces. * A few bugs lingering from the introduction of megaflows. * Overly zealous error checking that is now being triggered frequently in common cases. ==================== Signed-off-by: David S. Miller <[email protected]>
2014-02-05xen-netback: Fix Rx stall due to race conditionZoltan Kiss3-16/+7
The recent patch to fix receive side flow control (11b57f90257c1d6a91cee720151b69e0c2020cf6: xen-netback: stop vif thread spinning if frontend is unresponsive) solved the spinning thread problem, however caused an another one. The receive side can stall, if: - [THREAD] xenvif_rx_action sets rx_queue_stopped to true - [INTERRUPT] interrupt happens, and sets rx_event to true - [THREAD] then xenvif_kthread sets rx_event to false - [THREAD] rx_work_todo doesn't return true anymore Also, if interrupt sent but there is still no room in the ring, it take quite a long time until xenvif_rx_action realize it. This patch ditch that two variable, and rework rx_work_todo. If the thread finds it can't fit more skb's into the ring, it saves the last slot estimation into rx_last_skb_slots, otherwise it's kept as 0. Then rx_work_todo will check if: - there is something to send to the ring (like before) - there is space for the topmost packet in the queue I think that's more natural and optimal thing to test than two bool which are set somewhere else. Signed-off-by: Zoltan Kiss <[email protected]> Reviewed-by: Paul Durrant <[email protected]> Acked-by: Wei Liu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-02-06netfilter: nf_tables: add AF specific expression supportPatrick McHardy2-6/+21
For the reject module, we need to add AF-specific implementations to get rid of incorrect module dependencies. Try to load an AF-specific module first and fall back to generic modules. Signed-off-by: Patrick McHardy <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2014-02-06netfilter: nft_ct: fix missing NFT_CT_L3PROTOCOL key in validity checksPatrick McHardy1-0/+1
The key was missing in the list of valid keys, add it. Signed-off-by: Patrick McHardy <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2014-02-06netfilter: nf_tables: fix potential oops when dumping setsPatrick McHardy1-3/+5
Commit c9c8e48597 (netfilter: nf_tables: dump sets in all existing families) changed nft_ctx_init_from_setattr() to only look up the address family if it is not NFPROTO_UNSPEC. However if it is NFPROTO_UNSPEC and a table attribute is given, nftables_afinfo_lookup() will dereference the NULL afi pointer. Fix by checking for non-NULL afi and also move a check added by that commit to the proper position. Signed-off-by: Patrick McHardy <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2014-02-05netfilter: nf_tables: fix overrun in nf_tables_set_alloc_name()Patrick McHardy1-2/+2
The map that is used to allocate anonymous sets is indeed BITS_PER_BYTE * PAGE_SIZE long. Signed-off-by: Patrick McHardy <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2014-02-05netfilter: nf_conntrack: don't release a conntrack with non-zero refcntPablo Neira Ayuso4-14/+34
With this patch, the conntrack refcount is initially set to zero and it is bumped once it is added to any of the list, so we fulfill Eric's golden rule which is that all released objects always have a refcount that equals zero. Andrey Vagin reports that nf_conntrack_free can't be called for a conntrack with non-zero ref-counter, because it can race with nf_conntrack_find_get(). A conntrack slab is created with SLAB_DESTROY_BY_RCU. Non-zero ref-counter says that this conntrack is used. So when we release a conntrack with non-zero counter, we break this assumption. CPU1 CPU2 ____nf_conntrack_find() nf_ct_put() destroy_conntrack() ... init_conntrack __nf_conntrack_alloc (set use = 1) atomic_inc_not_zero(&ct->use) (use = 2) if (!l4proto->new(ct, skb, dataoff, timeouts)) nf_conntrack_free(ct); (use = 2 !!!) ... __nf_conntrack_alloc (set use = 1) if (!nf_ct_key_equal(h, tuple, zone)) nf_ct_put(ct); (use = 0) destroy_conntrack() /* continue to work with CT */ After applying the path "[PATCH] netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_get" another bug was triggered in destroy_conntrack(): <4>[67096.759334] ------------[ cut here ]------------ <2>[67096.759353] kernel BUG at net/netfilter/nf_conntrack_core.c:211! ... <4>[67096.759837] Pid: 498649, comm: atdd veid: 666 Tainted: G C --------------- 2.6.32-042stab084.18 #1 042stab084_18 /DQ45CB <4>[67096.759932] RIP: 0010:[<ffffffffa03d99ac>] [<ffffffffa03d99ac>] destroy_conntrack+0x15c/0x190 [nf_conntrack] <4>[67096.760255] Call Trace: <4>[67096.760255] [<ffffffff814844a7>] nf_conntrack_destroy+0x17/0x30 <4>[67096.760255] [<ffffffffa03d9bb5>] nf_conntrack_find_get+0x85/0x130 [nf_conntrack] <4>[67096.760255] [<ffffffffa03d9fb2>] nf_conntrack_in+0x352/0xb60 [nf_conntrack] <4>[67096.760255] [<ffffffffa048c771>] ipv4_conntrack_local+0x51/0x60 [nf_conntrack_ipv4] <4>[67096.760255] [<ffffffff81484419>] nf_iterate+0x69/0xb0 <4>[67096.760255] [<ffffffff814b5b00>] ? dst_output+0x0/0x20 <4>[67096.760255] [<ffffffff814845d4>] nf_hook_slow+0x74/0x110 <4>[67096.760255] [<ffffffff814b5b00>] ? dst_output+0x0/0x20 <4>[67096.760255] [<ffffffff814b66d5>] raw_sendmsg+0x775/0x910 <4>[67096.760255] [<ffffffff8104c5a8>] ? flush_tlb_others_ipi+0x128/0x130 <4>[67096.760255] [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20 <4>[67096.760255] [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20 <4>[67096.760255] [<ffffffff814c136a>] inet_sendmsg+0x4a/0xb0 <4>[67096.760255] [<ffffffff81444e93>] ? sock_sendmsg+0x13/0x140 <4>[67096.760255] [<ffffffff81444f97>] sock_sendmsg+0x117/0x140 <4>[67096.760255] [<ffffffff8102e299>] ? native_smp_send_reschedule+0x49/0x60 <4>[67096.760255] [<ffffffff81519beb>] ? _spin_unlock_bh+0x1b/0x20 <4>[67096.760255] [<ffffffff8109d930>] ? autoremove_wake_function+0x0/0x40 <4>[67096.760255] [<ffffffff814960f0>] ? do_ip_setsockopt+0x90/0xd80 <4>[67096.760255] [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20 <4>[67096.760255] [<ffffffff8100bc4e>] ? apic_timer_interrupt+0xe/0x20 <4>[67096.760255] [<ffffffff814457c9>] sys_sendto+0x139/0x190 <4>[67096.760255] [<ffffffff810efa77>] ? audit_syscall_entry+0x1d7/0x200 <4>[67096.760255] [<ffffffff810ef7c5>] ? __audit_syscall_exit+0x265/0x290 <4>[67096.760255] [<ffffffff81474daf>] compat_sys_socketcall+0x13f/0x210 <4>[67096.760255] [<ffffffff8104dea3>] ia32_sysret+0x0/0x5 I have reused the original title for the RFC patch that Andrey posted and most of the original patch description. Cc: Eric Dumazet <[email protected]> Cc: Andrew Vagin <[email protected]> Cc: Florian Westphal <[email protected]> Reported-by: Andrew Vagin <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Acked-by: Andrew Vagin <[email protected]>
2014-02-05netfilter: nf_nat_h323: fix crash in nf_ct_unlink_expect_report()Alexey Dobriyan1-1/+4
Similar bug fixed in SIP module in 3f509c6 ("netfilter: nf_nat_sip: fix incorrect handling of EBUSY for RTCP expectation"). BUG: unable to handle kernel paging request at 00100104 IP: [<f8214f07>] nf_ct_unlink_expect_report+0x57/0xf0 [nf_conntrack] ... Call Trace: [<c0244bd8>] ? del_timer+0x48/0x70 [<f8215687>] nf_ct_remove_expectations+0x47/0x60 [nf_conntrack] [<f8211c99>] nf_ct_delete_from_lists+0x59/0x90 [nf_conntrack] [<f8212e5e>] death_by_timeout+0x14e/0x1c0 [nf_conntrack] [<f8212d10>] ? nf_conntrack_set_hashsize+0x190/0x190 [nf_conntrack] [<c024442d>] call_timer_fn+0x1d/0x80 [<c024461e>] run_timer_softirq+0x18e/0x1a0 [<f8212d10>] ? nf_conntrack_set_hashsize+0x190/0x190 [nf_conntrack] [<c023e6f3>] __do_softirq+0xa3/0x170 [<c023e650>] ? __local_bh_enable+0x70/0x70 <IRQ> [<c023e587>] ? irq_exit+0x67/0xa0 [<c0202af6>] ? do_IRQ+0x46/0xb0 [<c027ad05>] ? clockevents_notify+0x35/0x110 [<c066ac6c>] ? common_interrupt+0x2c/0x40 [<c056e3c1>] ? cpuidle_enter_state+0x41/0xf0 [<c056e6fb>] ? cpuidle_idle_call+0x8b/0x100 [<c02085f8>] ? arch_cpu_idle+0x8/0x30 [<c027314b>] ? cpu_idle_loop+0x4b/0x140 [<c0273258>] ? cpu_startup_entry+0x18/0x20 [<c066056d>] ? rest_init+0x5d/0x70 [<c0813ac8>] ? start_kernel+0x2ec/0x2f2 [<c081364f>] ? repair_env_string+0x5b/0x5b [<c0813269>] ? i386_start_kernel+0x33/0x35 Signed-off-by: Alexey Dobriyan <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2014-02-05netfilter: nf_conntrack: fix RCU race in nf_conntrack_find_getAndrey Vagin1-4/+17
Lets look at destroy_conntrack: hlist_nulls_del_rcu(&ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode); ... nf_conntrack_free(ct) kmem_cache_free(net->ct.nf_conntrack_cachep, ct); net->ct.nf_conntrack_cachep is created with SLAB_DESTROY_BY_RCU. The hash is protected by rcu, so readers look up conntracks without locks. A conntrack is removed from the hash, but in this moment a few readers still can use the conntrack. Then this conntrack is released and another thread creates conntrack with the same address and the equal tuple. After this a reader starts to validate the conntrack: * It's not dying, because a new conntrack was created * nf_ct_tuple_equal() returns true. But this conntrack is not initialized yet, so it can not be used by two threads concurrently. In this case BUG_ON may be triggered from nf_nat_setup_info(). Florian Westphal suggested to check the confirm bit too. I think it's right. task 1 task 2 task 3 nf_conntrack_find_get ____nf_conntrack_find destroy_conntrack hlist_nulls_del_rcu nf_conntrack_free kmem_cache_free __nf_conntrack_alloc kmem_cache_alloc memset(&ct->tuplehash[IP_CT_DIR_MAX], if (nf_ct_is_dying(ct)) if (!nf_ct_tuple_equal() I'm not sure, that I have ever seen this race condition in a real life. Currently we are investigating a bug, which is reproduced on a few nodes. In our case one conntrack is initialized from a few tasks concurrently, we don't have any other explanation for this. <2>[46267.083061] kernel BUG at net/ipv4/netfilter/nf_nat_core.c:322! ... <4>[46267.083951] RIP: 0010:[<ffffffffa01e00a4>] [<ffffffffa01e00a4>] nf_nat_setup_info+0x564/0x590 [nf_nat] ... <4>[46267.085549] Call Trace: <4>[46267.085622] [<ffffffffa023421b>] alloc_null_binding+0x5b/0xa0 [iptable_nat] <4>[46267.085697] [<ffffffffa02342bc>] nf_nat_rule_find+0x5c/0x80 [iptable_nat] <4>[46267.085770] [<ffffffffa0234521>] nf_nat_fn+0x111/0x260 [iptable_nat] <4>[46267.085843] [<ffffffffa0234798>] nf_nat_out+0x48/0xd0 [iptable_nat] <4>[46267.085919] [<ffffffff814841b9>] nf_iterate+0x69/0xb0 <4>[46267.085991] [<ffffffff81494e70>] ? ip_finish_output+0x0/0x2f0 <4>[46267.086063] [<ffffffff81484374>] nf_hook_slow+0x74/0x110 <4>[46267.086133] [<ffffffff81494e70>] ? ip_finish_output+0x0/0x2f0 <4>[46267.086207] [<ffffffff814b5890>] ? dst_output+0x0/0x20 <4>[46267.086277] [<ffffffff81495204>] ip_output+0xa4/0xc0 <4>[46267.086346] [<ffffffff814b65a4>] raw_sendmsg+0x8b4/0x910 <4>[46267.086419] [<ffffffff814c10fa>] inet_sendmsg+0x4a/0xb0 <4>[46267.086491] [<ffffffff814459aa>] ? sock_update_classid+0x3a/0x50 <4>[46267.086562] [<ffffffff81444d67>] sock_sendmsg+0x117/0x140 <4>[46267.086638] [<ffffffff8151997b>] ? _spin_unlock_bh+0x1b/0x20 <4>[46267.086712] [<ffffffff8109d370>] ? autoremove_wake_function+0x0/0x40 <4>[46267.086785] [<ffffffff81495e80>] ? do_ip_setsockopt+0x90/0xd80 <4>[46267.086858] [<ffffffff8100be0e>] ? call_function_interrupt+0xe/0x20 <4>[46267.086936] [<ffffffff8118cb10>] ? ub_slab_ptr+0x20/0x90 <4>[46267.087006] [<ffffffff8118cb10>] ? ub_slab_ptr+0x20/0x90 <4>[46267.087081] [<ffffffff8118f2e8>] ? kmem_cache_alloc+0xd8/0x1e0 <4>[46267.087151] [<ffffffff81445599>] sys_sendto+0x139/0x190 <4>[46267.087229] [<ffffffff81448c0d>] ? sock_setsockopt+0x16d/0x6f0 <4>[46267.087303] [<ffffffff810efa47>] ? audit_syscall_entry+0x1d7/0x200 <4>[46267.087378] [<ffffffff810ef795>] ? __audit_syscall_exit+0x265/0x290 <4>[46267.087454] [<ffffffff81474885>] ? compat_sys_setsockopt+0x75/0x210 <4>[46267.087531] [<ffffffff81474b5f>] compat_sys_socketcall+0x13f/0x210 <4>[46267.087607] [<ffffffff8104dea3>] ia32_sysret+0x0/0x5 <4>[46267.087676] Code: 91 20 e2 01 75 29 48 89 de 4c 89 f7 e8 56 fa ff ff 85 c0 0f 84 68 fc ff ff 0f b6 4d c6 41 8b 45 00 e9 4d fb ff ff e8 7c 19 e9 e0 <0f> 0b eb fe f6 05 17 91 20 e2 80 74 ce 80 3d 5f 2e 00 00 00 74 <1>[46267.088023] RIP [<ffffffffa01e00a4>] nf_nat_setup_info+0x564/0x590 Cc: Eric Dumazet <[email protected]> Cc: Florian Westphal <[email protected]> Cc: Pablo Neira Ayuso <[email protected]> Cc: Patrick McHardy <[email protected]> Cc: Jozsef Kadlecsik <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Cyrill Gorcunov <[email protected]> Signed-off-by: Andrey Vagin <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2014-02-05netfilter: nf_tables: fix oops when deleting a chain with referencesPatrick McHardy1-1/+1
The following commands trigger an oops: # nft -i nft> add table filter nft> add chain filter input { type filter hook input priority 0; } nft> add chain filter test nft> add rule filter input jump test nft> delete chain filter test We need to check the chain use counter before allowing destruction since we might have references from sets or jump rules. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=69341 Reported-by: Matthew Ife <[email protected]> Tested-by: Matthew Ife <[email protected]> Signed-off-by: Patrick McHardy <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2014-02-05netfilter: nft_ct: fix unconditional dump of 'dir' attrArturo Borrero1-2/+13
We want to make sure that the information that we get from the kernel can be reinjected without troubles. The kernel shouldn't return an attribute that is not required, or even prohibited. Dumping unconditionally NFTA_CT_DIRECTION could lead an application in userspace to interpret that the attribute was originally set, while it was not. Signed-off-by: Arturo Borrero Gonzalez <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2014-02-04openvswitch: Suppress error messages on megaflow updatesAndy Zhou1-4/+1
With subfacets, we'd expect megaflow updates message to carry the original micro flow. If not, EINVAL is returned and kernel logs an error message. Now that the user space subfacet layer is removed, it is expected that flow updates can arrive with a micro flow other than the original. Change the return code to EEXIST and remove the kernel error log message. Reported-by: Ben Pfaff <[email protected]> Signed-off-by: Andy Zhou <[email protected]> Signed-off-by: Jesse Gross <[email protected]>
2014-02-04openvswitch: Fix ovs_flow_free() ovs-lock assert.Pravin B Shelar1-2/+4
ovs_flow_free() is not called under ovs-lock during packet execute path (ovs_packet_cmd_execute()). Since packet execute does not touch flow->mask, there is no need to take that lock either. So move assert in case where flow->mask is checked. Found by code inspection. Signed-off-by: Pravin B Shelar <[email protected]> Signed-off-by: Jesse Gross <[email protected]>
2014-02-04openvswitch: Fix ovs_dp_cmd_msg_size()Daniele Di Proietto1-0/+1
commit 43d4be9cb55f3bac5253e9289996fd9d735531db (openvswitch: Allow user space to announce ability to accept unaligned Netlink messages) introduced OVS_DP_ATTR_USER_FEATURES netlink attribute in datapath responses, but the attribute size was not taken into account in ovs_dp_cmd_msg_size(). Signed-off-by: Daniele Di Proietto <[email protected]> Signed-off-by: Jesse Gross <[email protected]>
2014-02-04openvswitch: Fix kernel panic on ovs_flow_freeAndy Zhou3-48/+47
Both mega flow mask's reference counter and per flow table mask list should only be accessed when holding ovs_mutex() lock. However this is not true with ovs_flow_table_flush(). The patch fixes this bug. Reported-by: Joe Stringer <[email protected]> Signed-off-by: Andy Zhou <[email protected]> Signed-off-by: Jesse Gross <[email protected]>
2014-02-04openvswitch: Pad OVS_PACKET_ATTR_PACKET if linear copy was performedThomas Graf1-0/+8
While the zerocopy method is correctly omitted if user space does not support unaligned Netlink messages. The attribute is still not padded correctly as skb_zerocopy() will not ensure padding and the attribute size is no longer pre calculated though nla_reserve() which ensured padding previously. This patch applies appropriate padding if a linear data copy was performed in skb_zerocopy(). Signed-off-by: Thomas Graf <[email protected]> Acked-by: Zoltan Kiss <[email protected]> Signed-off-by: Jesse Gross <[email protected]>
2014-02-04xen-netfront: handle backend CLOSED without CLOSINGDavid Vrabel1-1/+4
Backend drivers shouldn't transistion to CLOSED unless the frontend is CLOSED. If a backend does transition to CLOSED too soon then the frontend may not see the CLOSING state and will not properly shutdown. So, treat an unexpected backend CLOSED state the same as CLOSING. Signed-off-by: David Vrabel <[email protected]> Acked-by: Konrad Rzeszutek Wilk <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-02-04bnx2x: fix L2-GRE TCP issuesDmitry Kravkov1-1/+1
When configuring GRE tunnel using OVS, tcp stream is distributed over all RSS queues which may cause TCP reordering. It happens since OVS uses L2GRE protocol when kernel gre uses IPGRE. Patch defaults gre tunnel to L2GRE which allows proper RSS for L2GRE packets and (implicitly) disables RSS for IPGRE traffic. Signed-off-by: Dmitry Kravkov <[email protected]> Signed-off-by: Ariel Elior <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-02-04net: qmi_wwan: add Netgear Aircard 340UBjørn Mork1-0/+1
This device was mentioned in an OpenWRT forum. Seems to have a "standard" Sierra Wireless ifnumber to function layout: 0: qcdm 2: nmea 3: modem 8: qmi 9: storage Signed-off-by: Bjørn Mork <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-02-04rtnetlink: fix oops in rtnl_link_get_slave_info_data_sizeFernando Luis Vazquez Cao1-1/+1
We should check whether rtnetlink link operations are defined before calling get_slave_size(). Without this, the following oops can occur when adding a tap device to OVS. [ 87.839553] BUG: unable to handle kernel NULL pointer dereference at 00000000000000a8 [ 87.839595] IP: [<ffffffff813d47c0>] if_nlmsg_size+0xf0/0x220 [...] [ 87.840651] Call Trace: [ 87.840664] [<ffffffff813d694b>] ? rtmsg_ifinfo+0x2b/0x100 [ 87.840688] [<ffffffff813c8340>] ? __netdev_adjacent_dev_insert+0x150/0x1a0 [ 87.840718] [<ffffffff813d6a50>] ? rtnetlink_event+0x30/0x40 [ 87.840742] [<ffffffff814b4144>] ? notifier_call_chain+0x44/0x70 [ 87.840768] [<ffffffff813c8946>] ? __netdev_upper_dev_link+0x3c6/0x3f0 [ 87.840798] [<ffffffffa0678d6c>] ? netdev_create+0xcc/0x160 [openvswitch] [ 87.840828] [<ffffffffa06781ea>] ? ovs_vport_add+0x4a/0xd0 [openvswitch] [ 87.840857] [<ffffffffa0670139>] ? new_vport+0x9/0x50 [openvswitch] [ 87.840884] [<ffffffffa067279e>] ? ovs_vport_cmd_new+0x11e/0x210 [openvswitch] [ 87.840915] [<ffffffff813f3efa>] ? genl_family_rcv_msg+0x19a/0x360 [ 87.840941] [<ffffffff813f40c0>] ? genl_family_rcv_msg+0x360/0x360 [ 87.840967] [<ffffffff813f4139>] ? genl_rcv_msg+0x79/0xc0 [ 87.840991] [<ffffffff813b6cf9>] ? __kmalloc_reserve.isra.25+0x29/0x80 [ 87.841018] [<ffffffff813f2389>] ? netlink_rcv_skb+0xa9/0xc0 [ 87.841042] [<ffffffff813f27cf>] ? genl_rcv+0x1f/0x30 [ 87.841064] [<ffffffff813f1988>] ? netlink_unicast+0xe8/0x1e0 [ 87.841088] [<ffffffff813f1d9a>] ? netlink_sendmsg+0x31a/0x750 [ 87.841113] [<ffffffff813aee96>] ? sock_sendmsg+0x86/0xc0 [ 87.841136] [<ffffffff813c960d>] ? __netdev_update_features+0x4d/0x200 [ 87.841163] [<ffffffff813ca94e>] ? ethtool_get_value+0x2e/0x50 [ 87.841188] [<ffffffff813af269>] ? ___sys_sendmsg+0x359/0x370 [ 87.841212] [<ffffffff813da686>] ? dev_ioctl+0x1a6/0x5c0 [ 87.841236] [<ffffffff8109c210>] ? autoremove_wake_function+0x30/0x30 [ 87.841264] [<ffffffff813ac59d>] ? sock_do_ioctl+0x3d/0x50 [ 87.841288] [<ffffffff813aca68>] ? sock_ioctl+0x1e8/0x2c0 [ 87.841312] [<ffffffff811934bf>] ? do_vfs_ioctl+0x2cf/0x4b0 [ 87.841335] [<ffffffff813afeb9>] ? __sys_sendmsg+0x39/0x70 [ 87.841362] [<ffffffff814b86f9>] ? system_call_fastpath+0x16/0x1b [ 87.841386] Code: c0 74 10 48 89 ef ff d0 83 c0 07 83 e0 fc 48 98 49 01 c7 48 89 ef e8 d0 d6 fe ff 48 85 c0 0f 84 df 00 00 00 48 8b 90 08 07 00 00 <48> 8b 8a a8 00 00 00 31 d2 48 85 c9 74 0c 48 89 ee 48 89 c7 ff [ 87.841529] RIP [<ffffffff813d47c0>] if_nlmsg_size+0xf0/0x220 [ 87.841555] RSP <ffff880221aa5950> [ 87.841569] CR2: 00000000000000a8 [ 87.851442] ---[ end trace e42ab217691b4fc2 ]--- Signed-off-by: Fernando Luis Vazquez Cao <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-02-04ptp: Allow selecting trigger/event index in testptpStefan Sørensen1-3/+8
Currently the trigger/event is hardcoded to 0, this patch adds a new command line argument -i to select an arbitrary trigger/ event. Signed-off-by: Stefan Sørensen <[email protected]> Acked-by: Richard Cochran <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2014-02-04net: ethoc: set up MII management bus clockMax Filippov2-2/+31
MII management bus clock is derived from the MAC clock by dividing it by MIIMODER register CLKDIV field value. This value may need to be set up in case it is undefined or its default value is too high (and communication with PHY is too slow) or too low (and communication with PHY is impossible). The value of CLKDIV is not specified directly, but is derived from the MAC clock for the default MII management bus frequency of 2.5MHz. The MAC clock may be specified in the platform data, or in the 'clocks' device tree attribute. Signed-off-by: Max Filippov <[email protected]> Signed-off-by: David S. Miller <[email protected]>