aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2016-04-13net: force inlining of netif_tx_start/stop_queue, sock_hold, __sock_putDenys Vlasenko2-4/+4
Sometimes gcc mysteriously doesn't inline very small functions we expect to be inlined. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66122 Arguably, gcc should do better, but gcc people aren't willing to invest time into it, asking to use __always_inline instead. With this .config: http://busybox.net/~vda/kernel_config_OPTIMIZE_INLINING_and_Os, the following functions get deinlined many times. netif_tx_stop_queue: 207 copies, 590 calls: 55 push %rbp 48 89 e5 mov %rsp,%rbp f0 80 8f e0 01 00 00 01 lock orb $0x1,0x1e0(%rdi) 5d pop %rbp c3 retq netif_tx_start_queue: 47 copies, 111 calls 55 push %rbp 48 89 e5 mov %rsp,%rbp f0 80 a7 e0 01 00 00 fe lock andb $0xfe,0x1e0(%rdi) 5d pop %rbp c3 retq sock_hold: 39 copies, 124 calls 55 push %rbp 48 89 e5 mov %rsp,%rbp f0 ff 87 80 00 00 00 lock incl 0x80(%rdi) 5d pop %rbp c3 retq __sock_put: 6 copies, 13 calls 55 push %rbp 48 89 e5 mov %rsp,%rbp f0 ff 8f 80 00 00 00 lock decl 0x80(%rdi) 5d pop %rbp c3 retq This patch fixes this via s/inline/__always_inline/. Code size decrease after the patch is ~2.5k: text data bss dec hex filename 56719876 56364551 36196352 149280779 8e5d80b vmlinux_before 56717440 56364551 36196352 149278343 8e5ce87 vmlinux Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> CC: David S. Miller <davem@davemloft.net> CC: linux-kernel@vger.kernel.org CC: netdev@vger.kernel.org CC: netfilter-devel@vger.kernel.org Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-13Doc: networking: Fix typo in dsaMasanari Iida2-2/+2
This patch fix typos in Documentation/networking/dsa. Signed-off-by: Masanari Iida <standby24x7@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-13ipv6, token: allow for clearing the current device tokenDaniel Borkmann1-4/+6
The original tokenized iid support implemented via f53adae4eae5 ("net: ipv6: add tokenized interface identifier support") didn't allow for clearing a device token as it was intended that this addressing mode was the only one active for globally scoped IPv6 addresses. Later we relaxed that restriction via 617fe29d45bd ("net: ipv6: only invalidate previously tokenized addresses"), and we should also allow for clearing tokens as there's no good reason why it shouldn't be allowed. Fixes: 617fe29d45bd ("net: ipv6: only invalidate previously tokenized addresses") Reported-by: Robin H. Johnson <robbat2@gentoo.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-13sock: tigthen lockdep checks for sock_owned_by_userHannes Frederic Sowa6-23/+35
sock_owned_by_user should not be used without socket lock held. It seems to be a common practice to check .owned before lock reclassification, so provide a little help to abstract this check away. Cc: linux-cifs@vger.kernel.org Cc: linux-bluetooth@vger.kernel.org Cc: linux-nfs@vger.kernel.org Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-13net: ethernet: stmmac: GMAC4.xx: Fix TX descriptor preparationAlexandre TORGUE1-8/+1
On GMAC4.xx each descriptor contains 2 buffers of 16KB (each). Initially, those 2 buffers was filled in dwmac4_rd_prepare_tx_desc but it is actually not needed. Indeed, stmmac driver supports frame up to 9000 bytes (jumbo). So only one buffer is needed. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alexandre TORGUE <alexandre.torgue@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-13Merge branch 'udp-hdrs-fixes'David S. Miller2-4/+3
Willem de Bruijn says: ==================== fix two more udp pull header issues Follow up patches to the fixes to RxRPC and SunRPC. A scan of the code showed two other interfaces that expect UDP packets to have a udphdr when queued: read packet length with ioctl SIOCINQ and receive payload checksum with socket option IP_CHECKSUM. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-13udp: do not expect udp headers in recv cmsg IP_CMSG_CHECKSUMWillem de Bruijn2-2/+3
On udp sockets, recv cmsg IP_CMSG_CHECKSUM returns a checksum over the packet payload. Since commit e6afc8ace6dd pulled the headers, taking skb->data as the start of transport header is incorrect. Use the transport header pointer. Also, when peeking at an offset from the start of the packet, only return a checksum from the start of the peeked data. Note that the cmsg does not subtract a tail checkum when reading truncated data. Fixes: e6afc8ace6dd ("udp: remove headers from UDP packets before queueing") Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-13udp: do not expect udp headers on ioctl SIOCINQWillem de Bruijn1-2/+0
On udp sockets, ioctl SIOCINQ returns the payload size of the first packet. Since commit e6afc8ace6dd pulled the headers, the result is incorrect when subtracting header length. Remove that operation. Fixes: e6afc8ace6dd ("udp: remove headers from UDP packets before queueing") Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-13Merge branch 'dsa-refactoring-set-1'David S. Miller11-83/+171
Andrew Lunn says: ==================== DSA refactoring: set 1 There has been a long running effort to refractor DSA probing to make the switches true linux devices. Here are a small collection of patches moving in this direction. Most have been seen before. We take a little step forward by passing the dsa device point to the driver, thus allowing it to perform resource allocations using the normal mechanisms. This device structure will later be replaced by the devices own device structure. Future patches will add a true driver probe function, so we rename the current probe function, cleaning up the namespace. phys_port_mask continually confuses me, thinking it is about PHYs. But it is actually about ports enabled to the outside world. So rename it to enabled_port_mask. Lots more patches yet to follow, this is just doing some ground work. v2: enabled_port_mask instread of user_port_masks Added Tested-by's and Reviewed-by. ==================== Tested-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-13dsa: mv88e6xxx: Use bus in mv88e6xxx_lookup_name()Andrew Lunn1-4/+8
mv88e6xxx_lookup_name() returns the model name of a switch at a given address on an MII bus. Using mii_bus to identify the bus rather than the host device is more logical, so change the parameter. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-13dsa: Rename phys_port_mask to enabled_port_maskAndrew Lunn4-16/+17
The phys in phys_port_mask suggests this mask is about PHYs. In fact, it means physical ports. Rename to enabled_port_mask, indicating external enabled ports of the switch, which is hopefully less confusing. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-13net: dsa: Rename DSA probe function.Andrew Lunn6-18/+24
Rename the function called from the DSA to perform a probe for the switch. This makes the normal _probe() name available for a standard Linux device driver probe function. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-13net: dsa: Keep the mii bus and address in the private structureAndrew Lunn8-83/+89
Rather than looking up the mii bus and address every time, do it once at probe, and keep it in the private structure. Centralise this probe code in mv88e6xxx. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-13net: dsa: Remove allocation of driver private memoryAndrew Lunn2-2/+1
The drivers now allocate their own memory for private usage. Remove the allocation from the core code. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-13net: dsa: Have the switch driver allocate there own private memoryAndrew Lunn10-22/+86
Now the switch devices have a dev pointer, make use of it for allocating the drivers private data structures using a devm_kzalloc(). Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-13net: dsa: Pass the dsa device to the switch driversAndrew Lunn8-10/+18
By passing a device structure to the switch devices, it allows them to use devm_* methods for resource management. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-13Merge tag 'mac80211-next-for-davem-2016-04-13' of ↵David S. Miller230-1443/+1456
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== To synchronize with Kalle, here's just a big change that affects all drivers - removing the duplicated enum ieee80211_band and replacing it by enum nl80211_band. On top of that, just a small documentation update. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-13tipc: remove remnants of old broadcast codeJon Paul Maloy1-15/+0
We remove a couple of leftover fields in struct tipc_bearer. Those were used by the old broadcast implementation, and are not needed any longer. There is no functional changes in this commit. Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-12Merge branch 'mediatek-stress-test-fixes'David S. Miller2-44/+66
John Crispin says: ==================== net: mediatek: make the driver pass stress tests While testing the driver we managed to get the TX path to stall and fail to recover. When dual MAC support was added to the driver, the whole queue stop/wake code was not properly adapted. There was also a regression in the locking of the xmit function. The fact that watchdog_timeo was not set and that the tx_timeout code failed to properly reset the dma, irq and queue just made the mess complete. This series make the driver pass stress testing. With this series applied the testbed has been running for several days and still has not locked up. We have a second setup that has a small hack patch applied to randomly stop irqs and/or one of the queues and successfully manages to recover from these simulated tx stalls. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-12net: mediatek: do not set the QID field in the TX DMA descriptorsJohn Crispin1-2/+1
The QID field gets set to the mac id. This made the DMA linked list queue the traffic of each MAC on a different internal queue. However during long term testing we found that this will cause traffic stalls as the multi queue setup requires a more complete initialisation which is not part of the upstream driver yet. This patch removes the code setting the QID field, resulting in all traffic ending up in queue 0 which works without any special setup. Signed-off-by: John Crispin <blogic@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-12net: mediatek: move the pending_work struct to the device generic structJohn Crispin2-10/+7
The worker always touches both netdevs. It is ethernet core and not MAC specific. We only need one worker, which belongs into the ethernets core struct. Signed-off-by: John Crispin <blogic@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-12net: mediatek: fix mtk_pending_workJohn Crispin1-8/+20
The driver supports 2 MACs. Both run on the same DMA ring. If we hit a TX timeout we need to stop both netdevs before restarting them again. If we don't do this, mtk_stop() wont shutdown DMA and the consecutive call to mtk_open() wont restart DMA and enable IRQs. Signed-off-by: John Crispin <blogic@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-12net: mediatek: fix TX lockingJohn Crispin1-10/+10
Inside the TX path there is a lock inside the tx_map function. This is however too late. The patch moves the lock to the start of the xmit function right before the free count check of the DMA ring happens. If we do not do this, the code becomes racy leading to TX stalls and dropped packets. This happens as there are 2 netdevs running on the same physical DMA ring. Signed-off-by: John Crispin <blogic@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-12net: mediatek: fix stop and wakeup of queueJohn Crispin1-10/+27
The driver supports 2 MACs. Both run on the same DMA ring. If we go above/below the TX rings threshold value, we always need to wake/stop the queue of both devices. Not doing to can cause TX stalls and packet drops on one of the devices. Signed-off-by: John Crispin <blogic@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-12net: mediatek: remove superfluous reset callJohn Crispin1-4/+0
HW reset is triggered in the mtk_hw_init() function. There is no need to also reset the core during probe. Signed-off-by: John Crispin <blogic@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-12net: mediatek: mtk_cal_txd_req() returns bad valueJohn Crispin1-1/+1
The code used to also support the PDMA engine, which had 2 packet pointers per descriptor. Because of this we had to divide the result by 2 and round it up. This is no longer needed as the code only supports QDMA. Signed-off-by: John Crispin <blogic@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-12net: mediatek: watchdog_timeo was not setJohn Crispin1-0/+1
The original commit failed to set watchdog_timeo. This patch sets watchdog_timeo to HZ. Signed-off-by: John Crispin <blogic@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller8-129/+298
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains the first batch of Netfilter updates for your net-next tree. 1) Define pr_fmt() in nf_conntrack, from Weongyo Jeong. 2) Define and register netfilter's afinfo for the bridge family, this comes in preparation for native nfqueue's bridge for nft, from Stephane Bryant. 3) Add new attributes to store layer 2 and VLAN headers to nfqueue, also from Stephane Bryant. 4) Parse new NFQA_VLAN and NFQA_L2HDR nfqueue netlink attributes coming from userspace, from Stephane Bryant. 5) Use net->ipv6.devconf_all->hop_limit instead of hardcoded hop_limit in IPv6 SYNPROXY, from Liping Zhang. 6) Remove unnecessary check for dst == NULL in nf_reject_ipv6, from Haishuang Yan. 7) Deinline ctnetlink event report functions, from Florian Westphal. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-12netfilter: conntrack: move expectation event helper to ecache.cFlorian Westphal2-39/+33
Not performance critical, it is only invoked when an expectation is added/destroyed. While at it, kill unused nf_ct_expect_event() wrapper. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-04-12netfilter: conntrack: de-inline nf_conntrack_eventmask_reportFlorian Westphal2-54/+66
Way too large; move it to nf_conntrack_ecache.c. Reduces total object size by 1216 byte on my machine. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2016-04-12Merge branch 'for-upstream' of ↵David S. Miller10-39/+74
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2016-04-12 Here's a set of Bluetooth & 802.15.4 patches intended for the 4.7 kernel: - Fix for race condition in vhci driver - Memory leak fix for ieee802154/adf7242 driver - Improvements to deal with single-mode (LE-only) Bluetooth controllers - Fix for allowing the BT_SECURITY_FIPS security level - New BCM2E71 ACPI ID - NULL pointer dereference fix fox hci_ldisc driver Let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-12cfg80211: remove enum ieee80211_bandJohannes Berg230-1437/+1420
This enum is already perfectly aliased to enum nl80211_band, and the only reason for it is that we get IEEE80211_NUM_BANDS out of it. There's no really good reason to not declare the number of bands in nl80211 though, so do that and remove the cfg80211 one. Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-04-12cfg80211: Improve Connect/Associate command documentationJouni Malinen2-6/+36
The roaming cases for the Connect command were not fully covered and neither Connect nor Associate command uses of the prev_bssid parameter were very clear. Add details to describe how the prev_bssid argument is supposed to be used and when the driver should use association or reassociation. Signed-off-by: Jouni Malinen <jouni@qca.qualcomm.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2016-04-11net: mdio: Fix lockdep falls positive splatAndrew Lunn3-10/+15
MDIO devices can be stacked upon each other. The current code supports two levels, which until recently has been enough for a DSA mdio bus on top of another bus. Now we have hardware which has an MDIO mux in the middle. Define an MDIO MUTEX class with three levels. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11Merge branch 'rprpc-2nd-rewrite-part-1'David S. Miller20-326/+374
David Howells says: ==================== RxRPC: 2nd rewrite part 1 Okay, I'm in the process of rewriting the RxRPC rewrite. The primary aim of this second rewrite is to strictly control the number of active connections we know about and to get rid of connections we don't need much more quickly. On top of this, there are fixes to the protocol handling which will all occur in later parts. Here's the first set of patches from the second go, aimed at net-next. These are all fixes and cleanups preparatory to the main event. Notable parts of this set include: (1) A fix for the AFS filesystem to wait for outstanding calls to complete before closing the RxRPC socket. (2) Differentiation of local and remote abort codes. At a future point userspace will get to see this via control message data on recvmsg(). (3) Absorb the rxkad module into the af_rxrpc module to prevent a dependency loop. (4) Create a null security module and unconditionalise calls into the security module that's in force (there will always be a security module applied to a connection, even if it's just the null one). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11rxrpc: Create a null security type and get rid of conditional callsDavid Howells9-61/+105
Create a null security type for security index 0 and get rid of all conditional calls to the security operations. We expect normally to be using security, so this should be of little negative impact. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11rxrpc: Absorb the rxkad security moduleDavid Howells6-134/+85
Absorb the rxkad security module into the af_rxrpc module so that there's only one module file. This avoids a circular dependency whereby rxkad pins af_rxrpc and cached connections pin rxkad but can't be manually evicted (they will expire eventually and cease pinning). With this change, af_rxrpc can just be unloaded, despite having cached connections. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11rxrpc: Don't assume transport address family and size when using itDavid Howells2-4/+4
Don't assume transport address family and size when using the peer address to send a packet. Instead, use the start of the transport address rather than any particular element of the union and use the transport address length noted inside the sockaddr_rxrpc struct. This will be necessary when IPv6 support is introduced. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11rxrpc: Don't pass gfp around in incoming call handling functionsDavid Howells4-12/+9
Don't pass gfp around in incoming call handling functions, but rather hard code it at the points where we actually need it since the value comes from within the rxrpc driver and is always the same. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11rxrpc: Differentiate local and remote abort codes in structsDavid Howells10-25/+50
In the rxrpc_connection and rxrpc_call structs, there's one field to hold the abort code, no matter whether that value was generated locally to be sent or was received from the peer via an abort packet. Split the abort code fields in two for cleanliness sake and add an error field to hold the Linux error number to the rxrpc_call struct too (sometimes this is generated in a context where we can't return it to userspace directly). Furthermore, add a skb mark to indicate a packet that caused a local abort to be generated so that recvmsg() can pick up the correct abort code. A future addition will need to be to indicate to userspace the difference between aborts via a control message. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11rxrpc: Static arrays of strings should be const char *const[]David Howells3-4/+2
Static arrays of strings should be const char *const[]. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11rxrpc: Move some miscellaneous bits out into their own fileDavid Howells6-84/+107
Move some miscellaneous bits out into their own file to make it easier to split the call handling. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11rxrpc: Disable a debugging statement that has been left enabled.David Howells1-1/+1
Disable a debugging statement that has been left enabled Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11afs: Wait for outstanding async calls before closing rxrpc socketDavid Howells1-3/+13
The afs filesystem needs to wait for any outstanding asynchronous calls (such as FS.GiveUpCallBacks cleaning up the callbacks lodged with a server) to complete before closing the AF_RXRPC socket when unloading the module. This may occur if the module is removed too quickly after unmounting all filesystems. This will produce an error report that looks like: AFS: Assertion failed 1 == 0 is false 0x1 == 0x0 is false ------------[ cut here ]------------ kernel BUG at ../fs/afs/rxrpc.c:135! ... RIP: 0010:[<ffffffffa004111c>] afs_close_socket+0xec/0x107 [kafs] ... Call Trace: [<ffffffffa004a160>] afs_exit+0x1f/0x57 [kafs] [<ffffffff810c30a0>] SyS_delete_module+0xec/0x17d [<ffffffff81610417>] entry_SYSCALL_64_fastpath+0x12/0x6b Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11Merge branch 'udp-pull'David S. Miller4-9/+7
Willem de Bruijn says: ==================== net: fix udp pull header breakage Commit e6afc8ace6dd ("udp: remove headers from UDP packets before queueing") modified udp receive processing to pull headers before enqueue and to not expect them on dequeue. The patch missed protocols on top of udp with in-kernel implementations that have their own skb_recv_datagram calls and dequeue logic. Modify these datapaths to also no longer expect a udp header at skb->data. Sunrpc and rxrpc are the only two protocols that call this function and contain references to udphr (some others, like tipc, are based on encap_rcv, which acts before enqueue, before the the header pull). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11rxrpc: do not pull udp headers on receiveWillem de Bruijn1-2/+2
Commit e6afc8ace6dd modified the udp receive path by pulling the udp header before queuing an skbuff onto the receive queue. Rxrpc also calls skb_recv_datagram to dequeue an skb from a udp socket. Modify this receive path to also no longer expect udp headers. Fixes: e6afc8ace6dd ("udp: remove headers from UDP packets before queueing") Signed-off-by: Willem de Bruijn <willemb@google.com> Tested-by: Thierry Reding <treding@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11sunrpc: do not pull udp headers on receiveWillem de Bruijn3-7/+5
Commit e6afc8ace6dd modified the udp receive path by pulling the udp header before queuing an skbuff onto the receive queue. Sunrpc also calls skb_recv_datagram to dequeue an skb from a udp socket. Modify this receive path to also no longer expect udp headers. Fixes: e6afc8ace6dd ("udp: remove headers from UDP packets before queueing") Reported-by: Franklin S Cooper Jr. <fcooper@ti.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Tested-by: Thierry Reding <treding@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11net: ipv4: Consider failed nexthops in multipath routesDavid Ahern4-5/+53
Multipath route lookups should consider knowledge about next hops and not select a hop that is known to be failed. Example: [h2] [h3] 15.0.0.5 | | 3| 3| [SP1] [SP2]--+ 1 2 1 2 | | /-------------+ | | \ / | | X | | / \ | | / \---------------\ | 1 2 1 2 12.0.0.2 [TOR1] 3-----------------3 [TOR2] 12.0.0.3 4 4 \ / \ / \ / -------| |-----/ 1 2 [TOR3] 3| | [h1] 12.0.0.1 host h1 with IP 12.0.0.1 has 2 paths to host h3 at 15.0.0.5: root@h1:~# ip ro ls ... 12.0.0.0/24 dev swp1 proto kernel scope link src 12.0.0.1 15.0.0.0/16 nexthop via 12.0.0.2 dev swp1 weight 1 nexthop via 12.0.0.3 dev swp1 weight 1 ... If the link between tor3 and tor1 is down and the link between tor1 and tor2 then tor1 is effectively cut-off from h1. Yet the route lookups in h1 are alternating between the 2 routes: ping 15.0.0.5 gets one and ssh 15.0.0.5 gets the other. Connections that attempt to use the 12.0.0.2 nexthop fail since that neighbor is not reachable: root@h1:~# ip neigh show ... 12.0.0.3 dev swp1 lladdr 00:02:00:00:00:1b REACHABLE 12.0.0.2 dev swp1 FAILED ... The failed path can be avoided by considering known neighbor information when selecting next hops. If the neighbor lookup fails we have no knowledge about the nexthop, so give it a shot. If there is an entry then only select the nexthop if the state is sane. This is similar to what fib_detect_death does. To maintain backward compatibility use of the neighbor information is based on a new sysctl, fib_multipath_use_neigh. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Reviewed-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11Merge branch 'cpsw-host_port'David S. Miller1-31/+21
Grygorii Strashko says: ==================== drivers: net: cpsw: fix ale calls and drop host_port field from cpsw_priv This clean up series intended to: - fix port_mask parameters in ale calls and drop unnecessary shifts - drop host_port field from struct cpsw_priv ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-11drivers: net: cpsw: drop host_port field from struct cpsw_privGrygorii Strashko1-18/+12
The host_port field is constantly assigned to 0 and this value has never changed (since time when cpsw driver was introduced. More over, if this field will be assigned to non 0 value it will break current driver functionality. Hence, there are no reasons to continue maintaining this host_port field and it can be removed, and the HOST_PORT_NUM and ALE_PORT_HOST defines can be used instead. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>