aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2016-03-14openvswitch: Handle NF_REPEAT in conntrack action.Jarno Rajahalme1-2/+8
Repeat the nf_conntrack_in() call when it returns NF_REPEAT. This avoids dropping a SYN packet re-opening an existing TCP connection. Signed-off-by: Jarno Rajahalme <[email protected]> Acked-by: Joe Stringer <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2016-03-14openvswitch: Find existing conntrack entry after upcall.Jarno Rajahalme1-13/+90
Add a new function ovs_ct_find_existing() to find an existing conntrack entry for which this packet was already applied to. This is only to be called when there is evidence that the packet was already tracked and committed, but we lost the ct reference due to an userspace upcall. ovs_ct_find_existing() is called from skb_nfct_cached(), which can now hide the fact that the ct reference may have been lost due to an upcall. This allows ovs_ct_commit() to be simplified. This patch is needed by later "openvswitch: Interface with NAT" patch, as we need to be able to pass the packet through NAT using the original ct reference also after the reference is lost after an upcall. Signed-off-by: Jarno Rajahalme <[email protected]> Acked-by: Joe Stringer <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2016-03-14openvswitch: Update the CT state key only after nf_conntrack_in().Jarno Rajahalme1-3/+4
Only a successful nf_conntrack_in() call can effect a connection state change, so it suffices to update the key only after the nf_conntrack_in() returns. This change is needed for the later NAT patches. Signed-off-by: Jarno Rajahalme <[email protected]> Acked-by: Joe Stringer <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2016-03-14openvswitch: Add commentary to conntrack.cJarno Rajahalme1-1/+20
This makes the code easier to understand and the following patches more focused. Signed-off-by: Jarno Rajahalme <[email protected]> Acked-by: Joe Stringer <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2016-03-14netfilter: Allow calling into nat helper without skb_dst.Jarno Rajahalme2-44/+16
NAT checksum recalculation code assumes existence of skb_dst, which becomes a problem for a later patch in the series ("openvswitch: Interface with NAT."). Simplify this by removing the check on skb_dst, as the checksum will be dealt with later in the stack. Suggested-by: Pravin Shelar <[email protected]> Signed-off-by: Jarno Rajahalme <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2016-03-14netfilter: Remove IP_CT_NEW_REPLY definition.Jarno Rajahalme1-2/+0
Remove the definition of IP_CT_NEW_REPLY from the kernel as it does not make sense. This allows the definition of IP_CT_NUMBER to be simplified as well. Signed-off-by: Jarno Rajahalme <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2016-03-14net: dsa: refine netdev event notifierVivien Didelot1-24/+30
Rework the netdev event handler, similar to what the Mellanox Spectrum driver does, to easily welcome more events later (for example NETDEV_PRECHANGEUPPER) and use netdev helpers (such as netif_is_bridge_master). Signed-off-by: Vivien Didelot <[email protected]> Acked-by: Jiri Pirko <[email protected]> Acked-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-03-14net: dsa: make port_bridge_leave return voidVivien Didelot1-6/+3
netdev_upper_dev_unlink() which notifies NETDEV_CHANGEUPPER, returns void, as well as del_nbp(). So there's no advantage to catch an eventual error from the port_bridge_leave routine at the DSA level. Make this routine void for the DSA layer and its existing drivers. Signed-off-by: Vivien Didelot <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-03-14net: dsa: rename port_*_bridge routinesVivien Didelot1-4/+4
Rename DSA port_join_bridge and port_leave_bridge routines to respectively port_bridge_join and port_bridge_leave in order to respect an implicit Port::Bridge namespace. Signed-off-by: Vivien Didelot <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-03-14bridge: fix potential use-after-free when hook returns QUEUE or STOLEN verdictFlorian Westphal1-9/+7
Zefir Kurtisi reported kernel panic with an openwrt specific patch. However, it turns out that mainline has a similar bug waiting to happen. Once NF_HOOK() returns the skb is in undefined state and must not be used. Moreover, the okfn must consume the skb to support async processing (NF_QUEUE). Current okfn in this spot doesn't consume it and caller assumes that NF_HOOK return value tells us if skb was freed or not, but thats wrong. It "works" because no in-tree user registers a NFPROTO_BRIDGE hook at LOCAL_IN that returns STOLEN or NF_QUEUE verdicts. Once we add NF_QUEUE support for nftables bridge this will break -- NF_QUEUE holds the skb for async processing, caller will erronoulsy return RX_HANDLER_PASS and on reinject netfilter will access free'd skb. Fix this by pushing skb up the stack in the okfn instead. NB: It also seems dubious to use LOCAL_IN while bypassing PRE_ROUTING completely in this case but this is how its been forever so it seems preferable to not change this. Cc: Felix Fietkau <[email protected]> Cc: Zefir Kurtisi <[email protected]> Signed-off-by: Florian Westphal <[email protected]> Tested-by: Zefir Kurtisi <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-03-14Merge branch 'for-upstream' of ↵David S. Miller6-18/+91
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2016-03-12 Here's the last bluetooth-next pull request for the 4.6 kernel. - New USB ID for AR3012 in btusb - New BCM2E55 ACPI ID - Buffer overflow fix for the Add Advertising command - Support for a new Bluetooth LE limited privacy mode - Fix for firmware activation in btmrvl_sdio - Cleanups to mac802154 & 6lowpan code Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <[email protected]>
2016-03-14phy: fixed: Fix removal of phys.Andrew Lunn1-3/+1
The fixed phys delete function simply removed the fixed phy from the internal linked list and freed the memory. It however did not unregister the associated phy device. This meant it was still possible to find the phy device on the mdio bus. Make fixed_phy_del() an internal function and add a fixed_phy_unregister() to unregisters the phy device and then uses fixed_phy_del() to free resources. Modify DSA to use this new API function, so we don't leak phys. Signed-off-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-03-14dsa: dsa: Fix freeing of fixed-phys from user ports.Andrew Lunn1-3/+0
All ports types can have a fixed PHY associated with it. Remove the check which limits removal to only CPU and DSA ports. Signed-off-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-03-14dsa: Destroy fixed link phys after the phy has been disconnectedAndrew Lunn1-12/+12
The phy is disconnected from the slave in dsa_slave_destroy(). Don't destroy fixed link phys until after this, since there can be fixed linked phys connected to ports. Signed-off-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-03-14dsa: slave: Don't reference NULL pointer during phy_disconnectAndrew Lunn1-4/+8
When the phy is disconnected, the parent pointer to the netdev it was attached to is set to NULL. The code then tries to suspend the phy, but dsa_slave_fixed_link_update needs the parent pointer to determine which switch the phy is connected to. So it dereferenced a NULL pointer. Check for this condition. Signed-off-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-03-14xprtrdma: Use new CQ API for RPC-over-RDMA client send CQsChuck Lever3-125/+91
Calling ib_poll_cq() to sort through WCs during a completion is a common pattern amongst RDMA consumers. Since commit 14d3a3b2498e ("IB: add a proper completion queue abstraction"), WC sorting can be handled by the IB core. By converting to this new API, xprtrdma is made a better neighbor to other RDMA consumers, as it allows the core to schedule the delivery of completions more fairly amongst all active consumers. Because each ib_cqe carries a pointer to a completion method, the core can now post its own operations on a consumer's QP, and handle the completions itself, without changes to the consumer. Send completions were previously handled entirely in the completion upcall handler (ie, deferring to a process context is unneeded). Thus IB_POLL_SOFTIRQ is a direct replacement for the current xprtrdma send code path. Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Devesh Sharma <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2016-03-14xprtrdma: Use an anonymous union in struct rpcrdma_mwChuck Lever3-36/+36
Clean up: Make code more readable. Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Devesh Sharma <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2016-03-14xprtrdma: Use new CQ API for RPC-over-RDMA client receive CQsChuck Lever2-58/+21
Calling ib_poll_cq() to sort through WCs during a completion is a common pattern amongst RDMA consumers. Since commit 14d3a3b2498e ("IB: add a proper completion queue abstraction"), WC sorting can be handled by the IB core. By converting to this new API, xprtrdma is made a better neighbor to other RDMA consumers, as it allows the core to schedule the delivery of completions more fairly amongst all active consumers. Because each ib_cqe carries a pointer to a completion method, the core can now post its own operations on a consumer's QP, and handle the completions itself, without changes to the consumer. xprtrdma's reply processing is already handled in a work queue, but there is some initial order-dependent processing that is done in the soft IRQ context before a work item is scheduled. IB_POLL_SOFTIRQ is a direct replacement for the current xprtrdma receive code path. Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Devesh Sharma <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2016-03-14xprtrdma: Serialize credit accounting againChuck Lever3-9/+28
Commit fe97b47cd623 ("xprtrdma: Use workqueue to process RPC/RDMA replies") replaced the reply tasklet with a workqueue that allows RPC replies to be processed in parallel. Thus the credit values in RPC-over-RDMA replies can be applied in a different order than in which the server sent them. To fix this, revert commit eba8ff660b2d ("xprtrdma: Move credit update to RPC reply handler"). Reverting is done by hand to accommodate code changes that have occurred since then. Fixes: fe97b47cd623 ("xprtrdma: Use workqueue to process . . .") Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2016-03-14xprtrdma: Properly handle RDMA_ERROR repliesChuck Lever1-8/+43
These are shorter than RPCRDMA_HDRLEN_MIN, and they need to complete the waiting RPC. Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2016-03-14xprtrdma: Do not wait if ib_post_send() failsChuck Lever1-1/+5
If ib_post_send() in ro_unmap_sync() fails, the WRs have not been posted, no completions will fire, and wait_for_completion() will wait forever. Skip the wait in that case. To ensure the MRs are invalid, disconnect. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2016-03-14xprtrdma: Segment head and tail XDR buffers on page boundariesChuck Lever1-10/+32
A single memory allocation is used for the pair of buffers wherein the RPC client builds an RPC call message and decodes its matching reply. These buffers are sized based on the maximum possible size of the RPC call and reply messages for the operation in progress. This means that as the call buffer increases in size, the start of the reply buffer is pushed farther into the memory allocation. RPC requests are growing in size. It used to be that both the call and reply buffers fit inside a single page. But these days, thanks to NFSv4 (and especially security labels in NFSv4.2) the maximum call and reply sizes are large. NFSv4.0 OPEN, for example, now requires a 6KB allocation for a pair of call and reply buffers, and NFSv4 LOOKUP is not far behind. As the maximum size of a call increases, the reply buffer is pushed far enough into the buffer's memory allocation that a page boundary can appear in the middle of it. When the maximum possible reply size is larger than the client's RDMA receive buffers (currently 1KB), the client has to register a Reply chunk for the server to RDMA Write the reply into. The logic in rpcrdma_convert_iovs() assumes that xdr_buf head and tail buffers would always be contained on a single page. It supplies just one segment for the head and one for the tail. FMR, for example, registers up to a page boundary (only a portion of the reply buffer in the OPEN case above). But without additional segments, it doesn't register the rest of the buffer. When the server tries to write the OPEN reply, the RDMA Write fails with a remote access error since the client registered only part of the Reply chunk. rpcrdma_convert_iovs() must split the XDR buffer into multiple segments, each of which are guaranteed not to contain a page boundary. That way fmr_op_map is given the proper number of segments to register the whole reply buffer. Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Devesh Sharma <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2016-03-14xprtrdma: Clean up dprintk format string containing a newlineChuck Lever1-4/+2
Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2016-03-14xprtrdma: Clean up physical_op_map()Chuck Lever1-1/+0
physical_op_unmap{_sync} don't use mr_nsegs, so don't bother to set it in physical_op_map. Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2016-03-14tcp: Add RFC4898 tcpEStatsPerfDataSegsOut/InMartin KaFai Lau6-4/+16
Per RFC4898, they count segments sent/received containing a positive length data segment (that includes retransmission segments carrying data). Unlike tcpi_segs_out/in, tcpi_data_segs_out/in excludes segments carrying no data (e.g. pure ack). The patch also updates the segs_in in tcp_fastopen_add_skb() so that segs_in >= data_segs_in property is kept. Together with retransmission data, tcpi_data_segs_out gives a better signal on the rxmit rate. v6: Rebase on the latest net-next v5: Eric pointed out that checking skb->len is still needed in tcp_fastopen_add_skb() because skb can carry a FIN without data. Hence, instead of open coding segs_in and data_segs_in, tcp_segs_in() helper is used. Comment is added to the fastopen case to explain why segs_in has to be reset and tcp_segs_in() has to be called before __skb_pull(). v4: Add comment to the changes in tcp_fastopen_add_skb() and also add remark on this case in the commit message. v3: Add const modifier to the skb parameter in tcp_segs_in() v2: Rework based on recent fix by Eric: commit a9d99ce28ed3 ("tcp: fix tcpi_segs_in after connection establishment") Signed-off-by: Martin KaFai Lau <[email protected]> Cc: Chris Rapier <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Marcelo Ricardo Leitner <[email protected]> Cc: Neal Cardwell <[email protected]> Cc: Yuchung Cheng <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-03-14net: caif: fix misleading indentationArnd Bergmann1-1/+1
gcc points out code that is not indented the way it is interpreted: net/caif/cfpkt_skbuff.c: In function 'cfpkt_setlen': net/caif/cfpkt_skbuff.c:289:4: error: statement is indented as if it were guarded by... [-Werror=misleading-indentation] return cfpkt_getlen(pkt); ^~~~~~ net/caif/cfpkt_skbuff.c:286:3: note: ...this 'else' clause, but it is not else ^~~~ It is clear from the context that not returning here would be a bug, as we'd end up passing a negative length into a function that takes a u16 length, so it is not missing curly braces here, and I'm assuming that the indentation is the only part that's wrong about it. Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-03-14net: Fix use after free in the recvmmsg exit pathArnaldo Carvalho de Melo1-19/+19
The syzkaller fuzzer hit the following use-after-free: Call Trace: [<ffffffff8175ea0e>] __asan_report_load8_noabort+0x3e/0x40 mm/kasan/report.c:295 [<ffffffff851cc31a>] __sys_recvmmsg+0x6fa/0x7f0 net/socket.c:2261 [< inline >] SYSC_recvmmsg net/socket.c:2281 [<ffffffff851cc57f>] SyS_recvmmsg+0x16f/0x180 net/socket.c:2270 [<ffffffff86332bb6>] entry_SYSCALL_64_fastpath+0x16/0x7a arch/x86/entry/entry_64.S:185 And, as Dmitry rightly assessed, that is because we can drop the reference and then touch it when the underlying recvmsg calls return some packets and then hit an error, which will make recvmmsg to set sock->sk->sk_err, oops, fix it. Reported-and-Tested-by: Dmitry Vyukov <[email protected]> Cc: Alexander Potapenko <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Kostya Serebryany <[email protected]> Cc: Sasha Levin <[email protected]> Fixes: a2e2725541fa ("net: Introduce recvmmsg socket syscall") http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-03-14tipc: make sure IPv6 header fits in skb headroomRichard Alpe1-1/+1
Expand headroom further in order to be able to fit the larger IPv6 header. Prior to this patch this caused a skb under panic for certain tipc packets when using IPv6 UDP bearer(s). Signed-off-by: Richard Alpe <[email protected]> Acked-by: Jon Maloy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-03-14net: add a hardware buffer management helper APIGregory CLEMENT3-0/+91
This basic implementation allows to share code between driver using hardware buffer management. As the code is hardware agnostic, there is few helpers, most of the optimization brought by the an HW BM has to be done at driver level. Tested-by: Sebastian Careba <[email protected]> Signed-off-by: Gregory CLEMENT <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-03-13GSO/UDP: Use skb->len instead of udph->len to determine length of original skbAlexander Duyck1-5/+10
It is possible for tunnels to end up generating IP or IPv6 datagrams that are larger than 64K and expecting to be segmented. As such we need to deal with length values greater than 64K. In order to accommodate this we need to update the code to work with a 32b length value instead of a 16b one. Signed-off-by: Alexander Duyck <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-03-13ipv6: Pass proto to csum_ipv6_magic as __u8 instead of unsigned shortAlexander Duyck1-2/+1
This patch updates csum_ipv6_magic so that it correctly recognizes that protocol is a unsigned 8 bit value. This will allow us to better understand what limitations may or may not be present in how we handle the data. For example there are a number of places that call htonl on the protocol value. This is likely not necessary and can be replaced with a multiplication by ntohl(1) which will be converted to a shift by the compiler. Signed-off-by: Alexander Duyck <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-03-13ipv4: Don't do expensive useless work during inetdev destroy.David S. Miller3-2/+18
When an inetdev is destroyed, every address assigned to the interface is removed. And in this scenerio we do two pointless things which can be very expensive if the number of assigned interfaces is large: 1) Address promotion. We are deleting all addresses, so there is no point in doing this. 2) A full nf conntrack table purge for every address. We only need to do this once, as is already caught by the existing masq_dev_notifier so masq_inet_event() can skip this. Reported-by: Solar Designer <[email protected]> Signed-off-by: David S. Miller <[email protected]> Tested-by: Cyrill Gorcunov <[email protected]>
2016-03-13Merge tag 'nfc-next-4.6-1' of ↵David S. Miller2-2/+8
git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next Samuel Ortiz says: ==================== NFC 4.6 pull request This is a very small one this time, with only 5 patches. There are a couple of big items that could not be merged/finished on time. We have: - 2 LLCP fixes for a race and a potential OOM. - 2 cleanups for the pn544 and microread drivers. - 1 Maintainer addition for the s3fwrn5 driver. ==================== Signed-off-by: David S. Miller <[email protected]>
2016-03-13net: socket: use pr_info_once to tip the obsolete usage of PF_PACKETliping.zhang1-6/+2
There is no need to use the static variable here, pr_info_once is more concise. Signed-off-by: Liping Zhang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-03-13net: adjust napi_consume_skb to handle non-NAPI callersJesper Dangaard Brouer1-2/+2
Some drivers reuse/share code paths that free SKBs between NAPI and non-NAPI calls. Adjust napi_consume_skb to handle this use-case. Before, calls from netpoll (w/ IRQs disabled) was handled and indicated with a budget zero indication. Use the same zero indication to handle calls not originating from NAPI/softirq. Simply handled by using dev_consume_skb_any(). This adds an extra branch+call for the netpoll case (checking in_irq() + irqs_disabled()), but that is okay as this is a slowpath. Suggested-by: Alexander Duyck <[email protected]> Signed-off-by: Jesper Dangaard Brouer <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-03-13sctp: allow sctp_transmit_packet and others to use gfpMarcelo Ricardo Leitner7-66/+83
Currently sctp_sendmsg() triggers some calls that will allocate memory with GFP_ATOMIC even when not necessary. In the case of sctp_packet_transmit it will allocate a linear skb that will be used to construct the packet and this may cause sends to fail due to ENOMEM more often than anticipated specially with big MTUs. This patch thus allows it to inherit gfp flags from upper calls so that it can use GFP_KERNEL if it was triggered by a sctp_sendmsg call or similar. All others, like retransmits or flushes started from BH, are still allocated using GFP_ATOMIC. In netperf tests this didn't result in any performance drawbacks when memory is not too fragmented and made it trigger ENOMEM way less often. Signed-off-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-03-13ovs: allow nl 'flow set' to use ufid without flow keySamuel Gauthier1-11/+17
When we want to change a flow using netlink, we have to identify it to be able to perform a lookup. Both the flow key and unique flow ID (ufid) are valid identifiers, but we always have to specify the flow key in the netlink message. When both attributes are there, the ufid is used. The flow key is used to validate the actions provided by the userland. This commit allows to use the ufid without having to provide the flow key, as it is already done in the netlink 'flow get' and 'flow del' path. The flow key remains mandatory when an action is provided. Signed-off-by: Samuel Gauthier <[email protected]> Reviewed-by: Simon Horman <[email protected]> Acked-by: Pravin B Shelar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-03-13netconf: add macro to represent all attributesZhang Shengju2-32/+44
This patch adds macro NETCONFA_ALL to represent all type of netconf attributes for IPv4 and IPv6. Signed-off-by: Zhang Shengju <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-03-13sctp: fix the transports round robin issue when init is retransmittedXin Long2-2/+2
prior to this patch, at the beginning if we have two paths in one assoc, they may have the same params other than the last_time_heard, it will try the paths like this: 1st cycle try trans1 fail. then trans2 is selected.(cause it's last_time_heard is after trans1). 2nd cycle: try trans2 fail then trans2 is selected.(cause it's last_time_heard is after trans1). 3rd cycle: try trans2 fail then trans2 is selected.(cause it's last_time_heard is after trans1). .... trans1 will never have change to be selected, which is not what we expect. we should keeping round robin all the paths if they are just added at the beginning. So at first every tranport's last_time_heard should be initialized 0, so that we ensure they have the same value at the beginning, only by this, all the transports could get equal chance to be selected. Then for sctp_trans_elect_best, it should return the trans_next one when *trans == *trans_next, so that we can try next if it fails, but now it always return trans. so we can fix it by exchanging these two params when we calls sctp_trans_elect_tie(). Fixes: 4c47af4d5eb2 ('net: sctp: rework multihoming retransmission path selection to rfc4960') Signed-off-by: Xin Long <[email protected]> Acked-by: Daniel Borkmann <[email protected]> Acked-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-03-13rxrpc: Replace all unsigned with unsigned intDavid Howells8-39/+39
Replace all "unsigned" types with "unsigned int" types. Reported-by: David Miller <[email protected]> Signed-off-by: David Howells <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-03-13gro: Defer clearing of flush bit in tunnel pathsAlexander Duyck2-4/+2
This patch updates the GRO handlers for GRE, VXLAN, GENEVE, and FOU so that we do not clear the flush bit until after we have called the next level GRO handler. Previously this was being cleared before parsing through the list of frames, however this resulted in several paths where either the bit needed to be reset but wasn't as in the case of FOU, or cases where it was being set as in GENEVE. By just deferring the clearing of the bit until after the next level protocol has been parsed we can avoid any unnecessary bit twiddling and avoid bugs. Signed-off-by: Alexander Duyck <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-03-12netfilter: x_tables: check for size overflowFlorian Westphal1-0/+3
Ben Hawkes says: integer overflow in xt_alloc_table_info, which on 32-bit systems can lead to small structure allocation and a copy_from_user based heap corruption. Reported-by: Ben Hawkes <[email protected]> Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2016-03-11bpf: support flow label for bpf_skb_{set, get}_tunnel_keyDaniel Borkmann1-2/+12
This patch extends bpf_tunnel_key with a tunnel_label member, that maps to ip_tunnel_key's label so underlying backends like vxlan and geneve can propagate the label to udp_tunnel6_xmit_skb(), where it's being set in the IPv6 header. It allows for having 20 more bits to encode/decode flow related meta information programmatically. Tested with vxlan and geneve. Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-03-11ip_tunnel: add support for setting flow label via collect metadataDaniel Borkmann2-4/+4
This patch extends udp_tunnel6_xmit_skb() to pass in the IPv6 flow label from call sites. Currently, there's no such option and it's always set to zero when writing ip6_flow_hdr(). Add a label member to ip_tunnel_key, so that flow-based tunnels via collect metadata frontends can make use of it. vxlan and geneve will be converted to add flow label support separately. Signed-off-by: Daniel Borkmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-03-11bridge: allow zero ageing timeStephen Hemminger1-3/+8
This fixes a regression in the bridge ageing time caused by: commit c62987bbd8a1 ("bridge: push bridge setting ageing_time down to switchdev") There are users of Linux bridge which use the feature that if ageing time is set to 0 it causes entries to never expire. See: https://www.linuxfoundation.org/collaborate/workgroups/networking/bridge For a pure software bridge, it is unnecessary for the code to have arbitrary restrictions on what values are allowable. Signed-off-by: Stephen Hemminger <[email protected]> Acked-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-03-11net/flower: Fix pointer castAmir Vadai1-6/+6
Cast pointer to unsigned long instead of u64, to fix compilation warning on 32 bit arch, spotted by 0day build. Fixes: 5b33f48 ("net/flower: Introduce hardware offload support") Signed-off-by: Amir Vadai <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-03-11Bluetooth: Fix potential buffer overflow with Add AdvertisingJohan Hedberg1-0/+4
The Add Advertising command handler does the appropriate checks for the AD and Scan Response data, however fails to take into account the general length of the mgmt command itself, which could lead to potential buffer overflows. This patch adds the necessary check that the mgmt command length is consistent with the given ad and scan_rsp lengths. Signed-off-by: Johan Hedberg <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]> Cc: [email protected]
2016-03-11Bluetooth: Fix setting correct flags in ADJohan Hedberg1-1/+3
A recent change added MGMT_ADV_FLAG_DISCOV to the flags returned by get_adv_instance_flags(), however failed to take into account limited discoverable mode. This patch fixes the issue by setting the correct discoverability flag in the AD data. Signed-off-by: Johan Hedberg <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2016-03-11netfilter: nft_compat: check match/targetinfo attr sizeFlorian Westphal1-0/+6
We copy according to ->target|matchsize, so check that the netlink attribute (which can include padding and might be larger) contains enough data. Reported-by: Julia Lawall <[email protected]> Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2016-03-11Merge tag 'ipvs-fixes-for-v4.5' of ↵Pablo Neira Ayuso3-12/+35
https://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs Simon Horman says: ==================== please consider these IPVS fixes for v4.5 or if it is too late please consider them for v4.6. * Arnd Bergman has corrected an error whereby the SIP persistence engine may incorrectly access protocol fields * Julian Anastasov has corrected a problem reported by Jiri Bohac with the connection rescheduling mechanism added in 3.10 when new SYNs in connection to dead real server can be redirected to another real server. * Marco Angaroni resolved a problem in the SIP persistence engine whereby the Call-ID could not be found if it was at the beginning of a SIP message. ==================== Signed-off-by: Pablo Neira Ayuso <[email protected]>