aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2015-01-30irda: use msecs_to_jiffies for conversionsNicholas Mc Guire1-3/+5
This is only an API consolidation and should make things more readable it replaces var * HZ / 1000 constructs by msecs_to_jiffies(var). Signed-off-by: Nicholas Mc Guire <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-30net: Fix vlan_get_protocol for stacked vlanToshiaki Makita1-30/+1
vlan_get_protocol() could not get network protocol if a skb has a 802.1ad vlan tag or multiple vlans, which caused incorrect checksum calculation in several drivers. Fix vlan_get_protocol() to retrieve network protocol instead of incorrect vlan protocol. As the logic is the same as skb_network_protocol(), create a common helper function __vlan_get_protocol() and call it from existing functions. Signed-off-by: Toshiaki Makita <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-30net: mark some potential candidates __read_mostlyDaniel Borkmann4-5/+5
They are all either written once or extremly rarely (e.g. from init code), so we can move them to the .data..read_mostly section. Signed-off-by: Daniel Borkmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-30net: sctp: fix passing wrong parameter header to param_type2af in ↵Saran Maruti Ramanara1-1/+1
sctp_process_param When making use of RFC5061, section 4.2.4. for setting the primary IP address, we're passing a wrong parameter header to param_type2af(), resulting always in NULL being returned. At this point, param.p points to a sctp_addip_param struct, containing a sctp_paramhdr (type = 0xc004, length = var), and crr_id as a correlation id. Followed by that, as also presented in RFC5061 section 4.2.4., comes the actual sctp_addr_param, which also contains a sctp_paramhdr, but this time with the correct type SCTP_PARAM_IPV{4,6}_ADDRESS that param_type2af() can make use of. Since we already hold a pointer to addr_param from previous line, just reuse it for param_type2af(). Fixes: d6de3097592b ("[SCTP]: Add the handling of "Set Primary IP Address" parameter to INIT") Signed-off-by: Saran Maruti Ramanara <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Vlad Yasevich <[email protected]> Acked-by: Neil Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-30netlink: fix wrong subscription bitmask to group mapping inPablo Neira1-2/+2
The subscription bitmask passed via struct sockaddr_nl is converted to the group number when calling the netlink_bind() and netlink_unbind() callbacks. The conversion is however incorrect since bitmask (1 << 0) needs to be mapped to group number 1. Note that you cannot specify the group number 0 (usually known as _NONE) from setsockopt() using NETLINK_ADD_MEMBERSHIP since this is rejected through -EINVAL. This problem became noticeable since 97840cb ("netfilter: nfnetlink: fix insufficient validation in nfnetlink_bind") when binding to bitmask (1 << 0) in ctnetlink. Reported-by: Andre Tomt <[email protected]> Reported-by: Ivan Delalande <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-30netfilter: nft_lookup: add missing attribute validation for NFTA_LOOKUP_SET_IDPatrick McHardy1-0/+1
Signed-off-by: Patrick McHardy <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-01-30netfilter: nft_compat: add ebtables supportArturo Borrero1-6/+57
This patch extends nft_compat to support ebtables extensions. ebtables verdict codes are translated to the ones used by the nf_tables engine, so we can properly use ebtables target extensions from nft_compat. This patch extends previous work by Giuseppe Longo <[email protected]>. Signed-off-by: Arturo Borrero Gonzalez <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-01-30netfilter: nf_tables: fix leaks in error path of nf_tables_newchain()Pablo Neira Ayuso1-2/+6
Release statistics and module refcount on memory allocation problems. Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-01-30xprtrdma: Update the GFP flags used in xprt_rdma_allocate()Chuck Lever1-2/+5
Reflect the more conservative approach used in the socket transport's version of this transport method. An RPC buffer allocation should avoid forcing not just FS activity, but any I/O. In particular, two recent changes missed updating xprtrdma: - Commit c6c8fe79a83e ("net, sunrpc: suppress allocation warning ...") - Commit a564b8f03986 ("nfs: enable swap on NFS") Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-01-30xprtrdma: Clean up after adding regbuf managementChuck Lever2-11/+2
rpcrdma_{de}register_internal() are used only in verbs.c now. MAX_RPCRDMAHDR is no longer used and can be removed. Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Steve Wise <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-01-30xprtrdma: Allocate zero pad separately from rpcrdma_bufferChuck Lever3-23/+13
Use the new rpcrdma_alloc_regbuf() API to shrink the amount of contiguous memory needed for a buffer pool by moving the zero pad buffer into a regbuf. This is for consistency with the other uses of internally registered memory. Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Steve Wise <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-01-30xprtrdma: Allocate RPC/RDMA receive buffer separately from struct rpcrdma_repChuck Lever3-23/+23
The rr_base field is currently the buffer where RPC replies land. An RPC/RDMA reply header lands in this buffer. In some cases an RPC reply header also lands in this buffer, just after the RPC/RDMA header. The inline threshold is an agreed-on size limit for RDMA SEND operations that pass from server and client. The sum of the RPC/RDMA reply header size and the RPC reply header size must be less than this threshold. The largest RDMA RECV that the client should have to handle is the size of the inline threshold. The receive buffer should thus be the size of the inline threshold, and not related to RPCRDMA_MAX_SEGS. RPC replies received via RDMA WRITE (long replies) are caught in rq_rcv_buf, which is the second half of the RPC send buffer. Ie, such replies are not involved in any way with rr_base. Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-01-30xprtrdma: Allocate RPC/RDMA send buffer separately from struct rpcrdma_reqChuck Lever4-29/+19
The rl_base field is currently the buffer where each RPC/RDMA call header is built. The inline threshold is an agreed-on size limit to for RDMA SEND operations that pass between client and server. The sum of the RPC/RDMA header size and the RPC header size must be less than or equal to this threshold. Increasing the r/wsize maximum will require MAX_SEGS to grow significantly, but the inline threshold size won't change (both sides agree on it). The server's inline threshold doesn't change. Since an RPC/RDMA header can never be larger than the inline threshold, make all RPC/RDMA header buffers the size of the inline threshold. Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Steve Wise <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-01-30xprtrdma: Allocate RPC send buffer separately from struct rpcrdma_reqChuck Lever4-104/+78
Because internal memory registration is an expensive and synchronous operation, xprtrdma pre-registers send and receive buffers at mount time, and then re-uses them for each RPC. A "hardway" allocation is a memory allocation and registration that replaces a send buffer during the processing of an RPC. Hardway must be done if the RPC send buffer is too small to accommodate an RPC's call and reply headers. For xprtrdma, each RPC send buffer is currently part of struct rpcrdma_req so that xprt_rdma_free(), which is passed nothing but the address of an RPC send buffer, can find its matching struct rpcrdma_req and rpcrdma_rep quickly via container_of / offsetof. That means that hardway currently has to replace a whole rpcrmda_req when it replaces an RPC send buffer. This is often a fairly hefty chunk of contiguous memory due to the size of the rl_segments array and the fact that both the send and receive buffers are part of struct rpcrdma_req. Some obscure re-use of fields in rpcrdma_req is done so that xprt_rdma_free() can detect replaced rpcrdma_req structs, and restore the original. This commit breaks apart the RPC send buffer and struct rpcrdma_req so that increasing the size of the rl_segments array does not change the alignment of each RPC send buffer. (Increasing rl_segments is needed to bump up the maximum r/wsize for NFS/RDMA). This change opens up some interesting possibilities for improving the design of xprt_rdma_allocate(). xprt_rdma_allocate() is now the one place where RPC send buffers are allocated or re-allocated, and they are now always left in place by xprt_rdma_free(). A large re-allocation that includes both the rl_segments array and the RPC send buffer is no longer needed. Send buffer re-allocation becomes quite rare. Good send buffer alignment is guaranteed no matter what the size of the rl_segments array is. Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Steve Wise <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-01-30xprtrdma: Add struct rpcrdma_regbuf and helpersChuck Lever2-0/+98
There are several spots that allocate a buffer via kmalloc (usually contiguously with another data structure) and then register that buffer internally. I'd like to split the buffers out of these data structures to allow the data structures to scale. Start by adding functions that can kmalloc and register a buffer, and can manage/preserve the buffer's associated ib_sge and ib_mr fields. Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Steve Wise <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-01-30xprtrdma: Refactor rpcrdma_buffer_create() and rpcrdma_buffer_destroy()Chuck Lever1-53/+95
Move the details of how to create and destroy rpcrdma_req and rpcrdma_rep structures into helper functions. Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Steve Wise <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-01-30xprtrdma: Simplify synopsis of rpcrdma_buffer_create()Chuck Lever3-7/+7
Clean up: There is one call site for rpcrdma_buffer_create(). All of the arguments there are fields of an rpcrdma_xprt. Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Steve Wise <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-01-30xprtrdma: Take struct ib_qp_attr and ib_qp_init_attr off the stackChuck Lever2-7/+10
Reduce stack footprint of the connection upcall handler function. Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Steve Wise <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-01-30xprtrdma: Take struct ib_device_attr off the stackChuck Lever2-24/+14
Device attributes are large, and are used in more than one place. Stash a copy in dynamically allocated memory. Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Steve Wise <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-01-30xprtrdma: Free the pd if ib_query_qp() failsChuck Lever1-3/+7
If ib_query_qp() fails or the memory registration mode isn't supported, don't leak the PD. An orphaned IB/core resource will cause IB module removal to hang. Fixes: bd7ed1d13304 ("RPC/RDMA: check selected memory registration ...") Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Steve Wise <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-01-30xprtrdma: Remove rpcrdma_ep::rep_func and ::rep_xprtChuck Lever4-8/+6
Clean up: The rep_func field always refers to rpcrdma_conn_func(). rep_func should have been removed by commit b45ccfd25d50 ("xprtrdma: Remove MEMWINDOWS registration modes"). Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Steve Wise <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-01-30xprtrdma: Move credit update to RPC reply handlerChuck Lever3-16/+10
Reduce work in the receive CQ handler, which can be run at hardware interrupt level, by moving the RPC/RDMA credit update logic to the RPC reply handler. This has some additional benefits: More header sanity checking is done before trusting the incoming credit value, and the receive CQ handler no longer touches the RPC/RDMA header (the CPU stalls while waiting for the header contents to be brought into the cache). This further extends work begun by commit e7ce710a8802 ("xprtrdma: Avoid deadlock when credit window is reset"). Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Steve Wise <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-01-30xprtrdma: Remove rl_mr field, and the mr_chunk unionChuck Lever2-17/+13
Clean up: Since commit 0ac531c18323 ("xprtrdma: Remove REGISTER memory registration mode"), the rl_mr pointer is no longer used anywhere. After removal, there's only a single member of the mr_chunk union, so mr_chunk can be removed as well, in favor of a single pointer field. Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Steve Wise <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-01-30xprtrdma: Remove rpcrdma_ep::rep_iaChuck Lever2-2/+0
Clean up: This field is not used. Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Steve Wise <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-01-30xprtrdma: Rename "xprt" and "rdma_connect" fields in struct rpcrdma_xprtChuck Lever2-12/+13
Clean up: Use consistent field names in struct rpcrdma_xprt. Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Steve Wise <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-01-30xprtrdma: Clean up hdrlenChuck Lever1-5/+7
Clean up: Replace naked integers with a documenting macro. Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Steve Wise <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-01-30xprtrdma: Display XIDs in host byte orderChuck Lever1-3/+5
xprtsock.c and the backchannel code display XIDs in host byte order. Follow suit in xprtrdma. Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Steve Wise <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-01-30xprtrdma: Modernize htonl and ntohlChuck Lever1-22/+26
Clean up: Replace htonl and ntohl with the be32 equivalents. Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Steve Wise <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-01-30xprtrdma: human-readable completion statusChuck Lever1-13/+57
Make it easier to grep the system log for specific error conditions. The wc.opcode field is not included because opcode numbers are sparse, and because wc.opcode is not necessarily valid when completion reports an error. Signed-off-by: Chuck Lever <[email protected]> Reviewed-by: Steve Wise <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-01-30ipvs: rerouting to local clients is not needed anymoreJulian Anastasov1-11/+22
commit f5a41847acc5 ("ipvs: move ip_route_me_harder for ICMP") from 2.6.37 introduced ip_route_me_harder() call for responses to local clients, so that we can provide valid rt_src after SNAT. It was used by TCP to provide valid daddr for ip_send_reply(). After commit 0a5ebb8000c5 ("ipv4: Pass explicit daddr arg to ip_send_reply()." from 3.0 this rerouting is not needed anymore and should be avoided, especially in LOCAL_IN. Fixes 3.12.33 crash in xfrm reported by Florian Wiessner: "3.12.33 - BUG xfrm_selector_match+0x25/0x2f6" Reported-by: Smart Weblications GmbH - Florian Wiessner <[email protected]> Tested-by: Smart Weblications GmbH - Florian Wiessner <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Signed-off-by: Simon Horman <[email protected]>
2015-01-29ipv4: Don't increase PMTU with Datagram Too Big message.Li Wei1-0/+3
RFC 1191 said, "a host MUST not increase its estimate of the Path MTU in response to the contents of a Datagram Too Big message." Signed-off-by: Li Wei <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-29dev: add per net_device packet type chainsSalam Noureddine1-48/+84
When many pf_packet listeners are created on a lot of interfaces the current implementation using global packet type lists scales poorly. This patch adds per net_device packet type lists to fix this problem. The patch was originally written by Eric Biederman for linux-2.6.29. Tested on linux-3.16. Signed-off-by: "Eric W. Biederman" <[email protected]> Signed-off-by: Salam Noureddine <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-29rtnetlink: pass link_net to the newlink handlerNicolas Dichtel1-1/+1
When IFLA_LINK_NETNSID is used, the netdevice should be built in this link netns and moved at the end to another netns (pointed by the socket netns or IFLA_NET_NS_[PID|FD]). Existing user of the newlink handler will use the netns argument (src_net) to find a link netdevice or to check some other information into the link netns. For example, to find a netdevice, two information are required: an ifindex (usually from IFLA_LINK) and a netns (this link netns). Note: when using IFLA_LINK_NETNSID and IFLA_NET_NS_[PID|FD], a user may create a netdevice that stands in netnsX and with its link part in netnsY, by sending a rtnl message from netnsZ. Signed-off-by: Nicolas Dichtel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-29caif: remove wrong dev_net_set() callNicolas Dichtel1-1/+0
src_net points to the netns where the netlink message has been received. This netns may be different from the netns where the interface is created (because the user may add IFLA_NET_NS_[PID|FD]). In this case, src_net is the link netns. It seems wrong to override the netns in the newlink() handler because if it was not already src_net, it means that the user explicitly asks to create the netdevice in another netns. CC: Sjur Brændeland <[email protected]> CC: Dmitry Tarnyagin <[email protected]> Fixes: 8391c4aab1aa ("caif: Bugfixes in CAIF netdevice for close and flow control") Fixes: c41254006377 ("caif-hsi: Add rtnl support") Signed-off-by: Nicolas Dichtel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-29Bluetooth: Fix sending Read Remote Extended Features commandSzymon Janc1-1/+2
This command should only be used if remote device reports that it supports extended features. Otherwise command will fail and connection will be dropped. Some devices support SSP but don't support extended features so current check for SSP support is not enought. Instead of checking for SSP support just check if both ends support Extended Feature. < HCI Command: Create Connection (0x01|0x0005) plen 13 Address: D0:9C:30:00:19:6F (Foster Electric Company, Limited) Packet type: 0xcc18 DM1 may be used DH1 may be used DM3 may be used DH3 may be used DM5 may be used DH5 may be used Page scan repetition mode: R1 (0x01) Page scan mode: Mandatory (0x00) Clock offset: 0x94c8 Role switch: Allow slave (0x01) > HCI Event: Command Status (0x0f) plen 4 Create Connection (0x01|0x0005) ncmd 1 Status: Success (0x00) > HCI Event: Connect Complete (0x03) plen 11 Status: Success (0x00) Handle: 5 Address: D0:9C:30:00:19:6F (Foster Electric Company, Limited) Link type: ACL (0x01) Encryption: Disabled (0x00) < HCI Command: Read Remote Supported Features (0x01|0x001b) plen 2 Handle: 5 > HCI Event: Command Status (0x0f) plen 4 Read Remote Supported Features (0x01|0x001b) ncmd 1 Status: Success (0x00) > HCI Event: Page Scan Repetition Mode Change (0x20) plen 7 Address: D0:9C:30:00:19:6F (Foster Electric Company, Limited) Page scan repetition mode: R1 (0x01) > HCI Event: Read Remote Supported Features (0x0b) plen 11 Status: Success (0x00) Handle: 5 Features: 0xff 0xff 0x8f 0xfe 0xdb 0xff 0x5b 0x07 3 slot packets 5 slot packets Encryption Slot offset Timing accuracy Role switch Hold mode Sniff mode Park state Power control requests Channel quality driven data rate (CQDDR) SCO link HV2 packets HV3 packets u-law log synchronous data A-law log synchronous data CVSD synchronous data Paging parameter negotiation Power control Transparent synchronous data Broadcast Encryption Enhanced Data Rate ACL 2 Mbps mode Enhanced Data Rate ACL 3 Mbps mode Enhanced inquiry scan Interlaced inquiry scan Interlaced page scan RSSI with inquiry results Extended SCO link (EV3 packets) EV4 packets EV5 packets AFH capable slave AFH classification slave LE Supported (Controller) 3-slot Enhanced Data Rate ACL packets 5-slot Enhanced Data Rate ACL packets Sniff subrating Pause encryption AFH capable master AFH classification master Enhanced Data Rate eSCO 2 Mbps mode Enhanced Data Rate eSCO 3 Mbps mode 3-slot Enhanced Data Rate eSCO packets Extended Inquiry Response Simultaneous LE and BR/EDR (Controller) Secure Simple Pairing Encapsulated PDU Non-flushable Packet Boundary Flag Link Supervision Timeout Changed Event Inquiry TX Power Level Enhanced Power Control < HCI Command: Read Remote Extended Features (0x01|0x001c) plen 3 Handle: 5 Page: 1 > HCI Event: Command Status (0x0f) plen 4 Read Remote Extended Features (0x01|0x001c) ncmd 1 Status: Command Disallowed (0x0c) < HCI Command: Read Clock Offset (0x01|0x001f) plen 2 Handle: 5 > HCI Event: Command Status (0x0f) plen 4 Read Clock Offset (0x01|0x001f) ncmd 1 Status: Success (0x00) < HCI Command: Disconnect (0x01|0x0006) plen 3 Handle: 5 Reason: Remote User Terminated Connection (0x13) Signed-off-by: Szymon Janc <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2015-01-28tcp: ipv4: initialize unicast_sock sk_pacing_rateEric Dumazet1-0/+1
When I added sk_pacing_rate field, I forgot to initialize its value in the per cpu unicast_sock used in ip_send_unicast_reply() This means that for sch_fq users, RST packets, or ACK packets sent on behalf of TIME_WAIT sockets might be sent to slowly or even dropped once we reach the per flow limit. Signed-off-by: Eric Dumazet <[email protected]> Fixes: 95bd09eb2750 ("tcp: TSO packets automatic sizing") Signed-off-by: David S. Miller <[email protected]>
2015-01-28pkt_sched: fq: remove useless TIME_WAIT checkEric Dumazet1-2/+2
TIME_WAIT sockets are not owning any skb. ip_send_unicast_reply() and tcp_v6_send_response() both use regular sockets. We can safely remove a test in sch_fq and save one cache line miss, as sk_state is far away from sk_pacing_rate. Tested at Google for about one year. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-28act_connmark: fix dependencies betterArnd Bergmann1-1/+1
NET_ACT_CONNMARK fails to build if NF_CONNTRACK_MARK is disabled, and d7924450e14ea4 ("act_connmark: Add missing dependency on NF_CONNTRACK_MARK") fixed that case, but missed the cased where NF_CONNTRACK is a loadable module. This adds the second dependency to ensure that NET_ACT_CONNMARK can only be built-in if NF_CONNTRACK is also part of the kernel rather than a loadable module. Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-28net: remove sock_iocbChristoph Hellwig3-103/+43
The sock_iocb structure is allocate on stack for each read/write-like operation on sockets, and contains various fields of which only the embedded msghdr and sometimes a pointer to the scm_cookie is ever used. Get rid of the sock_iocb and put a msghdr directly on the stack and pass the scm_cookie explicitly to netlink_mmap_sendmsg. Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-28openvswitch: Add support for checksums on UDP tunnels.Jesse Gross3-6/+9
Currently, it isn't possible to request checksums on the outer UDP header of tunnels - the TUNNEL_CSUM flag is ignored. This adds support for requesting that UDP checksums be computed on transmit and properly reported if they are present on receive. Signed-off-by: Jesse Gross <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-28Merge tag 'nfc-next-3.20-1' of ↵David S. Miller5-59/+84
git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next NFC: 3.20 first pull request This is the first NFC pull request for 3.20. With this one we have: - Secure element support for the ST Micro st21nfca driver. This depends on a few HCI internal changes in order for example to support more than one secure element per controller. - ACPI support for NXP's pn544 HCI driver. This controller is found on many x86 SoCs and is typically enumerated on the ACPI bus there. - A few st21nfca and st21nfcb fixes. Signed-off-by: David S. Miller <[email protected]>
2015-01-28bridge: dont send notification when skb->len == 0 in rtnl_bridge_notifyRoopa Prabhu1-1/+5
Reported in: https://bugzilla.kernel.org/show_bug.cgi?id=92081 This patch avoids calling rtnl_notify if the device ndo_bridge_getlink handler does not return any bytes in the skb. Alternately, the skb->len check can be moved inside rtnl_notify. For the bridge vlan case described in 92081, there is also a fix needed in bridge driver to generate a proper notification. Will fix that in subsequent patch. v2: rebase patch on net tree Signed-off-by: Roopa Prabhu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-28tcp: fix timing issue in CUBIC slope calculationNeal Cardwell1-0/+8
This patch fixes a bug in CUBIC that causes cwnd to increase slightly too slowly when multiple ACKs arrive in the same jiffy. If cwnd is supposed to increase at a rate of more than once per jiffy, then CUBIC was sometimes too slow. Because the bic_target is calculated for a future point in time, calculated with time in jiffies, the cwnd can increase over the course of the jiffy while the bic_target calculated as the proper CUBIC cwnd at time t=tcp_time_stamp+rtt does not increase, because tcp_time_stamp only increases on jiffy tick boundaries. So since the cnt is set to: ca->cnt = cwnd / (bic_target - cwnd); as cwnd increases but bic_target does not increase due to jiffy granularity, the cnt becomes too large, causing cwnd to increase too slowly. For example: - suppose at the beginning of a jiffy, cwnd=40, bic_target=44 - so CUBIC sets: ca->cnt = cwnd / (bic_target - cwnd) = 40 / (44 - 40) = 40/4 = 10 - suppose we get 10 acks, each for 1 segment, so tcp_cong_avoid_ai() increases cwnd to 41 - so CUBIC sets: ca->cnt = cwnd / (bic_target - cwnd) = 41 / (44 - 41) = 41 / 3 = 13 So now CUBIC will wait for 13 packets to be ACKed before increasing cwnd to 42, insted of 10 as it should. The fix is to avoid adjusting the slope (determined by ca->cnt) multiple times within a jiffy, and instead skip to compute the Reno cwnd, the "TCP friendliness" code path. Reported-by: Eyal Perry <[email protected]> Signed-off-by: Neal Cardwell <[email protected]> Signed-off-by: Yuchung Cheng <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-28tcp: fix stretch ACK bugs in CUBICNeal Cardwell1-22/+9
Change CUBIC to properly handle stretch ACKs in additive increase mode by passing in the count of ACKed packets to tcp_cong_avoid_ai(). In addition, because we are now precisely accounting for stretch ACKs, including delayed ACKs, we can now remove the delayed ACK tracking and estimation code that tracked recent delayed ACK behavior in ca->delayed_ack. Reported-by: Eyal Perry <[email protected]> Signed-off-by: Neal Cardwell <[email protected]> Signed-off-by: Yuchung Cheng <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-28tcp: fix stretch ACK bugs in RenoNeal Cardwell1-4/+6
Change Reno to properly handle stretch ACKs in additive increase mode by passing in the count of ACKed packets to tcp_cong_avoid_ai(). In addition, if snd_cwnd crosses snd_ssthresh during slow start processing, and we then exit slow start mode, we need to carry over any remaining "credit" for packets ACKed and apply that to additive increase by passing this remaining "acked" count to tcp_cong_avoid_ai(). Reported-by: Eyal Perry <[email protected]> Signed-off-by: Neal Cardwell <[email protected]> Signed-off-by: Yuchung Cheng <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-28tcp: fix the timid additive increase on stretch ACKsNeal Cardwell1-6/+9
tcp_cong_avoid_ai() was too timid (snd_cwnd increased too slowly) on "stretch ACKs" -- cases where the receiver ACKed more than 1 packet in a single ACK. For example, suppose w is 10 and we get a stretch ACK for 20 packets, so acked is 20. We ought to increase snd_cwnd by 2 (since acked/w = 20/10 = 2), but instead we were only increasing cwnd by 1. This patch fixes that behavior. Reported-by: Eyal Perry <[email protected]> Signed-off-by: Neal Cardwell <[email protected]> Signed-off-by: Yuchung Cheng <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-28tcp: stretch ACK fixes prepNeal Cardwell6-9/+13
LRO, GRO, delayed ACKs, and middleboxes can cause "stretch ACKs" that cover more than the RFC-specified maximum of 2 packets. These stretch ACKs can cause serious performance shortfalls in common congestion control algorithms that were designed and tuned years ago with receiver hosts that were not using LRO or GRO, and were instead politely ACKing every other packet. This patch series fixes Reno and CUBIC to handle stretch ACKs. This patch prepares for the upcoming stretch ACK bug fix patches. It adds an "acked" parameter to tcp_cong_avoid_ai() to allow for future fixes to tcp_cong_avoid_ai() to correctly handle stretch ACKs, and changes all congestion control algorithms to pass in 1 for the ACKed count. It also changes tcp_slow_start() to return the number of packet ACK "credits" that were not processed in slow start mode, and can be processed by the congestion control module in additive increase mode. In future patches we will fix tcp_cong_avoid_ai() to handle stretch ACKs, and fix Reno and CUBIC handling of stretch ACKs in slow start and additive increase mode. Reported-by: Eyal Perry <[email protected]> Signed-off-by: Neal Cardwell <[email protected]> Signed-off-by: Yuchung Cheng <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-29Bluetooth: Move smp_unregister() into hci_dev_do_close() functionMarcel Holtmann1-6/+2
The smp_unregister() function needs to be called every time the controller is powered down. There are multiple entry points when this can happen. One is "hciconfig hci0 reset" which will throw a WARN_ON when LE support has been enabled. [ 78.564620] WARNING: CPU: 0 PID: 148 at net/bluetooth/smp.c:3075 smp_register+0xf1/0x170() [ 78.564622] Modules linked in: [ 78.564628] CPU: 0 PID: 148 Comm: kworker/u3:1 Not tainted 3.19.0-rc4-devel+ #404 [ 78.564629] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS [ 78.564635] Workqueue: hci0 hci_rx_work [ 78.564638] ffffffff81b4a7a2 ffff88001cb2fb38 ffffffff8161d881 0000000080000000 [ 78.564642] 0000000000000000 ffff88001cb2fb78 ffffffff8103b870 696e55206e6f6f6d [ 78.564645] ffff88001d965000 0000000000000000 0000000000000000 ffff88001d965000 [ 78.564648] Call Trace: [ 78.564655] [<ffffffff8161d881>] dump_stack+0x4f/0x7b [ 78.564662] [<ffffffff8103b870>] warn_slowpath_common+0x80/0xc0 [ 78.564667] [<ffffffff81544b00>] ? add_uuid+0x1f0/0x1f0 [ 78.564671] [<ffffffff8103b955>] warn_slowpath_null+0x15/0x20 [ 78.564674] [<ffffffff81562d81>] smp_register+0xf1/0x170 [ 78.564680] [<ffffffff81081236>] ? lock_timer_base.isra.30+0x26/0x50 [ 78.564683] [<ffffffff81544bf0>] powered_complete+0xf0/0x120 [ 78.564688] [<ffffffff8152e622>] hci_req_cmd_complete+0x82/0x260 [ 78.564692] [<ffffffff8153554f>] hci_cmd_complete_evt+0x6cf/0x2e20 [ 78.564697] [<ffffffff81623e43>] ? _raw_spin_unlock_irqrestore+0x13/0x30 [ 78.564701] [<ffffffff8106b0af>] ? __wake_up_sync_key+0x4f/0x60 [ 78.564705] [<ffffffff8153a2ab>] hci_event_packet+0xbcb/0x2e70 [ 78.564709] [<ffffffff814094d3>] ? skb_release_all+0x23/0x30 [ 78.564711] [<ffffffff81409529>] ? kfree_skb+0x29/0x40 [ 78.564715] [<ffffffff815296c8>] hci_rx_work+0x1c8/0x3f0 [ 78.564719] [<ffffffff8105bd91>] ? get_parent_ip+0x11/0x50 [ 78.564722] [<ffffffff8105be25>] ? preempt_count_add+0x55/0xb0 [ 78.564727] [<ffffffff8104f65f>] process_one_work+0x12f/0x360 [ 78.564731] [<ffffffff8104ff9b>] worker_thread+0x6b/0x4b0 [ 78.564735] [<ffffffff8104ff30>] ? cancel_delayed_work_sync+0x10/0x10 [ 78.564738] [<ffffffff810542fa>] kthread+0xea/0x100 [ 78.564742] [<ffffffff81620000>] ? __schedule+0x3e0/0x980 [ 78.564745] [<ffffffff81054210>] ? kthread_create_on_node+0x180/0x180 [ 78.564749] [<ffffffff816246ec>] ret_from_fork+0x7c/0xb0 [ 78.564752] [<ffffffff81054210>] ? kthread_create_on_node+0x180/0x180 [ 78.564755] ---[ end trace 8b0d943af76d3736 ]--- This warning is not critical and has only been placed in the code to actually catch this exact situation. To avoid triggering it move the smp_unregister() into hci_dev_do_close() which will now also take care of remove the SMP channel. It is safe to call this function since it only remove the channel if it has been previously registered. Signed-off-by: Marcel Holtmann <[email protected]> Signed-off-by: Johan Hedberg <[email protected]>
2015-01-28Bluetooth: Perform a power cycle when receiving hardware error eventMarcel Holtmann2-1/+24
When receiving a HCI Hardware Error event, the controller should be assumed to be non-functional until issuing a HCI Reset command. The Bluetooth hardware errors are vendor specific and so add a new hdev->hw_error callback that drivers can provide to run extra code to handle the hardware error. After completing the vendor specific error handling perform a full reset of the Bluetooth stack by closing and re-opening the transport. Based-on-patch-by: Johan Hedberg <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]> Signed-off-by: Johan Hedberg <[email protected]>
2015-01-28Bluetooth: Introduce hci_dev_do_reset helper functionMarcel Holtmann1-23/+34
Split the hci_dev_reset ioctl handling into using hci_dev_do_reset helper function. Similar to what has been done with hci_dev_do_open and hci_dev_do_close. Signed-off-by: Marcel Holtmann <[email protected]> Signed-off-by: Johan Hedberg <[email protected]>