aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2016-04-28tcp: Handle eor bit when fragmenting a skbMartin KaFai Lau1-0/+9
When fragmenting a skb, the next_skb should carry the eor from prev_skb. The eor of prev_skb should also be reset. Packetdrill script for testing: ~~~~~~ +0 `sysctl -q -w net.ipv4.tcp_min_tso_segs=10` +0 `sysctl -q -w net.ipv4.tcp_no_metrics_save=1` +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 0.100 < S 0:0(0) win 32792 <mss 1460,sackOK,nop,nop,nop,wscale 7> 0.100 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7> 0.200 < . 1:1(0) ack 1 win 257 0.200 accept(3, ..., ...) = 4 +0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0 0.200 sendto(4, ..., 15330, MSG_EOR, ..., ...) = 15330 0.200 sendto(4, ..., 730, 0, ..., ...) = 730 0.200 > . 1:7301(7300) ack 1 0.200 > . 7301:14601(7300) ack 1 0.300 < . 1:1(0) ack 14601 win 257 0.300 > P. 14601:15331(730) ack 1 0.300 > P. 15331:16061(730) ack 1 0.400 < . 1:1(0) ack 16061 win 257 0.400 close(4) = 0 0.400 > F. 16061:16061(0) ack 1 0.400 < F. 1:1(0) ack 16062 win 257 0.400 > . 16062:16062(0) ack 2 Signed-off-by: Martin KaFai Lau <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Neal Cardwell <[email protected]> Cc: Soheil Hassas Yeganeh <[email protected]> Cc: Willem de Bruijn <[email protected]> Cc: Yuchung Cheng <[email protected]> Acked-by: Eric Dumazet <[email protected]> Acked-by: Soheil Hassas Yeganeh <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-28tcp: Handle eor bit when coalescing skbMartin KaFai Lau2-0/+8
This patch: 1. Prevent next_skb from coalescing to the prev_skb if TCP_SKB_CB(prev_skb)->eor is set 2. Update the TCP_SKB_CB(prev_skb)->eor if coalescing is allowed Packetdrill script for testing: ~~~~~~ +0 `sysctl -q -w net.ipv4.tcp_min_tso_segs=10` +0 `sysctl -q -w net.ipv4.tcp_no_metrics_save=1` +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 0.100 < S 0:0(0) win 32792 <mss 1460,sackOK,nop,nop,nop,wscale 7> 0.100 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7> 0.200 < . 1:1(0) ack 1 win 257 0.200 accept(3, ..., ...) = 4 +0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0 0.200 sendto(4, ..., 730, MSG_EOR, ..., ...) = 730 0.200 sendto(4, ..., 730, MSG_EOR, ..., ...) = 730 0.200 write(4, ..., 11680) = 11680 0.200 > P. 1:731(730) ack 1 0.200 > P. 731:1461(730) ack 1 0.200 > . 1461:8761(7300) ack 1 0.200 > P. 8761:13141(4380) ack 1 0.300 < . 1:1(0) ack 1 win 257 <sack 1461:13141,nop,nop> 0.300 > P. 1:731(730) ack 1 0.300 > P. 731:1461(730) ack 1 0.400 < . 1:1(0) ack 13141 win 257 0.400 close(4) = 0 0.400 > F. 13141:13141(0) ack 1 0.500 < F. 1:1(0) ack 13142 win 257 0.500 > . 13142:13142(0) ack 2 Signed-off-by: Martin KaFai Lau <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Neal Cardwell <[email protected]> Cc: Soheil Hassas Yeganeh <[email protected]> Cc: Willem de Bruijn <[email protected]> Cc: Yuchung Cheng <[email protected]> Acked-by: Eric Dumazet <[email protected]> Acked-by: Soheil Hassas Yeganeh <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-28tcp: Make use of MSG_EOR in tcp_sendmsgMartin KaFai Lau1-2/+5
This patch adds an eor bit to the TCP_SKB_CB. When MSG_EOR is passed to tcp_sendmsg, the eor bit will be set at the skb containing the last byte of the userland's msg. The eor bit will prevent data from appending to that skb in the future. The change in do_tcp_sendpages is to honor the eor set during the previous tcp_sendmsg(MSG_EOR) call. This patch handles the tcp_sendmsg case. The followup patches will handle other skb coalescing and fragment cases. One potential use case is to use MSG_EOR with SOF_TIMESTAMPING_TX_ACK to get a more accurate TCP ack timestamping on application protocol with multiple outgoing response messages (e.g. HTTP2). Packetdrill script for testing: ~~~~~~ +0 `sysctl -q -w net.ipv4.tcp_min_tso_segs=10` +0 `sysctl -q -w net.ipv4.tcp_no_metrics_save=1` +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 0.100 < S 0:0(0) win 32792 <mss 1460,sackOK,nop,nop,nop,wscale 7> 0.100 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7> 0.200 < . 1:1(0) ack 1 win 257 0.200 accept(3, ..., ...) = 4 +0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0 0.200 write(4, ..., 14600) = 14600 0.200 sendto(4, ..., 730, MSG_EOR, ..., ...) = 730 0.200 sendto(4, ..., 730, MSG_EOR, ..., ...) = 730 0.200 > . 1:7301(7300) ack 1 0.200 > P. 7301:14601(7300) ack 1 0.300 < . 1:1(0) ack 14601 win 257 0.300 > P. 14601:15331(730) ack 1 0.300 > P. 15331:16061(730) ack 1 0.400 < . 1:1(0) ack 16061 win 257 0.400 close(4) = 0 0.400 > F. 16061:16061(0) ack 1 0.400 < F. 1:1(0) ack 16062 win 257 0.400 > . 16062:16062(0) ack 2 Signed-off-by: Martin KaFai Lau <[email protected]> Cc: Eric Dumazet <[email protected]> Cc: Neal Cardwell <[email protected]> Cc: Soheil Hassas Yeganeh <[email protected]> Cc: Willem de Bruijn <[email protected]> Cc: Yuchung Cheng <[email protected]> Suggested-by: Eric Dumazet <[email protected]> Acked-by: Eric Dumazet <[email protected]> Acked-by: Soheil Hassas Yeganeh <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-28tcp: remove SKBTX_ACK_TSTAMP since it is redundantSoheil Hassas Yeganeh4-13/+15
The SKBTX_ACK_TSTAMP flag is set in skb_shinfo->tx_flags when the timestamp of the TCP acknowledgement should be reported on error queue. Since accessing skb_shinfo is likely to incur a cache-line miss at the time of receiving the ack, the txstamp_ack bit was added in tcp_skb_cb, which is set iff the SKBTX_ACK_TSTAMP flag is set for an skb. This makes SKBTX_ACK_TSTAMP flag redundant. Remove the SKBTX_ACK_TSTAMP and instead use the txstamp_ack bit everywhere. Note that this frees one bit in shinfo->tx_flags. Signed-off-by: Soheil Hassas Yeganeh <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Suggested-by: Willem de Bruijn <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-28tcp: remove an unnecessary check in tcp_tx_timestampSoheil Hassas Yeganeh1-1/+1
Remove the redundant check for sk->sk_tsflags in tcp_tx_timestamp. tcp_tx_timestamp() receives the tsflags as a parameter. As a result the "sk->sk_tsflags || tsflags" is redundant, since tsflags already includes sk->sk_tsflags plus overrides from control messages. Signed-off-by: Soheil Hassas Yeganeh <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-27net: snmp: kill STATS_BH macrosEric Dumazet1-1/+1
There is nothing related to BH in SNMP counters anymore, since linux-3.0. Rename helpers to use __ prefix instead of _BH prefix, for contexts where preemption is disabled. This more closely matches convention used to update percpu variables. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-27ipv6: kill ICMP6MSGIN_INC_STATS_BH()Eric Dumazet1-1/+1
IPv6 ICMP stats are atomics anyway. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-27ipv6: rename IP6_UPD_PO_STATS_BH()Eric Dumazet1-2/+2
Rename IP6_UPD_PO_STATS_BH() to __IP6_UPD_PO_STATS() Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-27ipv6: rename IP6_INC_STATS_BH()Eric Dumazet6-89/+89
Rename IP6_INC_STATS_BH() to __IP6_INC_STATS() and IP6_ADD_STATS_BH() to __IP6_ADD_STATS() Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-27net: rename NET_{ADD|INC}_STATS_BH()Eric Dumazet23-145/+149
Rename NET_INC_STATS_BH() to __NET_INC_STATS() and NET_ADD_STATS_BH() to __NET_ADD_STATS() Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-27net: rename IP_UPD_PO_STATS_BH()Eric Dumazet1-3/+3
Rename IP_UPD_PO_STATS_BH() to __IP_UPD_PO_STATS() Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-27net: rename IP_ADD_STATS_BH()Eric Dumazet2-4/+4
Rename IP_ADD_STATS_BH() to __IP_ADD_STATS() Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-27net: rename ICMP6_INC_STATS_BH()Eric Dumazet5-14/+14
Rename ICMP6_INC_STATS_BH() to __ICMP6_INC_STATS() Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-27net: rename IP_INC_STATS_BH()Eric Dumazet7-28/+28
Rename IP_INC_STATS_BH() to __IP_INC_STATS(), to better express this is used in non preemptible context. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-27net: sctp: rename SCTP_INC_STATS_BH()Eric Dumazet1-6/+6
Rename SCTP_INC_STATS_BH() to __SCTP_INC_STATS() Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-27net: icmp: rename ICMPMSGIN_INC_STATS_BH()Eric Dumazet1-1/+1
Remove misleading _BH suffix. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-27net: tcp: rename TCP_INC_STATS_BHEric Dumazet6-24/+24
Rename TCP_INC_STATS_BH() to __TCP_INC_STATS() Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-27net: udp: rename UDP_INC_STATS_BH()Eric Dumazet4-46/+46
Rename UDP_INC_STATS_BH() to __UDP_INC_STATS(), and UDP6_INC_STATS_BH() to __UDP6_INC_STATS() Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-27net: rename ICMP_INC_STATS_BH()Eric Dumazet5-13/+13
Rename ICMP_INC_STATS_BH() to __ICMP_INC_STATS() Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-27dccp: rename DCCP_INC_STATS_BH()Eric Dumazet7-16/+16
Rename DCCP_INC_STATS_BH() to __DCCP_INC_STATS() Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-27net: snmp: kill various STATS_USER() helpersEric Dumazet4-44/+43
In the old days (before linux-3.0), SNMP counters were duplicated, one for user context, and one for BH context. After commit 8f0ea0fe3a03 ("snmp: reduce percpu needs by 50%") we have a single copy, and what really matters is preemption being enabled or disabled, since we use this_cpu_inc() or __this_cpu_inc() respectively. We therefore kill SNMP_INC_STATS_USER(), SNMP_ADD_STATS_USER(), NET_INC_STATS_USER(), NET_ADD_STATS_USER(), SCTP_INC_STATS_USER(), SNMP_INC_STATS64_USER(), SNMP_ADD_STATS64_USER(), TCP_ADD_STATS_USER(), UDP_INC_STATS_USER(), UDP6_INC_STATS_USER(), and XFRM_INC_STATS_USER() Following patches will rename __BH helpers to make clear their usage is not tied to BH being disabled. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller6-84/+112
Minor overlapping changes in the conflicts. In the macsec case, the change of the default ID macro name overlapped with the 64-bit netlink attribute alignment fixes in net-next. Signed-off-by: David S. Miller <[email protected]>
2016-04-27net: ipv6: Use passed in table for nexthop lookupsDavid Ahern1-2/+38
Similar to 3bfd847203c6 ("net: Use passed in table for nexthop lookups") for IPv4, if the route spec contains a table id use that to lookup the next hop first and fall back to a full lookup if it fails (per the fix 4c9bcd117918b ("net: Fix nexthop lookups")). Example: root@kenny:~# ip -6 ro ls table red local 2100:1::1 dev lo proto none metric 0 pref medium 2100:1::/120 dev eth1 proto kernel metric 256 pref medium local 2100:2::1 dev lo proto none metric 0 pref medium 2100:2::/120 dev eth2 proto kernel metric 256 pref medium local fe80::e0:f9ff:fe09:3cac dev lo proto none metric 0 pref medium local fe80::e0:f9ff:fe1c:b974 dev lo proto none metric 0 pref medium fe80::/64 dev eth1 proto kernel metric 256 pref medium fe80::/64 dev eth2 proto kernel metric 256 pref medium ff00::/8 dev red metric 256 pref medium ff00::/8 dev eth1 metric 256 pref medium ff00::/8 dev eth2 metric 256 pref medium unreachable default dev lo metric 240 error -113 pref medium root@kenny:~# ip -6 ro add table red 2100:3::/64 via 2100:1::64 RTNETLINK answers: No route to host Route add fails even though 2100:1::64 is a reachable next hop: root@kenny:~# ping6 -I red 2100:1::64 ping6: Warning: source address might be selected on device other than red. PING 2100:1::64(2100:1::64) from 2100:1::1 red: 56 data bytes 64 bytes from 2100:1::64: icmp_seq=1 ttl=64 time=1.33 ms With this patch: root@kenny:~# ip -6 ro add table red 2100:3::/64 via 2100:1::64 root@kenny:~# ip -6 ro ls table red local 2100:1::1 dev lo proto none metric 0 pref medium 2100:1::/120 dev eth1 proto kernel metric 256 pref medium local 2100:2::1 dev lo proto none metric 0 pref medium 2100:2::/120 dev eth2 proto kernel metric 256 pref medium 2100:3::/64 via 2100:1::64 dev eth1 metric 1024 pref medium local fe80::e0:f9ff:fe09:3cac dev lo proto none metric 0 pref medium local fe80::e0:f9ff:fe1c:b974 dev lo proto none metric 0 pref medium fe80::/64 dev eth1 proto kernel metric 256 pref medium fe80::/64 dev eth2 proto kernel metric 256 pref medium ff00::/8 dev red metric 256 pref medium ff00::/8 dev eth1 metric 256 pref medium ff00::/8 dev eth2 metric 256 pref medium unreachable default dev lo metric 240 error -113 pref medium Signed-off-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-27nl80211: use nla_put_u64_64bit() for the remaining u64 attributesJohannes Berg1-14/+22
Nicolas converted most users, but didn't realize some were generated by macros. Convert those over as well. Signed-off-by: Johannes Berg <[email protected]> Acked-by: Nicolas Dichtel <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2016-04-27mac80211: fix statistics leak if dev_alloc_name() failsJohannes Berg1-2/+2
In the case that dev_alloc_name() fails, e.g. because the name was given by the user and already exists, we need to clean up properly and free the per-CPU statistics. Fix that. Cc: [email protected] Fixes: 5a490510ba5f ("mac80211: use per-CPU TX/RX statistics") Signed-off-by: Johannes Berg <[email protected]>
2016-04-26net: remove NETDEV_TX_LOCKED supportFlorian Westphal3-34/+2
No more users in the tree, remove NETDEV_TX_LOCKED support. Adds another hole in softnet_stats struct, but better than keeping the unused collision counter around. Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-26sctp: sctp_diag should fill RMEM_ALLOC with asoc->rmem_alloc when ↵Xin Long1-1/+5
rcvbuf_policy is set For sctp assoc, when rcvbuf_policy is set, it will has it's own rmem_alloc, when we dump asoc info in sctp_diag, we should use that value on RMEM_ALLOC as well, just like WMEM_ALLOC. Signed-off-by: Xin Long <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-26Merge branch 'for-upstream' of ↵David S. Miller10-147/+142
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2016-04-26 Here's another set of Bluetooth & 802.15.4 patches for the 4.7 kernel: - Cleanups & refactoring of ieee802154 & 6lowpan code - Security related additions to ieee802154 and mrf24j40 driver - Memory corruption fix to Bluetooth 6lowpan code - Race condition fix in vhci driver - Enhancements to the atusb 802.15.4 driver Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <[email protected]>
2016-04-26sched: align nlattr properly when neededNicolas Dichtel16-33/+48
Signed-off-by: Nicolas Dichtel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-26neigh: align nlattr properly when neededNicolas Dichtel1-1/+2
Signed-off-by: Nicolas Dichtel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-26rtnl: align nlattr properly when neededNicolas Dichtel1-2/+2
Signed-off-by: Nicolas Dichtel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-26ovs: align nlattr properly when neededNicolas Dichtel1-12/+15
I also fix commit 8b32ab9e6ef1: use nla_total_size_64bit() for OVS_FLOW_ATTR_USED in ovs_flow_cmd_msg_size(). Fixes: 8b32ab9e6ef1 ("ovs: use nla_put_u64_64bit()") Signed-off-by: Nicolas Dichtel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-26sock_diag: align nlattr properly when neededNicolas Dichtel3-6/+10
I also fix the value of INET_DIAG_MAX. It's wrong since commit 8f840e47f190 which is only in net-next right now, thus I didn't make a separate patch. Fixes: 8f840e47f190 ("sctp: add the sctp_diag.c file") Signed-off-by: Nicolas Dichtel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-26net: ipv6: Delete host routes on an ifdownDavid Ahern1-33/+15
It was a simple idea -- save IPv6 configured addresses on a link down so that IPv6 behaves similar to IPv4. As always the devil is in the details and the IPv6 stack as too many behavioral differences from IPv4 making the simple idea more complicated than it needs to be. The current implementation for keeping IPv6 addresses can panic or spit out a warning in one of many paths: 1. IPv6 route gets an IPv4 route as its 'next' which causes a panic in rt6_fill_node while handling a route dump request. 2. rt->dst.obsolete is set to DST_OBSOLETE_DEAD hitting the WARN_ON in fib6_del 3. Panic in fib6_purge_rt because rt6i_ref count is not 1. The root cause of all these is references related to the host route for an address that is retained. So, this patch deletes the host route every time the ifdown loop runs. Since the host route is deleted and will be re-generated an up there is no longer a need for the l3mdev fix up. On the 'admin up' side move addrconf_permanent_addr into the NETDEV_UP event handling so that it runs only once versus on UP and CHANGE events. All of the current panics and warnings appear to be related to addresses on the loopback device, but given the catastrophic nature when a bug is triggered this patch takes the conservative approach and evicts all host routes rather than trying to determine when it can be re-used and when it can not. That can be a later optimizaton if desired. Signed-off-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-26Revert "ipv6: Revert optional address flusing on ifdown."David S. Miller1-12/+150
This reverts commit 841645b5f2dfceac69b78fcd0c9050868d41ea61. Ok, this puts the feature back. I've decided to apply David A.'s bug fix and run with that rather than make everyone wait another whole release for this feature. Signed-off-by: David S. Miller <[email protected]>
2016-04-26cfg80211: Add option to report the bss entry in connect resultKanchanapally, Vidyullatha3-7/+24
Since cfg80211 maintains separate BSS table entries for APs if the same BSSID, SSID pair is seen on multiple channels, it is possible that it can map the current_bss to a BSS entry on the wrong channel. This current_bss will not get flushed unless disconnected and cfg80211 reports a wrong channel as the associated channel. Fix this by introducing a new cfg80211_connect_bss() function which is similar to cfg80211_connect_result(), but it includes an additional parameter: the bss the STA is connected to. This allows drivers to provide the exact bss entry that matches the BSS to which the connection was completed. Reviewed-by: Jouni Malinen <[email protected]> Signed-off-by: Vidyullatha Kanchanapally <[email protected]> Signed-off-by: Sunil Dutt <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2016-04-26cfg80211/nl80211: Add support for NL80211_STA_INFO_RX_DURATIONMohammed Shafi Shajakhan1-1/+2
Add support for the a station statistics netlink attribute: NL80211_STA_INFO_RX_DURATION. If present, this attribute contains the aggregate PPDU duration (in microseconds) for all the frames from the peer. This is useful to help understand the total time spent transmitting to us by all of the connected peers. Signed-off-by: Mohammed Shafi Shajakhan <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2016-04-26ila: add checksum neutral ILA translationsTom Herbert4-15/+105
Support checksum neutral ILA as described in the ILA draft. The low order 16 bits of the identifier are used to contain the checksum adjustment value. The csum-mode parameter is added to described checksum processing. There are three values: - adjust transport checksum (previous behavior) - do checksum neutral mapping - do nothing On output the csum-mode in the ila_params is checked and acted on. If mode is checksum neutral mapping then to mapping and set C-bit. On input, C-bit is checked. If it is set checksum-netural mapping is done (regardless of csum-mode in ila params) and C-bit will be cleared. If it is not set then action in csum-mode is taken. Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-26ila: xlat changesTom Herbert1-69/+34
Change model of xlat to be used only for input where lookup is done on the locator part of an address (comparing to locator_match as key in rhashtable). This is needed for checksum neutral translation which obfuscates the low order 16 bits of the identifier. It also permits hosts to be in muliple ILA domains (each locator can map to a different SIR address). A check is also added to disallow translating non-ILA addresses (check of type in identifier). Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-26ila: Add struct definitions and helpersTom Herbert4-82/+161
Add structures for identifiers, locators, and an ila address which is composed of a locator and identifier and in6_addr can be cast to it. This includes a three bit type field and enums for the types defined in ILA I-D. In ILA lwt don't allow user to set a translation for a non-ILA address (type of identifier is zero meaning it is an IID). This also requires that the destination prefix is at least 65 bytes (64 bit locator and first byte of identifier). Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-26Bluetooth: 6lowpan: Fix memory corruption of ipv6 destination addressGlenn Ruben Bakke1-7/+4
The memcpy of ipv6 header destination address to the skb control block (sbk->cb) in header_create() results in currupted memory when bt_xmit() is issued. The skb->cb is "released" in the return of header_create() making room for lower layer to minipulate the skb->cb. The value retrieved in bt_xmit is not persistent across header creation and sending, and the lower layer will overwrite portions of skb->cb, making the copied destination address wrong. The memory corruption will lead to non-working multicast as the first 4 bytes of the copied destination address is replaced by a value that resolves into a non-multicast prefix. This fix removes the dependency on the skb control block between header creation and send, by moving the destination address memcpy to the send function path (setup_create, which is called from bt_xmit). Signed-off-by: Glenn Ruben Bakke <[email protected]> Acked-by: Jukka Rissanen <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]> Cc: [email protected] # 4.5+
2016-04-25RDS: TCP: Call pskb_extract() helper functionSowmini Varadhan1-11/+3
rds-stress experiments with request size 256 bytes, 8K acks, using 16 threads show a 40% improvment when pskb_extract() replaces the {skb_clone(..); pskb_pull(..); pskb_trim(..);} pattern in the Rx path, so we leverage the perf gain with this commit. Signed-off-by: Sowmini Varadhan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-25skbuff: Add pskb_extract() helper functionSowmini Varadhan1-0/+242
A pattern of skb usage seen in modules such as RDS-TCP is to extract `to_copy' bytes from the received TCP segment, starting at some offset `off' into a new skb `clone'. This is done in the ->data_ready callback, where the clone skb is queued up for rx on the PF_RDS socket, while the parent TCP segment is returned unchanged back to the TCP engine. The existing code uses the sequence clone = skb_clone(..); pskb_pull(clone, off, ..); pskb_trim(clone, to_copy, ..); with the intention of discarding the first `off' bytes. However, skb_clone() + pskb_pull() implies pksb_expand_head(), which ends up doing a redundant memcpy of bytes that will then get discarded in __pskb_pull_tail(). To avoid this inefficiency, this commit adds pskb_extract() that creates the clone, and memcpy's only the relevant header/frag/frag_list to the start of `clone'. pskb_trim() is then invoked to trim clone down to the requested to_copy bytes. Signed-off-by: Sowmini Varadhan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-25codel: split into multiple filesMichal Kazior2-0/+4
It was impossible to include codel.h for the purpose of having access to codel_params or codel_vars structure definitions and using them for embedding in other more complex structures. This splits allows codel.h itself to be treated like any other header file while codel_qdisc.h and codel_impl.h contain function definitions with logic that was previously in codel.h. This copies over copyrights and doesn't involve code changes other than adding a few additional include directives to net/sched/sch*codel.c. Signed-off-by: Michal Kazior <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-25codel: generalize the implementationMichal Kazior2-7/+32
This strips out qdisc specific bits from the code and makes it slightly more reusable. Codel will be used by wireless/mac80211 in the future. Signed-off-by: Michal Kazior <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-25net: better drop monitoring in ip{6}_recv_error()Eric Dumazet2-10/+10
We should call consume_skb(skb) when skb is properly consumed, or kfree_skb(skb) when skb must be dropped in error case. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-25tcp: SYN packets are now simply consumedEric Dumazet1-18/+1
We now have proper per-listener but also per network namespace counters for SYN packets that might be dropped. We replace the kfree_skb() by consume_skb() to be drop monitor [1] friendly, and remove an obsolete comment. FastOpen SYN packets can carry payload in them just fine. [1] perf record -a -g -e skb:kfree_skb sleep 1; perf report Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-25ipv6: Revert optional address flusing on ifdown.David S. Miller1-150/+12
This reverts the following three commits: 70af921db6f8835f4b11c65731116560adb00c14 799977d9aafbf0ca0b9c39b04cbfb16db71302c9 f1705ec197e705b79ea40fe7a2cc5acfa1d3bfac The feature was ill conceived, has terrible semantics, and has added nothing but regressions to the already fragile ipv6 stack. Fixes: f1705ec197e7 ("net: ipv6: Make address flushing on ifdown optional") Signed-off-by: David S. Miller <[email protected]>
2016-04-25wireless: use nla_put_u64_64bit()Nicolas Dichtel1-36/+55
Signed-off-by: Nicolas Dichtel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-04-25netfilter/ipvs: use nla_put_u64_64bit()Nicolas Dichtel1-12/+24
Signed-off-by: Nicolas Dichtel <[email protected]> Signed-off-by: David S. Miller <[email protected]>