aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2015-01-15Bluetooth: Bind the SMP channel registration to management power stateMarcel Holtmann2-3/+12
When the controller gets powered on via the management interface, then register the supported SMP channels. There is no point in registering these channels earlier since it is not know what identity address the controller is going to operate with. When powering down a controller unregister all SMP channels. This is required since a powered down controller is allowed to change its identity address. In addition the SMP channels are only available when the controller is powered via the management interface. When using legacy ioctl, then Bluetooth Low Energy is not supported and registering kernel side SMP integration may actually cause confusion. Signed-off-by: Marcel Holtmann <[email protected]> Signed-off-by: Johan Hedberg <[email protected]>
2015-01-15Bluetooth: Don't register any SMP channel if LE is not supportedMarcel Holtmann1-0/+6
When LE features are not supported, then do not bother registering any kind of SMP channel. Signed-off-by: Marcel Holtmann <[email protected]> Signed-off-by: Johan Hedberg <[email protected]>
2015-01-15Bluetooth: Fix LE SMP channel source address and source address typeMarcel Holtmann1-4/+23
The source address and source address type of the LE SMP channel can either be the public address of the controller or the static random address configured by the host. Right now the public address is used for the LE SMP channel and obviously that is not correct if the controller operates with the configured static random address. Signed-off-by: Marcel Holtmann <[email protected]> Signed-off-by: Johan Hedberg <[email protected]>
2015-01-15Bluetooth: Fix issue with switching BR/EDR back on when disabledMarcel Holtmann1-0/+15
For dual-mode controllers it is possible to disable BR/EDR and operate as LE single mode controllers with a static random address. If that is the case, then refuse switching BR/EDR back on after the controller has been powered. Signed-off-by: Marcel Holtmann <[email protected]> Signed-off-by: Johan Hedberg <[email protected]>
2015-01-15Bluetooth: Show device address type for L2CAP debugfs entriesMarcel Holtmann1-2/+2
The devices address types are BR/EDR Public, LE Public and LE Random and any of these three is valid for L2CAP connections. So show the correct type in the debugfs list. Signed-off-by: Marcel Holtmann <[email protected]> Signed-off-by: Johan Hedberg <[email protected]>
2015-01-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller7-133/+170
Pablo Neira Ayuso says: ==================== netfilter updates for net-next The following patchset contains netfilter updates for net-next, just a bunch of cleanups and small enhancement to selectively flush conntracks in ctnetlink, more specifically the patches are: 1) Rise default number of buckets in conntrack from 16384 to 65536 in systems with >= 4GBytes, patch from Marcelo Leitner. 2) Small refactor to save one level on indentation in xt_osf, from Joe Perches. 3) Remove unnecessary sizeof(char) in nf_log, from Fabian Frederick. 4) Another small cleanup to remove redundant variable in nfnetlink, from Duan Jiong. 5) Fix compilation warning in nfnetlink_cthelper on parisc, from Chen Gang. 6) Fix wrong format in debugging for ctseqadj, from Gao feng. 7) Selective conntrack flushing through the mark for ctnetlink, patch from Kristian Evensen. 8) Remove nf_ct_conntrack_flush_report() exported symbol now that is not required anymore after the selective flushing patch, again from Kristian. ==================== Signed-off-by: David S. Miller <[email protected]>
2015-01-15openvswitch: Support VXLAN Group Policy extensionThomas Graf4-19/+203
Introduces support for the group policy extension to the VXLAN virtual port. The extension is disabled by default and only enabled if the user has provided the respective configuration. ovs-vsctl add-port br0 vxlan0 -- \ set Interface vxlan0 type=vxlan options:exts=gbp The configuration interface to enable the extension is based on a new attribute OVS_VXLAN_EXT_GBP nested inside OVS_TUNNEL_ATTR_EXTENSION which can carry additional extensions as needed in the future. The group policy metadata is stored as binary blob (struct ovs_vxlan_opts) internally just like Geneve options but transported as nested Netlink attributes to user space. Renames the existing TUNNEL_OPTIONS_PRESENT to TUNNEL_GENEVE_OPT with the binary value kept intact, a new flag TUNNEL_VXLAN_OPT is introduced. The attributes OVS_TUNNEL_KEY_ATTR_VXLAN_OPTS and existing OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS are implemented mutually exclusive. Signed-off-by: Thomas Graf <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-15openvswitch: Allow for any level of nesting in flow attributesThomas Graf1-50/+56
nlattr_set() is currently hardcoded to two levels of nesting. This change introduces struct ovs_len_tbl to define minimal length requirements plus next level nesting tables to traverse the key attributes to arbitrary depth. Signed-off-by: Thomas Graf <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-15openvswitch: Rename GENEVE_TUN_OPTS() to TUN_METADATA_OPTS()Thomas Graf3-41/+47
Also factors out Geneve validation code into a new separate function validate_and_copy_geneve_opts(). A subsequent patch will introduce VXLAN options. Rename the existing GENEVE_TUN_OPTS() to reflect its extended purpose of carrying generic tunnel metadata options. Signed-off-by: Thomas Graf <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-15vxlan: Group Policy extensionThomas Graf1-3/+6
Implements supports for the Group Policy VXLAN extension [0] to provide a lightweight and simple security label mechanism across network peers based on VXLAN. The security context and associated metadata is mapped to/from skb->mark. This allows further mapping to a SELinux context using SECMARK, to implement ACLs directly with nftables, iptables, OVS, tc, etc. The group membership is defined by the lower 16 bits of skb->mark, the upper 16 bits are used for flags. SELinux allows to manage label to secure local resources. However, distributed applications require ACLs to implemented across hosts. This is typically achieved by matching on L2-L4 fields to identify the original sending host and process on the receiver. On top of that, netlabel and specifically CIPSO [1] allow to map security contexts to universal labels. However, netlabel and CIPSO are relatively complex. This patch provides a lightweight alternative for overlay network environments with a trusted underlay. No additional control protocol is required. Host 1: Host 2: Group A Group B Group B Group A +-----+ +-------------+ +-------+ +-----+ | lxc | | SELinux CTX | | httpd | | VM | +--+--+ +--+----------+ +---+---+ +--+--+ \---+---/ \----+---/ | | +---+---+ +---+---+ | vxlan | | vxlan | +---+---+ +---+---+ +------------------------------+ Backwards compatibility: A VXLAN-GBP socket can receive standard VXLAN frames and will assign the default group 0x0000 to such frames. A Linux VXLAN socket will drop VXLAN-GBP frames. The extension is therefore disabled by default and needs to be specifically enabled: ip link add [...] type vxlan [...] gbp In a mixed environment with VXLAN and VXLAN-GBP sockets, the GBP socket must run on a separate port number. Examples: iptables: host1# iptables -I OUTPUT -m owner --uid-owner 101 -j MARK --set-mark 0x200 host2# iptables -I INPUT -m mark --mark 0x200 -j DROP OVS: # ovs-ofctl add-flow br0 'in_port=1,actions=load:0x200->NXM_NX_TUN_GBP_ID[],NORMAL' # ovs-ofctl add-flow br0 'in_port=2,tun_gbp_id=0x200,actions=drop' [0] https://tools.ietf.org/html/draft-smith-vxlan-group-policy [1] http://lwn.net/Articles/204905/ Signed-off-by: Thomas Graf <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller14-43/+92
Conflicts: drivers/net/xen-netfront.c Minor overlapping changes in xen-netfront.c, mostly to do with some buffer management changes alongside the split of stats into TX and RX. Signed-off-by: David S. Miller <[email protected]>
2015-01-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds12-40/+90
Pull networking fixes from David Miller: 1) Don't use uninitialized data in IPVS, from Dan Carpenter. 2) conntrack race fixes from Pablo Neira Ayuso. 3) Fix TX hangs with i40e, from Jesse Brandeburg. 4) Fix budget return from poll calls in dnet and alx, from Eric Dumazet. 5) Fix bugus "if (unlikely(x) < 0)" test in AF_PACKET, from Christoph Jaeger. 6) Fix bug introduced by conversion to list_head in TIPC retransmit code, from Jon Paul Maloy. 7) Don't use GFP_NOIO under spinlock in USB kaweth driver, from Alexey Khoroshilov. 8) Fix bridge build with INET disabled, from Arnd Bergmann. 9) Fix netlink array overrun for PROBE attributes in openvswitch, from Thomas Graf. 10) Don't hold spinlock across synchronize_irq() in tg3 driver, from Prashant Sreedharan. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (44 commits) tg3: Release tp->lock before invoking synchronize_irq() tg3: tg3_reset_task() needs to use rtnl_lock to synchronize tg3: tg3_timer() should grab tp->lock before checking for tp->irq_sync team: avoid possible underflow of count_pending value for notify_peers and mcast_rejoin openvswitch: packet messages need their own probe attribtue i40e: adds FCoE configure option cxgb4vf: Fix queue allocation for 40G adapter netdevice: Add missing parentheses in macro bridge: only provide proxy ARP when CONFIG_INET is enabled neighbour: fix base_reachable_time(_ms) not effective immediatly when changed net: fec: fix MDIO bus assignement for dual fec SoC's xen-netfront: use different locks for Rx and Tx stats drivers: net: cpsw: fix multicast flush in dual emac mode cxgb4vf: Initialize mdio_addr before using it net: Corrected the comment describing the ndo operations to reflect the actual prototype for couple of operations usb/kaweth: use GFP_ATOMIC under spin_lock in usb_start_wait_urb() MAINTAINERS: add me as ibmveth maintainer tipc: fix bug in broadcast retransmit code update ip-sysctl.txt documentation (v2) net/at91_ether: prepare and unprepare clock ...
2015-01-14openvswitch: packet messages need their own probe attribtueThomas Graf1-1/+2
User space is currently sending a OVS_FLOW_ATTR_PROBE for both flow and packet messages. This leads to an out-of-bounds access in ovs_packet_cmd_execute() because OVS_FLOW_ATTR_PROBE > OVS_PACKET_ATTR_MAX. Introduce a new OVS_PACKET_ATTR_PROBE with the same numeric value as OVS_FLOW_ATTR_PROBE to grow the range of accepted packet attributes while maintaining to be binary compatible with existing OVS binaries. Fixes: 05da589 ("openvswitch: Add support for OVS_FLOW_ATTR_PROBE.") Reported-by: Sander Eikelenboom <[email protected]> Tracked-down-by: Florian Westphal <[email protected]> Signed-off-by: Thomas Graf <[email protected]> Reviewed-by: Jesse Gross <[email protected]> Acked-by: Pravin B Shelar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-14Bluetooth: 6lowpan: Remove PSM setting codeJukka Rissanen1-35/+31
Removing PSM setting debugfs interface as the IPSP has a well defined PSM value that should be used. The patch introduces enable flag that can be used to toggle 6lowpan on/off. Signed-off-by: Jukka Rissanen <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2015-01-14Bluetooth: Fix valid Identity Address checkJohan Hedberg1-1/+5
According to the Bluetooth core specification valid identity addresses are either Public Device Addresses or Static Random Addresses. IRKs received with any other type of address should be discarded since we cannot assume to know the permanent identity of the peer device. This patch fixes a missing check for the Identity Address when receiving the Identity Address Information SMP PDU. Signed-off-by: Johan Hedberg <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]> Cc: [email protected] # 3.17+
2015-01-14ipv6:icmp:remove unnecessary bracketszhuyj1-1/+1
There are too many brackets. Maybe only one bracket is enough. Signed-off-by: Zhu Yanjun <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-14openvswitch: Introduce ovs_tunnel_route_lookupFan Du5-39/+25
Introduce ovs_tunnel_route_lookup to consolidate route lookup shared by vxlan, gre, and geneve ports. Signed-off-by: Fan Du <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-14udp: pass udp_offload struct to UDP gro callbacksTom Herbert3-8/+17
This patch introduces udp_offload_callbacks which has the same GRO functions (but not a GSO function) as offload_callbacks, except there is an argument to a udp_offload struct passed to gro_receive and gro_complete functions. This additional argument can be used to retrieve the per port structure of the encapsulation for use in gro processing (mostly by doing container_of on the structure). Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-14bridge: only provide proxy ARP when CONFIG_INET is enabledArnd Bergmann1-1/+2
When IPV4 support is disabled, we cannot call arp_send from the bridge code, which would result in a kernel link error: net/built-in.o: In function `br_handle_frame_finish': :(.text+0x59914): undefined reference to `arp_send' :(.text+0x59a50): undefined reference to `arp_tbl' This makes the newly added proxy ARP support in the bridge code depend on the CONFIG_INET symbol and lets the compiler optimize the code out to avoid the link error. Signed-off-by: Arnd Bergmann <[email protected]> Fixes: 958501163ddd ("bridge: Add support for IEEE 802.11 Proxy ARP") Cc: Kyeyoon Park <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-14Bluetooth: Remove dead codeGowtham Anandha Babu1-6/+0
Variable 'controller' is assigned a value that is never used. Identified by cppcheck tool. Signed-off-by: Gowtham Anandha Babu <[email protected]> Signed-off-by: Johan Hedberg <[email protected]>
2015-01-14nl80211: send netdetect configuration info in NL80211_CMD_GET_WOWLANLuciano Coelho1-0/+52
Send the netdetect configuration information in the response to NL8021_CMD_GET_WOWLAN commands. This includes the scan interval, SSIDs to match and frequencies to scan. Additionally, add the NL80211_WOWLAN_TRIG_NET_DETECT with NL80211_ATTR_WOWLAN_TRIGGERS_SUPPORTED. Signed-off-by: Luciano Coelho <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2015-01-14cfg80211: avoid reg-hints in self-managed only systemsArik Nemtsov1-0/+25
When a system contains only self-managed regulatory devices all hints from the regulatory core are ignored. Stop hint processing early in this case. These systems usually don't have CRDA deployed, which results in endless (irrelevent) logs of the form: cfg80211: Calling CRDA to update world regulatory domain Make sure there's at least one self-managed device before discarding a hint, in order to prevent initial hints from disappearing on CRDA managed systems. Signed-off-by: Arik Nemtsov <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2015-01-14cfg80211: introduce sync regdom set API for self-managedArik Nemtsov1-2/+29
A self-managed device will sometimes need to set its regdomain synchronously. Notably it should be set before usermode has a chance to query it. Expose a new API to accomplish this which requires the RTNL. Signed-off-by: Arik Nemtsov <[email protected]> Reviewed-by: Ilan Peer <[email protected]> Reviewed-by: Emmanuel Grumbach <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2015-01-14mac80211: don't defer scans in case of radar detectionEliad Peller1-1/+1
Radar detection can last indefinite time. There is no point in deferring a scan request in this case - simply return -EBUSY. Signed-off-by: Eliad Peller <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2015-01-14mac80211: consider only relevant vifs for radar_required calculationEliad Peller1-2/+30
ctx->conf.radar_enabled should reflect whether radar detection is enabled for the channel context. When calculating it, make it consider only the vifs that have this context assigned (instead of all the vifs). Signed-off-by: Eliad Peller <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2015-01-14mac80211: remove local->radar_detect_enabledEliad Peller4-7/+5
local->radar_detect_enabled should tell whether radar_detect is enabled on any interface belonging to local. However, it's not getting updated correctly in many cases (actually, when testing with hwsim it's never been set, even when the dfs master is beaconing). Instead of handling all the corner cases (e.g. channel switch), simply check whether radar detection is enabled only when needed, instead of caching the result. Signed-off-by: Eliad Peller <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2015-01-14mac80211: add TDLS supported channels correctlyArik Nemtsov1-5/+21
The function adding the supported channels IE during a TDLS connection had several issues: 1. If the entire subband is usable, the function exitted the loop without adding it 2. The function only checked chandef_usable, ignoring flags like RADAR which would prevent TDLS off-channel communcation. 3. HT20 was explicitly required in the chandef, while not a requirement for TDLS off-channel. Signed-off-by: Arik Nemtsov <[email protected]> Reviewed-by: Emmanuel Grumbach <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2015-01-14mac80211: let flush() drop packets when possibleEmmanuel Grumbach10-22/+27
When roaming / suspending, it makes no sense to wait until the transmit queues of the device are empty. In extreme condition they can be starved (VO saturating the air), but even in regular cases, it is pointless to delay the roaming because the low level driver is trying to send packets to an AP which is far away. We'd rather drop these packets and let TCP retransmit if needed. This will allow to speed up the roaming. For suspend, the explanation is even more trivial. Signed-off-by: Emmanuel Grumbach <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2015-01-14Bluetooth: Use %llu for printing duration details of selftestsMarcel Holtmann2-2/+2
The duration variable for the selftests is unsigned long long and with that use %llu instead of %lld when printing the results. Signed-off-by: Marcel Holtmann <[email protected]> Signed-off-by: Johan Hedberg <[email protected]>
2015-01-14Bluetooth: Move Delete Stored Link Key to 4th phase of initializationMarcel Holtmann1-23/+23
This moves the execution of Delete Stored Link Key command to the hci_init4_req phase. No actual code has been changed. The command is just executed at a later stage of the initialization. Signed-off-by: Marcel Holtmann <[email protected]> Signed-off-by: Johan Hedberg <[email protected]>
2015-01-14neighbour: fix base_reachable_time(_ms) not effective immediatly when changedJean-Francois Remy1-0/+44
When setting base_reachable_time or base_reachable_time_ms on a specific interface through sysctl or netlink, the reachable_time value is not updated. This means that neighbour entries will continue to be updated using the old value until it is recomputed in neigh_period_work (which recomputes the value every 300*HZ). On systems with HZ equal to 1000 for instance, it means 5mins before the change is effective. This patch changes this behavior by recomputing reachable_time after each set on base_reachable_time or base_reachable_time_ms. The new value will become effective the next time the neighbour's timer is triggered. Changes are made in two places: the netlink code for set and the sysctl handling code. For sysctl, I use a proc_handler. The ipv6 network code does provide its own handler but it already refreshes reachable_time correctly so it's not an issue. Any other user of neighbour which provide its own handlers must refresh reachable_time. Signed-off-by: Jean-Francois Remy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-13net: rename vlan_tx_* helpers since "tx" is misleading thereJiri Pirko18-41/+42
The same macros are used for rx as well. So rename it. Signed-off-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-13net: sched: fix skb->protocol use in case of accelerated vlan pathJiri Pirko7-13/+13
tc code implicitly considers skb->protocol even in case of accelerated vlan paths and expects vlan protocol type here. However, on rx path, if the vlan header was already stripped, skb->protocol contains value of next header. Similar situation is on tx path. So for skbs that use skb->vlan_tci for tagging, use skb->vlan_proto instead. Reported-by: Jamal Hadi Salim <[email protected]> Signed-off-by: Jiri Pirko <[email protected]> Acked-by: Jamal Hadi Salim <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-13tipc: correctly handle releasing a not fully initialized sockSasha Levin1-2/+5
Commit f2f9800d4955 "tipc: make tipc node table aware of net namespace" has added a dereference of sock->sk before making sure it's not NULL, which makes releasing a tipc socket NULL pointer dereference for sockets that are not fully initialized. Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-13tipc: remove redundant timer defined in tipc_sock structYing Xue1-9/+7
Remove the redundant timer defined in tipc_sock structure, instead we can directly reuse the sk_timer defined in sock structure. Signed-off-by: Ying Xue <[email protected]> Acked-by: Erik Hugne <[email protected]> Reviewed-by: Jon Maloy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-13bridge: fix uninitialized variable warningRoopa Prabhu1-7/+9
net/bridge/br_netlink.c: In function ‘br_fill_ifinfo’: net/bridge/br_netlink.c:146:32: warning: ‘vid_range_flags’ may be used uninitialized in this function [-Wmaybe-uninitialized] err = br_fill_ifvlaninfo_range(skb, vid_range_start, ^ net/bridge/br_netlink.c:108:6: note: ‘vid_range_flags’ was declared here u16 vid_range_flags; Reported-by: Thomas Graf <[email protected]> Signed-off-by: Roopa Prabhu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-13openvswitch: Remove unnecessary version.h inclusionSyam Sidhardhan1-2/+0
version.h inclusion is not necessary as detected by versioncheck. Signed-off-by: Syam Sidhardhan <[email protected]> Acked-by: Pravin B Shelar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-13tcp: avoid reducing cwnd when ACK+DSACK is receivedSébastien Barré1-19/+19
With TLP, the peer may reply to a probe with an ACK+D-SACK, with ack value set to tlp_high_seq. In the current code, such ACK+DSACK will be missed and only at next, higher ack will the TLP episode be considered done. Since the DSACK is not present anymore, this will cost a cwnd reduction. This patch ensures that this scenario does not cause a cwnd reduction, since receiving an ACK+DSACK indicates that both the initial segment and the probe have been received by the peer. The following packetdrill test, from Neal Cardwell, validates this patch: // Establish a connection. 0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 +0 < S 0:0(0) win 32792 <mss 1000,sackOK,nop,nop,nop,wscale 7> +0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 6> +.020 < . 1:1(0) ack 1 win 257 +0 accept(3, ..., ...) = 4 // Send 1 packet. +0 write(4, ..., 1000) = 1000 +0 > P. 1:1001(1000) ack 1 // Loss probe retransmission. // packets_out == 1 => schedule PTO in max(2*RTT, 1.5*RTT + 200ms) // In this case, this means: 1.5*RTT + 200ms = 230ms +.230 > P. 1:1001(1000) ack 1 +0 %{ assert tcpi_snd_cwnd == 10 }% // Receiver ACKs at tlp_high_seq with a DSACK, // indicating they received the original packet and probe. +.020 < . 1:1(0) ack 1001 win 257 <sack 1:1001,nop,nop> +0 %{ assert tcpi_snd_cwnd == 10 }% // Send another packet. +0 write(4, ..., 1000) = 1000 +0 > P. 1001:2001(1000) ack 1 // Receiver ACKs above tlp_high_seq, which should end the TLP episode // if we haven't already. We should not reduce cwnd. +.020 < . 1:1(0) ack 2001 win 257 +0 %{ assert tcpi_snd_cwnd == 10, tcpi_snd_cwnd }% Credits: -Gregory helped in finding that tcp_process_tlp_ack was where the cwnd got reduced in our MPTCP tests. -Neal wrote the packetdrill test above -Yuchung reworked the patch to make it more readable. Cc: Gregory Detal <[email protected]> Cc: Nandita Dukkipati <[email protected]> Tested-by: Neal Cardwell <[email protected]> Reviewed-by: Yuchung Cheng <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Signed-off-by: Sébastien Barré <[email protected]> Acked-by: Eric Dumazet <[email protected]> Acked-by: Neal Cardwell <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-13netlink: eliminate nl_sk_hash_lockYing Xue3-19/+25
As rhashtable_lookup_compare_insert() can guarantee the process of search and insertion is atomic, it's safe to eliminate the nl_sk_hash_lock. After this, object insertion or removal will be protected with per bucket lock on write side while object lookup is guarded with rcu read lock on read side. Signed-off-by: Ying Xue <[email protected]> Cc: Thomas Graf <[email protected]> Acked-by: Thomas Graf <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-12net: allow large number of rx queuesPankaj Gupta1-5/+8
netif_alloc_rx_queues() uses kcalloc() to allocate memory for "struct netdev_queue *_rx" array. If we are doing large rx queue allocation kcalloc() might fail, so this patch does a fallback to vzalloc(). Similar implementation is done for tx queue allocation in netif_alloc_netdev_queues(). We avoid failure of high order memory allocation with the help of vzalloc(), this allows us to do large rx and tx queue allocation which in turn helps us to increase the number of queues in tun. As vmalloc() adds overhead on a critical network path, __GFP_REPEAT flag is used with kzalloc() to do this fallback only when really needed. Signed-off-by: Pankaj Gupta <[email protected]> Reviewed-by: Michael S. Tsirkin <[email protected]> Reviewed-by: David Gibson <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-12net: sched: sch_teql: Remove unused functionRickard Strandqvist1-7/+0
Remove the function teql_neigh_release() that is not used anywhere. This was partially found by using a static code analysis program called cppcheck. Signed-off-by: Rickard Strandqvist <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-12net: xfrm: xfrm_algo: Remove unused functionRickard Strandqvist1-5/+0
Remove the function aead_entries() that is not used anywhere. This was partially found by using a static code analysis program called cppcheck. Signed-off-by: Rickard Strandqvist <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-12bridge: new function to pack vlans into ranges during getsRoopa Prabhu1-21/+124
This patch adds new function to pack vlans into ranges whereever applicable using the flags BRIDGE_VLAN_INFO_RANGE_BEGIN and BRIDGE VLAN_INFO_RANGE_END Old vlan packing code is moved to a new function and continues to be called when filter_mask is RTEXT_FILTER_BRVLAN. Signed-off-by: Roopa Prabhu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-12bridge: support for multiple vlans and vlan ranges in setlink and dellink ↵Roopa Prabhu1-36/+68
requests This patch changes bridge IFLA_AF_SPEC netlink attribute parser to look for more than one IFLA_BRIDGE_VLAN_INFO attribute. This allows userspace to pack more than one vlan in the setlink msg. The dumps were already sending more than one vlan info in the getlink msg. This patch also adds bridge_vlan_info flags BRIDGE_VLAN_INFO_RANGE_BEGIN and BRIDGE_VLAN_INFO_RANGE_END to indicate start and end of vlan range This patch also deletes unused ifla_br_policy. Signed-off-by: Roopa Prabhu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-12tipc: make netlink support net namespaceYing Xue1-2/+5
Currently tipc module only allows users sitting on "init_net" namespace to configure it through netlink interface. But now almost each tipc component is able to be aware of net namespace, so it's time to open the permission for users residing in other namespaces, allowing them to configure their own tipc stack instance through netlink interface. Signed-off-by: Ying Xue <[email protected]> Tested-by: Tero Aho <[email protected]> Reviewed-by: Jon Maloy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-12tipc: make tipc random value aware of net namespaceYing Xue4-12/+4
After namespace is supported, each namespace should own its private random value. So the global variable representing the random value must be moved to tipc_net structure. Signed-off-by: Ying Xue <[email protected]> Tested-by: Tero Aho <[email protected]> Reviewed-by: Jon Maloy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-12tipc: make subscriber server support net namespaceYing Xue8-65/+86
TIPC establishes one subscriber server which allows users to subscribe their interesting name service status. After tipc supports namespace, one dedicated tipc stack instance is created for each namespace, and each instance can be deemed as one independent TIPC node. As a result, subscriber server must be built for each namespace. Signed-off-by: Ying Xue <[email protected]> Tested-by: Tero Aho <[email protected]> Reviewed-by: Jon Maloy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-12tipc: make tipc node address support net namespaceYing Xue17-194/+246
If net namespace is supported in tipc, each namespace will be treated as a separate tipc node. Therefore, every namespace must own its private tipc node address. This means the "tipc_own_addr" global variable of node address must be moved to tipc_net structure to satisfy the requirement. It's turned out that users also can assign node address for every namespace. Signed-off-by: Ying Xue <[email protected]> Tested-by: Tero Aho <[email protected]> Reviewed-by: Jon Maloy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-12tipc: name tipc name table support net namespaceYing Xue15-116/+154
TIPC name table is used to store the mapping relationship between TIPC service name and socket port ID. When tipc supports namespace, it allows users to publish service names only owned by a certain namespace. Therefore, every namespace must have its private name table to prevent service names published to one namespace from being contaminated by other service names in another namespace. Therefore, The name table global variable (ie, nametbl) and its lock must be moved to tipc_net structure, and a parameter of namespace must be added for necessary functions so that they can obtain name table variable defined in tipc_net structure. Signed-off-by: Ying Xue <[email protected]> Tested-by: Tero Aho <[email protected]> Reviewed-by: Jon Maloy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-01-12tipc: make tipc socket support net namespaceYing Xue6-33/+43
Now tipc socket table is statically allocated as a global variable. Through it, we can look up one socket instance with port ID, insert a new socket instance to the table, and delete a socket from the table. But when tipc supports net namespace, each namespace must own its specific socket table. So the global variable of socket table must be redefined in tipc_net structure. As a concequence, a new socket table will be allocated when a new namespace is created, and a socket table will be deallocated when namespace is destroyed. Signed-off-by: Ying Xue <[email protected]> Tested-by: Tero Aho <[email protected]> Reviewed-by: Jon Maloy <[email protected]> Signed-off-by: David S. Miller <[email protected]>