aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2015-08-10openvswitch: Move tunnel destroy function to oppenvswitch module.Pravin B Shelar3-20/+20
This function will be used in gre and geneve vport implementations. Signed-off-by: Pravin B Shelar <[email protected]> Acked-by: Thomas Graf <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-10net: add explicit logging and stat for neighbour table overflowRick Jones1-4/+10
Add an explicit neighbour table overflow message (ratelimited) and statistic to make diagnosing neighbour table overflows tractable in the wild. Diagnosing a neighbour table overflow can be quite difficult in the wild because there is no explicit dmesg logged. Callers to neighbour code seem to use net_dbg_ratelimit when the neighbour call fails which means the "base message" is not emitted and the callback suppressed messages from the ratelimiting can end-up juxtaposed with unrelated messages. Further, a forced garbage collection will increment a stat on each call whether it was successful in freeing-up a table entry or not, so that statistic is only a hint. So, add a net_info_ratelimited message and explicit statistic to the neighbour code. Signed-off-by: Rick Jones <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-10bridge: netlink: add support for vlan_filtering attributeNikolay Aleksandrov3-7/+32
This patch adds the ability to toggle the vlan filtering support via netlink. Since we're already running with rtnl in .changelink() we don't need to take any additional locks. Signed-off-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-10ipv6: don't reject link-local nexthop on other interfaceFlorian Westphal1-2/+4
48ed7b26faa7 ("ipv6: reject locally assigned nexthop addresses") is too strict; it rejects following corner-case: ip -6 route add default via fe80::1:2:3 dev eth1 [ where fe80::1:2:3 is assigned to a local interface, but not eth1 ] Fix this by restricting search to given device if nh is linklocal. Joint work with Hannes Frederic Sowa. Fixes: 48ed7b26faa7 ("ipv6: reject locally assigned nexthop addresses") Signed-off-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-10nfsd/sunrpc: factor svc_rqst allocation and freeing from sv_nrthreads ↵Jeff Layton1-18/+36
refcounting In later patches, we'll want to be able to allocate and free svc_rqst structures without monkeying with the serv->sv_nrthreads refcount. Factor those pieces out of their respective functions. Signed-off-by: Shirley Ma <[email protected]> Acked-by: Jeff Layton <[email protected]> Tested-by: Shirley Ma <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2015-08-10nfsd/sunrpc: move pool_mode definitions into svc.hJeff Layton1-24/+7
In later patches, we're going to need to allow code external to svc.c to figure out what pool_mode is in use. Move these definitions into svc.h to prepare for that. Also, make the svc_pool_map object available and exported so that other modules can peek in there to get insight into what pool mode is in use. Likewise, export svc_pool_map_get/put function to make it safe to do so. Signed-off-by: Shirley Ma <[email protected]> Acked-by: Jeff Layton <[email protected]> Tested-by: Shirley Ma <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2015-08-10nfsd/sunrpc: turn enqueueing a svc_xprt into a svc_serv operationJeff Layton1-5/+5
For now, all services use svc_xprt_do_enqueue, but once we add workqueue-based service support, we'll need to do something different. Signed-off-by: Shirley Ma <[email protected]> Acked-by: Jeff Layton <[email protected]> Tested-by: Shirley Ma <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2015-08-10nfsd/sunrpc: move sv_module parm into sv_opsJeff Layton1-5/+3
...not technically an operation, but it's more convenient and cleaner to pass the module pointer in this struct. Signed-off-by: Shirley Ma <[email protected]> Acked-by: Jeff Layton <[email protected]> Tested-by: Shirley Ma <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2015-08-10nfsd/sunrpc: move sv_function into sv_opsJeff Layton1-5/+3
Since we now have a container for holding svc_serv operations, move the sv_function into it as well. Signed-off-by: Shirley Ma <[email protected]> Acked-by: Jeff Layton <[email protected]> Tested-by: Shirley Ma <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2015-08-10nfsd/sunrpc: add a new svc_serv_ops struct and move sv_shutdown into itJeff Layton1-9/+9
In later patches we'll need to abstract out more operations on a per-service level, besides sv_shutdown and sv_function. Declare a new svc_serv_ops struct to hold these operations, and move sv_shutdown into this struct. Signed-off-by: Shirley Ma <[email protected]> Acked-by: Jeff Layton <[email protected]> Tested-by: Shirley Ma <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2015-08-10svcrdma: Change maximum server payload back to RPCSVC_MAXPAYLOADChuck Lever2-2/+1
Both commit 0380a3f375 ("svcrdma: Add a separate "max data segs" macro for svcrdma") and commit 7e5be28827bf ("svcrdma: advertise the correct max payload") are incorrect. This commit reverts both changes, restoring the server's maximum payload size to 1MB. Commit 7e5be28827bf based the server's maximum payload on the _client's_ RPCRDMA_MAX_DATA_SEGS value. That was wrong. Commit 0380a3f375 tried to fix this so that the client maximum payload size could be raised without affecting the server, but managed to confuse matters more on the server side. More importantly, limiting the advertised maximum payload size was meant to be a workaround, not the actual fix. We need to revisit https://bugzilla.linux-nfs.org/show_bug.cgi?id=270 A Linux client on a platform with 64KB pages can overrun and crash an x86_64 NFS/RDMA server when the r/wsize is 1MB. An x86/64 Linux client seems to work fine using 1MB reads and writes when the Linux server's maximum payload size is restored to 1MB. BugLink: https://bugzilla.linux-nfs.org/show_bug.cgi?id=270 Fixes: 0380a3f375 ("svcrdma: Add a separate "max data segs" macro") Signed-off-by: Chuck Lever <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2015-08-10Bluetooth: Enable new connection establishment procedure.Jakub Pawlowski2-5/+8
Currently, when trying to connect to already paired device that just rotated its RPA MAC address, old address would be used and connection would fail. In order to fix that, kernel must scan and receive advertisement with fresh RPA before connecting. This patch enables new connection establishment procedure. Instead of just sending HCI_OP_LE_CREATE_CONN to controller, "connect" will add device to kernel whitelist and start scan. If advertisement is received, it'll be compared against whitelist and then trigger connection if it matches. That fixes mentioned reconnect issue for already paired devices. It also make whole connection procedure more robust. We can try to connect to multiple devices at same time now, even though controller allow only one. Signed-off-by: Jakub Pawlowski <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2015-08-10Bluetooth: timeout handling in new connect procedureJakub Pawlowski1-2/+6
Currently, when trying to connect to already paired device that just rotated its RPA MAC address, old address would be used and connection would fail. In order to fix that, kernel must scan and receive advertisement with fresh RPA before connecting. This patch makes sure that when new procedure is in use, and we're stuck in scan phase because no advertisement was received and timeout happened, or app decided to close socket, scan whitelist gets properly cleaned up. Signed-off-by: Jakub Pawlowski <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2015-08-10Bluetooth: advertisement handling in new connect procedureJakub Pawlowski3-37/+72
Currently, when trying to connect to already paired device that just rotated its RPA MAC address, old address would be used and connection would fail. In order to fix that, kernel must scan and receive advertisement with fresh RPA before connecting. This path makes sure that after advertisement is received from device that we try to connect to, it is properly handled in check_pending_le_conn and trigger connect attempt. It also modifies hci_le_connect to make sure that connect attempt will be properly continued. Signed-off-by: Jakub Pawlowski <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2015-08-10Bluetooth: add hci_connect_le_scanJakub Pawlowski2-0/+207
Currently, when trying to connect to already paired device that just rotated its RPA MAC address, old address would be used and connection would fail. In order to fix that, kernel must scan and receive advertisement with fresh RPA before connecting. This patch adds hci_connect_le_scan with dependencies, new method that will be used to connect to remote LE devices. Instead of just sending connect request, it adds a device to whitelist. Later patches will make use of this whitelist to send conenct request when advertisement is received, and properly handle timeouts. Signed-off-by: Jakub Pawlowski <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2015-08-10Bluetooth: add hci_lookup_le_connectJakub Pawlowski4-10/+7
This patch adds hci_lookup_le_connect method, that will be used to check wether outgoing le connection attempt is in progress. Signed-off-by: Jakub Pawlowski <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2015-08-10ieee802154: 6lowpan: fix error frag handlingAlexander Aring1-1/+1
This patch fixes the error handling for lowpan_xmit_fragment by replace "-PTR_ERR" to "PTR_ERR". PTR_ERR returns already a negative errno code. Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2015-08-10ieee802154: add ack request default handlingAlexander Aring5-1/+77
This patch introduce a new mib entry which isn't part of 802.15.4 but useful as default behaviour to set the ack request bit or not if we don't know if the ack request bit should set. This is currently used for stacks like IEEE 802.15.4 6LoWPAN. Reviewed-by: Stefan Schmidt <[email protected]> Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2015-08-10mac802154: change frame_retries behaviourAlexander Aring2-8/+4
This patch changes the default minimum value of frame_retries to 0 and changes the frame_retries default value to 3 which is also 802.15.4 default. We don't use the frame_retries "-1" value as indicator for no-aret mode anymore, instead we checking on the ack request bit inside the 802.15.4 frame control field. This allows a acknowledge handling per frame. This checking is done by transceiver or inside xmit callback of driver layer. If a transceiver doesn't support ARET handling the transmit functionality ignores ack frames then, which isn't well but should not effect anything of current functionality. Reviewed-by: Stefan Schmidt <[email protected]> Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2015-08-10mac802154: cfg: remove test and set checksAlexander Aring1-16/+0
This patch removes several checks if a value is really changed. This makes only sense if we have another layer call e.g. calling the driver_ops which is done by callbacks like "set_channel". For MAC settings which need to be set by phy registers (if the phy supports that handling) this is set by doing an interface up currently and are not direct driver_ops calls, so we remove the checks from these configuration callbacks. Reviewed-by: Stefan Schmidt <[email protected]> Suggested-by: Phoebe Buckheister <[email protected]> Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2015-08-10mac802154: fix wpan mac setting while lowpan is thereAlexander Aring1-0/+15
If we currently change the mac address inside the wpan interface while we have a lowpan interface on top of the wpan interface, the mac address setting doesn't reach the lowpan interface. The effect would be that the IPv6 lowpan interface has the old SLAAC address and isn't working anymore because the lowpan interface use in internal mechanism sometimes dev->addr which is the old mac address of the wpan interface. This patch checks if a wpan interface belongs to lowpan interface, if yes then we need to check if the lowpan interface is down and change the mac address also at the lowpan interface. When the lowpan interface will be set up afterwards, it will use the correct SLAAC address which based on the updated mac address setting. Reviewed-by: Stefan Schmidt <[email protected]> Tested-by: Stefan Schmidt <[email protected]> Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2015-08-10ieee802154: 6lowpan: remove multiple lowpan per wpan supportAlexander Aring3-86/+27
We currently supports multiple lowpan interfaces per wpan interface. I never saw any use case into such functionality. We drop this feature now because it's much easier do deal with address changes inside the under laying wpan interface. This patch removes the multiple lowpan interface and adds a lowpan_dev netdev pointer into the wpan_dev, if this pointer isn't null the wpan interface belongs to the assigned lowpan interface. Reviewed-by: Stefan Schmidt <[email protected]> Tested-by: Stefan Schmidt <[email protected]> Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2015-08-106lowpan: Fix extraction of flow label fieldLukasz Duda1-1/+1
The lowpan_fetch_skb function is used to fetch the first byte, which also increments the data pointer in skb structure, making subsequent array lookup of byte 0 actually being byte 1. To decompress the first byte of the Flow Label when the TF flag is set to 0x01, the second half of the first byte is needed. The patch fixes the extraction of the Flow Label field. Acked-by: Jukka Rissanen <[email protected]> Signed-off-by: Lukasz Duda <[email protected]> Signed-off-by: Glenn Ruben Bakke <[email protected]> Signed-off-by: Alexander Aring <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2015-08-10Bluetooth: Fix breakage in amp_write_rem_assoc_frag()Dan Carpenter1-1/+1
We should be passing the pointer itself instead of the address of the pointer. This was a copy and paste bug when we replaced the calls to hci_send_cmd(). Originally, the arguments were "len, cp" but we overwrote them with "sizeof(cp), &cp" by mistake. Fixes: b3d3914006a0 ('Bluetooth: Move amp assoc read/write completed callback to amp.c') Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2015-08-10netlink: make sure -EBUSY won't escape from netlink_insertDaniel Borkmann1-0/+5
Linus reports the following deadlock on rtnl_mutex; triggered only once so far (extract): [12236.694209] NetworkManager D 0000000000013b80 0 1047 1 0x00000000 [12236.694218] ffff88003f902640 0000000000000000 ffffffff815d15a9 0000000000000018 [12236.694224] ffff880119538000 ffff88003f902640 ffffffff81a8ff84 00000000ffffffff [12236.694230] ffffffff81a8ff88 ffff880119c47f00 ffffffff815d133a ffffffff81a8ff80 [12236.694235] Call Trace: [12236.694250] [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10 [12236.694257] [<ffffffff815d133a>] ? schedule+0x2a/0x70 [12236.694263] [<ffffffff815d15a9>] ? schedule_preempt_disabled+0x9/0x10 [12236.694271] [<ffffffff815d2c3f>] ? __mutex_lock_slowpath+0x7f/0xf0 [12236.694280] [<ffffffff815d2cc6>] ? mutex_lock+0x16/0x30 [12236.694291] [<ffffffff814f1f90>] ? rtnetlink_rcv+0x10/0x30 [12236.694299] [<ffffffff8150ce3b>] ? netlink_unicast+0xfb/0x180 [12236.694309] [<ffffffff814f5ad3>] ? rtnl_getlink+0x113/0x190 [12236.694319] [<ffffffff814f202a>] ? rtnetlink_rcv_msg+0x7a/0x210 [12236.694331] [<ffffffff8124565c>] ? sock_has_perm+0x5c/0x70 [12236.694339] [<ffffffff814f1fb0>] ? rtnetlink_rcv+0x30/0x30 [12236.694346] [<ffffffff8150d62c>] ? netlink_rcv_skb+0x9c/0xc0 [12236.694354] [<ffffffff814f1f9f>] ? rtnetlink_rcv+0x1f/0x30 [12236.694360] [<ffffffff8150ce3b>] ? netlink_unicast+0xfb/0x180 [12236.694367] [<ffffffff8150d344>] ? netlink_sendmsg+0x484/0x5d0 [12236.694376] [<ffffffff810a236f>] ? __wake_up+0x2f/0x50 [12236.694387] [<ffffffff814cad23>] ? sock_sendmsg+0x33/0x40 [12236.694396] [<ffffffff814cb05e>] ? ___sys_sendmsg+0x22e/0x240 [12236.694405] [<ffffffff814cab75>] ? ___sys_recvmsg+0x135/0x1a0 [12236.694415] [<ffffffff811a9d12>] ? eventfd_write+0x82/0x210 [12236.694423] [<ffffffff811a0f9e>] ? fsnotify+0x32e/0x4c0 [12236.694429] [<ffffffff8108cb70>] ? wake_up_q+0x60/0x60 [12236.694434] [<ffffffff814cba09>] ? __sys_sendmsg+0x39/0x70 [12236.694440] [<ffffffff815d4797>] ? entry_SYSCALL_64_fastpath+0x12/0x6a It seems so far plausible that the recursive call into rtnetlink_rcv() looks suspicious. One way, where this could trigger is that the senders NETLINK_CB(skb).portid was wrongly 0 (which is rtnetlink socket), so the rtnl_getlink() request's answer would be sent to the kernel instead to the actual user process, thus grabbing rtnl_mutex() twice. One theory would be that netlink_autobind() triggered via netlink_sendmsg() internally overwrites the -EBUSY error to 0, but where it is wrongly originating from __netlink_insert() instead. That would reset the socket's portid to 0, which is then filled into NETLINK_CB(skb).portid later on. As commit d470e3b483dc ("[NETLINK]: Fix two socket hashing bugs.") also puts it, -EBUSY should not be propagated from netlink_insert(). It looks like it's very unlikely to reproduce. We need to trigger the rhashtable_insert_rehash() handler under a situation where rehashing currently occurs (one /rare/ way would be to hit ht->elasticity limits while not filled enough to expand the hashtable, but that would rather require a specifically crafted bind() sequence with knowledge about destination slots, seems unlikely). It probably makes sense to guard __netlink_insert() in any case and remap that error. It was suggested that EOVERFLOW might be better than an already overloaded ENOMEM. Reference: http://thread.gmane.org/gmane.linux.network/372676 Reported-by: Linus Torvalds <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Herbert Xu <[email protected]> Acked-by: Thomas Graf <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-10netfilter: SYNPROXY: fix sending window update to clientPhil Sutter2-2/+4
Upon receipt of SYNACK from the server, ipt_SYNPROXY first sends back an ACK to finish the server handshake, then calls nf_ct_seqadj_init() to initiate sequence number adjustment of forwarded packets to the client and finally sends a window update to the client to unblock it's TX queue. Since synproxy_send_client_ack() does not set synproxy_send_tcp()'s nfct parameter, no sequence number adjustment happens and the client receives the window update with incorrect sequence number. Depending on client TCP implementation, this leads to a significant delay (until a window probe is being sent). Signed-off-by: Phil Sutter <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-08-10netfilter: ip6t_SYNPROXY: fix NULL pointer dereferencePhil Sutter1-8/+10
This happens when networking namespaces are enabled. Suggested-by: Patrick McHardy <[email protected]> Signed-off-by: Phil Sutter <[email protected]> Acked-by: Patrick McHardy <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-08-09net: ethernet: Fix double word "the the" in eth.cMasanari Iida1-1/+1
This patch fix double word "the the" in Documentation/DocBook/networking/API-eth-get-headlen.html Documentation/DocBook/networking/netdev.html Documentation/DocBook/networking.xml These files are generated from comment in source, so I have to fix comment in net/ethernet/eth.c. Signed-off-by: Masanari Iida <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-09mpls: Enforce payload type of traffic sent using explicit NULLRobert Shearman1-27/+44
RFC 4182 s2 states that if an IPv4 Explicit NULL label is the only label on the stack, then after popping the resulting packet must be treated as a IPv4 packet and forwarded based on the IPv4 header. The same is true for IPv6 Explicit NULL with an IPv6 packet following. Therefore, when installing the IPv4/IPv6 Explicit NULL label routes, add an attribute that specifies the expected payload type for use at forwarding time for determining the type of the encapsulated packet instead of inspecting the first nibble of the packet. Signed-off-by: Robert Shearman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-09net: dsa: add support for switchdev FDB objectsVivien Didelot1-102/+116
Remove the fdb_{add,del,getnext} function pointer in favor of new port_fdb_{add,del,getnext}. Implement the switchdev_port_obj_{add,del,dump} functions in DSA to support the SWITCHDEV_OBJ_PORT_FDB objects. Signed-off-by: Vivien Didelot <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-09net: switchdev: support static FDB addressesVivien Didelot1-1/+1
This patch adds a is_static boolean to the switchdev_obj_fdb structure, in order to set the ndm_state to either NUD_NOARP or NUD_REACHABLE. Signed-off-by: Vivien Didelot <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-09net: switchdev: change fdb addr for a byte arrayVivien Didelot2-3/+4
The address in the switchdev_obj_fdb structure is currently represented as a pointer. Replacing it for a 6-byte array allows switchdev to carry addresses directly read from hardware registers, not stored by the switch chip driver (as in Rocker). Signed-off-by: Vivien Didelot <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-09net:wimax: Fix doucble word "the the" in networking.xmlMasanari Iida1-2/+1
This patch fix a double word "the the" in Documentation/DocBook/networking.xml and Documentation/DocBook/networking/API-Wimax-report-rfkill-sw.html. These files are generated from comment in source, so I had to fix the typo in net/wimax/io-rfkill.c Signed-off-by: Masanari Iida <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-07net: Fix race condition in store_rps_mapTom Herbert1-4/+7
There is a race condition in store_rps_map that allows jump label count in rps_needed to go below zero. This can happen when concurrently attempting to set and a clear map. Scenario: 1. rps_needed count is zero 2. New map is assigned by setting thread, but rps_needed count _not_ yet incremented (rps_needed count still zero) 2. Map is cleared by second thread, old_map set to that just assigned 3. Second thread performs static_key_slow_dec, rps_needed count now goes negative Fix is to increment or decrement rps_needed under the spinlock. Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-07Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-mergeDavid S. Miller4-10/+42
Antonio Quartulli says: ==================== Included changes: - prevent DAT from replying on behalf of local clients and confuse L2 bridges - fix crash on double list removal of TT objects (tt_local_entry) - fix crash due to missing NULL checks - initialize bw values for new GWs objects to prevent memory leak ==================== Signed-off-by: David S. Miller <[email protected]>
2015-08-07openvswitch: Make 100 percents packets sampled when sampling rate is 1.Wenyu Zhang1-1/+4
When sampling rate is 1, the sampling probability is UINT32_MAX. The packet should be sampled even the prandom32() generate the number of UINT32_MAX. And none packet need be sampled when the probability is 0. Signed-off-by: Wenyu Zhang <[email protected]> Acked-by: Pravin B Shelar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-07vxlan: combine VXLAN_FLOWBASED into VXLAN_COLLECT_METADATAAlexei Starovoitov1-1/+1
IFLA_VXLAN_FLOWBASED is useless without IFLA_VXLAN_COLLECT_METADATA, so combine them into single IFLA_VXLAN_COLLECT_METADATA flag. 'flowbased' doesn't convey real meaning of the vxlan tunnel mode. This mode can be used by routing, tc+bpf and ovs. Only ovs is strictly flow based, so 'collect metadata' is a better name for this tunnel mode. Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Thomas Graf <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-07RDS-TCP: Support multiple RDS-TCP listen endpoints, one per netns.Sowmini Varadhan4-50/+162
Register pernet subsys init/stop functions that will set up and tear down per-net RDS-TCP listen endpoints. Unregister pernet subusys functions on 'modprobe -r' to clean up these end points. Enable keepalive on both accept and connect socket endpoints. The keepalive timer expiration will ensure that client socket endpoints will be removed as appropriate from the netns when an interface is removed from a namespace. Register a device notifier callback that will clean up all sockets (and thus avoid the need to wait for keepalive timeout) when the loopback device is unregistered from the netns indicating that the netns is getting deleted. Signed-off-by: Sowmini Varadhan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-07RDS-TCP: Make RDS-TCP work correctly when it is set up in a netns other than ↵Sowmini Varadhan12-27/+59
init_net Open the sockets calling sock_create_kern() with the correct struct net pointer, and use that struct net pointer when verifying the address passed to rds_bind(). Signed-off-by: Sowmini Varadhan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-07netfilter: nfacct: per network namespace supportAndreas Schultz2-22/+51
- Move the nfnl_acct_list into the network namespace, initialize and destroy it per namespace - Keep track of refcnt on nfacct objects, the old logic does not longer work with a per namespace list - Adjust xt_nfacct to pass the namespace when registring objects Signed-off-by: Andreas Schultz <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-08-07netfilter: nft_limit: add per-byte limitingPablo Neira Ayuso1-4/+59
This patch adds a new NFTA_LIMIT_TYPE netlink attribute to indicate the type of limiting. Contrary to per-packet limiting, the cost is calculated from the packet path since this depends on the packet length. The burst attribute indicates the number of bytes in which the rate can be exceeded. Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-08-07netfilter: nft_limit: constant token cost per packetPablo Neira Ayuso1-7/+18
The cost per packet can be calculated from the control plane path since this doesn't ever change. Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-08-07netfilter: nft_limit: add burst parameterPablo Neira Ayuso1-2/+18
This patch adds the burst parameter. This burst indicates the number of packets that can exceed the limit. Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-08-07netfilter: nft_limit: factor out shared code with per-byte limitingPablo Neira Ayuso1-33/+53
This patch prepares the introduction of per-byte limiting. Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-08-07netfilter: nft_limit: convert to token-based limiting at nanosecond granularityPablo Neira Ayuso1-16/+26
Rework the limit expression to use a token-based limiting approach that refills the bucket gradually. The tokens are calculated at nanosecond granularity instead jiffies to improve precision. Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-08-07netfilter: nft_limit: rename to nft_limit_pktsPablo Neira Ayuso1-6/+6
To prepare introduction of bytes ratelimit support. Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-08-07netfilter: nf_tables: add nft_dup expressionPablo Neira Ayuso8-2/+234
This new expression uses the nf_dup engine to clone packets to a given gateway. Unlike xt_TEE, we use an index to indicate output interface which should be fine at this stage. Moreover, change to the preemtion-safe this_cpu_read(nf_skb_duplicated) from nf_dup_ipv{4,6} to silence a lockdep splat. Based on the original tee expression from Arturo Borrero Gonzalez, although this patch has diverted quite a bit from this initial effort due to the change to support maps. Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-08-07netfilter: factor out packet duplication for IPv4/IPv6Pablo Neira Ayuso8-152/+240
Extracted from the xtables TEE target. This creates two new modules for IPv4 and IPv6 that are shared between the TEE target and the new nf_tables dup expressions. Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-08-07netfilter: xt_TEE: get rid of WITH_CONNTRACK definitionPablo Neira Ayuso1-5/+3
Use IS_ENABLED(CONFIG_NF_CONNTRACK) instead. Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-08-07netfilter: nft_counter: convert it to use per-cpu countersPablo Neira Ayuso1-28/+69
This patch converts the existing seqlock to per-cpu counters. Suggested-by: Eric Dumazet <[email protected]> Suggested-by: Patrick McHardy <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>