aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2015-08-21netfilter: xt_TEE: use IS_ENABLED(CONFIG_NF_DUP_IPV6)Pablo Neira Ayuso1-2/+2
Instead of IS_ENABLED(CONFIG_IPV6), otherwise we hit: et/built-in.o: In function `tee_tg6': >> xt_TEE.c:(.text+0x6cd8c): undefined reference to `nf_dup_ipv6' when: CONFIG_IPV6=y CONFIG_NF_DUP_IPV4=y # CONFIG_NF_DUP_IPV6 is not set CONFIG_NETFILTER_XT_TARGET_TEE=y Reported-by: kbuild test robot <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-08-21netfilter: nf_dup: fix sparse warningsPablo Neira Ayuso2-3/+3
>> net/ipv4/netfilter/nft_dup_ipv4.c:29:37: sparse: incorrect type in initializer (different base types) net/ipv4/netfilter/nft_dup_ipv4.c:29:37: expected restricted __be32 [user type] s_addr net/ipv4/netfilter/nft_dup_ipv4.c:29:37: got unsigned int [unsigned] <noident> >> net/ipv6/netfilter/nf_dup_ipv6.c:48:23: sparse: incorrect type in assignment (different base types) net/ipv6/netfilter/nf_dup_ipv6.c:48:23: expected restricted __be32 [addressable] [assigned] [usertype] flowlabel net/ipv6/netfilter/nf_dup_ipv6.c:48:23: got int Reported-by: kbuild test robot <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-08-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller11-89/+129
Conflicts: drivers/net/usb/qmi_wwan.c Overlapping additions of new device IDs to qmi_wwan.c Signed-off-by: David S. Miller <[email protected]>
2015-08-21ipvs: add more mcast parameters for the sync daemonJulian Anastasov2-24/+164
- mcast_group: configure the multicast address, now IPv6 is supported too - mcast_port: configure the multicast port - mcast_ttl: configure the multicast TTL/HOP_LIMIT Signed-off-by: Julian Anastasov <[email protected]> Signed-off-by: Simon Horman <[email protected]>
2015-08-21ipvs: add sync_maxlen parameter for the sync daemonJulian Anastasov2-94/+96
Allow setups with large MTU to send large sync packets by adding sync_maxlen parameter. The default value is now based on MTU but no more than 1500 for compatibility reasons. To avoid problems if MTU changes allow fragmentation by sending packets with DF=0. Problem reported by Dan Carpenter. Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Signed-off-by: Simon Horman <[email protected]>
2015-08-21ipvs: call rtnl_lock earlyJulian Anastasov2-19/+33
When the sync damon is started we need to hold rtnl lock while calling ip_mc_join_group. Currently, we have a wrong locking order because the correct one is rtnl_lock->__ip_vs_mutex. It is implied from the usage of __ip_vs_mutex in ip_vs_dst_event() which is called under rtnl lock during NETDEV_* notifications. Fix the problem by calling rtnl_lock early only for the start_sync_thread call. As a bonus this fixes the usage __dev_get_by_name which was not called under rtnl lock. This patch actually extends and depends on commit 54ff9ef36bdf ("ipv4, ipv6: kill ip_mc_{join, leave}_group and ipv6_sock_mc_{join, drop}"). Signed-off-by: Julian Anastasov <[email protected]> Signed-off-by: Simon Horman <[email protected]>
2015-08-21ipvs: Add ovf schedulerRaducu Deaconu3-0/+98
The weighted overflow scheduling algorithm directs network connections to the server with the highest weight that is currently available and overflows to the next when active connections exceed the node's weight. Signed-off-by: Raducu Deaconu <[email protected]> Acked-by: Julian Anastasov <[email protected]> Signed-off-by: Simon Horman <[email protected]>
2015-08-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller32-433/+1152
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next This is second pull request includes the conflict resolution patch that resulted from the updates that we got for the conntrack template through kmalloc. No changes with regards to the previously sent 15 patches. The following patchset contains Netfilter updates for your net-next tree, they are: 1) Rework the existing nf_tables counter expression to make it per-cpu. 2) Prepare and factor out common packet duplication code from the TEE target so it can be reused from the new dup expression. 3) Add the new dup expression for the nf_tables IPv4 and IPv6 families. 4) Convert the nf_tables limit expression to use a token-based approach with 64-bits precision. 5) Enhance the nf_tables limit expression to support limiting at packet byte. This comes after several preparation patches. 6) Add a burst parameter to indicate the amount of packets or bytes that can exceed the limiting. 7) Add netns support to nfacct, from Andreas Schultz. 8) Pass the nf_conn_zone structure instead of the zone ID in nf_tables to allow accessing more zone specific information, from Daniel Borkmann. 9) Allow to define zone per-direction to support netns containers with overlapping network addressing, also from Daniel. 10) Extend the CT target to allow setting the zone based on the skb->mark as a way to support simple mappings from iptables, also from Daniel. 11) Make the nf_tables payload expression aware of the fact that VLAN offload may have removed a vlan header, from Florian Westphal. ==================== Signed-off-by: David S. Miller <[email protected]>
2015-08-21Merge branch 'master' of ↵Pablo Neira Ayuso176-2008/+4138
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Resolve conflicts with conntrack template fixes. Conflicts: net/netfilter/nf_conntrack_core.c net/netfilter/nf_synproxy_core.c net/netfilter/xt_CT.c Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-08-20ipv6: route: per route IP tunnel metadata via lightweight tunnelJiri Benc1-0/+102
Allow specification of per route IP tunnel instructions also for IPv6. This complements commit 3093fbe7ff4b ("route: Per route IP tunnel metadata via lightweight tunnel"). Signed-off-by: Jiri Benc <[email protected]> CC: YOSHIFUJI Hideaki <[email protected]> Acked-by: Thomas Graf <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-20ipv6: route: extend flow representation with tunnel keyJiri Benc1-0/+6
Use flowi_tunnel in flowi6 similarly to what is done with IPv4. This complements commit 1b7179d3adff ("route: Extend flow representation with tunnel key"). Signed-off-by: Jiri Benc <[email protected]> Acked-by: Thomas Graf <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-20ipv6: ndisc: inherit metadata dst when creating ndisc requestsJiri Benc3-5/+9
If output device wants to see the dst, inherit the dst of the original skb in the ndisc request. This is an IPv6 counterpart of commit 0accfc268f4d ("arp: Inherit metadata dst when creating ARP requests"). Signed-off-by: Jiri Benc <[email protected]> Acked-by: Thomas Graf <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-20ipv6: drop metadata dst in ip6_route_inputJiri Benc1-0/+1
The fix in commit 48fb6b554501 is incomplete, as now ip6_route_input can be called with non-NULL dst if it's a metadata dst and the reference is leaked. Drop the reference. Fixes: 48fb6b554501 ("ipv6: fix crash over flow-based vxlan device") Fixes: ee122c79d422 ("vxlan: Flow based tunneling") CC: Wei-Chun Chao <[email protected]> CC: Thomas Graf <[email protected]> Signed-off-by: Jiri Benc <[email protected]> Acked-by: Thomas Graf <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-20route: move lwtunnel state to dst_entryJiri Benc10-102/+39
Currently, the lwtunnel state resides in per-protocol data. This is a problem if we encapsulate ipv6 traffic in an ipv4 tunnel (or vice versa). The xmit function of the tunnel does not know whether the packet has been routed to it by ipv4 or ipv6, yet it needs the lwtstate data. Moving the lwtstate data to dst_entry makes such inter-protocol tunneling possible. As a bonus, this brings a nice diffstat. Signed-off-by: Jiri Benc <[email protected]> Acked-by: Roopa Prabhu <[email protected]> Acked-by: Thomas Graf <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-20ip_tunnels: use tos and ttl fields also for IPv6Jiri Benc6-18/+18
Rename the ipv4_tos and ipv4_ttl fields to just 'tos' and 'ttl', as they'll be used with IPv6 tunnels, too. Signed-off-by: Jiri Benc <[email protected]> Acked-by: Thomas Graf <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-20ip_tunnels: add IPv6 addresses to ip_tunnel_keyJiri Benc8-25/+25
Add the IPv6 addresses as an union with IPv4 ones. When using IPv4, the newly introduced padding after the IPv4 addresses needs to be zeroed out. Signed-off-by: Jiri Benc <[email protected]> Acked-by: Thomas Graf <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-20net: Fix nexthop lookupsDavid Ahern1-1/+8
Andreas reported breakage adding routes with local nexthops: $ ip route show table main ... 172.28.0.0/24 dev vnf-xe1p0 proto kernel scope link src 172.28.0.16 $ ip route add 10.0.0.0/8 via 172.28.0.32 table 100 dev vnf-xe1p0 RTNETLINK answers: Resource temporarily unavailable 3bfd847203c changed the lookup to use the passed in table but for cases like this the nexthop is in the local table rather than the passed in table. Fixes: 3bfd847203c ("net: Use passed in table for nexthop lookups") Reported-by: Andreas Schultz <[email protected]> Signed-off-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-20bridge: fix netlink max attr sizeScott Feldman1-1/+1
.maxtype should match .policy. Probably just been getting lucky here because IFLA_BRPORT_MAX > IFLA_BR_MAX. Fixes: 13323516 ("bridge: implement rtnl_link_ops->changelink") Signed-off-by: Scott Feldman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-20ipv4: Make fib_encap_match staticYing Xue1-3/+3
Make fib_encap_match() static as it isn't used outside the file. Signed-off-by: Ying Xue <[email protected]> Reviewed-by: Jiri Benc <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-20nfc: netlink: Add capability to reply to vendor_cmd with dataChristophe Ricard1-2/+84
A proprietary vendor command may send back useful data to the user application. For example, the field level applied on the NFC router antenna. Still based on net/wireless/nl80211.c implementation, add nfc_vendor_cmd_alloc_reply_skb and nfc_vendor_cmd_reply in order to send back over netlink data generated by a proprietary command. Signed-off-by: Christophe Ricard <[email protected]> Signed-off-by: Samuel Ortiz <[email protected]>
2015-08-20nfc: nci: hci: Add check on skb nci_hci_send_cmd parameterChristophe Ricard1-1/+1
skb can be NULL and may lead to a NULL pointer error. Add a check condition before setting HCI rx buffer. Cc: [email protected] Signed-off-by: Christophe Ricard <[email protected]> Signed-off-by: Samuel Ortiz <[email protected]>
2015-08-20NFC: nci: export nci_core_reset and nci_core_initRobert Baldyga1-0/+14
Some drivers needs to have ability to reinit NCI core, for example after updating firmware in setup() of post_setup() callback. This patch makes nci_core_reset() and nci_core_init() functions public, to make it possible. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Samuel Ortiz <[email protected]>
2015-08-20NFC: nci: Add post_setup handlerRobert Baldyga1-0/+4
Some drivers require non-standard configuration after NCI_CORE_INIT request, because they need to know ndev->manufact_specific_info or ndev->manufact_id. This patch adds post_setup handler allowing to do such custom configuration. Signed-off-by: Robert Baldyga <[email protected]> Signed-off-by: Samuel Ortiz <[email protected]>
2015-08-19vrf: vrf_master_ifindex_rcu is not always called with rcu read lockNikolay Aleksandrov1-2/+2
While running net-next I hit this: [ 634.073119] =============================== [ 634.073150] [ INFO: suspicious RCU usage. ] [ 634.073182] 4.2.0-rc6+ #45 Not tainted [ 634.073213] ------------------------------- [ 634.073244] include/net/vrf.h:38 suspicious rcu_dereference_check() usage! [ 634.073274] other info that might help us debug this: [ 634.073307] rcu_scheduler_active = 1, debug_locks = 1 [ 634.073338] 2 locks held by swapper/0/0: [ 634.073369] #0: (((&n->timer))){+.-...}, at: [<ffffffff8112bc35>] call_timer_fn+0x5/0x480 [ 634.073412] #1: (slock-AF_INET){+.-...}, at: [<ffffffff8174f0f5>] icmp_send+0x155/0x5f0 [ 634.073450] stack backtrace: [ 634.073483] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.2.0-rc6+ #45 [ 634.073514] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 634.073545] 0000000000000000 0593ba8242d9ace4 ffff88002fc03b48 ffffffff81803f1b [ 634.073612] 0000000000000000 ffffffff81e12500 ffff88002fc03b78 ffffffff811003c5 [ 634.073642] 0000000000000000 ffff88002ec4e600 ffffffff81f00f80 ffff88002fc03cf0 [ 634.073669] Call Trace: [ 634.073694] <IRQ> [<ffffffff81803f1b>] dump_stack+0x4c/0x65 [ 634.073728] [<ffffffff811003c5>] lockdep_rcu_suspicious+0xc5/0x100 [ 634.073763] [<ffffffff8174eb56>] icmp_route_lookup+0x176/0x5c0 [ 634.073793] [<ffffffff8174f2fb>] ? icmp_send+0x35b/0x5f0 [ 634.073818] [<ffffffff8174f274>] ? icmp_send+0x2d4/0x5f0 [ 634.073844] [<ffffffff8174f3ce>] icmp_send+0x42e/0x5f0 [ 634.073873] [<ffffffff8170b662>] ipv4_link_failure+0x22/0xa0 [ 634.073899] [<ffffffff8174bdda>] arp_error_report+0x3a/0x80 [ 634.073926] [<ffffffff816d6100>] ? neigh_lookup+0x2c0/0x2c0 [ 634.073952] [<ffffffff816d396e>] neigh_invalidate+0x8e/0x110 [ 634.073984] [<ffffffff816d62ae>] neigh_timer_handler+0x1ae/0x290 [ 634.074013] [<ffffffff816d6100>] ? neigh_lookup+0x2c0/0x2c0 [ 634.074013] [<ffffffff8112bce3>] call_timer_fn+0xb3/0x480 [ 634.074013] [<ffffffff8112bc35>] ? call_timer_fn+0x5/0x480 [ 634.074013] [<ffffffff816d6100>] ? neigh_lookup+0x2c0/0x2c0 [ 634.074013] [<ffffffff8112c2bc>] run_timer_softirq+0x20c/0x430 [ 634.074013] [<ffffffff810af50e>] __do_softirq+0xde/0x630 [ 634.074013] [<ffffffff810afc97>] irq_exit+0x117/0x120 [ 634.074013] [<ffffffff81810976>] smp_apic_timer_interrupt+0x46/0x60 [ 634.074013] [<ffffffff8180e950>] apic_timer_interrupt+0x70/0x80 [ 634.074013] <EOI> [<ffffffff8106b9d6>] ? native_safe_halt+0x6/0x10 [ 634.074013] [<ffffffff81101d8d>] ? trace_hardirqs_on+0xd/0x10 [ 634.074013] [<ffffffff81027d43>] default_idle+0x23/0x200 [ 634.074013] [<ffffffff8102852f>] arch_cpu_idle+0xf/0x20 [ 634.074013] [<ffffffff810f89ba>] default_idle_call+0x2a/0x40 [ 634.074013] [<ffffffff810f8dcc>] cpu_startup_entry+0x39c/0x4c0 [ 634.074013] [<ffffffff817f9cad>] rest_init+0x13d/0x150 [ 634.074013] [<ffffffff81f69038>] start_kernel+0x4a8/0x4c9 [ 634.074013] [<ffffffff81f68120>] ? early_idt_handler_array+0x120/0x120 [ 634.074013] [<ffffffff81f68339>] x86_64_start_reservations+0x2a/0x2c [ 634.074013] [<ffffffff81f68485>] x86_64_start_kernel+0x14a/0x16d It would seem vrf_master_ifindex_rcu() can be called without RCU held in other contexts as well so introduce a new helper which acquires rcu and returns the ifindex. Also add curly braces around both the "if" and "else" parts as per the style guide. Signed-off-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-19SUNRPC: Allow sockets to do GFP_NOIO allocationsTrond Myklebust1-3/+3
Follow up to commit c4a7ca774949 ("SUNRPC: Allow waiting on memory allocation"). Allows the RPC socket code to do non-IO blocking. Signed-off-by: Trond Myklebust <[email protected]>
2015-08-19netfilter: nft_payload: work around vlan header strippingFlorian Westphal1-1/+56
make payload expression aware of the fact that VLAN offload may have removed a vlan header. When we encounter tagged skb, transparently insert the tag into the register so that vlan header matching can work without userspace being aware of offload features. Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-08-18lwtunnel: ip tunnel: fix multiple routes with different encapJiri Benc1-0/+7
Currently, two routes going through the same tunnel interface are considered the same even when they are routed to a different host after encapsulation. This causes all routes added after the first one to have incorrect encapsulation parameters. This is nicely visible by doing: # ip r a 192.168.1.2/32 dev vxlan0 tunnel dst 10.0.0.2 # ip r a 192.168.1.3/32 dev vxlan0 tunnel dst 10.0.0.3 # ip r [...] 192.168.1.2/32 tunnel id 0 src 0.0.0.0 dst 10.0.0.2 [...] 192.168.1.3/32 tunnel id 0 src 0.0.0.0 dst 10.0.0.2 [...] Implement the missing comparison function. Fixes: 3093fbe7ff4bc ("route: Per route IP tunnel metadata via lightweight tunnel") Signed-off-by: Jiri Benc <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-18lwtunnel: fix memory leakJiri Benc1-4/+6
The built lwtunnel_state struct has to be freed after comparison. Fixes: 571e722676fe3 ("ipv4: support for fib route lwtunnel encap attributes") Signed-off-by: Jiri Benc <[email protected]> Acked-by: Roopa Prabhu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-18batman-adv: Fix memory leak on tt add with invalid vlanSven Eckelmann1-1/+4
The object tt_local is allocated with kmalloc and not initialized when the function batadv_tt_local_add checks for the vlan. But this function can only cleanup the object when the (not yet initialized) reference counter of the object is 1. This is unlikely and thus the object would leak when the vlan could not be found. Instead the uninitialized object tt_local has to be freed manually and the pointer has to set to NULL to avoid calling the function which would try to decrement the reference counter of the not existing object. CID: 1316518 Fixes: 354136bcc3c4 ("batman-adv: fix kernel crash due to missing NULL checks") Signed-off-by: Sven Eckelmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-18net: dsa: Allow multi hop routes to be expressedAndrew Lunn1-10/+30
With more than two switches in a hierarchy, it becomes necessary to describe multi-hop routes between switches. The current binding does not allow this, although the older platform_data did. Extend the link property to be a list rather than a single phandle to a remote switch. It is then possible to express that a port should be used to reach more than one switch and the switch maybe more than one hop away. Signed-off-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-18net: sched: drop all special handling of tx_queue_len == 0Phil Sutter5-17/+9
Those were all workarounds for the formerly double meaning of tx_queue_len, which broke scheduling algorithms if untreated. Now that all in-tree drivers have been converted away from setting tx_queue_len = 0, it should be safe to drop these workarounds for categorically broken setups. Signed-off-by: Phil Sutter <[email protected]> Cc: Jamal Hadi Salim <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-18net: warn if drivers set tx_queue_len = 0Phil Sutter1-0/+3
Due to the introduction of IFF_NO_QUEUE, there is a better way for drivers to indicate that no qdisc should be attached by default. Though, the old convention can't be dropped since ignoring that setting would break drivers still using it. Instead, add a warning so out-of-tree driver maintainers get a chance to adjust their code before we finally get rid of any special handling of tx_queue_len == 0. Signed-off-by: Phil Sutter <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-18net: caif: convert to using IFF_NO_QUEUEPhil Sutter1-1/+1
Signed-off-by: Phil Sutter <[email protected]> Cc: Dmitry Tarnyagin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-18net: hsr: convert to using IFF_NO_QUEUEPhil Sutter1-1/+1
Signed-off-by: Phil Sutter <[email protected]> Cc: Arvid Brodin <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-18net: batman-adv: convert to using IFF_NO_QUEUEPhil Sutter1-1/+1
Signed-off-by: Phil Sutter <[email protected]> Cc: Marek Lindner <[email protected]> Cc: Simon Wunderlich <[email protected]> Cc: Antonio Quartulli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-18net: dsa: convert to using IFF_NO_QUEUEPhil Sutter1-1/+1
Signed-off-by: Phil Sutter <[email protected]> Cc: Lennert Buytenhek <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-18net: 6lowpan: convert to using IFF_NO_QUEUEPhil Sutter1-1/+1
Signed-off-by: Phil Sutter <[email protected]> Cc: Alexander Aring <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-18net: bridge: convert to using IFF_NO_QUEUEPhil Sutter1-2/+1
Signed-off-by: Phil Sutter <[email protected]> Cc: Stephen Hemminger <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-18net: 8021q: convert to using IFF_NO_QUEUEPhil Sutter1-2/+1
Signed-off-by: Phil Sutter <[email protected]> Cc: Patrick McHardy <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-17net: Identifier Locator Addressing moduleTom Herbert3-0/+236
Adding new module name ila. This implements ILA translation. Light weight tunnel redirection is used to perform the translation in the data path. This is configured by the "ip -6 route" command using the "encap ila <locator>" option, where <locator> is the value to set in destination locator of the packet. e.g. ip -6 route add 3333:0:0:1:5555:0:1:0/128 \ encap ila 2001:0:0:1 via 2401:db00:20:911a:face:0:25:0 Sets a route where 3333:0:0:1 will be overwritten by 2001:0:0:1 on output. Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-17net: Add inet_proto_csum_replace_by_diff utility functionTom Herbert1-0/+13
This function updates a checksum field value and skb->csum based on a value which is the difference between the old and new checksum. Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-17net: Change pseudohdr argument of inet_proto_csum_replace* to be a boolTom Herbert17-33/+35
inet_proto_csum_replace4,2,16 take a pseudohdr argument which indicates the checksum field carries a pseudo header. This argument should be a boolean instead of an int. Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-17lwt: Add support to redirect dst.inputTom Herbert3-2/+69
This patch adds the capability to redirect dst input in the same way that dst output is redirected by LWT. Also, save the original dst.input and and dst.out when setting up lwtunnel redirection. These can be called by the client as a pass- through. Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-18netfilter: nf_conntrack: add efficient mark to zone mappingDaniel Borkmann5-40/+27
This work adds the possibility of deriving the zone id from the skb->mark field in a scalable manner. This allows for having only a single template serving hundreds/thousands of different zones, for example, instead of the need to have one match for each zone as an extra CT jump target. Note that we'd need to have this information attached to the template as at the time when we're trying to lookup a possible ct object, we already need to know zone information for a possible match when going into __nf_conntrack_find_get(). This work provides a minimal implementation for a possible mapping. In order to not add/expose an extra ct->status bit, the zone structure has been extended to carry a flag for deriving the mark. Signed-off-by: Daniel Borkmann <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-08-18netfilter: nf_conntrack: add direction support for zonesDaniel Borkmann9-92/+223
This work adds a direction parameter to netfilter zones, so identity separation can be performed only in original/reply or both directions (default). This basically opens up the possibility of doing NAT with conflicting IP address/port tuples from multiple, isolated tenants on a host (e.g. from a netns) without requiring each tenant to NAT twice resp. to use its own dedicated IP address to SNAT to, meaning overlapping tuples can be made unique with the zone identifier in original direction, where the NAT engine will then allocate a unique tuple in the commonly shared default zone for the reply direction. In some restricted, local DNAT cases, also port redirection could be used for making the reply traffic unique w/o requiring SNAT. The consensus we've reached and discussed at NFWS and since the initial implementation [1] was to directly integrate the direction meta data into the existing zones infrastructure, as opposed to the ct->mark approach we proposed initially. As we pass the nf_conntrack_zone object directly around, we don't have to touch all call-sites, but only those, that contain equality checks of zones. Thus, based on the current direction (original or reply), we either return the actual id, or the default NF_CT_DEFAULT_ZONE_ID. CT expectations are direction-agnostic entities when expectations are being compared among themselves, so we can only use the identifier in this case. Note that zone identifiers can not be included into the hash mix anymore as they don't contain a "stable" value that would be equal for both directions at all times, f.e. if only zone->id would unconditionally be xor'ed into the table slot hash, then replies won't find the corresponding conntracking entry anymore. If no particular direction is specified when configuring zones, the behaviour is exactly as we expect currently (both directions). Support has been added for the CT netlink interface as well as the x_tables raw CT target, which both already offer existing interfaces to user space for the configuration of zones. Below a minimal, simplified collision example (script in [2]) with netperf sessions: +--- tenant-1 ---+ mark := 1 | netperf |--+ +----------------+ | CT zone := mark [ORIGINAL] [ip,sport] := X +--------------+ +--- gateway ---+ | mark routing |--| SNAT |-- ... + +--------------+ +---------------+ | +--- tenant-2 ---+ | ~~~|~~~ | netperf |--+ +-----------+ | +----------------+ mark := 2 | netserver |------ ... + [ip,sport] := X +-----------+ [ip,port] := Y On the gateway netns, example: iptables -t raw -A PREROUTING -j CT --zone mark --zone-dir ORIGINAL iptables -t nat -A POSTROUTING -o <dev> -j SNAT --to-source <ip> --random-fully iptables -t mangle -A PREROUTING -m conntrack --ctdir ORIGINAL -j CONNMARK --save-mark iptables -t mangle -A POSTROUTING -m conntrack --ctdir REPLY -j CONNMARK --restore-mark conntrack dump from gateway netns: netperf -H 10.1.1.2 -t TCP_STREAM -l60 -p12865,5555 from each tenant netns tcp 6 431995 ESTABLISHED src=40.1.1.1 dst=10.1.1.2 sport=5555 dport=12865 zone-orig=1 src=10.1.1.2 dst=10.1.1.1 sport=12865 dport=1024 [ASSURED] mark=1 secctx=system_u:object_r:unlabeled_t:s0 use=1 tcp 6 431994 ESTABLISHED src=40.1.1.1 dst=10.1.1.2 sport=5555 dport=12865 zone-orig=2 src=10.1.1.2 dst=10.1.1.1 sport=12865 dport=5555 [ASSURED] mark=2 secctx=system_u:object_r:unlabeled_t:s0 use=1 tcp 6 299 ESTABLISHED src=40.1.1.1 dst=10.1.1.2 sport=39438 dport=33768 zone-orig=1 src=10.1.1.2 dst=10.1.1.1 sport=33768 dport=39438 [ASSURED] mark=1 secctx=system_u:object_r:unlabeled_t:s0 use=1 tcp 6 300 ESTABLISHED src=40.1.1.1 dst=10.1.1.2 sport=32889 dport=40206 zone-orig=2 src=10.1.1.2 dst=10.1.1.1 sport=40206 dport=32889 [ASSURED] mark=2 secctx=system_u:object_r:unlabeled_t:s0 use=2 Taking this further, test script in [2] creates 200 tenants and runs original-tuple colliding netperf sessions each. A conntrack -L dump in the gateway netns also confirms 200 overlapping entries, all in ESTABLISHED state as expected. I also did run various other tests with some permutations of the script, to mention some: SNAT in random/random-fully/persistent mode, no zones (no overlaps), static zones (original, reply, both directions), etc. [1] http://thread.gmane.org/gmane.comp.security.firewalls.netfilter.devel/57412/ [2] https://paste.fedoraproject.org/242835/65657871/ Signed-off-by: Daniel Borkmann <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-08-17inet: Move VRF table lookup to inlined functionDavid Ahern1-9/+1
Table lookup compiles out when VRF is not enabled. Signed-off-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-17Merge branch 'for-upstream' of ↵David S. Miller22-194/+491
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2015-08-16 Here's what's likely the last bluetooth-next pull request for 4.3: - 6lowpan/802.15.4 refactoring, cleanups & fixes - Document 6lowpan netdev usage in Documentation/networking/6lowpan.txt - Support for UART based QCA Bluetooth controllers - Power management support for Broeadcom Bluetooth controllers - Change LE connection initiation to always use passive scanning first - Support for new Silicon Wave USB ID Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <[email protected]>
2015-08-17net: Export bpf_prog_create_from_user().David S. Miller1-0/+1
Signed-off-by: David S. Miller <[email protected]>
2015-08-17ipv6: trivial whitespace fixIan Morris1-1/+2
Change brace placement to be in line with coding standards Signed-off-by: Ian Morris <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-17Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-mergeDavid S. Miller9-48/+102
Antonio Quartulli says: ==================== Included changes: - avoid integer overflow in GW selection routine - prevent race condition by making capability bit changes atomic (use clear/set/test_bit) - fix synchronization issue in mcast tvlv handler - fix crash on double list removal of TT Request objects - fix leak by puring packets enqueued for sending upon iface removal - ensure network header pointer is set in skb ==================== Signed-off-by: David S. Miller <[email protected]>