aboutsummaryrefslogtreecommitdiff
path: root/include/uapi/linux
AgeCommit message (Collapse)AuthorFilesLines
2015-08-27geneve: Make dst-port configurable.Pravin B Shelar1-0/+1
Add netlink interface to configure Geneve UDP port number. So that user can configure it for a Gevene device. Signed-off-by: Pravin B Shelar <[email protected]> Reviewed-by: Jesse Gross <[email protected]> Acked-by: Thomas Graf <[email protected]> Acked-by: John W. Linville <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-27bridge: Add netlink support for vlan_protocol attributeToshiaki Makita1-0/+1
This enables bridge vlan_protocol to be configured through netlink. When CONFIG_BRIDGE_VLAN_FILTERING is disabled, kernel behaves the same way as this feature is not implemented. Signed-off-by: Toshiaki Makita <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-27openvswitch: Allow attaching helpers to ct actionJoe Stringer1-0/+3
Add support for using conntrack helpers to assist protocol detection. The new OVS_CT_ATTR_HELPER attribute of the CT action specifies a helper to be used for this connection. If no helper is specified, then helpers will be automatically applied as per the sysctl configuration of net.netfilter.nf_conntrack_helper. The helper may be specified as part of the conntrack action, eg: ct(helper=ftp). Initial packets for related connections should be committed to allow later packets for the flow to be considered established. Example ovs-ofctl flows allowing FTP connections from ports 1->2: in_port=1,tcp,action=ct(helper=ftp,commit),2 in_port=2,tcp,ct_state=-trk,action=ct(recirc) in_port=2,tcp,ct_state=+trk-new+est,action=1 in_port=2,tcp,ct_state=+trk+rel,action=1 Signed-off-by: Joe Stringer <[email protected]> Acked-by: Thomas Graf <[email protected]> Acked-by: Pravin B Shelar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-27openvswitch: Allow matching on conntrack labelJoe Stringer1-0/+10
Allow matching and setting the ct_label field. As with ct_mark, this is populated by executing the CT action. The label field may be modified by specifying a label and mask nested under the CT action. It is stored as metadata attached to the connection. Label modification occurs after lookup, and will only persist when the conntrack entry is committed by providing the COMMIT flag to the CT action. Labels are currently fixed to 128 bits in size. Signed-off-by: Joe Stringer <[email protected]> Acked-by: Thomas Graf <[email protected]> Acked-by: Pravin B Shelar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-27openvswitch: Allow matching on conntrack markJoe Stringer1-0/+5
Allow matching and setting the ct_mark field. As with ct_state and ct_zone, these fields are populated when the CT action is executed. To write to this field, a value and mask can be specified as a nested attribute under the CT action. This data is stored with the conntrack entry, and is executed after the lookup occurs for the CT action. The conntrack entry itself must be committed using the COMMIT flag in the CT action flags for this change to persist. Signed-off-by: Justin Pettit <[email protected]> Signed-off-by: Joe Stringer <[email protected]> Acked-by: Thomas Graf <[email protected]> Acked-by: Pravin B Shelar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-27openvswitch: Add conntrack actionJoe Stringer1-0/+40
Expose the kernel connection tracker via OVS. Userspace components can make use of the CT action to populate the connection state (ct_state) field for a flow. This state can be subsequently matched. Exposed connection states are OVS_CS_F_*: - NEW (0x01) - Beginning of a new connection. - ESTABLISHED (0x02) - Part of an existing connection. - RELATED (0x04) - Related to an established connection. - INVALID (0x20) - Could not track the connection for this packet. - REPLY_DIR (0x40) - This packet is in the reply direction for the flow. - TRACKED (0x80) - This packet has been sent through conntrack. When the CT action is executed by itself, it will send the packet through the connection tracker and populate the ct_state field with one or more of the connection state flags above. The CT action will always set the TRACKED bit. When the COMMIT flag is passed to the conntrack action, this specifies that information about the connection should be stored. This allows subsequent packets for the same (or related) connections to be correlated with this connection. Sending subsequent packets for the connection through conntrack allows the connection tracker to consider the packets as ESTABLISHED, RELATED, and/or REPLY_DIR. The CT action may optionally take a zone to track the flow within. This allows connections with the same 5-tuple to be kept logically separate from connections in other zones. If the zone is specified, then the "ct_zone" match field will be subsequently populated with the zone id. IP fragments are handled by transparently assembling them as part of the CT action. The maximum received unit (MRU) size is tracked so that refragmentation can occur during output. IP frag handling contributed by Andy Zhou. Based on original design by Justin Pettit. Signed-off-by: Joe Stringer <[email protected]> Signed-off-by: Justin Pettit <[email protected]> Signed-off-by: Andy Zhou <[email protected]> Acked-by: Thomas Graf <[email protected]> Acked-by: Pravin B Shelar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-26Merge tag 'ipvs2-for-v4.3' of ↵Pablo Neira Ayuso1-0/+5
https://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs-next Simon Horman says: ==================== Second Round of IPVS Updates for v4.3 I realise these are a little late in the cycle, so if you would prefer me to repost them for v4.4 then just let me know. The updates include: * A new scheduler from Raducu Deaconu * Enhanced configurability of the sync daemon from Julian Anastasov ==================== Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-08-26netfilter: ip6t_REJECT: added missing icmpv6 codesAndreas Herz1-1/+3
RFC 4443 added two new codes values for ICMPv6 type 1: 5 - Source address failed ingress/egress policy 6 - Reject route to destination And RFC 7084 states in L-14 that IPv6 Router MUST send ICMPv6 Destination Unreachable with code 5 for packets forwarded to it that use an address from a prefix that has been invalidated. Codes 5 and 6 are more informative subsets of code 1. Signed-off-by: Andreas Herz <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-08-25dlm: fix lvb copy for user locksDavid Teigland1-1/+1
For a userland lock request, the previous and current lock modes are used to decide when the lvb should be copied back to the user. The wrong previous value was used, so that it always matched the current value. This caused the lvb to be copied back to the user in the wrong cases. Signed-off-by: David Teigland <[email protected]>
2015-08-22Merge tag 'signed-kvm-ppc-next' of git://github.com/agraf/linux-2.6 into ↵Paolo Bonzini2-0/+3
kvm-queue Patch queue for ppc - 2015-08-22 Highlights for KVM PPC this time around: - Book3S: A few bug fixes - Book3S: Allow micro-threading on POWER8
2015-08-21ipvs: add more mcast parameters for the sync daemonJulian Anastasov1-0/+4
- mcast_group: configure the multicast address, now IPv6 is supported too - mcast_port: configure the multicast port - mcast_ttl: configure the multicast TTL/HOP_LIMIT Signed-off-by: Julian Anastasov <[email protected]> Signed-off-by: Simon Horman <[email protected]>
2015-08-21ipvs: add sync_maxlen parameter for the sync daemonJulian Anastasov1-0/+1
Allow setups with large MTU to send large sync packets by adding sync_maxlen parameter. The default value is now based on MTU but no more than 1500 for compatibility reasons. To avoid problems if MTU changes allow fragmentation by sending packets with DF=0. Problem reported by Dan Carpenter. Reported-by: Dan Carpenter <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Signed-off-by: Simon Horman <[email protected]>
2015-08-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller3-1/+31
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next This is second pull request includes the conflict resolution patch that resulted from the updates that we got for the conntrack template through kmalloc. No changes with regards to the previously sent 15 patches. The following patchset contains Netfilter updates for your net-next tree, they are: 1) Rework the existing nf_tables counter expression to make it per-cpu. 2) Prepare and factor out common packet duplication code from the TEE target so it can be reused from the new dup expression. 3) Add the new dup expression for the nf_tables IPv4 and IPv6 families. 4) Convert the nf_tables limit expression to use a token-based approach with 64-bits precision. 5) Enhance the nf_tables limit expression to support limiting at packet byte. This comes after several preparation patches. 6) Add a burst parameter to indicate the amount of packets or bytes that can exceed the limiting. 7) Add netns support to nfacct, from Andreas Schultz. 8) Pass the nf_conn_zone structure instead of the zone ID in nf_tables to allow accessing more zone specific information, from Daniel Borkmann. 9) Allow to define zone per-direction to support netns containers with overlapping network addressing, also from Daniel. 10) Extend the CT target to allow setting the zone based on the skb->mark as a way to support simple mappings from iptables, also from Daniel. 11) Make the nf_tables payload expression aware of the fact that VLAN offload may have removed a vlan header, from Florian Westphal. ==================== Signed-off-by: David S. Miller <[email protected]>
2015-08-21Merge branch 'master' of ↵Pablo Neira Ayuso10-16/+49
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Resolve conflicts with conntrack template fixes. Conflicts: net/netfilter/nf_conntrack_core.c net/netfilter/nf_synproxy_core.c net/netfilter/xt_CT.c Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-08-20ipv6: route: per route IP tunnel metadata via lightweight tunnelJiri Benc1-0/+16
Allow specification of per route IP tunnel instructions also for IPv6. This complements commit 3093fbe7ff4b ("route: Per route IP tunnel metadata via lightweight tunnel"). Signed-off-by: Jiri Benc <[email protected]> CC: YOSHIFUJI Hideaki <[email protected]> Acked-by: Thomas Graf <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-20Merge branch 'perf/urgent' into perf/core, to pick up fixes before adding ↵Ingo Molnar1-0/+1
more changes Signed-off-by: Ingo Molnar <[email protected]>
2015-08-18dm stats: report precise_timestamps and histogram in @stats_list outputMikulas Patocka1-2/+2
If the user selected the precise_timestamps or histogram options, report it in the @stats_list message output. If the user didn't select these options, no extra tokens are reported, thus it is backward compatible with old software that doesn't know about precise timestamps and histogram. Signed-off-by: Mikulas Patocka <[email protected]> Signed-off-by: Mike Snitzer <[email protected]> Cc: [email protected] # 4.2
2015-08-18NVMe: Add nvme subsystem reset IOCTLJon Derrick1-0/+1
Controllers can perform optional subsystem resets as introduced in NVMe 1.1. This patch adds an IOCTL to trigger the subsystem reset by writing "NVMe" to the NSSR register. Signed-off-by: Jon Derrick <[email protected]> Acked-by: Keith Busch <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2015-08-17net: Identifier Locator Addressing moduleTom Herbert2-0/+16
Adding new module name ila. This implements ILA translation. Light weight tunnel redirection is used to perform the translation in the data path. This is configured by the "ip -6 route" command using the "encap ila <locator>" option, where <locator> is the value to set in destination locator of the packet. e.g. ip -6 route add 3333:0:0:1:5555:0:1:0/128 \ encap ila 2001:0:0:1 via 2401:db00:20:911a:face:0:25:0 Sets a route where 3333:0:0:1 will be overwritten by 2001:0:0:1 on output. Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-18netfilter: nf_conntrack: add efficient mark to zone mappingDaniel Borkmann1-1/+3
This work adds the possibility of deriving the zone id from the skb->mark field in a scalable manner. This allows for having only a single template serving hundreds/thousands of different zones, for example, instead of the need to have one match for each zone as an extra CT jump target. Note that we'd need to have this information attached to the template as at the time when we're trying to lookup a possible ct object, we already need to know zone information for a possible match when going into __nf_conntrack_find_get(). This work provides a minimal implementation for a possible mapping. In order to not add/expose an extra ct->status bit, the zone structure has been extended to carry a flag for deriving the mark. Signed-off-by: Daniel Borkmann <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-08-18netfilter: nf_conntrack: add direction support for zonesDaniel Borkmann2-1/+6
This work adds a direction parameter to netfilter zones, so identity separation can be performed only in original/reply or both directions (default). This basically opens up the possibility of doing NAT with conflicting IP address/port tuples from multiple, isolated tenants on a host (e.g. from a netns) without requiring each tenant to NAT twice resp. to use its own dedicated IP address to SNAT to, meaning overlapping tuples can be made unique with the zone identifier in original direction, where the NAT engine will then allocate a unique tuple in the commonly shared default zone for the reply direction. In some restricted, local DNAT cases, also port redirection could be used for making the reply traffic unique w/o requiring SNAT. The consensus we've reached and discussed at NFWS and since the initial implementation [1] was to directly integrate the direction meta data into the existing zones infrastructure, as opposed to the ct->mark approach we proposed initially. As we pass the nf_conntrack_zone object directly around, we don't have to touch all call-sites, but only those, that contain equality checks of zones. Thus, based on the current direction (original or reply), we either return the actual id, or the default NF_CT_DEFAULT_ZONE_ID. CT expectations are direction-agnostic entities when expectations are being compared among themselves, so we can only use the identifier in this case. Note that zone identifiers can not be included into the hash mix anymore as they don't contain a "stable" value that would be equal for both directions at all times, f.e. if only zone->id would unconditionally be xor'ed into the table slot hash, then replies won't find the corresponding conntracking entry anymore. If no particular direction is specified when configuring zones, the behaviour is exactly as we expect currently (both directions). Support has been added for the CT netlink interface as well as the x_tables raw CT target, which both already offer existing interfaces to user space for the configuration of zones. Below a minimal, simplified collision example (script in [2]) with netperf sessions: +--- tenant-1 ---+ mark := 1 | netperf |--+ +----------------+ | CT zone := mark [ORIGINAL] [ip,sport] := X +--------------+ +--- gateway ---+ | mark routing |--| SNAT |-- ... + +--------------+ +---------------+ | +--- tenant-2 ---+ | ~~~|~~~ | netperf |--+ +-----------+ | +----------------+ mark := 2 | netserver |------ ... + [ip,sport] := X +-----------+ [ip,port] := Y On the gateway netns, example: iptables -t raw -A PREROUTING -j CT --zone mark --zone-dir ORIGINAL iptables -t nat -A POSTROUTING -o <dev> -j SNAT --to-source <ip> --random-fully iptables -t mangle -A PREROUTING -m conntrack --ctdir ORIGINAL -j CONNMARK --save-mark iptables -t mangle -A POSTROUTING -m conntrack --ctdir REPLY -j CONNMARK --restore-mark conntrack dump from gateway netns: netperf -H 10.1.1.2 -t TCP_STREAM -l60 -p12865,5555 from each tenant netns tcp 6 431995 ESTABLISHED src=40.1.1.1 dst=10.1.1.2 sport=5555 dport=12865 zone-orig=1 src=10.1.1.2 dst=10.1.1.1 sport=12865 dport=1024 [ASSURED] mark=1 secctx=system_u:object_r:unlabeled_t:s0 use=1 tcp 6 431994 ESTABLISHED src=40.1.1.1 dst=10.1.1.2 sport=5555 dport=12865 zone-orig=2 src=10.1.1.2 dst=10.1.1.1 sport=12865 dport=5555 [ASSURED] mark=2 secctx=system_u:object_r:unlabeled_t:s0 use=1 tcp 6 299 ESTABLISHED src=40.1.1.1 dst=10.1.1.2 sport=39438 dport=33768 zone-orig=1 src=10.1.1.2 dst=10.1.1.1 sport=33768 dport=39438 [ASSURED] mark=1 secctx=system_u:object_r:unlabeled_t:s0 use=1 tcp 6 300 ESTABLISHED src=40.1.1.1 dst=10.1.1.2 sport=32889 dport=40206 zone-orig=2 src=10.1.1.2 dst=10.1.1.1 sport=40206 dport=32889 [ASSURED] mark=2 secctx=system_u:object_r:unlabeled_t:s0 use=2 Taking this further, test script in [2] creates 200 tenants and runs original-tuple colliding netperf sessions each. A conntrack -L dump in the gateway netns also confirms 200 overlapping entries, all in ESTABLISHED state as expected. I also did run various other tests with some permutations of the script, to mention some: SNAT in random/random-fully/persistent mode, no zones (no overlaps), static zones (original, reply, both directions), etc. [1] http://thread.gmane.org/gmane.comp.security.firewalls.netfilter.devel/57412/ [2] https://paste.fedoraproject.org/242835/65657871/ Signed-off-by: Daniel Borkmann <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-08-17packet: add extended BPF fanout modeWillem de Bruijn1-0/+1
Add fanout mode PACKET_FANOUT_EBPF that accepts an en extended BPF program to select a socket. Update the internal eBPF program by passing to socket option SOL_PACKET/PACKET_FANOUT_DATA a file descriptor returned by bpf(). Signed-off-by: Willem de Bruijn <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> Acked-by: Daniel Borkmann <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-17packet: add classic BPF fanout modeWillem de Bruijn1-0/+2
Add fanout mode PACKET_FANOUT_CBPF that accepts a classic BPF program to select a socket. This avoids having to keep adding special case fanout modes. One example use case is application layer load balancing. The QUIC protocol, for instance, encodes a connection ID in UDP payload. Also add socket option SOL_PACKET/PACKET_FANOUT_DATA that updates data associated with the socket group. Fanout mode PACKET_FANOUT_CBPF is the only user so far. Signed-off-by: Willem de Bruijn <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> Acked-by: Daniel Borkmann <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-17lwtunnel: rename ip lwtunnel attributesJiri Benc2-15/+14
We already have IFLA_IPTUN_ netlink attributes. The IP_TUN_ attributes look very similar, yet they serve very different purpose. This is confusing for anyone trying to implement a user space tool supporting lwt. As the IP_TUN_ attributes are used only for the lightweight tunnels, prefix them with LWTUNNEL_IP_ instead to make their purpose clear. Also, it's more logical to have them in lwtunnel.h together with the encap enum. Fixes: 3093fbe7ff4b ("route: Per route IP tunnel metadata via lightweight tunnel") Signed-off-by: Jiri Benc <[email protected]> Acked-by: Thomas Graf <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-14Merge tag 'usb-for-v4.3' of ↵Greg Kroah-Hartman1-0/+12
git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-next Felipe writes: usb: patches for v4.3 merge window New support for Allwinne SoC on the MUSB driver has been added to the list of glue layers. MUSB also got support for building all DMA engines in one binary; this will be great for distros. DWC3 now has no trace of dev_dbg()/dev_vdbg() usage. We will rely solely on tracing to debug DWC3. There was also a fix for memory corruption with EP0 when maxpacket size transfers are > 512 bytes. Robert's EP capabilities flags is making EP selection a lot simpler. UDCs are now required to set these flags up when adding endpoints to the framework. Other than these, we have the usual set of miscelaneous cleanups and minor fixes. Signed-off-by: Felipe Balbi <[email protected]>
2015-08-13net: Introduce VRF related flags and helpersDavid Ahern1-0/+9
Add a VRF_MASTER flag for interfaces and helper functions for determining if a device is a VRF_MASTER. Add link attribute for passing VRF_TABLE id. Add vrf_ptr to netdevice. Add various macros for determining if a device is a VRF device, the index of the master VRF device and table associated with VRF device. Signed-off-by: Shrijeet Mukherjee <[email protected]> Signed-off-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-13net: ipv6 sysctl option to ignore routes when nexthop link is downAndy Gospodarek1-0/+1
Like the ipv4 patch with a similar title, this adds a sysctl to allow the user to change routing behavior based on whether or not the interface associated with the nexthop was an up or down link. The default setting preserves the current behavior, but anyone that enables it will notice that nexthops on down interfaces will no longer be selected: net.ipv6.conf.all.ignore_routes_with_linkdown = 0 net.ipv6.conf.default.ignore_routes_with_linkdown = 0 net.ipv6.conf.lo.ignore_routes_with_linkdown = 0 ... When the above sysctls are set, not only will link status be reported to userspace, but an indication that a nexthop is dead and will not be used is also reported. 1000::/8 via 7000::2 dev p7p1 metric 1024 dead linkdown pref medium 1000::/8 via 8000::2 dev p8p1 metric 1024 pref medium 7000::/8 dev p7p1 proto kernel metric 256 dead linkdown pref medium 8000::/8 dev p8p1 proto kernel metric 256 pref medium 9000::/8 via 8000::2 dev p8p1 metric 2048 pref medium 9000::/8 via 7000::2 dev p7p1 metric 1024 dead linkdown pref medium fe80::/64 dev p7p1 proto kernel metric 256 dead linkdown pref medium fe80::/64 dev p8p1 proto kernel metric 256 pref medium This also adds devconf support and notification when sysctl values change. v2: drop use of rt6i_nhflags since it is not needed right now Signed-off-by: Andy Gospodarek <[email protected]> Signed-off-by: Dinesh Dutt <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-0/+1
Conflicts: drivers/net/ethernet/cavium/Kconfig The cavium conflict was overlapping dependency changes. Signed-off-by: David S. Miller <[email protected]>
2015-08-10ip_gre: Add support to collect tunnel metadata.Pravin B Shelar1-0/+1
Following patch create new tunnel flag which enable tunnel metadata collection on given device. Signed-off-by: Pravin B Shelar <[email protected]> Acked-by: Thomas Graf <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-10net: add explicit logging and stat for neighbour table overflowRick Jones1-0/+1
Add an explicit neighbour table overflow message (ratelimited) and statistic to make diagnosing neighbour table overflows tractable in the wild. Diagnosing a neighbour table overflow can be quite difficult in the wild because there is no explicit dmesg logged. Callers to neighbour code seem to use net_dbg_ratelimit when the neighbour call fails which means the "base message" is not emitted and the callback suppressed messages from the ratelimiting can end-up juxtaposed with unrelated messages. Further, a forced garbage collection will increment a stat on each call whether it was successful in freeing-up a table entry or not, so that statistic is only a hint. So, add a net_info_ratelimited message and explicit statistic to the neighbour code. Signed-off-by: Rick Jones <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-10bridge: netlink: add support for vlan_filtering attributeNikolay Aleksandrov1-0/+1
This patch adds the ability to toggle the vlan filtering support via netlink. Since we're already running with rtnl in .changelink() we don't need to take any additional locks. Signed-off-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-09bpf: Implement function bpf_perf_event_read() that get the selected hardware ↵Kaixu Xia1-0/+1
PMU conuter According to the perf_event_map_fd and index, the function bpf_perf_event_read() can convert the corresponding map value to the pointer to struct perf_event and return the Hardware PMU counter value. Signed-off-by: Kaixu Xia <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-09bpf: Add new bpf map type to store the pointer to struct perf_eventKaixu Xia1-0/+1
Introduce a new bpf map type 'BPF_MAP_TYPE_PERF_EVENT_ARRAY'. This map only stores the pointer to struct perf_event. The user space event FDs from perf_event_open() syscall are converted to the pointer to struct perf_event and stored in map. Signed-off-by: Kaixu Xia <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-09Merge 4.2-rc6 into char-misc-nextGreg Kroah-Hartman1-0/+1
We want the fixes in Linus's tree in here as well. Signed-off-by: Greg Kroah-Hartman <[email protected]>
2015-08-07vxlan: combine VXLAN_FLOWBASED into VXLAN_COLLECT_METADATAAlexei Starovoitov1-1/+0
IFLA_VXLAN_FLOWBASED is useless without IFLA_VXLAN_COLLECT_METADATA, so combine them into single IFLA_VXLAN_COLLECT_METADATA flag. 'flowbased' doesn't convey real meaning of the vxlan tunnel mode. This mode can be used by routing, tc+bpf and ovs. Only ovs is strictly flow based, so 'collect metadata' is a better name for this tunnel mode. Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Thomas Graf <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-07netfilter: nft_limit: add per-byte limitingPablo Neira Ayuso1-0/+7
This patch adds a new NFTA_LIMIT_TYPE netlink attribute to indicate the type of limiting. Contrary to per-packet limiting, the cost is calculated from the packet path since this depends on the packet length. The burst attribute indicates the number of bytes in which the rate can be exceeded. Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-08-07netfilter: nft_limit: add burst parameterPablo Neira Ayuso1-0/+2
This patch adds the burst parameter. This burst indicates the number of packets that can exceed the limit. Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-08-07netfilter: nf_tables: add nft_dup expressionPablo Neira Ayuso1-0/+14
This new expression uses the nf_dup engine to clone packets to a given gateway. Unlike xt_TEE, we use an index to indicate output interface which should be fine at this stage. Moreover, change to the preemtion-safe this_cpu_read(nf_skb_duplicated) from nf_dup_ipv{4,6} to silence a lockdep splat. Based on the original tee expression from Arturo Borrero Gonzalez, although this patch has diverted quite a bit from this initial effort due to the change to support maps. Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-08-06audit: implement audit by executableRichard Guy Briggs1-1/+4
This adds the ability audit the actions of a not-yet-running process. This patch implements the ability to filter on the executable path. Instead of just hard coding the ino and dev of the executable we care about at the moment the rule is inserted into the kernel, use the new audit_fsnotify infrastructure to manage this dynamically. This means that if the filename does not yet exist but the containing directory does, or if the inode in question is unlinked and creat'd (aka updated) the rule will just continue to work. If the containing directory is moved or deleted or the filesystem is unmounted, the rule is deleted automatically. A future enhancement would be to have the rule survive across directory disruptions. This is a heavily modified version of a patch originally submitted by Eric Paris with some ideas from Peter Moody. Cc: Peter Moody <[email protected]> Cc: Eric Paris <[email protected]> Signed-off-by: Richard Guy Briggs <[email protected]> [PM: minor whitespace clean to satisfy ./scripts/checkpatch] Signed-off-by: Paul Moore <[email protected]>
2015-08-04Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller2-0/+4
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for net-next, they are: 1) A couple of cleanups for the netfilter core hook from Eric Biederman. 2) Net namespace hook registration, also from Eric. This adds a dependency with the rtnl_lock. This should be fine by now but we have to keep an eye on this because if we ever get the per-subsys nfnl_lock before rtnl we have may problems in the future. But we have room to remove this in the future by propagating the complexity to the clients, by registering hooks for the init netns functions. 3) Update nf_tables to use the new net namespace hook infrastructure, also from Eric. 4) Three patches to refine and to address problems from the new net namespace hook infrastructure. 5) Switch to alternate jumpstack in xtables iff the packet is reentering. This only applies to a very special case, the TEE target, but Eric Dumazet reports that this is slowing down things for everyone else. So let's only switch to the alternate jumpstack if the tee target is in used through a static key. This batch also comes with offline precalculation of the jumpstack based on the callchain depth. From Florian Westphal. 6) Minimal SCTP multihoming support for our conntrack helper, from Michal Kubecek. 7) Reduce nf_bridge_info per skbuff scratchpad area to 32 bytes, from Florian Westphal. 8) Fix several checkpatch errors in bridge netfilter, from Bernhard Thaler. 9) Get rid of useless debug message in ip6t_REJECT, from Subash Abhinov. ==================== Signed-off-by: David S. Miller <[email protected]>
2015-08-04Merge tag 'pci-v4.2-fixes-1' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fix from Bjorn Helgaas: "This is a trivial fix for a change that broke user program compilation (QEMU in this case)" * tag 'pci-v4.2-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: Restore PCI_MSIX_FLAGS_BIRMASK definition
2015-08-04perf: Add cycles to branch_infoAndi Kleen1-1/+3
Intel Skylake supports reporting the time in cycles a branch in the LBR took, to give a rough indication of the basic block performance. Export the cycle information in the branch_info structure. This can be done by just reusing some currently zero padding. This is just the generic header change. The architecture still needs to fill it in. There's no attempt to convert to real time, as we really want cycles here. Signed-off-by: Andi Kleen <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2015-08-03mpls: Use definition for reserved label checksRobert Shearman1-0/+2
In multiple locations there are checks for whether the label in hand is a reserved label or not using the arbritray value of 16. Factor this out into a #define for better maintainability and for documentation. Signed-off-by: Robert Shearman <[email protected]> Acked-by: Roopa Prabhu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-08-03mei: add async event notification ioctlsTomas Winkler1-0/+19
Add ioctl IOCTL_MEI_NOTIFY_SET for enabling and disabling async event notification. Add ioctl IOCTL_MEI_NOTIFY_GET for receiving and acking an event notification. Signed-off-by: Tomas Winkler <[email protected]> Signed-off-by: Alexander Usyskin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2015-08-02ebpf: add skb->hash to offset map for usage in {cls, act}_bpf or filtersDaniel Borkmann1-0/+1
Add skb->hash to the __sk_buff offset map, so it can be accessed from an eBPF program. We currently already do this for classic BPF filters, but not yet on eBPF, it might be useful as a demuxer in combination with helpers like bpf_clone_redirect(), toy example: __section("cls-lb") int ingress_main(struct __sk_buff *skb) { unsigned int which = 3 + (skb->hash & 7); /* bpf_skb_store_bytes(skb, ...); */ /* bpf_l{3,4}_csum_replace(skb, ...); */ bpf_clone_redirect(skb, which, 0); return -1; } I was thinking whether to add skb_get_hash(), but then concluded the raw skb->hash seems fine in this case: we can directly access the hash w/o extra eBPF helper function call, it's filled out by many NICs on ingress, and in case the entropy level would not be sufficient, people can still implement their own specific sw fallback hash mix anyway. Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-07-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller3-1/+26
Conflicts: arch/s390/net/bpf_jit_comp.c drivers/net/ethernet/ti/netcp_ethss.c net/bridge/br_multicast.c net/ipv4/ip_fragment.c All four conflicts were cases of simple overlapping changes. Signed-off-by: David S. Miller <[email protected]>
2015-07-31bonding: add tlb_dynamic_lb netlink supportNikolay Aleksandrov1-0/+1
tlb_dynamic_lb could be set only via sysfs, this patch allows it to be set via netlink. Signed-off-by: Nikolay Aleksandrov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-07-31vxlan: expose COLLECT_METADATA flag to user spaceAlexei Starovoitov1-0/+1
Two vxlan driver flags FLOWBASED and COLLECT_METADATA need to be set to make use of its new flow mode. The former already exposed. Expose the latter. Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Thomas Graf <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-07-31bpf: add helpers to access tunnel metadataAlexei Starovoitov1-0/+17
Introduce helpers to let eBPF programs attached to TC manipulate tunnel metadata: bpf_skb_[gs]et_tunnel_key(skb, key, size, flags) skb: pointer to skb key: pointer to 'struct bpf_tunnel_key' size: size of 'struct bpf_tunnel_key' flags: room for future extensions First eBPF program that uses these helpers will allocate per_cpu metadata_dst structures that will be used on TX. On RX metadata_dst is allocated by tunnel driver. Typical usage for TX: struct bpf_tunnel_key tkey; ... populate tkey ... bpf_skb_set_tunnel_key(skb, &tkey, sizeof(tkey), 0); bpf_clone_redirect(skb, vxlan_dev_ifindex, 0); RX: struct bpf_tunnel_key tkey = {}; bpf_skb_get_tunnel_key(skb, &tkey, sizeof(tkey), 0); ... lookup or redirect based on tkey ... 'struct bpf_tunnel_key' will be extended in the future by adding elements to the end and the 'size' argument will indicate which fields are populated, thereby keeping backwards compatibility. The 'flags' argument may be used as well when the 'size' is not enough or to indicate completely different layout of bpf_tunnel_key. Signed-off-by: Alexei Starovoitov <[email protected]> Acked-by: Thomas Graf <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-07-31Merge branch 'perf/urgent' into perf/core, to merge fixes before pulling ↵Ingo Molnar3-1/+26
more changes Signed-off-by: Ingo Molnar <[email protected]>