blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2018-12-21	netfilter: conntrack: un-export seq_print_acct	Florian Westphal	1	-3/+0
	Only one caller, just place it where its needed. Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2018-12-21	netfilter: conntrack: udp: only extend timeout to stream mode after 2s	Florian Westphal	1	-0/+5
	Currently DNS resolvers that send both A and AAAA queries from same source port can trigger stream mode prematurely, which results in non-early-evictable conntrack entry for three minutes, even though DNS requests are done in a few milliseconds. Add a two second grace period where we continue to use the ordinary 30-second default timeout. Its enough for DNS request/response traffic, even if two request/reply packets are involved. ASSURED is still set, else conntrack (and thus a possible NAT mapping ...) gets zapped too in case conntrack table runs full. Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2018-12-20	bpf: sk_msg, sock{map\|hash} redirect through ULP	John Fastabend	1	-0/+9
	A sockmap program that redirects through a kTLS ULP enabled socket will not work correctly because the ULP layer is skipped. This fixes the behavior to call through the ULP layer on redirect to ensure any operations required on the data stream at the ULP layer continue to be applied. To do this we add an internal flag MSG_SENDPAGE_NOPOLICY to avoid calling the BPF layer on a redirected message. This is required to avoid calling the BPF layer multiple times (possibly recursively) which is not the current/expected behavior without ULPs. In the future we may add a redirect flag if users _do_ want the policy applied again but this would need to work for both ULP and non-ULP sockets and be opt-in to avoid breaking existing programs. Also to avoid polluting the flag space with an internal flag we reuse the flag space overlapping MSG_SENDPAGE_NOPOLICY with MSG_WAITFORONE. Here WAITFORONE is specific to recv path and SENDPAGE_NOPOLICY is only used for sendpage hooks. The last thing to verify is user space API is masked correctly to ensure the flag can not be set by user. (Note this needs to be true regardless because we have internal flags already in-use that user space should not be able to set). But for completeness we have two UAPI paths into sendpage, sendfile and splice. In the sendfile case the function do_sendfile() zero's flags, ./fs/read_write.c: static ssize_t do_sendfile(int out_fd, int in_fd, loff_t ppos, size_t count, loff_t max) { ... fl = 0; #if 0 / * We need to debate whether we can enable this or not. The * man page documents EAGAIN return for the output at least, * and the application is arguably buggy if it doesn't expect * EAGAIN on a non-blocking file descriptor. / if (in.file->f_flags & O_NONBLOCK) fl = SPLICE_F_NONBLOCK; #endif file_start_write(out.file); retval = do_splice_direct(in.file, &pos, out.file, &out_pos, count, fl); } In the splice case the pipe_to_sendpage "actor" is used which masks flags with SPLICE_F_MORE. ./fs/splice.c: static int pipe_to_sendpage(struct pipe_inode_info pipe, struct pipe_buffer buf, struct splice_desc sd) { ... more = (sd->flags & SPLICE_F_MORE) ? MSG_MORE : 0; ... } Confirming what we expect that internal flags are in fact internal to socket side. Fixes: d3b18ad31f93 ("tls: add bpf support to sk_msg handling") Signed-off-by: John Fastabend <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
2018-12-20	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	David S. Miller	4	-23/+28
	Lots of conflicts, by happily all cases of overlapping changes, parallel adds, things of that nature. Thanks to Stephen Rothwell, Saeed Mahameed, and others for their guidance in these resolutions. Signed-off-by: David S. Miller <[email protected]>
2018-12-19	iptunnel: make TUNNEL_FLAGS available in uapi	wenxu	1	-19/+0
	ip l add dev tun type gretap external ip r a 10.0.0.1 encap ip dst 192.168.152.171 id 1000 dev gretap For gretap Key example when the command set the id but don't set the TUNNEL_KEY flags. There is no key field in the send packet In the lwtunnel situation, some TUNNEL_FLAGS should can be set by userspace Signed-off-by: wenxu <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-12-19	neighbour: register rtnl doit handler	Roopa Prabhu	1	-0/+1
	this patch registers neigh doit handler. The doit handler returns a neigh entry given dst and dev. This is similar to route and fdb doit (get) handlers. Also moves nda_policy declaration from rtnetlink.c to neighbour.c Signed-off-by: Roopa Prabhu <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-12-19	net: switch secpath to use skb extension infrastructure	Florian Westphal	1	-21/+1
	Remove skb->sp and allocate secpath storage via extension infrastructure. This also reduces sk_buff by 8 bytes on x86_64. Total size of allyesconfig kernel is reduced slightly, as there is less inlined code (one conditional atomic op instead of two on skb_clone). No differences in throughput in following ipsec performance tests: - transport mode with aes on 10GB link - tunnel mode between two network namespaces with aes and null cipher Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-12-19	xfrm: use secpath_exist where applicable	Florian Westphal	1	-1/+1
	Will reduce noise when skb->sp is removed later in this series. Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-12-19	net: use skb_sec_path helper in more places	Florian Westphal	1	-2/+4
	skb_sec_path gains 'const' qualifier to avoid xt_policy.c: 'skb_sec_path' discards 'const' qualifier from pointer target type same reasoning as previous conversions: Won't need to touch these spots anymore when skb->sp is removed. Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-12-19	net: move secpath_exist helper to sk_buff.h	Florian Westphal	1	-9/+0
	Future patch will remove skb->sp pointer. To reduce noise in those patches, move existing helper to sk_buff and use it in more places to ease skb->sp replacement later. Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-12-19	xfrm: change secpath_set to return secpath struct, not error value	Florian Westphal	1	-1/+1
	It can only return 0 (success) or -ENOMEM. Change return value to a pointer to secpath struct. This avoids direct access to skb->sp: err = secpath_set(skb); if (!err) .. skb->sp-> ... Becomes: sp = secpath_set(skb) if (!sp) .. sp-> .. This reduces noise in followup patch which is going to remove skb->sp. Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-12-19	net: convert bridge_nf to use skb extension infrastructure	Florian Westphal	1	-4/+4
	This converts the bridge netfilter (calling iptables hooks from bridge) facility to use the extension infrastructure. The bridge_nf specific hooks in skb clone and free paths are removed, they have been replaced by the skb_ext hooks that do the same as the bridge nf allocations hooks did. Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-12-19	netfilter: avoid using skb->nf_bridge directly	Florian Westphal	1	-6/+0
	This pointer is going to be removed soon, so use the existing helpers in more places to avoid noise when the removal happens. Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-12-19	Merge tag 'mac80211-next-for-davem-2018-12-19' of ↵	David S. Miller	2	-5/+302
	git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== This time we have too many changes to list, highlights: * virt_wifi - wireless control simulation on top of another network interface * hwsim configurability to test capabilities similar to real hardware * various mesh improvements * various radiotap vendor data fixes in mac80211 * finally the nl_set_extack_cookie_u64() we talked about previously, used for * peer measurement APIs, right now only with FTM (flight time measurement) for location * made nl80211 radio/interface announcements more complete * various new HE (802.11ax) things: updates, TWT support, ... ==================== Signed-off-by: David S. Miller <[email protected]>
2018-12-18	Merge branch 'master' of ↵	David S. Miller	1	-0/+1
	git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2018-12-18 1) Fix error return code in xfrm_output_one() when no dst_entry is attached to the skb. From Wei Yongjun. 2) The xfrm state hash bucket count reported to userspace is off by one. Fix from Benjamin Poirier. 3) Fix NULL pointer dereference in xfrm_input when skb_dst_force clears the dst_entry. 4) Fix freeing of xfrm states on acquire. We use a dedicated slab cache for the xfrm states now, so free it properly with kmem_cache_free. From Mathias Krause. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <[email protected]>
2018-12-18	Merge branch 'master' of ↵	David S. Miller	2	-1/+6
	git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2018-12-18 1) Add xfrm policy selftest scripts. From Florian Westphal. 2) Split inexact policies into four different search list classes and use the rbtree infrastructure to store/lookup the policies. This is to improve the policy lookup performance after the flowcache removal. Patches from Florian Westphal. 3) Various coding style fixes, from Colin Ian King. 4) Fix policy lookup logic after adding the inexact policy search tree infrastructure. From Florian Westphal. 5) Remove a useless remove BUG_ON from xfrm6_dst_ifdown. From Li RongQing. 6) Use the correct policy direction for lookups on hash rebuilding. From Florian Westphal. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <[email protected]>
2018-12-18	mac80211: propagate the support for TWT to the driver	Emmanuel Grumbach	1	-0/+3
	TWT is a feature that was added in 11ah and enhanced in 11ax. There are two bits that need to be set if we want to use the feature in 11ax: one in the HE Capability IE and one in the Extended Capability IE. This is because of backward compatibility between 11ah and 11ax. In order to simplify the flow for the low level driver in managed mode, aggregate the two bits and add a boolean that tells whether TWT is supported or not, but only if 11ax is supported. Signed-off-by: Emmanuel Grumbach <[email protected]> Signed-off-by: Luca Coelho <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2018-12-18	mac80211: document RCU requirements for ieee80211_tx_dequeue()	Johannes Berg	1	-0/+8
	In the iwlwifi conversion, we sometimes call this from outside of the wake_tx_queue() method, and in those cases must be in an RCU critical section. Document this requirement. Signed-off-by: Johannes Berg <[email protected]> Signed-off-by: Luca Coelho <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2018-12-18	cfg80211: clarify LCI/civic location documentation	Johannes Berg	1	-2/+4
	The older code and current userspace assumed that this data is the content of the Measurement Report element, starting with the Measurement Token. Clarify this in the documentation. Signed-off-by: Johannes Berg <[email protected]>
2018-12-18	wireless: FTM: fix kernel-doc "cannot understand" warnings	Randy Dunlap	2	-2/+2
	Fix kernel-doc warnings in FTM due to missing "struct" keyword. Fixes 109 warnings from <net/cfg80211.h>: ../include/net/cfg80211.h:2838: warning: cannot understand function prototype: 'struct cfg80211_ftm_responder_stats ' and fixes 88 warnings from <net/mac80211.h>: ../include/net/mac80211.h:477: warning: cannot understand function prototype: 'struct ieee80211_ftm_responder_params ' Fixes: 81e54d08d9d8 ("cfg80211: support FTM responder configuration/statistics") Fixes: bc847970f432 ("mac80211: support FTM responder configuration/statistics") Signed-off-by: Randy Dunlap <[email protected]> Cc: Pradeep Kumar Chitrapu <[email protected]> Cc: Johannes Berg <[email protected]> Cc: David Spinadel <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2018-12-17	net: add missing SOF_TIMESTAMPING_OPT_ID support	Willem de Bruijn	1	-4/+21
	SOF_TIMESTAMPING_OPT_ID is supported on TCP, UDP and RAW sockets. But it was missing on RAW with IPPROTO_IP, PF_PACKET and CAN. Add skb_setup_tx_timestamp that configures both tx_flags and tskey for these paths that do not need corking or use bytestream keys. Fixes: 09c2d251b707 ("net-timestamp: add key to disambiguate concurrent datagrams") Signed-off-by: Willem de Bruijn <[email protected]> Acked-by: Soheil Hassas Yeganeh <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-12-17	netfilter: nat: remove nf_nat_l4proto struct	Florian Westphal	2	-33/+0
	This removes the (now empty) nf_nat_l4proto struct, all its instances and all the no longer needed runtime (un)register functionality. nf_nat_need_gre() can be axed as well: the module that calls it (to load the no-longer-existing nat_gre module) also calls other nat core functions. GRE nat is now always available if kernel is built with it. Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2018-12-17	netfilter: nat: remove l4proto->manip_pkt	Florian Westphal	1	-8/+7
	This removes the last l4proto indirection, the two callers, the l3proto packet mangling helpers for ipv4 and ipv6, now call the nf_nat_l4proto_manip_pkt() helper. nf_nat_proto_{dccp,tcp,sctp,gre,icmp,icmpv6} are left behind, even though they contain no functionality anymore to not clutter this patch. Next patch will remove the empty files and the nf_nat_l4proto struct. nf_nat_proto_udp.c is renamed to nf_nat_proto.c, as it now contains the other nat manip functionality as well, not just udp and udplite. Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2018-12-17	netfilter: nat: remove l4proto->nlattr_to_range	Florian Westphal	1	-6/+0
	all protocols did set this to nf_nat_l4proto_nlattr_to_range, so just call it directly. The important difference is that we'll now also call it for protocols that we don't support (i.e., nf_nat_proto_unknown did not provide .nlattr_to_range). However, there should be no harm, even icmp provided this callback. If we don't implement a specific l4nat for this, nothing would make use of this information, so adding a big switch/case construct listing all supported l4protocols seems a bit pointless. This change leaves a single function pointer in the l4proto struct. Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2018-12-17	netfilter: nat: remove l4proto->in_range	Florian Westphal	1	-11/+0
	With exception of icmp, all of the l4 nat protocols set this to nf_nat_l4proto_in_range. Get rid of this and just check the l4proto in the caller. Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2018-12-17	netfilter: nat: fold in_range indirection into caller	Florian Westphal	1	-3/+0
	No need for indirections here, we only support ipv4 and ipv6 and the called functions are very small. Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2018-12-17	netfilter: nat: remove l4proto->unique_tuple	Florian Westphal	1	-11/+0
	fold remaining users (icmp, icmpv6, gre) into nf_nat_l4proto_unique_tuple. The static-save of old incarnation of resolved key in gre and icmp is removed as well, just use the prandom based offset like the others. Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2018-12-17	netfilter: nat: un-export nf_nat_l4proto_unique_tuple	Florian Westphal	1	-6/+0
	almost all l4proto->unique_tuple implementations just call this helper, so make ->unique_tuple() optional and call its helper directly if the l4proto doesn't override it. This is an intermediate step to get rid of ->unique_tuple completely. Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2018-12-17	netfilter: remove NF_NAT_RANGE_PROTO_RANDOM support	Florian Westphal	1	-2/+0
	Historically this was net_random() based, and was then converted to a hash based algorithm (private boot seed + hash of endpoint addresses) due to concerns of leaking net_random() bits. RANDOM_FULLY mode was added later to avoid problems with hash based mode (see commit 34ce324019e76, "netfilter: nf_nat: add full port randomization support" for details). Just make prandom_u32() the default search starting point and get rid of ->secure_port() altogether. Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2018-12-16	net: dsa: ksz: Rename NET_DSA_TAG_KSZ to _KSZ9477	Tristram Ha	1	-1/+1
	Rename the tag Kconfig option and related macros in preparation for addition of new KSZ family switches with different tag formats. Signed-off-by: Tristram Ha <[email protected]> Signed-off-by: Marek Vasut <[email protected]> Cc: Vivien Didelot <[email protected]> Cc: Woojung Huh <[email protected]> Cc: David S. Miller <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Reviewed-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-12-16	neighbor: Add protocol attribute	David Ahern	1	-0/+2
	Similar to routes and rules, add protocol attribute to neighbor entries for easier tracking of how each was created. Signed-off-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-12-15	net: use indirect call wrappers at GRO transport layer	Paolo Abeni	1	-0/+7
	This avoids an indirect call in the receive path for TCP and UDP packets. TCP takes precedence on UDP, so that we have a single additional conditional in the common case. When IPV6 is build as module, all gro symbols except UDPv6 are builtin, while the latter belong to the ipv6 module, so we need some special care. v1 -> v2: - adapted to INDIRECT_CALL_ changes v2 -> v3: - fix build issue with CONFIG_IPV6=m Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-12-15	net: use indirect call wrappers at GRO network layer	Paolo Abeni	1	-0/+2
	This avoids an indirect calls for L3 GRO receive path, both for ipv4 and ipv6, if the latter is not compiled as a module. Note that when IPv6 is compiled as builtin, it will be checked first, so we have a single additional compare for the more common path. v1 -> v2: - adapted to INDIRECT_CALL_ changes Signed-off-by: Paolo Abeni <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-12-15	neighbor: Improve neighbour struct layout	David Ahern	1	-2/+2
	Move arp_queue_len_bytes ahead of arp_queue to remove two 4-byte holes. Ensure ha element is always 8-byte aligned. Signed-off-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-12-14	neighbor: Move neigh_update_ext_learned to core file	David Ahern	1	-18/+0
	neigh_update_ext_learned has one caller in neighbour.c so does not need to be defined in the header. Move it and in the process remove the intialization of ndm_flags and just set it based on the flags check. Signed-off-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-12-14	net_sched: fold tcf_block_cb_call() into tc_setup_cb_call()	Cong Wang	1	-2/+2
	After commit 69bd48404f25 ("net/sched: Remove egdev mechanism"), tc_setup_cb_call() is nearly identical to tcf_block_cb_call(), so we can just fold tcf_block_cb_call() into tc_setup_cb_call() and remove its unused parameter 'exts'. Fixes: 69bd48404f25 ("net/sched: Remove egdev mechanism") Cc: Oz Shlomo <[email protected]> Cc: Jiri Pirko <[email protected]> Signed-off-by: Cong Wang <[email protected]> Acked-by: Jiri Pirko <[email protected]> Acked-by: Oz Shlomo <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-12-14	net/tls: sleeping function from invalid context	Atul Gupta	1	-0/+6
	HW unhash within mutex for registered tls devices cause sleep when called from tcp_set_state for TCP_CLOSE. Release lock and re-acquire after function call with ref count incr/dec. defined kref and fp release for tls_device to ensure device is not released outside lock. BUG: sleeping function called from invalid context at kernel/locking/mutex.c:748 in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/7 INFO: lockdep is turned off. CPU: 7 PID: 0 Comm: swapper/7 Tainted: G W O Call Trace: <IRQ> dump_stack+0x5e/0x8b ___might_sleep+0x222/0x260 __mutex_lock+0x5c/0xa50 ? vprintk_emit+0x1f3/0x440 ? kmem_cache_free+0x22d/0x2a0 ? tls_hw_unhash+0x2f/0x80 ? printk+0x52/0x6e ? tls_hw_unhash+0x2f/0x80 tls_hw_unhash+0x2f/0x80 tcp_set_state+0x5f/0x180 tcp_done+0x2e/0xe0 tcp_rcv_state_process+0x92c/0xdd3 ? lock_acquire+0xf5/0x1f0 ? tcp_v4_rcv+0xa7c/0xbe0 ? tcp_v4_do_rcv+0x70/0x1e0 Signed-off-by: Atul Gupta <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-12-12	net: switchdev: Add extack to switchdev_handle_port_obj_add() callback	Petr Machata	1	-2/+4
	Drivers use switchdev_handle_port_obj_add() to handle recursive descent through lower devices. Change this function prototype to take add_cb that itself takes an extack argument. Decode extack from switchdev_notifier_port_obj_info and pass it to add_cb. Update mlxsw and ocelot drivers which use this helper. Signed-off-by: Petr Machata <[email protected]> Acked-by: Jiri Pirko <[email protected]> Acked-by: Ivan Vecera <[email protected]> Reviewed-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-12-12	net: switchdev: Add extack to struct switchdev_notifier_info	Petr Machata	1	-2/+11
	In order to pass extack to the drivers that need it, add an extack field to struct switchdev_notifier_info, and an extack argument to the function call_switchdev_blocking_notifiers(). Also add a helper function switchdev_notifier_info_to_extack(). Signed-off-by: Petr Machata <[email protected]> Acked-by: Jiri Pirko <[email protected]> Acked-by: Ivan Vecera <[email protected]> Reviewed-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-12-12	net: switchdev: Add extack argument to switchdev_port_obj_add()	Petr Machata	1	-2/+4
	After the previous patch, bridge driver has extack argument available to pass to switchdev. Therefore extend switchdev_port_obj_add() with this argument, updating all callers, and passing the argument through to switchdev_port_obj_notify(). Signed-off-by: Petr Machata <[email protected]> Acked-by: Jiri Pirko <[email protected]> Acked-by: Ivan Vecera <[email protected]> Reviewed-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-12-10	net/sched: Remove egdev mechanism	Oz Shlomo	1	-30/+0
	The egdev mechanism was replaced by the TC indirect block notifications platform. Signed-off-by: Oz Shlomo <[email protected]> Reviewed-by: Eli Britstein <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Cc: John Hurley <[email protected]> Cc: Jakub Kicinski <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2018-12-10	net: Add netif_is_gretap()/netif_is_ip6gretap()	Oz Shlomo	1	-2/+11
	Changed the is_gretap_dev and is_ip6gretap_dev logic from structure comparison to string comparison of the rtnl_link_ops kind field. This approach aligns with the current identification methods and function names of vxlan and geneve network devices. Convert mlxsw to use these helpers and use them in downstream mlx5 patch. Signed-off-by: Oz Shlomo <[email protected]> Reviewed-by: Eli Britstein <[email protected]> Reviewed-by: Ido Schimmel <[email protected]> Signed-off-by: Saeed Mahameed <[email protected]>
2018-12-10	xfrm: clean an indentation issue, remove a space	Colin Ian King	1	-1/+1
	Trivial fix to clean up indentation issue, remove an extraneous space. Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Steffen Klassert <[email protected]>
2018-12-09	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	David S. Miller	3	-5/+30
	Several conflicts, seemingly all over the place. I used Stephen Rothwell's sample resolutions for many of these, if not just to double check my own work, so definitely the credit largely goes to him. The NFP conflict consisted of a bug fix (moving operations past the rhashtable operation) while chaning the initial argument in the function call in the moved code. The net/dsa/master.c conflict had to do with a bug fix intermixing of making dsa_master_set_mtu() static with the fixing of the tagging attribute location. cls_flower had a conflict because the dup reject fix from Or overlapped with the addition of port range classifiction. __set_phy_supported()'s conflict was relatively easy to resolve because Andrew fixed it in both trees, so it was just a matter of taking the net-next copy. Or at least I think it was :-) Joe Stringer's fix to the handling of netns id 0 in bpf_sk_lookup() intermixed with changes on how the sdif and caller_net are calculated in these code paths in net-next. The remaining BPF conflicts were largely about the addition of the __bpf_md_ptr stuff in 'net' overlapping with adjustments and additions to the relevant data structure where the MD pointer macros are used. Signed-off-by: David S. Miller <[email protected]>
2018-12-07	neighbour: Avoid writing before skb->head in neigh_hh_output()	Stefano Brivio	1	-5/+23
	While skb_push() makes the kernel panic if the skb headroom is less than the unaligned hardware header size, it will proceed normally in case we copy more than that because of alignment, and we'll silently corrupt adjacent slabs. In the case fixed by the previous patch, "ipv6: Check available headroom in ip6_xmit() even without options", we end up in neigh_hh_output() with 14 bytes headroom, 14 bytes hardware header and write 16 bytes, starting 2 bytes before the allocated buffer. Always check we're not writing before skb->head and, if the headroom is not enough, warn and drop the packet. v2: - instead of panicking with BUG_ON(), WARN_ON_ONCE() and drop the packet (Eric Dumazet) - if we avoid the panic, though, we need to explicitly check the headroom before the memcpy(), otherwise we'll have corrupted slabs on a running kernel, after we warn - use __skb_push() instead of skb_push(), as the headroom check is already implemented here explicitly (Eric Dumazet) Signed-off-by: Stefano Brivio <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-12-07	neighbor: Improve garbage collection	David Ahern	1	-0/+3
	The existing garbage collection algorithm has a number of problems: 1. The gc algorithm will not evict PERMANENT entries as those entries are managed by userspace, yet the existing algorithm walks the entire hash table which means it always considers PERMANENT entries when looking for entries to evict. In some use cases (e.g., EVPN) there can be tens of thousands of PERMANENT entries leading to wasted CPU cycles when gc kicks in. As an example, with 32k permanent entries, neigh_alloc has been observed taking more than 4 msec per invocation. 2. Currently, when the number of neighbor entries hits gc_thresh2 and the last flush for the table was more than 5 seconds ago gc kicks in walks the entire hash table evicting all entries not in PERMANENT or REACHABLE state and not marked as externally learned. There is no discriminator on when the neigh entry was created or if it just moved from REACHABLE to another NUD_VALID state (e.g., NUD_STALE). It is possible for entries to be created or for established neighbor entries to be moved to STALE (e.g., an external node sends an ARP request) right before the 5 second window lapses: -----\|---------x\|----------\|----- t-5 t t+5 If that happens those entries are evicted during gc causing unnecessary thrashing on neighbor entries and userspace caches trying to track them. Further, this contradicts the description of gc_thresh2 which says "Entries older than 5 seconds will be cleared". One workaround is to make gc_thresh2 == gc_thresh3 but that negates the whole point of having separate thresholds. 3. Clearing all neigh non-PERMANENT/REACHABLE/externally learned entries when gc_thresh2 is exceeded is over kill and contributes to trashing especially during startup. This patch addresses these problems as follows: 1. Use of a separate list_head to track entries that can be garbage collected along with a separate counter. PERMANENT entries are not added to this list. The gc_thresh parameters are only compared to the new counter, not the total entries in the table. The forced_gc function is updated to only walk this new gc_list looking for entries to evict. 2. Entries are added to the list head at the tail and removed from the front. 3. Entries are only evicted if they were last updated more than 5 seconds ago, adhering to the original intent of gc_thresh2. 4. Forced gc is stopped once the number of gc_entries drops below gc_thresh2. 5. Since gc checks do not apply to PERMANENT entries, gc levels are skipped when allocating a new neighbor for a PERMANENT entry. By extension this means there are no explicit limits on the number of PERMANENT entries that can be created, but this is no different than FIB entries or FDB entries. Signed-off-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-12-07	vxlan: Add vxlan_fdb_clear_offload()	Petr Machata	1	-0/+6
	When a driver unoffloads all FDB entries en bloc, it's inefficient to send the switchdev notification one by one. Add a helper that walks the FDB table, unsetting the offload flag on RDST with a given VNI. Signed-off-by: Petr Machata <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-12-07	vxlan: Add vxlan_fdb_replay()	Petr Machata	1	-0/+9
	When a VXLAN device becomes relevant to a driver (such as when it is attached to an offloaded bridge), the driver will generally need to walk the existing FDB entries and offload them. Add a function vxlan_fdb_replay() to call a given notifier block for each FDB entry with a given VNI. Signed-off-by: Petr Machata <[email protected]> Signed-off-by: Ido Schimmel <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-12-06	net: dsa: Add overhead to tag protocol ops.	Andrew Lunn	1	-0/+1
	Each DSA tag protocol needs to add additional headers to the Ethernet frame in order to direct it towards a specific switch egress port. It must also remove the head from a frame received from a switch. Indicate the maximum size of these headers in the tag protocol ops structure, so the core can take these overheads into account. Signed-off-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-12-05	sctp: frag_point sanity check	Jakub Audykowicz	1	-0/+5
	If for some reason an association's fragmentation point is zero, sctp_datamsg_from_user will try to endlessly try to divide a message into zero-sized chunks. This eventually causes kernel panic due to running out of memory. Although this situation is quite unlikely, it has occurred before as reported. I propose to add this simple last-ditch sanity check due to the severity of the potential consequences. Signed-off-by: Jakub Audykowicz <[email protected]> Acked-by: Neil Horman <[email protected]> Acked-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: David S. Miller <[email protected]>