aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2016-05-26libceph: schedule tick from ceph_osdc_init()Ilya Dryomov1-28/+9
Both homeless OSD sessions and watch/notify v2, introduced in later commits, require periodic ticks which don't depend on ->num_requests. Schedule the initial tick from ceph_osdc_init() and reschedule from handle_timeout() unconditionally. Signed-off-by: Ilya Dryomov <[email protected]>
2016-05-26libceph: move schedule_delayed_work() in ceph_osdc_init()Ilya Dryomov1-3/+3
ceph_osdc_stop() isn't called if ceph_osdc_init() fails, so we end up with handle_osds_timeout() running on invalid memory if any one of the allocations fails. Call schedule_delayed_work() after everything is setup, just before returning. Signed-off-by: Ilya Dryomov <[email protected]>
2016-05-26libceph: redo callbacks and factor out MOSDOpReply decodingIlya Dryomov1-153/+209
If you specify ACK | ONDISK and set ->r_unsafe_callback, both ->r_callback and ->r_unsafe_callback(true) are called on ack. This is very confusing. Redo this so that only one of them is called: ->r_unsafe_callback(true), on ack ->r_unsafe_callback(false), on commit or ->r_callback, on ack|commit Decode everything in decode_MOSDOpReply() to reduce clutter. Signed-off-by: Ilya Dryomov <[email protected]>
2016-05-26libceph: drop msg argument from ceph_osdc_callback_tIlya Dryomov1-2/+2
finish_read(), its only user, uses it to get to hdr.data_len, which is what ->r_result is set to on success. This gains us the ability to safely call callbacks from contexts other than reply, e.g. map check. Signed-off-by: Ilya Dryomov <[email protected]>
2016-05-26libceph: switch to calc_target(), part 2Ilya Dryomov2-200/+216
The crux of this is getting rid of ceph_osdc_build_request(), so that MOSDOp can be encoded not before but after calc_target() calculates the actual target. Encoding now happens within ceph_osdc_start_request(). Also nuked is the accompanying bunch of pointers into the encoded buffer that was used to update fields on each send - instead, the entire front is re-encoded. If we want to support target->name_len != base->name_len in the future, there is no other way, because oid is surrounded by other fields in the encoded buffer. Encoding OSD ops and adding data items to the request message were mixed together in osd_req_encode_op(). While we want to re-encode OSD ops, we don't want to add duplicate data items to the message when resending, so all call to ceph_osdc_msg_data_add() are factored out into a new setup_request_data(). Signed-off-by: Ilya Dryomov <[email protected]>
2016-05-26libceph: switch to calc_target(), part 1Ilya Dryomov2-97/+24
Replace __calc_request_pg() and most of __map_request() with calc_target() and start using req->r_t. ceph_osdc_build_request() however still encodes base_oid, because it's called before calc_target() is and target_oid is empty at that point in time; a printf in osdc_show() also shows base_oid. This is fixed in "libceph: switch to calc_target(), part 2". Signed-off-by: Ilya Dryomov <[email protected]>
2016-05-26libceph: introduce ceph_osd_request_target, calc_target()Ilya Dryomov2-2/+276
Introduce ceph_osd_request_target, containing all mapping-related fields of ceph_osd_request and calc_target() for calculating mappings and populating it. Signed-off-by: Ilya Dryomov <[email protected]>
2016-05-26libceph: pi->min_size, pi->last_force_request_resendIlya Dryomov2-5/+53
Add and decode pi->min_size and pi->last_force_request_resend. These are going to be used by calc_target(). Signed-off-by: Ilya Dryomov <[email protected]>
2016-05-26libceph: make pgid_cmp() globalIlya Dryomov1-11/+12
calc_target() code is going to need to know how to compare PGs. Take lhs and rhs pgid by const * while at it. Signed-off-by: Ilya Dryomov <[email protected]>
2016-05-26libceph: rename ceph_calc_pg_primary()Ilya Dryomov1-4/+5
Rename ceph_calc_pg_primary() to ceph_pg_to_acting_primary() to emphasise that it returns acting primary. Signed-off-by: Ilya Dryomov <[email protected]>
2016-05-26libceph: ceph_osds, ceph_pg_to_up_acting_osds()Ilya Dryomov2-143/+197
Knowning just acting set isn't enough, we need to be able to record up set as well to detect interval changes. This means returning (up[], up_len, up_primary, acting[], acting_len, acting_primary) and passing it around. Introduce and switch to ceph_osds to help with that. Rename ceph_calc_pg_acting() to ceph_pg_to_up_acting_osds() and return both up and acting sets from it. Signed-off-by: Ilya Dryomov <[email protected]>
2016-05-26libceph: rename ceph_oloc_oid_to_pg()Ilya Dryomov2-17/+18
Rename ceph_oloc_oid_to_pg() to ceph_object_locator_to_pg(). Emphasise that returned is raw PG and return -ENOENT instead of -EIO if the pool doesn't exist. Signed-off-by: Ilya Dryomov <[email protected]>
2016-05-26libceph: DEFINE_RB_FUNCS macroIlya Dryomov2-131/+18
Given struct foo { u64 id; struct rb_node bar_node; }; generate insert_bar(), erase_bar() and lookup_bar() functions with DEFINE_RB_FUNCS(bar, struct foo, id, bar_node) The key is assumed to be an integer (u64, int, etc), compared with < and >. nodefld has to be initialized with RB_CLEAR_NODE(). Start using it for MDS, MON and OSD requests and OSD sessions. Signed-off-by: Ilya Dryomov <[email protected]>
2016-05-26libceph: open-code remove_{all,old}_osds()Ilya Dryomov1-30/+21
They are called only once, from ceph_osdc_stop() and handle_osds_timeout() respectively. Signed-off-by: Ilya Dryomov <[email protected]>
2016-05-26libceph: nuke unused fields and functionsIlya Dryomov3-17/+2
Either unused or useless: osdmap->mkfs_epoch osd->o_marked_for_keepalive monc->num_generic_requests osdc->map_waiters osdc->last_requested_map osdc->timeout_tid osd_req_op_cls_response_data() osdmap_apply_incremental() @msgr arg Signed-off-by: Ilya Dryomov <[email protected]>
2016-05-26libceph: variable-sized ceph_object_idIlya Dryomov3-8/+103
Currently ceph_object_id can hold object names of up to 100 (CEPH_MAX_OID_NAME_LEN) characters. This is enough for all use cases, expect one - long rbd image names: - a format 1 header is named "<imgname>.rbd" - an object that points to a format 2 header is named "rbd_id.<imgname>" We operate on these potentially long-named objects during rbd map, and, for format 1 images, during header refresh. (A format 2 header name is a small system-generated string.) Lift this 100 character limit by making ceph_object_id be able to point to an externally-allocated string. Apart from being able to work with almost arbitrarily-long named objects, this allows us to reduce the size of ceph_object_id from >100 bytes to 64 bytes. Signed-off-by: Ilya Dryomov <[email protected]>
2016-05-26libceph: change how osd_op_reply message size is calculatedIlya Dryomov1-10/+4
For a message pool message, preallocate a page, just like we do for osd_op. For a normal message, take ceph_object_id into account and don't bother subtracting CEPH_OSD_SLAB_OPS ceph_osd_ops. Signed-off-by: Ilya Dryomov <[email protected]>
2016-05-26libceph: move message allocation out of ceph_osdc_alloc_request()Ilya Dryomov1-38/+50
The size of ->r_request and ->r_reply messages depends on the size of the object name (ceph_object_id), while the size of ceph_osd_request is fixed. Move message allocation into a separate function that would have to be called after ceph_object_id and ceph_object_locator (which is also going to become variable in size with RADOS namespaces) have been filled in: req = ceph_osdc_alloc_request(...); <fill in req->r_base_oid> <fill in req->r_base_oloc> ceph_osdc_alloc_messages(req); Signed-off-by: Ilya Dryomov <[email protected]>
2016-05-26libceph: grab snapc in ceph_osdc_alloc_request()Ilya Dryomov1-2/+4
ceph_osdc_build_request() is going away. Grab snapc and initialize ->r_snapid in ceph_osdc_alloc_request(). Signed-off-by: Ilya Dryomov <[email protected]>
2016-05-26libceph: make ceph_osdc_put_request() accept NULLIlya Dryomov1-3/+5
Signed-off-by: Ilya Dryomov <[email protected]>
2016-05-25net: hwbm: Fix unbalanced spinlock in error caseGregory CLEMENT1-0/+3
When hwbm_pool_add exited in error the spinlock was not released. This patch fixes this issue. Fixes: 8cb2d8bf57e6 ("net: add a hardware buffer management helper API") Reported-by: Jean-Jacques Hiblot <[email protected]> Cc: <[email protected]> Signed-off-by: Gregory CLEMENT <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-25tipc: fix potential null pointer dereferences in some compat functionsBaozeng Ding1-18/+93
Before calling the nla_parse_nested function, make sure the pointer to the attribute is not null. This patch fixes several potential null pointer dereference vulnerabilities in the tipc netlink functions. Signed-off-by: Baozeng Ding <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-25netfilter: nf_queue: Make the queue_handler pernetEric W. Biederman2-15/+20
Florian Weber reported: > Under full load (unshare() in loop -> OOM conditions) we can > get kernel panic: > > BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 > IP: [<ffffffff81476c85>] nfqnl_nf_hook_drop+0x35/0x70 > [..] > task: ffff88012dfa3840 ti: ffff88012dffc000 task.ti: ffff88012dffc000 > RIP: 0010:[<ffffffff81476c85>] [<ffffffff81476c85>] nfqnl_nf_hook_drop+0x35/0x70 > RSP: 0000:ffff88012dfffd80 EFLAGS: 00010206 > RAX: 0000000000000008 RBX: ffffffff81add0c0 RCX: ffff88013fd80000 > [..] > Call Trace: > [<ffffffff81474d98>] nf_queue_nf_hook_drop+0x18/0x20 > [<ffffffff814738eb>] nf_unregister_net_hook+0xdb/0x150 > [<ffffffff8147398f>] netfilter_net_exit+0x2f/0x60 > [<ffffffff8141b088>] ops_exit_list.isra.4+0x38/0x60 > [<ffffffff8141b652>] setup_net+0xc2/0x120 > [<ffffffff8141bd09>] copy_net_ns+0x79/0x120 > [<ffffffff8106965b>] create_new_namespaces+0x11b/0x1e0 > [<ffffffff810698a7>] unshare_nsproxy_namespaces+0x57/0xa0 > [<ffffffff8104baa2>] SyS_unshare+0x1b2/0x340 > [<ffffffff81608276>] entry_SYSCALL_64_fastpath+0x1e/0xa8 > Code: 65 00 48 89 e5 41 56 41 55 41 54 53 83 e8 01 48 8b 97 70 12 00 00 48 98 49 89 f4 4c 8b 74 c2 18 4d 8d 6e 08 49 81 c6 88 00 00 00 <49> 8b 5d 00 48 85 db 74 1a 48 89 df 4c 89 e2 48 c7 c6 90 68 47 > The simple fix for this requires a new pernet variable for struct nf_queue that indicates when it is safe to use the dynamically allocated nf_queue state. As we need a variable anyway make nf_register_queue_handler and nf_unregister_queue_handler pernet. This allows the existing logic of when it is safe to use the state from the nfnetlink_queue module to be reused with no changes except for making it per net. The syncrhonize_rcu from nf_unregister_queue_handler is moved to a new function nfnl_queue_net_exit_batch so that the worst case of having a syncrhonize_rcu in the pernet exit path is not experienced in batch mode. Reported-by: Florian Westphal <[email protected]> Signed-off-by: "Eric W. Biederman" <[email protected]> Acked-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2016-05-25netfilter: conntrack: remove leftover binary sysctl defineFlorian Westphal1-2/+0
Users got removed in f8572d8f2a2ba ("sysctl net: Remove unused binary sysctl code"). Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2016-05-24net sched actions: policer missing timestamp processingJamal Hadi Salim1-0/+11
Policer was not dumping or updating timestamps Signed-off-by: Jamal Hadi Salim <[email protected]> Acked-by: Cong Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-24net_sched: avoid too many hrtimer_start() callsEric Dumazet2-10/+7
I found a serious performance bug in packet schedulers using hrtimers. sch_htb and sch_fq are definitely impacted by this problem. We constantly rearm high resolution timers if some packets are throttled in one (or more) class, and other packets are flying through qdisc on another (non throttled) class. hrtimer_start() does not have the mod_timer() trick of doing nothing if expires value does not change : if (timer_pending(timer) && timer->expires == expires) return 1; This issue is particularly visible when multiple cpus can queue/dequeue packets on the same qdisc, as hrtimer code has to lock a remote base. I used following fix : 1) Change htb to use qdisc_watchdog_schedule_ns() instead of open-coding it. 2) Cache watchdog prior expiration. hrtimer might provide this, but I prefer to not rely on some hrtimer internal. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-24Merge tag 'nfsd-4.7' of git://linux-nfs.org/~bfields/linuxLinus Torvalds6-62/+81
Pull nfsd updates from Bruce Fields: "A very quiet cycle for nfsd, mainly just an RDMA update from Chuck Lever" * tag 'nfsd-4.7' of git://linux-nfs.org/~bfields/linux: sunrpc: fix stripping of padded MIC tokens svcrpc: autoload rdma module svcrdma: Generalize svc_rdma_xdr_decode_req() svcrdma: Eliminate code duplication in svc_rdma_recvfrom() svcrdma: Drain QP before freeing svcrdma_xprt svcrdma: Post Receives only for forward channel requests svcrdma: Remove superfluous line from rdma_read_chunks() svcrdma: svc_rdma_put_context() is invoked twice in Send error path svcrdma: Do not add XDR padding to xdr_buf page vector svcrdma: Support IPv6 with NFS/RDMA nfsd: handle seqid wraparound in nfsd4_preprocess_layout_stateid Remove unnecessary allocation
2016-05-24ip6_gre: Set flowi6_proto as IPPROTO_GRE in xmit path.Haishuang Yan1-0/+1
In gre6 xmit path, we are sending a GRE packet, so set fl6 proto to IPPROTO_GRE properly. Signed-off-by: Haishuang Yan <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-24ip6_gre: Fix MTU setting for ip6gretapHaishuang Yan1-0/+2
When creat an ip6gretap interface with an unreachable route, the MTU is about 14 bytes larger than what was needed. If the remote address is reachable: ping6 2001:0:130::1 -c 2 PING 2001:0:130::1(2001:0:130::1) 56 data bytes 64 bytes from 2001:0:130::1: icmp_seq=1 ttl=64 time=1.46 ms 64 bytes from 2001:0:130::1: icmp_seq=2 ttl=64 time=81.1 ms Signed-off-by: David S. Miller <[email protected]>
2016-05-23ipv4: Fix non-initialized TTL when CONFIG_SYSCTL=nEzequiel Garcia2-4/+8
Commit fa50d974d104 ("ipv4: Namespaceify ip_default_ttl sysctl knob") moves the default TTL assignment, and as side-effect IPv4 TTL now has a default value only if sysctl support is enabled (CONFIG_SYSCTL=y). The sysctl_ip_default_ttl is fundamental for IP to work properly, as it provides the TTL to be used as default. The defautl TTL may be used in ip_selected_ttl, through the following flow: ip_select_ttl ip4_dst_hoplimit net->ipv4.sysctl_ip_default_ttl This commit fixes the issue by assigning net->ipv4.sysctl_ip_default_ttl in net_init_net, called during ipv4's initialization. Without this commit, a kernel built without sysctl support will send all IP packets with zero TTL (unless a TTL is explicitly set, e.g. with setsockopt). Given a similar issue might appear on the other knobs that were namespaceify, this commit also moves them. Fixes: fa50d974d104 ("ipv4: Namespaceify ip_default_ttl sysctl knob") Signed-off-by: Ezequiel Garcia <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-23net/atm: sk_err_soft must be positiveStefan Hajnoczi2-3/+3
The sk_err and sk_err_soft fields are positive errno values and userspace applications rely on this when using getsockopt(SO_ERROR). ATM code places an -errno into sk_err_soft in sigd_send() and returns it from svc_addparty()/svc_dropparty(). Although I am not familiar with ATM code I came to this conclusion because: 1. sigd_send() msg->type cases as_okay and as_error both have: sk->sk_err = -msg->reply; while the as_addparty and as_dropparty cases have: sk->sk_err_soft = msg->reply; This is the source of the inconsistency. 2. svc_addparty() returns an -errno and assumes sk_err_soft is also an -errno: if (flags & O_NONBLOCK) { error = -EINPROGRESS; goto out; } ... error = xchg(&sk->sk_err_soft, 0); out: release_sock(sk); return error; This shows that sk_err_soft is indeed being treated as an -errno. This patch ensures that sk_err_soft is always a positive errno. Signed-off-by: Stefan Hajnoczi <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-23sunrpc: fix stripping of padded MIC tokensTomáš Trnka1-2/+2
The length of the GSS MIC token need not be a multiple of four bytes. It is then padded by XDR to a multiple of 4 B, but unwrap_integ_data() would previously only trim mic.len + 4 B. The remaining up to three bytes would then trigger a check in nfs4svc_decode_compoundargs(), leading to a "garbage args" error and mount failure: nfs4svc_decode_compoundargs: compound not properly padded! nfsd: failed to decode arguments! This would prevent older clients using the pre-RFC 4121 MIC format (37-byte MIC including a 9-byte OID) from mounting exports from v3.9+ servers using krb5i. The trimming was introduced by commit 4c190e2f913f ("sunrpc: trim off trailing checksum before returning decrypted or integrity authenticated buffer"). Fixes: 4c190e2f913f "unrpc: trim off trailing checksum..." Signed-off-by: Tomáš Trnka <[email protected]> Cc: [email protected] Acked-by: Jeff Layton <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2016-05-23svcrpc: autoload rdma moduleJ. Bruce Fields1-4/+19
This should fix failures like: # rpc.nfsd --rdma rpc.nfsd: Unable to request RDMA services: Protocol not supported Reported-by: Steve Dickson <[email protected]> Reviewed-by: Chuck Lever <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2016-05-20Merge tag 'tty-4.7-rc1' of ↵Linus Torvalds3-27/+24
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty and serial driver updates from Greg KH: "Here's the large TTY and Serial driver update for 4.7-rc1. A few new serial drivers are added here, and Peter has fixed a bunch of long-standing bugs in the tty layer and serial drivers as normal. Full details in the shortlog. All of these have been in linux-next for a while with no reported issues" * tag 'tty-4.7-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (88 commits) MAINTAINERS: 8250: remove website reference serial: core: Fix port mutex assert if lockdep disabled serial: 8250_dw: fix wrong logic in dw8250_check_lcr() tty: vt, finish looping on duplicate tty: vt, return error when con_startup fails QE-UART: add "fsl,t1040-ucc-uart" to of_device_id serial: mctrl_gpio: Drop support for out1-gpios and out2-gpios serial: 8250dw: Add device HID for future AMD UART controller Fix OpenSSH pty regression on close serial: mctrl_gpio: add IRQ locking serial: 8250: Integrate Fintek into 8250_base serial: mps2-uart: add support for early console serial: mps2-uart: add MPS2 UART driver dt-bindings: document the MPS2 UART bindings serial: sirf: Use generic uart-has-rtscts DT property serial: sirf: Introduce helper variable struct device_node *np serial: mxs-auart: Use generic uart-has-rtscts DT property serial: imx: Use generic uart-has-rtscts DT property doc: DT: Add Generic Serial Device Tree Bindings serial: 8250: of: Make tegra_serial_handle_break() static ...
2016-05-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds33-303/+714
Pull networking fixes and more updates from David Miller: 1) Tunneling fixes from Tom Herbert and Alexander Duyck. 2) AF_UNIX updates some struct sock bit fields with the socket lock, whereas setsockopt() sets overlapping ones with locking. Seperate out the synchronized vs. the AF_UNIX unsynchronized ones to avoid corruption. From Andrey Ryabinin. 3) Mount BPF filesystem with mount_nodev rather than mount_ns, from Eric Biederman. 4) A couple kmemdup conversions, from Muhammad Falak R Wani. 5) BPF verifier fixes from Alexei Starovoitov. 6) Don't let tunneled UDP packets get stuck in socket queues, if something goes wrong during the encapsulation just drop the packet rather than signalling an error up the call stack. From Hannes Frederic Sowa. 7) SKB ref after free in batman-adv, from Florian Westphal. 8) TCP iSCSI, ocfs2, rds, and tipc have to disable BH in it's TCP callbacks since the TCP stack runs pre-emptibly now. From Eric Dumazet. 9) Fix crash in fixed_phy_add, from Rabin Vincent. 10) Fix length checks in xen-netback, from Paul Durrant. 11) Fix mixup in KEY vs KEYID macsec attributes, from Sabrina Dubroca. 12) RDS connection spamming bug fixes from Sowmini Varadhan * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (152 commits) net: suppress warnings on dev_alloc_skb uapi glibc compat: fix compilation when !__USE_MISC in glibc udp: prevent skbs lingering in tunnel socket queues bpf: teach verifier to recognize imm += ptr pattern bpf: support decreasing order in direct packet access net: usb: ch9200: use kmemdup ps3_gelic: use kmemdup net:liquidio: use kmemdup bpf: Use mount_nodev not mount_ns to mount the bpf filesystem net: cdc_ncm: update datagram size after changing mtu tuntap: correctly wake up process during uninit intel: Add support for IPv6 IP-in-IP offload ip6_gre: Do not allow segmentation offloads GRE_CSUM is enabled with FOU/GUE RDS: TCP: Avoid rds connection churn from rogue SYNs RDS: TCP: rds_tcp_accept_worker() must exit gracefully when terminating rds-tcp net: sock: move ->sk_shutdown out of bitfields. ipv6: Don't reset inner headers in ip6_tnl_xmit ip4ip6: Support for GSO/GRO ip6ip6: Support for GSO/GRO ipv6: Set features for IPv6 tunnels ...
2016-05-20udp: prevent skbs lingering in tunnel socket queuesHannes Frederic Sowa2-2/+2
In case we find a socket with encapsulation enabled we should call the encap_recv function even if just a udp header without payload is available. The callbacks are responsible for correctly verifying and dropping the packets. Also, in case the header validation fails for geneve and vxlan we shouldn't put the skb back into the socket queue, no one will pick them up there. Instead we can simply discard them in the respective encap_recv functions. Signed-off-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-20ip6_gre: Do not allow segmentation offloads GRE_CSUM is enabled with FOU/GUEAlexander Duyck1-4/+8
This patch addresses the same issue we had for IPv4 where enabling GRE with an inner checksum cannot be supported with FOU/GUE due to the fact that they will jump past the GRE header at it is treated like a tunnel header. Signed-off-by: Alexander Duyck <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-20RDS: TCP: Avoid rds connection churn from rogue SYNsSowmini Varadhan1-4/+6
When a rogue SYN is received after the connection arbitration algorithm has converged, the incoming SYN should not needlessly quiesce the transmit path, and it should not result in needless TCP connection resets due to re-execution of the connection arbitration logic. Signed-off-by: Sowmini Varadhan <[email protected]> Acked-by: Santosh Shilimkar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-20RDS: TCP: rds_tcp_accept_worker() must exit gracefully when terminating rds-tcpSowmini Varadhan1-0/+3
There are two instances where we want to terminate RDS-TCP: when exiting the netns or during module unload. In either case, the termination sequence is to stop the listen socket, mark the rtn->rds_tcp_listen_sock as null, and flush any accept workqs. Thus any workqs that get flushed at this point will encounter a null rds_tcp_listen_sock, and must exit gracefully to allow the RDS-TCP termination to complete successfully. Signed-off-by: Sowmini Varadhan <[email protected]> Acked-by: Santosh Shilimkar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-20ipv6: Don't reset inner headers in ip6_tnl_xmitTom Herbert1-5/+0
Since iptunnel_handle_offloads() is called in all paths we can probably drop the block in ip6_tnl_xmit that was checking for skb->encapsulation and resetting the inner headers. Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-20ip4ip6: Support for GSO/GROTom Herbert3-6/+44
Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-20ip6ip6: Support for GSO/GROTom Herbert2-3/+26
Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-20ipv6: Set features for IPv6 tunnelsTom Herbert1-0/+9
Need to set dev features, use same values that are used in GREv6. Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-20ip6_tunnel: Add support for fou/gue encapsulationTom Herbert1-0/+72
Add netlink and setup for encapsulation Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-20ip6_gre: Add support for fou/gue encapsulationTom Herbert1-4/+75
Add netlink and setup for encapsulation Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-20fou: Add encap ops for IPv6 tunnelsTom Herbert2-0/+141
This patch add a new fou6 module that provides encapsulation operations for IPv6. Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-20ip6_tun: Add infrastructure for doing encapsulationTom Herbert2-13/+86
Add encap_hlen and ip_tunnel_encap structure to ip6_tnl. Add functions for getting encap hlen, setting up encap on a tunnel, performing encapsulation operation. Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-20fou: Support IPv6 in fouTom Herbert1-12/+35
This patch adds receive path support for IPv6 with fou. - Add address family to fou structure for open sockets. This supports AF_INET and AF_INET6. Lookups for fou ports are performed on both the port number and family. - In fou and gue receive adjust tot_len in IPv4 header or payload_len based on address family. - Allow AF_INET6 in FOU_ATTR_AF netlink attribute. Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-20fou: Split out {fou,gue}_build_headerTom Herbert1-10/+37
Create __fou_build_header and __gue_build_header. These implement the protocol generic parts of building the fou and gue header. fou_build_header and gue_build_header implement the IPv4 specific functions and call the __*_build_header functions. Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-05-20fou: Call setup_udp_tunnel_sockTom Herbert1-34/+16
Use helper function to set up UDP tunnel related information for a fou socket. Signed-off-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]>