aboutsummaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2015-12-22net/rds: Avoid calling ib_query_deviceOr Gerlitz2-41/+16
Instead, use the cached copy of the attributes present on the device. Signed-off-by: Or Gerlitz <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2015-12-206lowpan: fix debugfs interface entry nameAlexander Aring1-3/+3
This patches moves the debugfs interface related register after netdevice register. The function lowpan_dev_debugfs_init will use "dev->name" which can be before register_netdevice a format string. The function register_netdevice will evaluate the format string if necessary and replace "dev->name" to the real interface name. Reported-by: Lukasz Duda <[email protected]> Signed-off-by: Alexander Aring <[email protected]> Acked-by: Lukasz Duda <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2015-12-20Bluetooth: use list_for_each_entry*Geliang Tang4-49/+25
Use list_for_each_entry*() instead of list_for_each*() to simplify the code. Signed-off-by: Geliang Tang <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]>
2015-12-18openvswitch: correct encoding of set tunnel action attributesSimon Horman1-1/+4
In a set action tunnel attributes should be encoded in a nested action. I noticed this because ovs-dpctl was reporting an error when dumping flows due to the incorrect encoding of tunnel attributes in a set action. Fixes: fc4099f17240 ("openvswitch: Fix egress tunnel info.") Signed-off-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-12-18ipip: ioctl: Remove superfluous IP-TTL handling.Pravin B Shelar1-3/+0
IP-TTL case is already handled in ip_tunnel_ioctl() API. Signed-off-by: Pravin B Shelar <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-12-18tcp: diag: add support for request sockets to tcp_abort()Eric Dumazet1-0/+9
Adding support for SYN_RECV request sockets to tcp_abort() is quite easy after our tcp listener rewrite. Note that we also need to better handle listeners, or we might leak not yet accepted children, because of a missing inet_csk_listen_stop() call. Signed-off-by: Eric Dumazet <[email protected]> Cc: Lorenzo Colitti <[email protected]> Tested-by: Lorenzo Colitti <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-12-18bpf: fix misleading comment in bpf_convert_filterDaniel Borkmann1-6/+0
Comment says "User BPF's register A is mapped to our BPF register 6", which is actually wrong as the mapping is on register 0. This can already be inferred from the code itself. So just remove it before someone makes assumptions based on that. Only code tells truth. ;) Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-12-18bpf: move clearing of A/X into classic to eBPF migration prologueDaniel Borkmann1-3/+16
Back in the days where eBPF (or back then "internal BPF" ;->) was not exposed to user space, and only the classic BPF programs internally translated into eBPF programs, we missed the fact that for classic BPF A and X needed to be cleared. It was fixed back then via 83d5b7ef99c9 ("net: filter: initialize A and X registers"), and thus classic BPF specifics were added to the eBPF interpreter core to work around it. This added some confusion for JIT developers later on that take the eBPF interpreter code as an example for deriving their JIT. F.e. in f75298f5c3fe ("s390/bpf: clear correct BPF accumulator register"), at least X could leak stack memory. Furthermore, since this is only needed for classic BPF translations and not for eBPF (verifier takes care that read access to regs cannot be done uninitialized), more complexity is added to JITs as they need to determine whether they deal with migrations or native eBPF where they can just omit clearing A/X in their prologue and thus reduce image size a bit, see f.e. cde66c2d88da ("s390/bpf: Only clear A and X for converted BPF programs"). In other cases (x86, arm64), A and X is being cleared in the prologue also for eBPF case, which is unnecessary. Lets move this into the BPF migration in bpf_convert_filter() where it actually belongs as long as the number of eBPF JITs are still few. It can thus be done generically; allowing us to remove the quirk from __bpf_prog_run() and to slightly reduce JIT image size in case of eBPF, while reducing code duplication on this matter in current(/future) eBPF JITs. Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> Reviewed-by: Michael Holzheu <[email protected]> Tested-by: Michael Holzheu <[email protected]> Cc: Zi Shen Lim <[email protected]> Cc: Yang Shi <[email protected]> Acked-by: Yang Shi <[email protected]> Acked-by: Zi Shen Lim <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-12-18bpf: add bpf_skb_load_bytes helperDaniel Borkmann1-1/+34
When hacking tc programs with eBPF, one of the issues that come up from time to time is to load addresses from headers. In eBPF as in classic BPF, we have BPF_LD | BPF_ABS | BPF_{B,H,W} instructions that extract a byte, half-word or word out of the skb data though helpers such as bpf_load_pointer() (interpreter case). F.e. extracting a whole IPv6 address could possibly look like ... union v6addr { struct { __u32 p1; __u32 p2; __u32 p3; __u32 p4; }; __u8 addr[16]; }; [...] a.p1 = htonl(load_word(skb, off)); a.p2 = htonl(load_word(skb, off + 4)); a.p3 = htonl(load_word(skb, off + 8)); a.p4 = htonl(load_word(skb, off + 12)); [...] /* access to a.addr[...] */ This work adds a complementary helper bpf_skb_load_bytes() (we also have bpf_skb_store_bytes()) as an alternative where the same call would look like from an eBPF program: ret = bpf_skb_load_bytes(skb, off, addr, sizeof(addr)); Same verifier restrictions apply as in ffeedafbf023 ("bpf: introduce current->pid, tgid, uid, gid, comm accessors") case, where stack memory access needs to be statically verified and thus guaranteed to be initialized in first use (otherwise verifier cannot tell whether a subsequent access to it is valid or not as it's runtime dependent). Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-12-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller28-317/+865
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains the first batch of Netfilter updates for the upcoming 4.5 kernel. This batch contains userspace netfilter header compilation fixes, support for packet mangling in nf_tables, the new tracing infrastructure for nf_tables and cgroup2 support for iptables. More specifically, they are: 1) Two patches to include dependencies in our netfilter userspace headers to resolve compilation problems, from Mikko Rapeli. 2) Four comestic cleanup patches for the ebtables codebase, from Ian Morris. 3) Remove duplicate include in the netfilter reject infrastructure, from Stephen Hemminger. 4) Two patches to simplify the netfilter defragmentation code for IPv6, patch from Florian Westphal. 5) Fix root ownership of /proc/net netfilter for unpriviledged net namespaces, from Philip Whineray. 6) Get rid of unused fields in struct nft_pktinfo, from Florian Westphal. 7) Add mangling support to our nf_tables payload expression, from Patrick McHardy. 8) Introduce a new netlink-based tracing infrastructure for nf_tables, from Florian Westphal. 9) Change setter functions in nfnetlink_log to be void, from Rami Rosen. 10) Add netns support to the cttimeout infrastructure. 11) Add cgroup2 support to iptables, from Tejun Heo. 12) Introduce nfnl_dereference_protected() in nfnetlink, from Florian. 13) Add support for mangling pkttype in the nf_tables meta expression, also from Florian. BTW, I need that you pull net into net-next, I have another batch that requires changes that I don't yet see in net. ==================== Signed-off-by: David S. Miller <[email protected]>
2015-12-18xprtrdma: Revert commit e7104a2a9606 ('xprtrdma: Cap req_cqinit').Chuck Lever2-10/+2
The root of the problem was that sends (especially unsignalled FASTREG and LOCAL_INV Work Requests) were not properly flow- controlled, which allowed a send queue overrun. Now that the RPC/RDMA reply handler waits for invalidation to complete, the send queue is properly flow-controlled. Thus this limit is no longer necessary. Signed-off-by: Chuck Lever <[email protected]> Tested-by: Devesh Sharma <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-12-18xprtrdma: Invalidate in the RPC reply handlerChuck Lever1-0/+16
There is a window between the time the RPC reply handler wakes the waiting RPC task and when xprt_release() invokes ops->buf_free. During this time, memory regions containing the data payload may still be accessed by a broken or malicious server, but the RPC application has already been allowed access to the memory containing the RPC request's data payloads. The server should be fenced from client memory containing RPC data payloads _before_ the RPC application is allowed to continue. This change also more strongly enforces send queue accounting. There is a maximum number of RPC calls allowed to be outstanding. When an RPC/RDMA transport is set up, just enough send queue resources are allocated to handle registration, Send, and invalidation WRs for each those RPCs at the same time. Before, additional RPC calls could be dispatched while invalidation WRs were still consuming send WQEs. When invalidation WRs backed up, dispatching additional RPCs resulted in a send queue overrun. Now, the reply handler prevents RPC dispatch until invalidation is complete. This prevents RPC call dispatch until there are enough send queue resources to proceed. Still to do: If an RPC exits early (say, ^C), the reply handler has no opportunity to perform invalidation. Currently, xprt_rdma_free() still frees remaining RDMA resources, which could deadlock. Additional changes are needed to handle invalidation properly in this case. Reported-by: Jason Gunthorpe <[email protected]> Signed-off-by: Chuck Lever <[email protected]> Tested-by: Devesh Sharma <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-12-18xprtrdma: Add ro_unmap_sync method for all-physical registrationChuck Lever1-0/+13
physical's ro_unmap is synchronous already. The new ro_unmap_sync method just has to DMA unmap all MRs associated with the RPC request. Signed-off-by: Chuck Lever <[email protected]> Tested-by: Devesh Sharma <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-12-18xprtrdma: Add ro_unmap_sync method for FMRChuck Lever1-0/+64
FMR's ro_unmap method is already synchronous because ib_unmap_fmr() is a synchronous verb. However, some improvements can be made here. 1. Gather all the MRs for the RPC request onto a list, and invoke ib_unmap_fmr() once with that list. This reduces the number of doorbells when there is more than one MR to invalidate 2. Perform the DMA unmap _after_ the MRs are unmapped, not before. This is critical after invalidating a Write chunk. Signed-off-by: Chuck Lever <[email protected]> Tested-by: Devesh Sharma <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-12-18xprtrdma: Add ro_unmap_sync method for FRWRChuck Lever2-4/+134
FRWR's ro_unmap is asynchronous. The new ro_unmap_sync posts LOCAL_INV Work Requests and waits for them to complete before returning. Note also, DMA unmapping is now done _after_ invalidation. Signed-off-by: Chuck Lever <[email protected]> Tested-by: Devesh Sharma <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-12-18xprtrdma: Introduce ro_unmap_sync methodChuck Lever1-0/+2
In the current xprtrdma implementation, some memreg strategies implement ro_unmap synchronously (the MR is knocked down before the method returns) and some asynchonously (the MR will be knocked down and returned to the pool in the background). To guarantee the MR is truly invalid before the RPC consumer is allowed to resume execution, we need an unmap method that is always synchronous, invoked from the RPC/RDMA reply handler. The new method unmaps all MRs for an RPC. The existing ro_unmap method unmaps only one MR at a time. Signed-off-by: Chuck Lever <[email protected]> Tested-by: Devesh Sharma <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-12-18xprtrdma: Move struct ib_send_wr off the stackChuck Lever2-18/+24
For FRWR FASTREG and LOCAL_INV, move the ib_*_wr structure off the stack. This allows frwr_op_map and frwr_op_unmap to chain WRs together without limit to register or invalidate a set of MRs with a single ib_post_send(). (This will be for chaining LOCAL_INV requests). Signed-off-by: Chuck Lever <[email protected]> Tested-by: Devesh Sharma <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-12-18xprtrdma: Disable RPC/RDMA backchannel debugging messagesChuck Lever1-7/+9
Clean up. Fixes: 63cae47005af ('xprtrdma: Handle incoming backward direction') Signed-off-by: Chuck Lever <[email protected]> Tested-by: Devesh Sharma <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-12-18xprtrdma: xprt_rdma_free() must not release backchannel reqsChuck Lever1-0/+3
Preserve any rpcrdma_req that is attached to rpc_rqst's allocated for the backchannel. Otherwise, after all the pre-allocated backchannel req's are consumed, incoming backward calls start writing on freed memory. Somehow this hunk got lost. Fixes: f531a5dbc451 ('xprtrdma: Pre-allocate backward rpc_rqst') Signed-off-by: Chuck Lever <[email protected]> Tested-by: Devesh Sharma <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-12-18xprtrdma: Fix additional uses of spin_lock_irqsave(rb_lock)Chuck Lever2-9/+4
Clean up. rb_lock critical sections added in rpcrdma_ep_post_extra_recv() should have first been converted to use normal spin_lock now that the reply handler is a work queue. The backchannel set up code should use the appropriate helper instead of open-coding a rb_recv_bufs list add. Problem introduced by glib patch re-ordering on my part. Fixes: f531a5dbc451 ('xprtrdma: Pre-allocate backward rpc_rqst') Signed-off-by: Chuck Lever <[email protected]> Tested-by: Devesh Sharma <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-12-18xprtrdma: checking for NULL instead of IS_ERR()Dan Carpenter1-2/+2
The rpcrdma_create_req() function returns error pointers or success. It never returns NULL. Fixes: f531a5dbc451 ('xprtrdma: Pre-allocate backward rpc_rqst and send/receive buffers') Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-12-18xprtrdma: clean up some curly bracesDan Carpenter1-1/+2
It doesn't matter either way, but the curly braces were clearly intended here. It causes a Smatch warning. Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Anna Schumaker <[email protected]>
2015-12-18net: Allow accepted sockets to be bound to l3mdev domainDavid Ahern5-5/+17
Allow accepted sockets to derive their sk_bound_dev_if setting from the l3mdev domain in which the packets originated. A sysctl setting is added to control the behavior which is similar to sk_mark and sysctl_tcp_fwmark_accept. This effectively allow a process to have a "VRF-global" listen socket, with child sockets bound to the VRF device in which the packet originated. A similar behavior can be achieved using sk_mark, but a solution using marks is incomplete as it does not handle duplicate addresses in different L3 domains/VRFs. Allowing sockets to inherit the sk_bound_dev_if from l3mdev domain provides a complete solution. Signed-off-by: David Ahern <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-12-18ipv6: addrconf: use stable address generator for ARPHRD_NONEBjørn Mork1-6/+39
Add a new address generator mode, using the stable address generator with an automatically generated secret. This is intended as a default address generator mode for device types with no EUI64 implementation. The new generator is used for ARPHRD_NONE interfaces initially, adding default IPv6 autoconf support to e.g. tun interfaces. If the addrgenmode is set to 'random', either by default or manually, and no stable secret is available, then a random secret is used as input for the stable-privacy address generator. The secret can be read and modified like manually configured secrets, using the proc interface. Modifying the secret will change the addrgen mode to 'stable-privacy' to indicate that it operates on a known secret. Existing behaviour of the 'stable-privacy' mode is kept unchanged. If a known secret is available when the device is created, then the mode will default to 'stable-privacy' as before. The mode can be manually set to 'random' but it will behave exactly like 'stable-privacy' in this case. The secret will not change. Cc: Hannes Frederic Sowa <[email protected]> Cc: 吉藤英明 <[email protected]> Signed-off-by: Bjørn Mork <[email protected]> Acked-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-12-18ila: add NETFILTER dependencyArnd Bergmann1-0/+1
The recently added generic ILA translation facility fails to build when CONFIG_NETFILTER is disabled: net/ipv6/ila/ila_xlat.c:229:20: warning: 'struct nf_hook_state' declared inside parameter list net/ipv6/ila/ila_xlat.c:235:27: error: array type has incomplete element type 'struct nf_hook_ops' static struct nf_hook_ops ila_nf_hook_ops[] __read_mostly = { This adds an explicit Kconfig dependency to avoid that case. Signed-off-by: Arnd Bergmann <[email protected]> Fixes: 7f00feaf1076 ("ila: Add generic ILA translation facility") Signed-off-by: David S. Miller <[email protected]>
2015-12-18netfilter: nft_ct: include direction when dumping NFT_CT_L3PROTOCOL keyFlorian Westphal1-0/+1
one nft userspace test case fails with 'ct l3proto original ipv4' mismatches 'ct l3proto ipv4' ... because NFTA_CT_DIRECTION attr is missing. Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-12-18netfilter: nf_tables: use skb->protocol instead of assuming ethernet headerPablo Neira Ayuso1-1/+1
Otherwise we may end up with incorrect network and transport header for other protocols. Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-12-18netfilter: meta: add support for setting skb->pkttypeFlorian Westphal1-0/+38
This allows to redirect bridged packets to local machine: ether type ip ether daddr set aa:53:08:12:34:56 meta pkttype set unicast Without 'set unicast', ip stack discards PACKET_OTHERHOST skbs. It is also useful to add support for a '-m cluster like' nft rule (where switch floods packets to several nodes, and each cluster node node processes a subset of packets for load distribution). Mangling is restricted to HOST/OTHER/BROAD/MULTICAST, i.e. you cannot set skb->pkt_type to PACKET_KERNEL or change PACKET_LOOPBACK to PACKET_HOST. Signed-off-by: Florian Westphal <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2015-12-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller49-240/+371
Conflicts: drivers/net/geneve.c Here we had an overlapping change, where in 'net' the extraneous stats bump was being removed whilst in 'net-next' the final argument to udp_tunnel6_xmit_skb() was being changed. Signed-off-by: David S. Miller <[email protected]>
2015-12-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds44-223/+347
Pull networking fixes from David Miller: 1) Fix uninitialized variable warnings in nfnetlink_queue, a lot of people reported this... From Arnd Bergmann. 2) Don't init mutex twice in i40e driver, from Jesse Brandeburg. 3) Fix spurious EBUSY in rhashtable, from Herbert Xu. 4) Missing DMA unmaps in mvpp2 driver, from Marcin Wojtas. 5) Fix race with work structure access in pppoe driver causing corruptions, from Guillaume Nault. 6) Fix OOPS due to sh_eth_rx() not checking whether netdev_alloc_skb() actually succeeded or not, from Sergei Shtylyov. 7) Don't lose flags when settifn IFA_F_OPTIMISTIC in ipv6 code, from Bjørn Mork. 8) VXLAN_HD_RCO defined incorrectly, fix from Jiri Benc. 9) Fix clock source used for cookies in SCTP, from Marcelo Ricardo Leitner. 10) aurora driver needs HAS_DMA dependency, from Geert Uytterhoeven. 11) ndo_fill_metadata_dst op of vxlan has to handle ipv6 tunneling properly as well, from Jiri Benc. 12) Handle request sockets properly in xfrm layer, from Eric Dumazet. 13) Double stats update in ipv6 geneve transmit path, fix from Pravin B Shelar. 14) sk->sk_policy[] needs RCU protection, and as a result xfrm_policy_destroy() needs to free policies using an RCU grace period, from Eric Dumazet. 15) SCTP needs to clone ipv6 tx options in order to avoid use after free, from Eric Dumazet. 16) Missing kbuild export if ila.h, from Stephen Hemminger. 17) Missing mdiobus_alloc() return value checking in mdio-mux.c, from Tobias Klauser. 18) Validate protocol value range in ->create() methods, from Hannes Frederic Sowa. 19) Fix early socket demux races that result in illegal dst reuse, from Eric Dumazet. 20) Validate socket address length in pptp code, from WANG Cong. 21) skb_reorder_vlan_header() uses incorrect offset and can corrupt packets, from Vlad Yasevich. 22) Fix memory leaks in nl80211 registry code, from Ola Olsson. 23) Timeout loop count handing fixes in mISDN, xgbe, qlge, sfc, and qlcnic. From Dan Carpenter. 24) msg.msg_iocb needs to be cleared in recvfrom() otherwise, for example, AF_ALG will interpret it as an async call. From Tadeusz Struk. 25) inetpeer_set_addr_v4 forgets to initialize the 'vif' field, from Eric Dumazet. 26) rhashtable enforces the minimum table size not early enough, breaking how we calculate the per-cpu lock allocations. From Herbert Xu. 27) Fix FCC port lockup in 82xx driver, from Martin Roth. 28) FOU sockets need to be freed using RCU, from Hannes Frederic Sowa. 29) Fix out-of-bounds access in __skb_complete_tx_timestamp() and sock_setsockopt() wrt. timestamp handling. From WANG Cong. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (117 commits) net: check both type and procotol for tcp sockets drivers: net: xgene: fix Tx flow control tcp: restore fastopen with no data in SYN packet af_unix: Revert 'lock_interruptible' in stream receive code fou: clean up socket with kfree_rcu 82xx: FCC: Fixing a bug causing to FCC port lock-up gianfar: Don't enable RX Filer if not supported net: fix warnings in 'make htmldocs' by moving macro definition out of field declaration rhashtable: Fix walker list corruption rhashtable: Enforce minimum size on initial hash table inet: tcp: fix inetpeer_set_addr_v4() ipv6: automatically enable stable privacy mode if stable_secret set net: fix uninitialized variable issue bluetooth: Validate socket address length in sco_sock_bind(). net_sched: make qdisc_tree_decrease_qlen() work for non mq ser_gigaset: remove unnecessary kfree() calls from release method ser_gigaset: fix deallocation of platform device structure ser_gigaset: turn nonsense checks into WARN_ON ser_gigaset: fix up NULL checks qlcnic: fix a timeout loop ...
2015-12-17net: check both type and procotol for tcp socketsWANG Cong2-2/+4
Dmitry reported the following out-of-bound access: Call Trace: [<ffffffff816cec2e>] __asan_report_load4_noabort+0x3e/0x40 mm/kasan/report.c:294 [<ffffffff84affb14>] sock_setsockopt+0x1284/0x13d0 net/core/sock.c:880 [< inline >] SYSC_setsockopt net/socket.c:1746 [<ffffffff84aed7ee>] SyS_setsockopt+0x1fe/0x240 net/socket.c:1729 [<ffffffff85c18c76>] entry_SYSCALL_64_fastpath+0x16/0x7a arch/x86/entry/entry_64.S:185 This is because we mistake a raw socket as a tcp socket. We should check both sk->sk_type and sk->sk_protocol to ensure it is a tcp socket. Willem points out __skb_complete_tx_timestamp() needs to fix as well. Reported-by: Dmitry Vyukov <[email protected]> Cc: Willem de Bruijn <[email protected]> Cc: Eric Dumazet <[email protected]> Signed-off-by: Cong Wang <[email protected]> Acked-by: Willem de Bruijn <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-12-17tcp: restore fastopen with no data in SYN packetEric Dumazet1-11/+12
Yuchung tracked a regression caused by commit 57be5bdad759 ("ip: convert tcp_sendmsg() to iov_iter primitives") for TCP Fast Open. Some Fast Open users do not actually add any data in the SYN packet. Fixes: 57be5bdad759 ("ip: convert tcp_sendmsg() to iov_iter primitives") Reported-by: Yuchung Cheng <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Cc: Al Viro <[email protected]> Acked-by: Yuchung Cheng <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-12-17af_unix: Revert 'lock_interruptible' in stream receive codeRainer Weikusat1-10/+3
With b3ca9b02b00704053a38bfe4c31dbbb9c13595d0, the AF_UNIX SOCK_STREAM receive code was changed from using mutex_lock(&u->readlock) to mutex_lock_interruptible(&u->readlock) to prevent signals from being delayed for an indefinite time if a thread sleeping on the mutex happened to be selected for handling the signal. But this was never a problem with the stream receive code (as opposed to its datagram counterpart) as that never went to sleep waiting for new messages with the mutex held and thus, wouldn't cause secondary readers to block on the mutex waiting for the sleeping primary reader. As the interruptible locking makes the code more complicated in exchange for no benefit, change it back to using mutex_lock. Signed-off-by: Rainer Weikusat <[email protected]> Acked-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-12-17ipv6: add IPV6_HDRINCL option for raw socketsHannes Frederic Sowa1-4/+16
Same as in Windows, we miss IPV6_HDRINCL for SOL_IPV6 and SOL_RAW. The SOL_IP/IP_HDRINCL is not available for IPv6 sockets. Signed-off-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-12-17ipv6: allow routes to be configured with expire valuesXin Long1-0/+10
Add the support for adding expire value to routes, requested by Tom Gundersen <[email protected]> for systemd-networkd, and NetworkManager wants it too. implement it by adding the new RTNETLINK attribute RTA_EXPIRES. Signed-off-by: Xin Long <[email protected]> Acked-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-12-16fou: clean up socket with kfree_rcuHannes Frederic Sowa1-1/+2
fou->udp_offloads is managed by RCU. As it is actually included inside the fou sockets, we cannot let the memory go out of scope before a grace period. We either can synchronize_rcu or switch over to kfree_rcu to manage the sockets. kfree_rcu seems appropriate as it is used by vxlan and geneve. Fixes: 23461551c00628c ("fou: Support for foo-over-udp RX path") Cc: Tom Herbert <[email protected]> Signed-off-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-12-16net: Pass ndm_state to route netlink FDB notifications.Hubert Sokolowski1-7/+10
Before this change applications monitoring FDB notifications were not able to determine whether a new FDB entry is permament or not: bridge fdb add f1:f2:f3:f4:f5:f8 dev sw0p1 temp self bridge fdb add f1:f2:f3:f4:f5:f9 dev sw0p1 self bridge monitor fdb f1:f2:f3:f4:f5:f8 dev sw0p1 self permanent f1:f2:f3:f4:f5:f9 dev sw0p1 self permanent With this change ndm_state from the original netlink message is passed to the new netlink message sent as notification. bridge fdb add f1:f2:f3:f4:f5:f6 dev sw0p1 self bridge fdb add f1:f2:f3:f4:f5:f7 dev sw0p1 temp self bridge monitor fdb f1:f2:f3:f4:f5:f6 dev sw0p1 self permanent f1:f2:f3:f4:f5:f7 dev sw0p1 self static Signed-off-by: Hubert Sokolowski <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-12-16Merge tag 'mac80211-for-davem-2015-12-15' of ↵David S. Miller9-74/+92
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Another set of fixes: * memory leak fixes (from Ola) * operating mode notification spec compliance fix (from Eyal) * copy rfkill names in case pointer becomes invalid (myself) * two hardware restart fixes (myself) * get rid of "limiting TX power" log spam (myself) ==================== Signed-off-by: David S. Miller <[email protected]>
2015-12-16Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-mergeDavid S. Miller11-24/+367
Antonio Quartulli says: ==================== Included changes: - change my email in MAINTAINERS and Doc files - create and export list of single hop neighs per interface - protect CRC in the BLA code by means of its own lock - minor fixes and code cleanups ==================== Signed-off-by: David S. Miller <[email protected]>
2015-12-16net: sctp: dynamically enable or disable pf stateZhu Yanjun3-1/+14
As we all know, the value of pf_retrans >= max_retrans_path can disable pf state. The variables of pf_retrans and max_retrans_path can be changed by the userspace application. Sometimes the user expects to disable pf state while the 2 variables are changed to enable pf state. So it is necessary to introduce a new variable to disable pf state. According to the suggestions from Vlad Yasevich, extra1 and extra2 are removed. The initialization of pf_enable is added. Acked-by: Vlad Yasevich <[email protected]> Signed-off-by: Zhu Yanjun <[email protected]> Acked-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-12-16batman-adv: lock crc access in bridge loop avoidanceSimon Wunderlich2-5/+32
We have found some networks in which nodes were constantly requesting other nodes BLA claim tables to synchronize, just to ask for that again once completed. The reason was that the crc checksum of the asked nodes were out of sync due to missing locking and multiple writes to the same crc checksum when adding/removing entries. Therefore the asked nodes constantly reported the wrong crc, which caused repeating requests. To avoid multiple functions changing a backbone gateways crc entry at the same time, lock it using a spinlock. Signed-off-by: Simon Wunderlich <[email protected]> Tested-by: Alfons Name <[email protected]> Signed-off-by: Marek Lindner <[email protected]> Signed-off-by: Antonio Quartulli <[email protected]>
2015-12-16batman-adv: Fix typo 'wether' -> 'whether'Sven Eckelmann1-2/+2
Signed-off-by: Sven Eckelmann <[email protected]> Signed-off-by: Marek Lindner <[email protected]> Signed-off-by: Antonio Quartulli <[email protected]>
2015-12-16batman-adv: Use chain pointer when purging fragmentsSven Eckelmann1-4/+4
The chain pointer was already created in batadv_frag_purge_orig to make the checks more readable. Just use the chain pointer everywhere instead of having the same dereference + array access in the most lines of this function. Signed-off-by: Sven Eckelmann <[email protected]> Acked-by: Martin Hundebøll <[email protected]> Signed-off-by: Marek Lindner <[email protected]> Signed-off-by: Antonio Quartulli <[email protected]>
2015-12-16batman-adv: unify flags access style in tt global addSimon Wunderlich1-1/+1
This should slightly improve readability Signed-off-by: Simon Wunderlich <[email protected]> Signed-off-by: Marek Lindner <[email protected]> Signed-off-by: Antonio Quartulli <[email protected]>
2015-12-16batman-adv: detect local excess vlans in TT requestSimon Wunderlich1-1/+13
If the local representation of the global TT table of one originator has more VLAN entries than the respective TT update, there is some inconsistency present. By detecting and reporting this inconsistency, the global table gets updated and the excess VLAN will get removed in the process. Reported-by: Alessandro Bolletta <[email protected]> Signed-off-by: Simon Wunderlich <[email protected]> Acked-by: Antonio Quartulli <[email protected]> Signed-off-by: Marek Lindner <[email protected]> Signed-off-by: Antonio Quartulli <[email protected]>
2015-12-15ipv6: automatically enable stable privacy mode if stable_secret setHannes Frederic Sowa1-0/+6
Bjørn reported that while we switch all interfaces to privacy stable mode when setting the secret, we don't set this mode for new interfaces. This does not make sense, so change this behaviour. Fixes: 622c81d57b392cc ("ipv6: generation of stable privacy addresses for link-local and autoconf") Reported-by: Bjørn Mork <[email protected]> Cc: Bjørn Mork <[email protected]> Signed-off-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-12-15sctp: use GFP_KERNEL in sctp_init()Eric Dumazet1-2/+2
modules init functions being called from process context, we better use GFP_KERNEL allocations to increase our chances to get these high-order pages we want for SCTP hash tables. This mostly matters if SCTP module is loaded once memory got fragmented. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-12-15net: diag: Support destroying TCP sockets.Lorenzo Colitti5-0/+66
This implements SOCK_DESTROY for TCP sockets. It causes all blocking calls on the socket to fail fast with ECONNABORTED and causes a protocol close of the socket. It informs the other end of the connection by sending a RST, i.e., initiating a TCP ABORT as per RFC 793. ECONNABORTED was chosen for consistency with FreeBSD. Signed-off-by: Lorenzo Colitti <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-12-15net: diag: Support SOCK_DESTROY for inet sockets.Lorenzo Colitti1-8/+15
This passes the SOCK_DESTROY operation to the underlying protocol diag handler, or returns -EOPNOTSUPP if that handler does not define a destroy operation. Most of this patch is just renaming functions. This is not strictly necessary, but it would be fairly counterintuitive to have the code to destroy inet sockets be in a function whose name starts with inet_diag_get. Signed-off-by: Lorenzo Colitti <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2015-12-15net: diag: Add the ability to destroy a socket.Lorenzo Colitti1-3/+20
This patch adds a SOCK_DESTROY operation, a destroy function pointer to sock_diag_handler, and a diag_destroy function pointer. It does not include any implementation code. Signed-off-by: Lorenzo Colitti <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>