blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2016-08-08	bpf: fix checksum fixups on bpf_skb_store_bytes	Daniel Borkmann	1	-2/+2
	bpf_skb_store_bytes() invocations above L2 header need BPF_F_RECOMPUTE_CSUM flag for updates, so that CHECKSUM_COMPLETE will be fixed up along the way. Where we ran into an issue with bpf_skb_store_bytes() is when we did a single-byte update on the IPv6 hoplimit despite using BPF_F_RECOMPUTE_CSUM flag; simple ping via ICMPv6 triggered a hw csum failure as a result. The underlying issue has been tracked down to a buffer alignment issue. Meaning, that csum_partial() computations via skb_postpull_rcsum() and skb_postpush_rcsum() pair invoked had a wrong result since they operated on an odd address for the hoplimit, while other computations were done on an even address. This mix doesn't work as-is with skb_postpull_rcsum(), skb_postpush_rcsum() pair as it always expects at least half-word alignment of input buffers, which is normally the case. Thus, instead of these helpers using csum_sub() and (implicitly) csum_add(), we need to use csum_block_sub(), csum_block_add(), respectively. For unaligned offsets, they rotate the sum to align it to a half-word boundary again, otherwise they work the same as csum_sub() and csum_add(). Adding __skb_postpull_rcsum(), __skb_postpush_rcsum() variants that take the offset as an input and adapting bpf_skb_store_bytes() to them fixes the hw csum failures again. The skb_postpull_rcsum(), skb_postpush_rcsum() helpers use a 0 constant for offset so that the compiler optimizes the offset & 1 test away and generates the same code as with csum_sub()/_add(). Fixes: 608cd71a9c7c ("tc: bpf: generalize pedit action") Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-08-08	bpf: also call skb_postpush_rcsum on xmit occasions	Daniel Borkmann	1	-3/+10
	Follow-up to commit f8ffad69c9f8 ("bpf: add skb_postpush_rcsum and fix dev_forward_skb occasions") to fix an issue for dev_queue_xmit() redirect locations which need CHECKSUM_COMPLETE fixups on ingress. For the same reasons as described in f8ffad69c9f8 already, we of course also need this here, since dev_queue_xmit() on a veth device will let us end up in the dev_forward_skb() helper again to cross namespaces. Latter then calls into skb_postpull_rcsum() to pull out L2 header, so that netif_rx_internal() sees CHECKSUM_COMPLETE as it is expected. That is, CHECKSUM_COMPLETE on ingress covering L2 _payload_, not L2 headers. Also here we have to address bpf_redirect() and bpf_clone_redirect(). Fixes: 3896d655f4d4 ("bpf: introduce bpf_clone_redirect() helper") Fixes: 27b29f63058d ("bpf: add bpf_redirect() helper") Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Alexei Starovoitov <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-08-08	sctp_diag: Respect ss adding TCPF_CLOSE to idiag_states	Phil Sutter	1	-2/+2
	Since 'ss' always adds TCPF_CLOSE to idiag_states flags, sctp_diag can't rely upon TCPF_LISTEN flag solely being present when listening sockets are requested. Signed-off-by: Phil Sutter <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-08-08	sctp_diag: Fix T3_rtx timer export	Phil Sutter	1	-4/+10
	The asoc's timer value is not kept in asoc->timeouts array but in it's primary transport instead. Furthermore, we must export the timer only if it is pending, otherwise the value will underrun when stored in an unsigned variable and user space will only see a very large timeout value. Signed-off-by: Phil Sutter <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-08-08	libceph: using kfree_rcu() to simplify the code	Wei Yongjun	1	-7/+1
	The callback function of call_rcu() just calls a kfree(), so we can use kfree_rcu() instead of call_rcu() + callback function. Signed-off-by: Wei Yongjun <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2016-08-08	libceph: make cancel_generic_request() static	Wei Yongjun	1	-1/+1
	Fixes the following sparse warning: net/ceph/mon_client.c:577:6: warning: symbol 'cancel_generic_request' was not declared. Should it be static? Signed-off-by: Wei Yongjun <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2016-08-08	libceph: fix return value check in alloc_msg_with_page_vector()	Wei Yongjun	1	-1/+1
	In case of error, the function ceph_alloc_page_vector() returns ERR_PTR() and never returns NULL. The NULL test in the return value check should be replaced with IS_ERR(). Fixes: 1907920324f1 ('libceph: support for sending notifies') Signed-off-by: Wei Yongjun <[email protected]> Signed-off-by: Ilya Dryomov <[email protected]>
2016-08-08	netfilter: nf_conntrack_sip: CSeq 0 is a valid CSeq	Christophe Leroy	1	-2/+2
	Do not drop packet when CSeq is 0 as 0 is also a valid value for CSeq. simple_strtoul() will return 0 either when all digits are 0 or if there are no digits at all. Therefore when simple_strtoul() returns 0 we check if first character is digit 0 or not. Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2016-08-08	netfilter: nft_rbtree: ignore inactive matching element with no descendants	Pablo Neira Ayuso	1	-4/+6
	If we find a matching element that is inactive with no descendants, we jump to the found label, then crash because of nul-dereference on the left branch. Fix this by checking that the element is active and not an interval end and skipping the logic that only applies to the tree iteration. Signed-off-by: Pablo Neira Ayuso <[email protected]> Tested-by: Anders K. Pedersen <[email protected]>
2016-08-08	netfilter: nf_ct_h323: do not re-activate already expired timer	Liping Zhang	1	-1/+2
	Commit 96d1327ac2e3 ("netfilter: h323: Use mod_timer instead of set_expect_timeout") just simplify the source codes if (!del_timer(&exp->timeout)) return 0; add_timer(&exp->timeout); to mod_timer(&exp->timeout, jiffies + info->timeout * HZ); This is not correct, and introduce a race codition: CPU0 CPU1 - timer expire process_rcf expectation_timed_out lock(exp_lock) - find_exp waiting exp_lock... re-activate timer!! waiting exp_lock... unlock(exp_lock) lock(exp_lock) - unlink expect - free(expect) - unlock(exp_lock) So when the timer expires again, we will access the memory that was already freed. Replace mod_timer with mod_timer_pending here to fix this problem. Fixes: 96d1327ac2e3 ("netfilter: h323: Use mod_timer instead of set_expect_timeout") Cc: Gao Feng <[email protected]> Signed-off-by: Liping Zhang <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]>
2016-08-06	Merge tag 'mac80211-for-davem-2016-08-05' of ↵	David S. Miller	8	-20/+51
	git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== First set of fixes for the current cycle: * fix 80+80 bandwidth warning * fix powersave with mac80211 TXQ implementation * use correct way to free SKBs from multicast buffering * mesh: fix operation ordering to work with all drivers * mesh: end service period even when peer goes away * mesh: correct HT opmode validity checks * pass hw pointer from mac80211 to driver in TPT method, fixing a bug (in a bit the wrong way, but that's what we have right now) ==================== Signed-off-by: David S. Miller <[email protected]>
2016-08-06	Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost	Linus Torvalds	6	-6/+1663
	Pull virtio/vhost updates from Michael Tsirkin: - new vsock device support in host and guest - platform IOMMU support in host and guest, including compatibility quirks for legacy systems. - misc fixes and cleanups. * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: VSOCK: Use kvfree() vhost: split out vringh Kconfig vhost: detect 32 bit integer wrap around vhost: new device IOTLB API vhost: drop vringh dependency vhost: convert pre sorted vhost memory array to interval tree vhost: introduce vhost memory accessors VSOCK: Add Makefile and Kconfig VSOCK: Introduce vhost_vsock.ko VSOCK: Introduce virtio_transport.ko VSOCK: Introduce virtio_vsock_common.ko VSOCK: defer sock removal to transports VSOCK: transport-specific vsock_transport functions vhost: drop vringh dependency vop: pull in vhost Kconfig virtio: new feature to detect IOMMU device quirk balloon: check the number of available pages in leak balloon vhost: lockless enqueuing vhost: simplify work flushing
2016-08-06	ipv4: panic in leaf_walk_rcu due to stale node pointer	David Forster	1	-6/+2
	Panic occurs when issuing "cat /proc/net/route" whilst populating FIB with > 1M routes. Use of cached node pointer in fib_route_get_idx is unsafe. BUG: unable to handle kernel paging request at ffffc90001630024 IP: [<ffffffff814cf6a0>] leaf_walk_rcu+0x10/0xe0 PGD 11b08d067 PUD 11b08e067 PMD dac4b067 PTE 0 Oops: 0000 [#1] SMP Modules linked in: nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscac snd_hda_codec_generic snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep virti acpi_cpufreq button parport_pc ppdev lp parport autofs4 ext4 crc16 mbcache jbd tio_ring virtio floppy uhci_hcd ehci_hcd usbcore usb_common libata scsi_mod CPU: 1 PID: 785 Comm: cat Not tainted 4.2.0-rc8+ #4 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007 task: ffff8800da1c0bc0 ti: ffff88011a05c000 task.ti: ffff88011a05c000 RIP: 0010:[<ffffffff814cf6a0>] [<ffffffff814cf6a0>] leaf_walk_rcu+0x10/0xe0 RSP: 0018:ffff88011a05fda0 EFLAGS: 00010202 RAX: ffff8800d8a40c00 RBX: ffff8800da4af940 RCX: ffff88011a05ff20 RDX: ffffc90001630020 RSI: 0000000001013531 RDI: ffff8800da4af950 RBP: 0000000000000000 R08: ffff8800da1f9a00 R09: 0000000000000000 R10: ffff8800db45b7e4 R11: 0000000000000246 R12: ffff8800da4af950 R13: ffff8800d97a74c0 R14: 0000000000000000 R15: ffff8800d97a7480 FS: 00007fd3970e0700(0000) GS:ffff88011fd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: ffffc90001630024 CR3: 000000011a7e4000 CR4: 00000000000006e0 Stack: ffffffff814d00d3 0000000000000000 ffff88011a05ff20 ffff8800da1f9a00 ffffffff811dd8b9 0000000000000800 0000000000020000 00007fd396f35000 ffffffff811f8714 0000000000003431 ffffffff8138dce0 0000000000000f80 Call Trace: [<ffffffff814d00d3>] ? fib_route_seq_start+0x93/0xc0 [<ffffffff811dd8b9>] ? seq_read+0x149/0x380 [<ffffffff811f8714>] ? fsnotify+0x3b4/0x500 [<ffffffff8138dce0>] ? process_echoes+0x70/0x70 [<ffffffff8121cfa7>] ? proc_reg_read+0x47/0x70 [<ffffffff811bb823>] ? __vfs_read+0x23/0xd0 [<ffffffff811bbd42>] ? rw_verify_area+0x52/0xf0 [<ffffffff811bbe61>] ? vfs_read+0x81/0x120 [<ffffffff811bcbc2>] ? SyS_read+0x42/0xa0 [<ffffffff81549ab2>] ? entry_SYSCALL_64_fastpath+0x16/0x75 Code: 48 85 c0 75 d8 f3 c3 31 c0 c3 f3 c3 66 66 66 66 66 66 2e 0f 1f 84 00 00 a 04 89 f0 33 02 44 89 c9 48 d3 e8 0f b6 4a 05 49 89 RIP [<ffffffff814cf6a0>] leaf_walk_rcu+0x10/0xe0 RSP <ffff88011a05fda0> CR2: ffffc90001630024 Signed-off-by: Dave Forster <[email protected]> Acked-by: Alexander Duyck <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-08-06	rxrpc: Fix races between skb free, ACK generation and replying	David Howells	7	-46/+45
	Inside the kafs filesystem it is possible to occasionally have a call processed and terminated before we've had a chance to check whether we need to clean up the rx queue for that call because afs_send_simple_reply() ends the call when it is done, but this is done in a workqueue item that might happen to run to completion before afs_deliver_to_call() completes. Further, it is possible for rxrpc_kernel_send_data() to be called to send a reply before the last request-phase data skb is released. The rxrpc skb destructor is where the ACK processing is done and the call state is advanced upon release of the last skb. ACK generation is also deferred to a work item because it's possible that the skb destructor is not called in a context where kernel_sendmsg() can be invoked. To this end, the following changes are made: (1) kernel_rxrpc_data_consumed() is added. This should be called whenever an skb is emptied so as to crank the ACK and call states. This does not release the skb, however. kernel_rxrpc_free_skb() must now be called to achieve that. These together replace rxrpc_kernel_data_delivered(). (2) kernel_rxrpc_data_consumed() is wrapped by afs_data_consumed(). This makes afs_deliver_to_call() easier to work as the skb can simply be discarded unconditionally here without trying to work out what the return value of the ->deliver() function means. The ->deliver() functions can, via afs_data_complete(), afs_transfer_reply() and afs_extract_data() mark that an skb has been consumed (thereby cranking the state) without the need to conditionally free the skb to make sure the state is correct on an incoming call for when the call processor tries to send the reply. (3) rxrpc_recvmsg() now has to call kernel_rxrpc_data_consumed() when it has finished with a packet and MSG_PEEK isn't set. (4) rxrpc_packet_destructor() no longer calls rxrpc_hard_ACK_data(). Because of this, we no longer need to clear the destructor and put the call before we free the skb in cases where we don't want the ACK/call state to be cranked. (5) The ->deliver() call-type callbacks are made to return -EAGAIN rather than 0 if they expect more data (afs_extract_data() returns -EAGAIN to the delivery function already), and the caller is now responsible for producing an abort if that was the last packet. (6) There are many bits of unmarshalling code where: ret = afs_extract_data(call, skb, last, ...); switch (ret) { case 0: break; case -EAGAIN: return 0; default: return ret; } is to be found. As -EAGAIN can now be passed back to the caller, we now just return if ret < 0: ret = afs_extract_data(call, skb, last, ...); if (ret < 0) return ret; (7) Checks for trailing data and empty final data packets has been consolidated as afs_data_complete(). So: if (skb->len > 0) return -EBADMSG; if (!last) return 0; becomes: ret = afs_data_complete(call, skb, last); if (ret < 0) return ret; (8) afs_transfer_reply() now checks the amount of data it has against the amount of data desired and the amount of data in the skb and returns an error to induce an abort if we don't get exactly what we want. Without these changes, the following oops can occasionally be observed, particularly if some printks are inserted into the delivery path: general protection fault: 0000 [#1] SMP Modules linked in: kafs(E) af_rxrpc(E) [last unloaded: af_rxrpc] CPU: 0 PID: 1305 Comm: kworker/u8:3 Tainted: G E 4.7.0-fsdevel+ #1303 Hardware name: ASUS All Series/H97-PLUS, BIOS 2306 10/09/2014 Workqueue: kafsd afs_async_workfn [kafs] task: ffff88040be041c0 ti: ffff88040c070000 task.ti: ffff88040c070000 RIP: 0010:[<ffffffff8108fd3c>] [<ffffffff8108fd3c>] __lock_acquire+0xcf/0x15a1 RSP: 0018:ffff88040c073bc0 EFLAGS: 00010002 RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000000 RCX: ffff88040d29a710 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88040d29a710 RBP: ffff88040c073c70 R08: 0000000000000001 R09: 0000000000000001 R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: ffff88040be041c0 R15: ffffffff814c928f FS: 0000000000000000(0000) GS:ffff88041fa00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fa4595f4750 CR3: 0000000001c14000 CR4: 00000000001406f0 Stack: 0000000000000006 000000000be04930 0000000000000000 ffff880400000000 ffff880400000000 ffffffff8108f847 ffff88040be041c0 ffffffff81050446 ffff8803fc08a920 ffff8803fc08a958 ffff88040be041c0 ffff88040c073c38 Call Trace: [<ffffffff8108f847>] ? mark_held_locks+0x5e/0x74 [<ffffffff81050446>] ? __local_bh_enable_ip+0x9b/0xa1 [<ffffffff8108f9ca>] ? trace_hardirqs_on_caller+0x16d/0x189 [<ffffffff810915f4>] lock_acquire+0x122/0x1b6 [<ffffffff810915f4>] ? lock_acquire+0x122/0x1b6 [<ffffffff814c928f>] ? skb_dequeue+0x18/0x61 [<ffffffff81609dbf>] _raw_spin_lock_irqsave+0x35/0x49 [<ffffffff814c928f>] ? skb_dequeue+0x18/0x61 [<ffffffff814c928f>] skb_dequeue+0x18/0x61 [<ffffffffa009aa92>] afs_deliver_to_call+0x344/0x39d [kafs] [<ffffffffa009ab37>] afs_process_async_call+0x4c/0xd5 [kafs] [<ffffffffa0099e9c>] afs_async_workfn+0xe/0x10 [kafs] [<ffffffff81063a3a>] process_one_work+0x29d/0x57c [<ffffffff81064ac2>] worker_thread+0x24a/0x385 [<ffffffff81064878>] ? rescuer_thread+0x2d0/0x2d0 [<ffffffff810696f5>] kthread+0xf3/0xfb [<ffffffff8160a6ff>] ret_from_fork+0x1f/0x40 [<ffffffff81069602>] ? kthread_create_on_node+0x1cf/0x1cf Signed-off-by: David Howells <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-08-06	OVS: Ignore negative headroom value	Ian Wienand	1	-1/+1
	net_device->ndo_set_rx_headroom (introduced in 871b642adebe300be2e50aa5f65a418510f636ec) says "Setting a negtaive value reset the rx headroom to the default value". It seems that the OVS implementation in 3a927bc7cf9d0fbe8f4a8189dd5f8440228f64e7 overlooked this and sets dev->needed_headroom unconditionally. This doesn't have an immediate effect, but can mess up later LL_RESERVED_SPACE calculations, such as done in net/ipv6/mcast.c:mld_newpack. For reference, this issue was found from a skb_panic raised there after the length calculations had given the wrong result. Note the other current users of this interface (drivers/net/tun.c:tun_set_headroom and drivers/net/veth.c:veth_set_rx_headroom) are both checking this correctly thus need no modification. Thanks to Ben for some pointers from the crash dumps! Cc: Benjamin Poirier <[email protected]> Cc: Paolo Abeni <[email protected]> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1361414 Signed-off-by: Ian Wienand <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-08-05	NFSv4: Cap the transport reconnection timer at 1/2 lease period	Trond Myklebust	1	-0/+21
	We don't want to miss a lease period renewal due to the TCP connection failing to reconnect in a timely fashion. To ensure this doesn't happen, cap the reconnection timer so that we retry the connection attempt at least every 1/2 lease period. Signed-off-by: Trond Myklebust <[email protected]>
2016-08-05	SUNRPC: Limit the reconnect backoff timer to the max RPC message timeout	Trond Myklebust	2	-6/+15
	...and ensure that we propagate it to new transports on the same client. Signed-off-by: Trond Myklebust <[email protected]>
2016-08-05	SUNRPC: Fix reconnection timeouts	Trond Myklebust	1	-7/+20
	When the connect attempt fails and backs off, we should start the clock at the last connection attempt, not time at which we queue up the reconnect job. Signed-off-by: Trond Myklebust <[email protected]>
2016-08-05	SUNRPC: disable the use of IPv6 temporary addresses.	NeilBrown	1	-0/+11
	If the net.ipv6.conf.*.use_temp_addr sysctl is set to '2', then TCP connections over IPv6 will prefer a 'private' source address. These eventually expire and become invalid, typically after a week, but the time is configurable. When the local address becomes invalid the client will not be able to receive replies from the server. Eventually the connection will timeout or break and a new connection will be established, but this can take half an hour (typically TCP connection break time). RFC 4941, which describes private IPv6 addresses, acknowledges that some applications might not work well with them and that the application may explicitly a request non-temporary (i.e. "public") address. I believe this is correct for SUNRPC clients. Without this change, a client will occasionally experience a long delay if private addresses have been enabled. The privacy offered by private addresses is of little value for an NFS server which requires client authentication. For NFSv3 this will often not be a problem because idle connections are closed after 5 minutes. For NFSv4 connections never go idle due to the period RENEW (or equivalent) request. Signed-off-by: NeilBrown <[email protected]> Signed-off-by: Trond Myklebust <[email protected]>
2016-08-05	SUNRPC: allow for upcalls for same uid but different gss service	Olga Kornievskaia	1	-3/+5
	It's possible to have simultaneous upcalls for the same UIDs but different GSS service. In that case, we need to allow for the upcall to gssd to proceed so that not the same context is used by two different GSS services. Some servers lock the use of context to the GSS service. Signed-off-by: Olga Kornievskaia <[email protected]> Cc: [email protected] # v3.9+ Signed-off-by: Trond Myklebust <[email protected]>
2016-08-05	mac80211: Add ieee80211_hw pointer to get_expected_throughput	Maxim Altshul	1	-1/+1
	The variable is added to allow the driver an easy access to it's own hw->priv when the op is invoked. This fixes a crash in wlcore because it was relying on a station pointer that wasn't initialized yet. It's the wrong way to fix the crash, but it solves the problem for now and it does make sense to have the hw pointer here. Signed-off-by: Maxim Altshul <[email protected]> [rewrite commit message, fix indentation] Signed-off-by: Johannes Berg <[email protected]>
2016-08-05	nl80211: correct checks for NL80211_MESHCONF_HT_OPMODE value	Masashi Honma	1	-3/+31
	Previously, NL80211_MESHCONF_HT_OPMODE validation rejected correct flag combinations, e.g. IEEE80211_HT_OP_MODE_PROTECTION_NONHT_MIXED \| IEEE80211_HT_OP_MODE_NON_HT_STA_PRSNT. Doing just a range-check allows setting flags that don't exist (0x8) and invalid flag combinations. Implements some checks based on IEEE 802.11 2012 8.4.2.59 "HT Operation element". Signed-off-by: Masashi Honma <[email protected]> [reword commit message, simplify a bit] Signed-off-by: Johannes Berg <[email protected]>
2016-08-05	mac80211: End the MPSP even if EOSP frame was not acked	Masashi Honma	1	-7/+7
	If QoS frame with EOSP (end of service period) subfield=1 sent by local peer was not acked by remote peer, local peer did not end the MPSP. This prevents local peer from going to DOZE state. And if the remote peer goes away without closing connection, local peer continues AWAKE state and wastes battery. Signed-off-by: Masashi Honma <[email protected]> Acked-by: Bob Copeland <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2016-08-05	mac80211: fix purging multicast PS buffer queue	Felix Fietkau	2	-4/+4
	The code currently assumes that buffered multicast PS frames don't have a pending ACK frame for tx status reporting. However, hostapd sends a broadcast deauth frame on teardown for which tx status is requested. This can lead to the "Have pending ack frames" warning on module reload. Fix this by using ieee80211_free_txskb/ieee80211_purge_tx_queue. Cc: [email protected] Signed-off-by: Felix Fietkau <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2016-08-04	Merge tag 'nfsd-4.8' of git://linux-nfs.org/~bfields/linux	Linus Torvalds	4	-122/+86
	Pull nfsd updates from Bruce Fields: "Highlights: - Trond made a change to the server's tcp logic that allows a fast client to better take advantage of high bandwidth networks, but may increase the risk that a single client could starve other clients; a new sunrpc.svc_rpc_per_connection_limit parameter should help mitigate this in the (hopefully unlikely) event this becomes a problem in practice. - Tom Haynes added a minimal flex-layout pnfs server, which is of no use in production for now--don't build it unless you're doing client testing or further server development" * tag 'nfsd-4.8' of git://linux-nfs.org/~bfields/linux: (32 commits) nfsd: remove some dead code in nfsd_create_locked() nfsd: drop unnecessary MAY_EXEC check from create nfsd: clean up bad-type check in nfsd_create_locked nfsd: remove unnecessary positive-dentry check nfsd: reorganize nfsd_create nfsd: check d_can_lookup in fh_verify of directories nfsd: remove redundant zero-length check from create nfsd: Make creates return EEXIST instead of EACCES SUNRPC: Detect immediate closure of accepted sockets SUNRPC: accept() may return sockets that are still in SYN_RECV nfsd: allow nfsd to advertise multiple layout types nfsd: Close race between nfsd4_release_lockowner and nfsd4_lock nfsd/blocklayout: Make sure calculate signature/designator length aligned xfs: abstract block export operations from nfsd layouts SUNRPC: Remove unused callback xpo_adjust_wspace() SUNRPC: Change TCP socket space reservation SUNRPC: Add a server side per-connection limit SUNRPC: Micro optimisation for svc_data_ready SUNRPC: Call the default socket callbacks instead of open coding SUNRPC: lock the socket while detaching it ...
2016-08-04	tree-wide: replace config_enabled() with IS_ENABLED()	Masahiro Yamada	1	-1/+1
	The use of config_enabled() against config options is ambiguous. In practical terms, config_enabled() is equivalent to IS_BUILTIN(), but the author might have used it for the meaning of IS_ENABLED(). Using IS_ENABLED(), IS_BUILTIN(), IS_MODULE() etc. makes the intention clearer. This commit replaces config_enabled() with IS_ENABLED() where possible. This commit is only touching bool config options. I noticed two cases where config_enabled() is used against a tristate option: - config_enabled(CONFIG_HWMON) [ drivers/net/wireless/ath/ath10k/thermal.c ] - config_enabled(CONFIG_BACKLIGHT_CLASS_DEVICE) [ drivers/gpu/drm/gma500/opregion.c ] I did not touch them because they should be converted to IS_BUILTIN() in order to keep the logic, but I was not sure it was the authors' intention. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Masahiro Yamada <[email protected]> Acked-by: Kees Cook <[email protected]> Cc: Stas Sergeev <[email protected]> Cc: Matt Redfearn <[email protected]> Cc: Joshua Kinard <[email protected]> Cc: Jiri Slaby <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Markos Chandras <[email protected]> Cc: "Dmitry V. Levin" <[email protected]> Cc: yu-cheng yu <[email protected]> Cc: James Hogan <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Johannes Berg <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Al Viro <[email protected]> Cc: Will Drewry <[email protected]> Cc: Nikolay Martynov <[email protected]> Cc: Huacai Chen <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: Leonid Yegoshin <[email protected]> Cc: Rafal Milecki <[email protected]> Cc: James Cowgill <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Ralf Baechle <[email protected]> Cc: Alex Smith <[email protected]> Cc: Adam Buchbinder <[email protected]> Cc: Qais Yousef <[email protected]> Cc: Jiang Liu <[email protected]> Cc: Mikko Rapeli <[email protected]> Cc: Paul Gortmaker <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: Brian Norris <[email protected]> Cc: Hidehiro Kawai <[email protected]> Cc: "Luis R. Rodriguez" <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: Roland McGrath <[email protected]> Cc: Paul Burton <[email protected]> Cc: Kalle Valo <[email protected]> Cc: Viresh Kumar <[email protected]> Cc: Tony Wu <[email protected]> Cc: Huaitong Han <[email protected]> Cc: Sumit Semwal <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Jason Cooper <[email protected]> Cc: "David S. Miller" <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: Andrea Gelmini <[email protected]> Cc: David Woodhouse <[email protected]> Cc: Marc Zyngier <[email protected]> Cc: Rabin Vincent <[email protected]> Cc: "Maciej W. Rozycki" <[email protected]> Cc: David Daney <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2016-08-03	openvswitch: Remove incorrect WARN_ONCE().	Jarno Rajahalme	1	-7/+1
	ovs_ct_find_existing() issues a warning if an existing conntrack entry classified as IP_CT_NEW is found, with the premise that this should not happen. However, a newly confirmed, non-expected conntrack entry remains IP_CT_NEW as long as no reply direction traffic is seen. This has resulted into somewhat confusing kernel log messages. This patch removes this check and warning. Fixes: 289f2253 ("openvswitch: Find existing conntrack entry after upcall.") Suggested-by: Joe Stringer <[email protected]> Signed-off-by: Jarno Rajahalme <[email protected]> Acked-by: Joe Stringer <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-08-03	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	Linus Torvalds	6	-7/+10
	Pull networking fixes from David Miller: 1) Fix several cases of missing of_node_put() calls in various networking drivers. From Peter Chen. 2) Don't try to remove unconfigured VLANs in qed driver, from Yuval Mintz. 3) Unbalanced locking in TIPC error handling, from Wei Yongjun. 4) Fix lockups in CPDMA driver, from Grygorii Strashko. 5) More MACSEC refcount et al fixes, from Sabrina Dubroca. 6) Fix MAC address setting in r8169 during runtime suspend, from Chun-Hao Lin. 7) Various printf format specifier fixes, from Heinrich Schuchardt. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (59 commits) qed: Fail driver load in 100g MSI mode. ethernet: ti: davinci_emac: add missing of_node_put after calling of_parse_phandle ethernet: stmicro: stmmac: add missing of_node_put after calling of_parse_phandle ethernet: stmicro: stmmac: dwmac-socfpga: add missing of_node_put after calling of_parse_phandle ethernet: renesas: sh_eth: add missing of_node_put after calling of_parse_phandle ethernet: renesas: ravb_main: add missing of_node_put after calling of_parse_phandle ethernet: marvell: pxa168_eth: add missing of_node_put after calling of_parse_phandle ethernet: marvell: mvpp2: add missing of_node_put after calling of_parse_phandle ethernet: marvell: mvneta: add missing of_node_put after calling of_parse_phandle ethernet: hisilicon: hns: hns_dsaf_main: add missing of_node_put after calling of_parse_phandle ethernet: hisilicon: hns: hns_dsaf_mac: add missing of_node_put after calling of_parse_phandle ethernet: cavium: octeon: add missing of_node_put after calling of_parse_phandle ethernet: aurora: nb8800: add missing of_node_put after calling of_parse_phandle ethernet: arc: emac_main: add missing of_node_put after calling of_parse_phandle ethernet: apm: xgene: add missing of_node_put after calling of_parse_phandle ethernet: altera: add missing of_node_put 8139too: fix system hang when there is a tx timeout event. qed: Fix error return code in qed_resc_alloc() net: qlcnic: avoid superfluous assignement dsa: b53: remove redundant if ...
2016-08-03	mac80211: mesh: flush stations before beacons are stopped	Maital Hahn	1	-4/+6
	Some drivers (e.g. wl18xx) expect that the last stage in the de-initialization process will be stopping the beacons, similar to AP flow. Update ieee80211_stop_mesh() flow accordingly. As peers can be removed dynamically, this would not impact other drivers. Tested also on Ralink RT3572 chipset. Signed-off-by: Maital Hahn <[email protected]> Signed-off-by: Yaniv Machani <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2016-08-02	Merge tag 'ceph-for-4.8-rc1' of git://github.com/ceph/ceph-client	Linus Torvalds	9	-24/+245
	Pull Ceph updates from Ilya Dryomov: "The highlights are: - RADOS namespace support in libceph and CephFS (Zheng Yan and myself). The stopgaps added in 4.5 to deny access to inodes in namespaces are removed and CEPH_FEATURE_FS_FILE_LAYOUT_V2 feature bit is now fully supported - A large rework of the MDS cap flushing code (Zheng Yan) - Handle some of ->d_revalidate() in RCU mode (Jeff Layton). We were overly pessimistic before, bailing at the first sight of LOOKUP_RCU On top of that we've got a few CephFS bug fixes, a couple of cleanups and Arnd's workaround for a weird genksyms issue" * tag 'ceph-for-4.8-rc1' of git://github.com/ceph/ceph-client: (34 commits) ceph: fix symbol versioning for ceph_monc_do_statfs ceph: Correctly return NXIO errors from ceph_llseek ceph: Mark the file cache as unreclaimable ceph: optimize cap flush waiting ceph: cleanup ceph_flush_snaps() ceph: kick cap flushes before sending other cap message ceph: introduce an inode flag to indicates if snapflush is needed ceph: avoid sending duplicated cap flush message ceph: unify cap flush and snapcap flush ceph: use list instead of rbtree to track cap flushes ceph: update types of some local varibles ceph: include 'follows' of pending snapflush in cap reconnect message ceph: update cap reconnect message to version 3 ceph: mount non-default filesystem by name libceph: fsmap.user subscription support ceph: handle LOOKUP_RCU in ceph_d_revalidate ceph: allow dentry_lease_is_valid to work under RCU walk ceph: clear d_fsinfo pointer under d_lock ceph: remove ceph_mdsc_lease_release ceph: don't use ->d_time ...
2016-08-02	SUNRPC: Fix up socket autodisconnect	Trond Myklebust	1	-8/+18
	Ensure that we don't forget to set up the disconnection timer for the case when a connect request is fulfilled after the RPC request that initiated it has timed out or been interrupted. Signed-off-by: Trond Myklebust <[email protected]>
2016-08-02	mac80211: fix check for buffered powersave frames with txq	Felix Fietkau	1	-1/+1
	The logic was inverted here, set the bit if frames are pending. Fixes: ba8c3d6f16a1 ("mac80211: add an intermediate software queue implementation") Signed-off-by: Felix Fietkau <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2016-08-02	cfg80211: fix missing break in NL8211_CHAN_WIDTH_80P80 case	Colin Ian King	1	-0/+1
	The switch on chandef->width is missing a break on the NL8211_CHAN_WIDTH_80P80 case; currently we get a WARN_ON when center_freq2 is non-zero because of the missing break. Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Johannes Berg <[email protected]>
2016-08-02	VSOCK: Add Makefile and Kconfig	Asias He	2	-0/+26
	Enable virtio-vsock and vhost-vsock. Signed-off-by: Asias He <[email protected]> Signed-off-by: Stefan Hajnoczi <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2016-08-02	VSOCK: Introduce virtio_transport.ko	Asias He	1	-0/+624
	VM sockets virtio transport implementation. This driver runs in the guest. Signed-off-by: Asias He <[email protected]> Signed-off-by: Stefan Hajnoczi <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2016-08-02	VSOCK: Introduce virtio_vsock_common.ko	Asias He	1	-0/+992
	This module contains the common code and header files for the following virtio_transporto and vhost_vsock kernel modules. Signed-off-by: Asias He <[email protected]> Signed-off-by: Claudio Imbrenda <[email protected]> Signed-off-by: Stefan Hajnoczi <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2016-08-02	VSOCK: defer sock removal to transports	Stefan Hajnoczi	2	-6/+12
	The virtio transport will implement graceful shutdown and the related SO_LINGER socket option. This requires orphaning the sock but keeping it in the table of connections after .release(). This patch adds the vsock_remove_sock() function and leaves it up to the transport when to remove the sock. Signed-off-by: Stefan Hajnoczi <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2016-08-02	VSOCK: transport-specific vsock_transport functions	Stefan Hajnoczi	1	-0/+9
	struct vsock_transport contains function pointers called by AF_VSOCK core code. The transport may want its own transport-specific function pointers and they can be added after struct vsock_transport. Allow the transport to fetch vsock_transport. It can downcast it to access transport-specific function pointers. The virtio transport will use this. Signed-off-by: Stefan Hajnoczi <[email protected]> Signed-off-by: Michael S. Tsirkin <[email protected]>
2016-08-01	SUNRPC: Detect immediate closure of accepted sockets	Trond Myklebust	1	-2/+5
	This modification is useful for debugging issues that happen while the socket is being initialised. Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2016-08-01	SUNRPC: accept() may return sockets that are still in SYN_RECV	Trond Myklebust	1	-3/+10
	We're seeing traces of the following form: [10952.396347] svc: transport ffff88042ba4a 000 dequeued, inuse=2 [10952.396351] svc: tcp_accept ffff88042ba4 a000 sock ffff88042a6e4c80 [10952.396362] nfsd: connect from 10.2.6.1, port=187 [10952.396364] svc: svc_setup_socket ffff8800b99bcf00 [10952.396368] setting up TCP socket for reading [10952.396370] svc: svc_setup_socket created ffff8803eb10a000 (inet ffff88042b75b800) [10952.396373] svc: transport ffff8803eb10a000 put into queue [10952.396375] svc: transport ffff88042ba4a000 put into queue [10952.396377] svc: server ffff8800bb0ec000 waiting for data (to = 3600000) [10952.396380] svc: transport ffff8803eb10a000 dequeued, inuse=2 [10952.396381] svc_recv: found XPT_CLOSE [10952.396397] svc: svc_delete_xprt(ffff8803eb10a000) [10952.396398] svc: svc_tcp_sock_detach(ffff8803eb10a000) [10952.396399] svc: svc_sock_detach(ffff8803eb10a000) [10952.396412] svc: svc_sock_free(ffff8803eb10a000) i.e. an immediate close of the socket after initialisation. The culprit appears to be the test at the end of svc_tcp_init, which checks if the newly created socket is in the TCP_ESTABLISHED state, and immediately closes it if not. The evidence appears to suggest that the socket might still be in the SYN_RECV state at this time. The fix is to check for both states, and then to add a check in svc_tcp_state_change() to ensure we don't close the socket when it transitions into TCP_ESTABLISHED. Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: J. Bruce Fields <[email protected]>
2016-08-01	SUNRPC: Handle EADDRNOTAVAIL on connection failures	Trond Myklebust	1	-0/+4
	If the connect attempt immediately fails with an EADDRNOTAVAIL error, then that means our choice of source port number was bad. This error is expected when we set the SO_REUSEPORT socket option and we have 2 sockets sharing the same source and destination address and port combinations. Signed-off-by: Trond Myklebust <[email protected]> Fixes: 402e23b4ed9ed ("SUNRPC: Fix stupid typo in xs_sock_set_reuseport") Cc: [email protected] # v4.0+
2016-07-30	sctp: allow receiving msg when TCP-style sk is in CLOSED state	Xin Long	1	-1/+1
	Commit 141ddefce7c8 ("sctp: change sk state to CLOSED instead of CLOSING in sctp_sock_migrate") changed sk state to CLOSED if the assoc is closed when sctp_accept clones a new sk. If there is still data in sk receive queue, users will not be able to read it any more, as sctp_recvmsg returns directly if sk state is CLOSED. This patch is to add CLOSED state check in sctp_recvmsg to allow reading data from TCP-style sk with CLOSED state as what TCP does. Signed-off-by: Xin Long <[email protected]> Acked-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-07-30	sctp: allow delivering notifications after receiving SHUTDOWN	Xin Long	1	-1/+3
	Prior to this patch, once sctp received SHUTDOWN or shutdown with RD, sk->sk_shutdown would be set with RCV_SHUTDOWN, and all events would be dropped in sctp_ulpq_tail_event(). It would cause: 1. some notifications couldn't be received by users. like SCTP_SHUTDOWN_COMP generated by sctp_sf_do_4_C(). 2. sctp would also never trigger sk_data_ready when the association was closed, making it harder to identify the end of the association by calling recvmsg() and getting an EOF. It was not convenient for kernel users. The check here should be stopping delivering DATA chunks after receiving SHUTDOWN, and stopping delivering ANY chunks after sctp_close(). So this patch is to allow notifications to enqueue into receive queue even if sk->sk_shutdown is set to RCV_SHUTDOWN in sctp_ulpq_tail_event, but if sk->sk_shutdown == RCV_SHUTDOWN \| SEND_SHUTDOWN, it drops all events. Signed-off-by: Xin Long <[email protected]> Acked-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-07-30	sctp: fix the issue sctp requeue auth chunk incorrectly	Xin Long	1	-1/+2
	sctp needs to queue auth chunk back when we know that we are going to generate another segment. But commit f1533cce60d1 ("sctp: fix panic when sending auth chunks") requeues the last chunk processed which is probably not the auth chunk. It causes panic when calculating the MAC in sctp_auth_calculate_hmac(), as the incorrect offset of the auth chunk in skb->data. This fix is to requeue it by using packet->auth. Fixes: f1533cce60d1 ("sctp: fix panic when sending auth chunks") Signed-off-by: Xin Long <[email protected]> Acked-by: Marcelo Ricardo Leitner <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-07-30	tcp: consider recv buf for the initial window scale	Soheil Hassas Yeganeh	1	-1/+2
	tcp_select_initial_window() intends to advertise a window scaling for the maximum possible window size. To do so, it considers the maximum of net.ipv4.tcp_rmem[2] and net.core.rmem_max as the only possible upper-bounds. However, users with CAP_NET_ADMIN can use SO_RCVBUFFORCE to set the socket's receive buffer size to values larger than net.ipv4.tcp_rmem[2] and net.core.rmem_max. Thus, SO_RCVBUFFORCE is effectively ignored by tcp_select_initial_window(). To fix this, consider the maximum of net.ipv4.tcp_rmem[2], net.core.rmem_max and socket's initial buffer space. Fixes: b0573dea1fb3 ("[NET]: Introduce SO_{SND,RCV}BUFFORCE socket options") Signed-off-by: Soheil Hassas Yeganeh <[email protected]> Suggested-by: Neal Cardwell <[email protected]> Acked-by: Neal Cardwell <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-07-30	net: ipv6: use list_move instead of list_del/list_add	Wei Yongjun	1	-2/+1
	Using list_move() instead of list_del() + list_add(). Signed-off-by: Wei Yongjun <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-07-30	tipc: fix imbalance read_unlock_bh in __tipc_nl_add_monitor()	Wei Yongjun	1	-1/+1
	In the error handling case of nla_nest_start() failed read_unlock_bh() is called to unlock a lock that had not been taken yet. sparse warns about the context imbalance as the following: net/tipc/monitor.c:799:23: warning: context imbalance in '__tipc_nl_add_monitor' - different lock contexts for basic block Fixes: cf6f7e1d5109 ('tipc: dump monitor attributes') Signed-off-by: Wei Yongjun <[email protected]> Acked-by: Ying Xue <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2016-07-30	Merge tag 'nfs-for-4.8-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs	Linus Torvalds	21	-941/+864
	Pull NFS client updates from Trond Myklebust: "Highlights include: Stable bugfixes: - nfs: don't create zero-length requests - several LAYOUTGET bugfixes Features: - several performance related features - more aggressive caching when we can rely on close-to-open cache consistency - remove serialisation of O_DIRECT reads and writes - optimise several code paths to not flush to disk unnecessarily. However allow for the idiosyncracies of pNFS for those layout types that need to issue a LAYOUTCOMMIT before the metadata can be updated on the server. - SUNRPC updates to the client data receive path - pNFS/SCSI support RH/Fedora dm-mpath device nodes - pNFS files/flexfiles can now use unprivileged ports when the generic NFS mount options allow it. Bugfixes: - Don't use RDMA direct data placement together with data integrity or privacy security flavours - Remove the RDMA ALLPHYSICAL memory registration mode as it has potential security holes. - Several layout recall fixes to improve NFSv4.1 protocol compliance. - Fix an Oops in the pNFS files and flexfiles connection setup to the DS - Allow retry of operations that used a returned delegation stateid - Don't mark the inode as revalidated if a LAYOUTCOMMIT is outstanding - Fix writeback races in nfs4_copy_range() and nfs42_proc_deallocate()" * tag 'nfs-for-4.8-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (104 commits) pNFS: Actively set attributes as invalid if LAYOUTCOMMIT is outstanding NFSv4: Clean up lookup of SECINFO_NO_NAME NFSv4.2: Fix warning "variable ‘stateids’ set but not used" NFSv4: Fix warning "no previous prototype for ‘nfs4_listxattr’" SUNRPC: Fix a compiler warning in fs/nfs/clnt.c pNFS: Remove redundant smp_mb() from pnfs_init_lseg() pNFS: Cleanup - do layout segment initialisation in one place pNFS: Remove redundant stateid invalidation pNFS: Remove redundant pnfs_mark_layout_returned_if_empty() pNFS: Clear the layout metadata if the server changed the layout stateid pNFS: Cleanup - don't open code pnfs_mark_layout_stateid_invalid() NFS: pnfs_mark_matching_lsegs_return() should match the layout sequence id pNFS: Do not set plh_return_seq for non-callback related layoutreturns pNFS: Ensure layoutreturn acts as a completion for layout callbacks pNFS: Fix CB_LAYOUTRECALL stateid verification pNFS: Always update the layout barrier seqid on LAYOUTGET pNFS: Always update the layout stateid if NFS_LAYOUT_INVALID_STID is set pNFS: Clear the layout return tracking on layout reinitialisation pNFS: LAYOUTRETURN should only update the stateid if the layout is valid nfs: don't create zero-length requests ...
2016-07-29	Merge branch 'next' of ↵	Linus Torvalds	24	-191/+3232
	git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "Highlights: - TPM core and driver updates/fixes - IPv6 security labeling (CALIPSO) - Lots of Apparmor fixes - Seccomp: remove 2-phase API, close hole where ptrace can change syscall #" * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (156 commits) apparmor: fix SECURITY_APPARMOR_HASH_DEFAULT parameter handling tpm: Add TPM 2.0 support to the Nuvoton i2c driver (NPCT6xx family) tpm: Factor out common startup code tpm: use devm_add_action_or_reset tpm2_i2c_nuvoton: add irq validity check tpm: read burstcount from TPM_STS in one 32-bit transaction tpm: fix byte-order for the value read by tpm2_get_tpm_pt tpm_tis_core: convert max timeouts from msec to jiffies apparmor: fix arg_size computation for when setprocattr is null terminated apparmor: fix oops, validate buffer size in apparmor_setprocattr() apparmor: do not expose kernel stack apparmor: fix module parameters can be changed after policy is locked apparmor: fix oops in profile_unpack() when policy_db is not present apparmor: don't check for vmalloc_addr if kvzalloc() failed apparmor: add missing id bounds check on dfa verification apparmor: allow SYS_CAP_RESOURCE to be sufficient to prlimit another task apparmor: use list_next_entry instead of list_entry_next apparmor: fix refcount race when finding a child profile apparmor: fix ref count leak when profile sha1 hash is read apparmor: check that xindex is in trans_table bounds ...
2016-07-29	Merge branch 'for-linus' of ↵	Linus Torvalds	1	-4/+4
	git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull userns vfs updates from Eric Biederman: "This tree contains some very long awaited work on generalizing the user namespace support for mounting filesystems to include filesystems with a backing store. The real world target is fuse but the goal is to update the vfs to allow any filesystem to be supported. This patchset is based on a lot of code review and testing to approach that goal. While looking at what is needed to support the fuse filesystem it became clear that there were things like xattrs for security modules that needed special treatment. That the resolution of those concerns would not be fuse specific. That sorting out these general issues made most sense at the generic level, where the right people could be drawn into the conversation, and the issues could be solved for everyone. At a high level what this patchset does a couple of simple things: - Add a user namespace owner (s_user_ns) to struct super_block. - Teach the vfs to handle filesystem uids and gids not mapping into to kuids and kgids and being reported as INVALID_UID and INVALID_GID in vfs data structures. By assigning a user namespace owner filesystems that are mounted with only user namespace privilege can be detected. This allows security modules and the like to know which mounts may not be trusted. This also allows the set of uids and gids that are communicated to the filesystem to be capped at the set of kuids and kgids that are in the owning user namespace of the filesystem. One of the crazier corner casees this handles is the case of inodes whose i_uid or i_gid are not mapped into the vfs. Most of the code simply doesn't care but it is easy to confuse the inode writeback path so no operation that could cause an inode write-back is permitted for such inodes (aka only reads are allowed). This set of changes starts out by cleaning up the code paths involved in user namespace permirted mounts. Then when things are clean enough adds code that cleanly sets s_user_ns. Then additional restrictions are added that are possible now that the filesystem superblock contains owner information. These changes should not affect anyone in practice, but there are some parts of these restrictions that are changes in behavior. - Andy's restriction on suid executables that does not honor the suid bit when the path is from another mount namespace (think /proc/[pid]/fd/) or when the filesystem was mounted by a less privileged user. - The replacement of the user namespace implicit setting of MNT_NODEV with implicitly setting SB_I_NODEV on the filesystem superblock instead. Using SB_I_NODEV is a stronger form that happens to make this state user invisible. The user visibility can be managed but it caused problems when it was introduced from applications reasonably expecting mount flags to be what they were set to. There is a little bit of work remaining before it is safe to support mounting filesystems with backing store in user namespaces, beyond what is in this set of changes. - Verifying the mounter has permission to read/write the block device during mount. - Teaching the integrity modules IMA and EVM to handle filesystems mounted with only user namespace root and to reduce trust in their security xattrs accordingly. - Capturing the mounters credentials and using that for permission checks in d_automount and the like. (Given that overlayfs already does this, and we need the work in d_automount it make sense to generalize this case). Furthermore there are a few changes that are on the wishlist: - Get all filesystems supporting posix acls using the generic posix acls so that posix_acl_fix_xattr_from_user and posix_acl_fix_xattr_to_user may be removed. [Maintainability] - Reducing the permission checks in places such as remount to allow the superblock owner to perform them. - Allowing the superblock owner to chown files with unmapped uids and gids to something that is mapped so the files may be treated normally. I am not considering even obvious relaxations of permission checks until it is clear there are no more corner cases that need to be locked down and handled generically. Many thanks to Seth Forshee who kept this code alive, and putting up with me rewriting substantial portions of what he did to handle more corner cases, and for his diligent testing and reviewing of my changes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (30 commits) fs: Call d_automount with the filesystems creds fs: Update i_[ug]id_(read\|write) to translate relative to s_user_ns evm: Translate user/group ids relative to s_user_ns when computing HMAC dquot: For now explicitly don't support filesystems outside of init_user_ns quota: Handle quota data stored in s_user_ns in quota_setxquota quota: Ensure qids map to the filesystem vfs: Don't create inodes with a uid or gid unknown to the vfs vfs: Don't modify inodes with a uid or gid unknown to the vfs cred: Reject inodes with invalid ids in set_create_file_as() fs: Check for invalid i_uid in may_follow_link() vfs: Verify acls are valid within superblock's s_user_ns. userns: Handle -1 in k[ug]id_has_mapping when !CONFIG_USER_NS fs: Refuse uid/gid changes which don't map into s_user_ns selinux: Add support for unprivileged mounts from user namespaces Smack: Handle labels consistently in untrusted mounts Smack: Add support for unprivileged mounts from user namespaces fs: Treat foreign mounts as nosuid fs: Limit file caps to the user namespace of the super block userns: Remove the now unnecessary FS_USERNS_DEV_MOUNT flag userns: Remove implicit MNT_NODEV fragility. ...