aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2018-08-10spi: davinci: fix a NULL pointer dereferenceBartosz Golaszewski1-1/+1
On non-OF systems spi->controlled_data may be NULL. This causes a NULL pointer derefence on dm365-evm. Signed-off-by: Bartosz Golaszewski <[email protected]> Signed-off-by: Mark Brown <[email protected]> Cc: [email protected]
2018-08-10x86/microcode: Allow late microcode loading with SMT disabledJosh Poimboeuf1-4/+12
The kernel unnecessarily prevents late microcode loading when SMT is disabled. It should be safe to allow it if all the primary threads are online. Signed-off-by: Josh Poimboeuf <[email protected]> Acked-by: Borislav Petkov <[email protected]> Signed-off-by: David Woodhouse <[email protected]>
2018-08-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller8-20/+30
Daniel Borkmann says: ==================== pull-request: bpf 2018-08-10 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix cpumap and devmap on teardown as they're under RCU context and won't have same assumption as running under NAPI protection, from Jesper. 2) Fix various sockmap bugs in bpf_tcp_sendmsg() code, e.g. we had a bug where socket error was not propagated correctly, from Daniel. 3) Fix incompatible libbpf header license for BTF code and match it before it gets officially released with the rest of libbpf which is LGPL-2.1, from Martin. ==================== Signed-off-by: David S. Miller <[email protected]>
2018-08-09smb3: enumerating snapshots was leaving part of the data off endSteve French1-7/+27
When enumerating snapshots, the last few bytes of the final snapshot could be left off since we were miscalculating the length returned (leaving off the sizeof struct SRV_SNAPSHOT_ARRAY) See MS-SMB2 section 2.2.32.2. In addition fixup the length used to allow smaller buffer to be passed in, in order to allow returning the size of the whole snapshot array more easily. Sample userspace output with a kernel patched with this (mounted to a Windows volume with two snapshots). Before this patch, the second snapshot would be missing a few bytes at the end. ~/cifs-2.6# ~/enum-snapshots /mnt/file press enter to issue the ioctl to retrieve snapshot information ... size of snapshot array = 102 Num snapshots: 2 Num returned: 2 Array Size: 102 Snapshot 0:@GMT-2018.06.30-19.34.17 Snapshot 1:@GMT-2018.06.30-19.33.37 CC: Stable <[email protected]> Signed-off-by: Steve French <[email protected]> Reviewed-by: Pavel Shilovsky <[email protected]>
2018-08-09cifs: update smb2_queryfs() to use compoundingRonnie Sahlberg4-26/+131
Change smb2_queryfs() to use a Create/QueryInfo/Close compound request. Signed-off-by: Ronnie Sahlberg <[email protected]> Signed-off-by: Steve French <[email protected]> Reviewed-by: Paulo Alcantara <[email protected]> Reviewed-by: Pavel Shilovsky <[email protected]>
2018-08-09cifs: update receive_encrypted_standard to handle compounded responsesRonnie Sahlberg4-43/+107
Signed-off-by: Ronnie Sahlberg <[email protected]> Signed-off-by: Steve French <[email protected]> Reviewed-by: Paulo Alcantara <[email protected]> Reviewed-by: Pavel Shilovsky <[email protected]>
2018-08-09make sure that __dentry_kill() always invalidates d_seq, unhashed or notAl Viro1-5/+2
RCU pathwalk relies upon the assumption that anything that changes ->d_inode of a dentry will invalidate its ->d_seq. That's almost true - the one exception is that the final dput() of already unhashed dentry does *not* touch ->d_seq at all. Unhashing does, though, so for anything we'd found by RCU dcache lookup we are fine. Unfortunately, we can *start* with an unhashed dentry or jump into it. We could try and be careful in the (few) places where that could happen. Or we could just make the final dput() invalidate the damn thing, unhashed or not. The latter is much simpler and easier to backport, so let's do it that way. Reported-by: "Dae R. Jeong" <[email protected]> Cc: [email protected] Signed-off-by: Al Viro <[email protected]>
2018-08-09fix __legitimize_mnt()/mntput() raceAl Viro1-0/+14
__legitimize_mnt() has two problems - one is that in case of success the check of mount_lock is not ordered wrt preceding increment of refcount, making it possible to have successful __legitimize_mnt() on one CPU just before the otherwise final mntpu() on another, with __legitimize_mnt() not seeing mntput() taking the lock and mntput() not seeing the increment done by __legitimize_mnt(). Solved by a pair of barriers. Another is that failure of __legitimize_mnt() on the second read_seqretry() leaves us with reference that'll need to be dropped by caller; however, if that races with final mntput() we can end up with caller dropping rcu_read_lock() and doing mntput() to release that reference - with the first mntput() having freed the damn thing just as rcu_read_lock() had been dropped. Solution: in "do mntput() yourself" failure case grab mount_lock, check if MNT_DOOMED has been set by racing final mntput() that has missed our increment and if it has - undo the increment and treat that as "failure, caller doesn't need to drop anything" case. It's not easy to hit - the final mntput() has to come right after the first read_seqretry() in __legitimize_mnt() *and* manage to miss the increment done by __legitimize_mnt() before the second read_seqretry() in there. The things that are almost impossible to hit on bare hardware are not impossible on SMP KVM, though... Reported-by: Oleg Nesterov <[email protected]> Fixes: 48a066e72d97 ("RCU'd vsfmounts") Cc: [email protected] Signed-off-by: Al Viro <[email protected]>
2018-08-09MIPS: Remove remnants of UASM_ISAPaul Burton2-2/+0
Commit 33679a50370d ("MIPS: uasm: Remove needless ISA abstraction") removed use of the MIPS_ISA preprocessor macro, but left a couple of unused definitions of it behind. Remove the dead code. Signed-off-by: Paul Burton <[email protected]>
2018-08-09fix mntput/mntput raceAl Viro1-2/+12
mntput_no_expire() does the calculation of total refcount under mount_lock; unfortunately, the decrement (as well as all increments) are done outside of it, leading to false positives in the "are we dropping the last reference" test. Consider the following situation: * mnt is a lazy-umounted mount, kept alive by two opened files. One of those files gets closed. Total refcount of mnt is 2. On CPU 42 mntput(mnt) (called from __fput()) drops one reference, decrementing component * After it has looked at component #0, the process on CPU 0 does mntget(), incrementing component #0, gets preempted and gets to run again - on CPU 69. There it does mntput(), which drops the reference (component #69) and proceeds to spin on mount_lock. * On CPU 42 our first mntput() finishes counting. It observes the decrement of component #69, but not the increment of component #0. As the result, the total it gets is not 1 as it should've been - it's 0. At which point we decide that vfsmount needs to be killed and proceed to free it and shut the filesystem down. However, there's still another opened file on that filesystem, with reference to (now freed) vfsmount, etc. and we are screwed. It's not a wide race, but it can be reproduced with artificial slowdown of the mnt_get_count() loop, and it should be easier to hit on SMP KVM setups. Fix consists of moving the refcount decrement under mount_lock; the tricky part is that we want (and can) keep the fast case (i.e. mount that still has non-NULL ->mnt_ns) entirely out of mount_lock. All places that zero mnt->mnt_ns are dropping some reference to mnt and they call synchronize_rcu() before that mntput(). IOW, if mntput() observes (under rcu_read_lock()) a non-NULL ->mnt_ns, it is guaranteed that there is another reference yet to be dropped. Reported-by: Jann Horn <[email protected]> Tested-by: Jann Horn <[email protected]> Fixes: 48a066e72d97 ("RCU'd vsfmounts") Cc: [email protected] Signed-off-by: Al Viro <[email protected]>
2018-08-09null_blk: add lock drop/acquire annotationJens Axboe1-1/+3
sparse complains: drivers/block/null_blk_main.c:816:24: sparse: context imbalance in 'null_insert_page' - unexpected unlock Fix it by adding the necessary annotations to the function. Signed-off-by: Jens Axboe <[email protected]>
2018-08-09hwmon: k10temp: Support Threadripper 2920X, 2970WX; simplify offset tableGuenter Roeck1-8/+2
All announced Threadripper 29xx models have a temperature offset of 27 degrees C. Simplify temperature offset table to match all 29xx Threadripper models with a single entry. Also simplify the table to match all 19xx Threadripper models with a single entry. This effectively drops entries for Threadripper 1910/1920/1950 which never saw the light of day. Cc: Michael Larabel <[email protected]> Cc: Clemens Ladisch <[email protected]> Signed-off-by: Guenter Roeck <[email protected]>
2018-08-09Merge branch 'bpf-fix-cpu-and-devmap-teardown'Daniel Borkmann4-14/+21
Jesper Dangaard Brouer says: ==================== Removing entries from cpumap and devmap, goes through a number of syncronization steps to make sure no new xdp_frames can be enqueued. But there is a small chance, that xdp_frames remains which have not been flushed/processed yet. Flushing these during teardown, happens from RCU context and not as usual under RX NAPI context. The optimization introduced in commt 389ab7f01af9 ("xdp: introduce xdp_return_frame_rx_napi"), missed that the flush operation can also be called from RCU context. Thus, we cannot always use the xdp_return_frame_rx_napi call, which take advantage of the protection provided by XDP RX running under NAPI protection. The samples/bpf xdp_redirect_cpu have a --stress-mode, that is adjusted to easier reproduce (verified by Red Hat QA). ==================== Signed-off-by: Daniel Borkmann <[email protected]>
2018-08-09xdp: fix bug in devmap teardown code pathJesper Dangaard Brouer1-5/+9
Like cpumap teardown, the devmap teardown code also flush remaining xdp_frames, via bq_xmit_all() in case map entry is removed. The code can call xdp_return_frame_rx_napi, from the the wrong context, in-case ndo_xdp_xmit() fails. Fixes: 389ab7f01af9 ("xdp: introduce xdp_return_frame_rx_napi") Fixes: 735fc4054b3a ("xdp: change ndo_xdp_xmit API to support bulking") Signed-off-by: Jesper Dangaard Brouer <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
2018-08-09samples/bpf: xdp_redirect_cpu adjustment to reproduce teardown race easierJesper Dangaard Brouer2-3/+3
The teardown race in cpumap is really hard to reproduce. These changes makes it easier to reproduce, for QA. The --stress-mode now have a case of a very small queue size of 8, that helps to trigger teardown flush to encounter a full queue, which results in calling xdp_return_frame API, in a non-NAPI protect context. Also increase MAX_CPUS, as my QA department have larger machines than me. Tested-by: Jean-Tsung Hsiao <[email protected]> Signed-off-by: Jesper Dangaard Brouer <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
2018-08-09xdp: fix bug in cpumap teardown code pathJesper Dangaard Brouer1-6/+9
When removing a cpumap entry, a number of syncronization steps happen. Eventually the teardown code __cpu_map_entry_free is invoked from/via call_rcu. The teardown code __cpu_map_entry_free() flushes remaining xdp_frames, by invoking bq_flush_to_queue, which calls xdp_return_frame_rx_napi(). The issues is that the teardown code is not running in the RX NAPI code path. Thus, it is not allowed to invoke the NAPI variant of xdp_return_frame. This bug was found and triggered by using the --stress-mode option to the samples/bpf program xdp_redirect_cpu. It is hard to trigger, because the ptr_ring have to be full and cpumap bulk queue max contains 8 packets, and a remote CPU is racing to empty the ptr_ring queue. Fixes: 389ab7f01af9 ("xdp: introduce xdp_return_frame_rx_napi") Tested-by: Jean-Tsung Hsiao <[email protected]> Signed-off-by: Jesper Dangaard Brouer <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]>
2018-08-09Blk-throttle: reduce tail io latency when iops limit is enforcedLiu Bo1-6/+1
When an application's iops has exceeded its cgroup's iops limit, surely it is throttled and kernel will set a timer for dispatching, thus IO latency includes the delay. However, the dispatch delay which is calculated by the limit and the elapsed jiffies is suboptimal. As the dispatch delay is only calculated once the application's iops is (iops limit + 1), it doesn't need to wait any longer than the remaining time of the current slice. The difference can be proved by the following fio job and cgroup iops setting, ----- $ echo 4 > /mnt/config/nullb/disk1/mbps # limit nullb's bandwidth to 4MB/s for testing. $ echo "253:1 riops=100 rbps=max" > /sys/fs/cgroup/unified/cg1/io.max $ cat r2.job [global] name=fio-rand-read filename=/dev/nullb1 rw=randread bs=4k direct=1 numjobs=1 time_based=1 runtime=60 group_reporting=1 [file1] size=4G ioengine=libaio iodepth=1 rate_iops=50000 norandommap=1 thinktime=4ms ----- wo patch: file1: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1 fio-3.7-66-gedfc Starting 1 process read: IOPS=99, BW=400KiB/s (410kB/s)(23.4MiB/60001msec) slat (usec): min=10, max=336, avg=27.71, stdev=17.82 clat (usec): min=2, max=28887, avg=5929.81, stdev=7374.29 lat (usec): min=24, max=28901, avg=5958.73, stdev=7366.22 clat percentiles (usec): | 1.00th=[ 4], 5.00th=[ 4], 10.00th=[ 4], 20.00th=[ 4], | 30.00th=[ 4], 40.00th=[ 4], 50.00th=[ 6], 60.00th=[11731], | 70.00th=[11863], 80.00th=[11994], 90.00th=[12911], 95.00th=[22676], | 99.00th=[23725], 99.50th=[23987], 99.90th=[23987], 99.95th=[25035], | 99.99th=[28967] w/ patch: file1: (g=0): rw=randread, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1 fio-3.7-66-gedfc Starting 1 process read: IOPS=100, BW=400KiB/s (410kB/s)(23.4MiB/60005msec) slat (usec): min=10, max=155, avg=23.24, stdev=16.79 clat (usec): min=2, max=12393, avg=5961.58, stdev=5959.25 lat (usec): min=23, max=12412, avg=5985.91, stdev=5951.92 clat percentiles (usec): | 1.00th=[ 3], 5.00th=[ 3], 10.00th=[ 4], 20.00th=[ 4], | 30.00th=[ 4], 40.00th=[ 5], 50.00th=[ 47], 60.00th=[11863], | 70.00th=[11994], 80.00th=[11994], 90.00th=[11994], 95.00th=[11994], | 99.00th=[11994], 99.50th=[11994], 99.90th=[12125], 99.95th=[12125], | 99.99th=[12387] Signed-off-by: Liu Bo <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-08-09x86/relocs: Add __end_rodata_aligned to S_RELJoerg Roedel1-0/+1
This new symbol needs to be in the workaround-list for buggy binutils, otherwise the build with gcc-4.6 fails. Fixes: 39d668e04eda ('x86/mm/pti: Make pti_clone_kernel_text() compile on 32 bit') Reported-by: Stephen Rothwell <[email protected]> Signed-off-by: Joerg Roedel <[email protected]> Signed-off-by: Thomas Gleixner <[email protected]> Tested-by: Sedat Dilek <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Linux-Next Mailing List <[email protected]> Link: https://lkml.kernel.org/r/[email protected]
2018-08-09Merge branch 'linus' of ↵Linus Torvalds8-191/+101
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fix from Herbert Xu: "This fixes a performance regression in arm64 NEON crypto as well as a crash in x86 aegis/morus on unsupported CPUs" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: x86/aegis,morus - Fix and simplify CPUID checks crypto: arm64 - revert NEON yield for fast AEAD implementations
2018-08-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds26-161/+178
Pull networking fixes from David Miller: 1) The real fix for the ipv6 route metric leak Sabrina was seeing, from Cong Wang. 2) Fix syzbot triggers AF_PACKET v3 ring buffer insufficient room conditions, from Willem de Bruijn. 3) vsock can reinitialize active work struct, fix from Cong Wang. 4) RXRPC keepalive generator can wedge a cpu, fix from David Howells. 5) Fix locking in AF_SMC ioctl, from Ursula Braun. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: dsa: slave: eee: Allow ports to use phylink net/smc: move sock lock in smc_ioctl() net/smc: allow sysctl rmem and wmem defaults for servers net/smc: no shutdown in state SMC_LISTEN net: aquantia: Fix IFF_ALLMULTI flag functionality rxrpc: Fix the keepalive generator [ver #2] net/mlx5e: Cleanup of dcbnl related fields net/mlx5e: Properly check if hairpin is possible between two functions vhost: reset metadata cache when initializing new IOTLB llc: use refcount_inc_not_zero() for llc_sap_find() dccp: fix undefined behavior with 'cwnd' shift in ccid2_cwnd_restart() tipc: fix an interrupt unsafe locking scenario vsock: split dwork to avoid reinitializations net: thunderx: check for failed allocation lmac->dmacs cxgb4: mk_act_open_req() buggers ->{local, peer}_ip on big-endian hosts packet: refine ring v3 block size test to hold one frame ip6_tunnel: use the right value for ipv4 min mtu check in ip6_tnl_xmit ipv6: fix double refcount of fib6_metrics
2018-08-09block: paride: pd: mark expected switch fall-throughsGustavo A. R. Silva1-0/+2
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Addresses-Coverity-ID: 1056543 ("Missing break in switch") Addresses-Coverity-ID: 1056544 ("Missing break in switch") Signed-off-by: Gustavo A. R. Silva <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-08-09i2c: xlp9xx: Fix case where SSIF read transaction completes earlyGeorge Cherian1-13/+28
During ipmi stress tests we see occasional failure of transactions at the boot time. This happens in the case of a I2C_M_RECV_LEN transactions, when the read transfer completes (with the initial read length of 34) before the driver gets a chance to handle interrupts. The current driver code expects at least 2 interrupts for I2C_M_RECV_LEN transactions. The length is updated during the first interrupt, and the buffer contents are only copied during subsequent interrupts. In case of just one interrupt, we will complete the transaction without copying out the bytes from RX fifo. Update the code to drain the RX fifo after the length update, so that the transaction completes correctly in all cases. Signed-off-by: George Cherian <[email protected]> Signed-off-by: Wolfram Sang <[email protected]> Cc: [email protected]
2018-08-09block: Ensure that a request queue is dissociated from the cgroup controllerBart Van Assche1-0/+15
Several block drivers call alloc_disk() followed by put_disk() if something fails before device_add_disk() is called without calling blk_cleanup_queue(). Make sure that also for this scenario a request queue is dissociated from the cgroup controller. This patch avoids that loading the parport_pc, paride and pf drivers triggers the following kernel crash: BUG: KASAN: null-ptr-deref in pi_init+0x42e/0x580 [paride] Read of size 4 at addr 0000000000000008 by task modprobe/744 Call Trace: dump_stack+0x9a/0xeb kasan_report+0x139/0x350 pi_init+0x42e/0x580 [paride] pf_init+0x2bb/0x1000 [pf] do_one_initcall+0x8e/0x405 do_init_module+0xd9/0x2f2 load_module+0x3ab4/0x4700 SYSC_finit_module+0x176/0x1a0 do_syscall_64+0xee/0x2b0 entry_SYSCALL_64_after_hwframe+0x42/0xb7 Reported-by: Alexandru Moise <[email protected]> Fixes: a063057d7c73 ("block: Fix a race between request queue removal and the block cgroup controller") # v4.17 Signed-off-by: Bart Van Assche <[email protected]> Tested-by: Alexandru Moise <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Ming Lei <[email protected]> Cc: Alexandru Moise <[email protected]> Cc: Joseph Qi <[email protected]> Cc: <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-08-09block: Introduce blk_exit_queue()Bart Van Assche2-24/+31
This patch does not change any functionality. Signed-off-by: Bart Van Assche <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Ming Lei <[email protected]> Cc: Omar Sandoval <[email protected]> Cc: Alexandru Moise <[email protected]> Cc: Joseph Qi <[email protected]> Cc: <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-08-09blkcg: Introduce blkg_root_lookup()Bart Van Assche1-0/+18
This new function will be used in a later patch to verify whether a queue has been dissociated from the cgroup controller before being released. Signed-off-by: Bart Van Assche <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Ming Lei <[email protected]> Cc: Omar Sandoval <[email protected]> Cc: Johannes Thumshirn <[email protected]> Cc: Alexandru Moise <[email protected]> Cc: Joseph Qi <[email protected]> Cc: <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-08-09block: Remove two superfluous #include directivesBart Van Assche1-2/+0
Commit 12f5b9314545 ("blk-mq: Remove generation seqeunce") removed the only seqcount_t and u64_stats_sync instances from <linux/blkdev.h> but did not remove the corresponding #include directives. Since these include directives are no longer needed, remove them. Signed-off-by: Bart Van Assche <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Keith Busch <[email protected]> Cc: Ming Lei <[email protected]> Cc: Jianchao Wang <[email protected]> Cc: Hannes Reinecke <[email protected]>, Cc: Johannes Thumshirn <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-08-09blk-mq: count the hctx as active before allocating tagJianchao Wang2-2/+9
Currently, we count the hctx as active after allocate driver tag successfully. If a previously inactive hctx try to get tag first time, it may fails and need to wait. However, due to the stale tag ->active_queues, the other shared-tags users are still able to occupy all driver tags while there is someone waiting for tag. Consequently, even if the previously inactive hctx is waked up, it still may not be able to get a tag and could be starved. To fix it, we count the hctx as active before try to allocate driver tag, then when it is waiting the tag, the other shared-tag users will reserve budget for it. Reviewed-by: Ming Lei <[email protected]> Signed-off-by: Jianchao Wang <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-08-09block: bvec_nr_vecs() returns value for wrong slabGreg Edwards1-1/+1
In commit ed996a52c868 ("block: simplify and cleanup bvec pool handling"), the value of the slab index is incremented by one in bvec_alloc() after the allocation is done to indicate an index value of 0 does not need to be later freed. bvec_nr_vecs() was not updated accordingly, and thus returns the wrong value. Decrement idx before performing the lookup. Fixes: ed996a52c868 ("block: simplify and cleanup bvec pool handling") Signed-off-by: Greg Edwards <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-08-09Merge branch 'nvme-4.19' of git://git.infradead.org/nvme into for-4.19/blockJens Axboe9-10/+138
Pull NVMe updates from Christoph: "This should be the last round of NVMe updates before the 4.19 merge window opens. It conatins support for write protected (aka read-only) namespaces from Chaitanya, two ANA fixes from Hannes and a fabrics fix from Tal Shorer." * 'nvme-4.19' of git://git.infradead.org/nvme: nvme-fabrics: fix ctrl_loss_tmo < 0 to reconnect forever nvmet: add ns write protect support nvme: set gendisk read only based on nsattr nvme.h: add support for ns write protect definitions nvme.h: fixup ANA group descriptor format nvme: fixup crash on failed discovery
2018-08-09bcache: trivial - remove tailing backslash in macro BTREE_FLAGShenghui Wang1-1/+1
Remove the tailing backslash in macro BTREE_FLAG in btree.h Signed-off-by: Shenghui Wang <[email protected]> Signed-off-by: Coly Li <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-08-09bcache: make the pr_err statement used for ENOENT only in sysfs_attatch sectionShenghui Wang1-2/+2
The pr_err statement in the code for sysfs_attatch section would run for various error codes, which maybe confusing. E.g, Run the command twice: echo 796b5c05-b03c-4bc7-9cbd-a8df5e8be891 > \ /sys/block/bcache0/bcache/attach [the backing dev got attached on the first run] echo 796b5c05-b03c-4bc7-9cbd-a8df5e8be891 > \ /sys/block/bcache0/bcache/attach In dmesg, after the command run twice, we can get: bcache: bch_cached_dev_attach() Can't attach sda6: already attached bcache: __cached_dev_store() Can't attach 796b5c05-b03c-4bc7-9cbd-\ a8df5e8be891 : cache set not found The first statement in the message was right, but the second was confusing. bch_cached_dev_attach has various pr_ statements for various error codes, except ENOENT. After the change, rerun above command twice: echo 796b5c05-b03c-4bc7-9cbd-a8df5e8be891 > \ /sys/block/bcache0/bcache/attach echo 796b5c05-b03c-4bc7-9cbd-a8df5e8be891 > \ /sys/block/bcache0/bcache/attach In dmesg we only got: bcache: bch_cached_dev_attach() Can't attach sda6: already attached No confusing "cache set not found" message anymore. And for some not exist SET-UUID: echo 796b5c05-b03c-4bc7-9cbd-a8df5e8be898 > \ /sys/block/bcache0/bcache/attach In dmesg we can get: bcache: __cached_dev_store() Can't attach 796b5c05-b03c-4bc7-9cbd-\ a8df5e8be898 : cache set not found Signed-off-by: Shenghui Wang <[email protected]> Signed-off-by: Coly Li <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-08-09bcache: set max writeback rate when I/O request is idleColy Li7-45/+138
Commit b1092c9af9ed ("bcache: allow quick writeback when backing idle") allows the writeback rate to be faster if there is no I/O request on a bcache device. It works well if there is only one bcache device attached to the cache set. If there are many bcache devices attached to a cache set, it may introduce performance regression because multiple faster writeback threads of the idle bcache devices will compete the btree level locks with the bcache device who have I/O requests coming. This patch fixes the above issue by only permitting fast writebac when all bcache devices attached on the cache set are idle. And if one of the bcache devices has new I/O request coming, minimized all writeback throughput immediately and let PI controller __update_writeback_rate() to decide the upcoming writeback rate for each bcache device. Also when all bcache devices are idle, limited wrieback rate to a small number is wast of thoughput, especially when backing devices are slower non-rotation devices (e.g. SATA SSD). This patch sets a max writeback rate for each backing device if the whole cache set is idle. A faster writeback rate in idle time means new I/Os may have more available space for dirty data, and people may observe a better write performance then. Please note bcache may change its cache mode in run time, and this patch still works if the cache mode is switched from writeback mode and there is still dirty data on cache. Fixes: Commit b1092c9af9ed ("bcache: allow quick writeback when backing idle") Cc: [email protected] #4.16+ Signed-off-by: Coly Li <[email protected]> Tested-by: Kai Krakow <[email protected]> Tested-by: Stefan Priebe <[email protected]> Cc: Michael Lyle <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-08-09bcache: add code comments for bset.cColy Li1-0/+63
This patch tries to add code comments in bset.c, to make some tricky code and designment to be more comprehensible. Most information of this patch comes from the discussion between Kent and I, he offers very informative details. If there is any mistake of the idea behind the code, no doubt that's from me misrepresentation. Signed-off-by: Coly Li <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-08-09bcache: fix mistaken comments in request.cColy Li1-1/+1
This patch updates code comment in bch_keylist_realloc() by fixing incorrected function names, to make the code to be more comprehennsible. Signed-off-by: Coly Li <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-08-09bcache: fix mistaken code comments in bcache.hColy Li1-3/+3
This patch updates the code comment in struct cache with correct array names, to make the code to be more comprehensible. Signed-off-by: Coly Li <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-08-09bcache: add a comment in super.cColy Li1-0/+1
This patch adds a line of code comment in super.c:register_bdev(), to make code to be more comprehensible. Signed-off-by: Coly Li <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-08-09bcache: avoid unncessary cache prefetch bch_btree_node_get()Coly Li1-7/+7
In bch_btree_node_get() the read-in btree node will be partially prefetched into L1 cache for following bset iteration (if there is). But if the btree node read is failed, the perfetch operations will waste L1 cache space. This patch checkes whether read operation and only does cache prefetch when read I/O succeeded. Signed-off-by: Coly Li <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-08-09bcache: display rate debug parameters to 0 when writeback is not runningColy Li1-10/+16
When writeback is not running, writeback rate should be 0, other value is misleading. And the following dyanmic writeback rate debug parameters should be 0 too, rate, proportional, integral, change otherwise they are misleading when writeback is not running. Signed-off-by: Coly Li <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-08-09bcache: do not check return value of debugfs_create_dir()Coly Li5-13/+21
Greg KH suggests that normal code should not care about debugfs. Therefore no matter successful or failed of debugfs_create_dir() execution, it is unncessary to check its return value. There are two functions called debugfs_create_dir() and check the return value, which are bch_debug_init() and closure_debug_init(). This patch changes these two functions from int to void type, and ignore return values of debugfs_create_dir(). This patch does not fix exact bug, just makes things work as they should. Signed-off-by: Coly Li <[email protected]> Suggested-by: Greg Kroah-Hartman <[email protected]> Cc: [email protected] Cc: Kai Krakow <[email protected]> Cc: Kent Overstreet <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2018-08-09Merge branch 'asoc-4.19' into asoc-nextMark Brown312-4288/+15488
2018-08-09Merge branch 'asoc-4.18' into asoc-linusMark Brown18-33/+111
2018-08-09ASoC: adav80x: mark expected switch fall-throughGustavo A. R. Silva1-0/+1
In preparation to enabling -Wimplicit-fallthrough, mark switch cases where we are expecting to fall through. Addresses-Coverity-ID: 1056531 ("Missing break in switch") Signed-off-by: Gustavo A. R. Silva <[email protected]> Acked-by: Lars-Peter Clausen <[email protected]> Signed-off-by: Mark Brown <[email protected]>
2018-08-09kbuild: remove deprecated host-progs variableMasahiro Yamada3-10/+1
The host-progs has been kept as an alias of hostprogs-y for a long time (at least since the beginning of Git era), with the clear prompt: Usage of host-progs is deprecated. Please replace with hostprogs-y! Enough time for the migration has passed. Signed-off-by: Masahiro Yamada <[email protected]> Acked-by: Max Filippov <[email protected]>
2018-08-09init/Kconfig: Use short unix-style option instead of --longnameRob Landley1-2/+2
Avoids warning messages with the latest release of toybox, which never bothered to implement the --longopts nothing was using. Signed-off-by: Rob Landley <[email protected]> Signed-off-by: Masahiro Yamada <[email protected]>
2018-08-09kbuild: make samples really depend on headers_installMasahiro Yamada1-1/+2
Kernel headers must be installed into $(objtree)/usr/include to avoid the build failure of samples. Commit ddea05fa148b ("kbuild: make samples depend on headers_install") addressed this, but "samples/" is only used for the single target build. "make samples/" properly installs kernel headers, but it does not work for general building because a phony target "sample" (no trailing slash) is used. Reported-by: David Howells <[email protected]> Signed-off-by: Masahiro Yamada <[email protected]> Tested-by: David Howells <[email protected]>
2018-08-09platform/x86: Add ACPI i2c-multi-instantiate pseudo driverHans de Goede4-0/+150
On systems with ACPI instantiated i2c-clients, normally there is 1 fw_node per i2c-device and that fw-node contains 1 I2cSerialBus resource for that 1 i2c-device. But in some rare cases the manufacturer has decided to describe multiple i2c-devices in a single ACPI fwnode with multiple I2cSerialBus resources. An earlier attempt to fix this in the i2c-core resulted in a lot of extra code to support this corner-case. This commit introduces a new i2c-multi-instantiate driver which fixes this in a different way. This new driver can be built as a module which will only loaded on affected systems. This driver will instantiate a new i2c-client per I2cSerialBus resource, using the driver_data from the acpi_device_id it is binding to to tell it which chip-type (and optional irq-resource) to use when instantiating. Note this driver depends on a platform device being instantiated for the ACPI fwnode, see the i2c_multi_instantiate_ids list of ACPI device-ids in drivers/acpi/scan.c: acpi_device_enumeration_by_parent(). Acked-by: Andy Shevchenko <[email protected]> Acked-by: Wolfram Sang <[email protected]> Signed-off-by: Hans de Goede <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>
2018-08-09s390/dasd: fix hanging offline processing due to canceled workerStefan Haberland1-2/+5
During offline processing two worker threads are canceled without freeing the device reference which leads to a hanging offline process. Reviewed-by: Jan Hoeppner <[email protected]> Signed-off-by: Stefan Haberland <[email protected]> Signed-off-by: Martin Schwidefsky <[email protected]>
2018-08-09s390/dasd: fix panic for failed online processingStefan Haberland1-0/+3
Fix a panic that occurs for a device that got an error in dasd_eckd_check_characteristics() during online processing. For example the read configuration data command may have failed. If this error occurs the device is not being set online and the earlier invoked steps during online processing are rolled back. Therefore dasd_eckd_uncheck_device() is called which needs a valid private structure. But this pointer is not valid if dasd_eckd_check_characteristics() has failed. Check for a valid device->private pointer to prevent a panic. Reviewed-by: Jan Hoeppner <[email protected]> Signed-off-by: Stefan Haberland <[email protected]> Signed-off-by: Martin Schwidefsky <[email protected]>
2018-08-09Merge branch 'regmap-4.19' into regmap-nextMark Brown9-28/+303
2018-08-09Merge tag 'regmap-noinc-read' into regmap-4.19Mark Brown3-1/+100
regmap: Support non-incrementing registers Some devices have individual registers that don't autoincrement the register address during bulk reads but instead repeatedly read the same value, for example for monitoring GPIOs or ADCs. Add support for these.