aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2016-09-22dm array: introduce cursor apiJoe Thornber2-0/+119
More efficient way to iterate an array due to prefetching (makes use of the new dm_btree_cursor_* api). Signed-off-by: Joe Thornber <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2016-09-22dm btree: introduce cursor apiJoe Thornber2-0/+197
This uses prefetching to speed up iteration through a btree. Signed-off-by: Joe Thornber <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2016-09-22dm cache policy smq: distribute entries to random levels when switching to smqJoe Thornber1-1/+6
For smq the 32 bit 'hint' stores the multiqueue level that the entry should be stored in. If a different policy has been used previously, and then switched to smq, the hints will be invalid. In which case we used to put all entries in the bottom level of the multiqueue, and then redistribute. Redistribution is faster if we put entries with invalid hints in random levels initially. Signed-off-by: Joe Thornber <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2016-09-22dm cache: speed up writing of the hint arrayJoe Thornber5-94/+42
It's far quicker to always delete the hint array and recreate with dm_array_new() because we avoid the copying caused by mutation. Also simplifies the policy interface, replacing the walk_hints() with the simpler get_hint(). Signed-off-by: Joe Thornber <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2016-09-22dm array: add dm_array_new()Joe Thornber2-31/+130
dm_array_new() creates a new, populated array more efficiently than starting with an empty one and resizing. Signed-off-by: Joe Thornber <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2016-09-15dm mpath: delay the requeue of blk-mq requests while all paths downMike Snitzer1-6/+9
Return DM_MAPIO_DELAY_REQUEUE from .clone_and_map_rq. Also, return false from .busy, if all paths are down, so that blk-mq requests get mapped via .clone_and_map_rq -- which results in DM_MAPIO_DELAY_REQUEUE being returned to dm-rq. This change allows for a noticeable reduction in cpu utilization (reduced kworker load) while all paths are down, e.g.: system CPU idleness (as measured by fio's --idle-prof=system): before: system: 86.58% after: system: 98.60% Signed-off-by: Mike Snitzer <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]>
2016-09-15dm mpath: use dm_mq_kick_requeue_list()Mike Snitzer1-6/+8
When reinstating a path the blk-mq request_queue's requeue_list should get kicked. It makes sense to kick the requeue_list as part of the existing hook (previously only used by bio-based support). Rename process_queued_bios_list to process_queued_io_list. Signed-off-by: Mike Snitzer <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]>
2016-09-15dm rq: introduce dm_mq_kick_requeue_list()Mike Snitzer2-4/+15
Make it possible for a request-based target to kick the DM device's blk-mq request_queue's requeue_list. Signed-off-by: Mike Snitzer <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]>
2016-09-15dm rq: reduce arguments passed to map_request() and ↵Mike Snitzer1-11/+11
dm_requeue_original_request() Signed-off-by: Mike Snitzer <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]>
2016-09-14dm rq: add DM_MAPIO_DELAY_REQUEUE to delay requeue of blk-mq requestsMike Snitzer2-14/+19
Otherwise blk-mq will immediately dispatch requests that are requeued via a BLK_MQ_RQ_QUEUE_BUSY return from blk_mq_ops .queue_rq. Delayed requeue is implemented using blk_mq_delay_kick_requeue_list() with a delay of 5 secs. In the context of DM multipath (all paths down) it doesn't make any sense to requeue more quickly. Signed-off-by: Mike Snitzer <[email protected]>
2016-09-14dm: convert wait loops to use autoremove_wake_function()Bart Van Assche2-14/+6
Use autoremove_wake_function() instead of default_wake_function() to make the dm wait loops more similar to other wait loops in the kernel. This patch does not change any functionality. Signed-off-by: Bart Van Assche <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2016-09-14dm: use signal_pending_state() in dm_wait_for_completion()Bart Van Assche1-2/+1
Use signal_pending_state() instead of open-coding it. This patch does not change any functionality but makes it possible to pass TASK_KILLABLE as the second argument of dm_wait_for_completion(). See also commit 16882c1e962b ("sched: fix TASK_WAKEKILL vs SIGKILL race"). Signed-off-by: Bart Van Assche <[email protected]>. Signed-off-by: Mike Snitzer <[email protected]>
2016-09-14dm: rename task state function argumentsBart Van Assche1-5/+9
Rename 'interruptible' into 'task_state' to make it clear that this argument is a task state instead of a boolean. Also, change type from int to long. Signed-off-by: Bart Van Assche <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2016-09-14dm: add two lockdep_assert_held() statementsBart Van Assche1-0/+4
Document the locking assumptions for the __bind() and __dm_suspend() functions. Signed-off-by: Bart Van Assche <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2016-09-14dm rq: simplify dm_old_stop_queue()Bart Van Assche1-6/+2
This patch does not change any functionality. Signed-off-by: Bart Van Assche <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2016-09-14dm mpath: check if path's request_queue is dying in activate_path()Mike Snitzer1-3/+3
If pg_init_retries is set and a request is queued against a multipath device with all underlying block device request_queues in the "dying" state then an infinite loop is triggered because activate_path() never succeeds and hence never calls pg_init_done(). This change avoids that device removal triggers an infinite loop by failing the activate_path() which causes the "dying" path to be failed. Reported-by: Bart Van Assche <[email protected]> Signed-off-by: Mike Snitzer <[email protected]> Cc: [email protected]
2016-09-14dm rq: take request_queue lock while clearing QUEUE_FLAG_STOPPEDMike Snitzer1-5/+14
Every call of queue_flag_clear_unlocked() after block device initialization has finished is wrong if blk_cleanup_queue() can be called concurrently. Convert queue_flag_clear_unlocked() into queue_flag_clear() and protect it by the block layer queue lock. Also, factor out dm_mq_start_queue(). Reported-by: Bart Van Assche <[email protected]> Signed-off-by: Mike Snitzer <[email protected]> Cc: [email protected]
2016-09-14dm rq: factor out dm_mq_stop_queue()Bart Van Assche1-8/+20
Also, check that the blk-mq request_queue isn't already stopped. Signed-off-by: Bart Van Assche <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2016-09-14dm: mark request_queue dead before destroying the DM deviceBart Van Assche1-0/+5
This avoids that new requests are queued while __dm_destroy() is in progress. Signed-off-by: Bart Van Assche <[email protected]> Signed-off-by: Mike Snitzer <[email protected]> Cc: [email protected]
2016-09-14dm: return correct error code in dm_resume()'s retry loopMinfei Huang1-3/+2
dm_resume() will return success (0) rather than -EINVAL if !dm_suspended_md() upon retry within dm_resume(). Reset the error code at the start of dm_resume()'s retry loop. Also, remove a useless assignment at the end of dm_resume(). Fixes: ffcc393641 ("dm: enhance internal suspend and resume interface") Cc: [email protected] # 3.19+ Signed-off-by: Minfei Huang <[email protected]> Signed-off-by: Mike Snitzer <[email protected]>
2016-09-14blk-mq: introduce blk_mq_delay_kick_requeue_list()Mike Snitzer3-5/+14
blk_mq_delay_kick_requeue_list() provides the ability to kick the q->requeue_list after a specified time. To do this the request_queue's 'requeue_work' member was changed to a delayed_work. blk_mq_delay_kick_requeue_list() allows DM to defer processing requeued requests while it doesn't make sense to immediately requeue them (e.g. when all paths in a DM multipath have failed). Signed-off-by: Mike Snitzer <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2016-09-14block: remove IOPRIO_BITSChristoph Hellwig1-1/+0
Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Bart Van Assche <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2016-09-14bio.h: remove a very outdated commentChristoph Hellwig1-2/+0
Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2016-09-14block: remove bio_destructor_tChristoph Hellwig1-1/+0
Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Bart Van Assche <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2016-09-14block: Improve bio_set_op_attrs() robustnessBart Van Assche1-5/+12
Since REQ_OP_BITS == 3 and __REQ_NR_BITS == 30 it is not that hard to pass an op_flags argument to bio_set_op_attrs() that is larger than the number of bits reserved for the op_flags argument. Complain if this happens. Additionally, ensure that negative arguments trigger a complaint (1 << ... is signed while 1U << ... is unsigned; adding 0U to an integer expression causes it to be promoted to an unsigned type). Signed-off-by: Bart Van Assche <[email protected]> Cc: Mike Christie <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Hannes Reinecke <[email protected]> Cc: Damien Le Moal <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2016-09-14block, dm-crypt, btrfs: Introduce bio_flags()Bart Van Assche3-4/+6
Introduce the bio_flags() macro. Ensure that the second argument of bio_set_op_attrs() only contains flags and no operation. This patch does not change any functionality. Signed-off-by: Bart Van Assche <[email protected]> Cc: Mike Christie <[email protected]> Cc: Chris Mason <[email protected]> (maintainer:BTRFS FILE SYSTEM) Cc: Josef Bacik <[email protected]> (maintainer:BTRFS FILE SYSTEM) Cc: Mike Snitzer <[email protected]> Cc: Hannes Reinecke <[email protected]> Cc: Damien Le Moal <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2016-09-14block: Document that bio_op() uses the data type of bio.bi_opfBart Van Assche1-1/+1
Make it clear that the sizeof(unsigned int) expression in BIO_OP_SHIFT refers to the bi_opf member of struct bio. Signed-off-by: Bart Van Assche <[email protected]> Cc: Mike Christie <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Hannes Reinecke <[email protected]> Cc: Damien Le Moal <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2016-09-14block: remove remnant refs to hardsectLinus Walleij3-4/+4
commit e1defc4ff0cf57aca6c5e3ff99fa503f5943c1f1 "block: Do away with the notion of hardsect_size" removed the notion of "hardware sector size" from the kernel in favor of logical block size, but references remain in comments and documentation. Update the remaining sites mentioning hardsect. Signed-off-by: Linus Walleij <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2016-09-14block: remove blk_mq_alloc_single_hw_queue() prototypeLinus Walleij1-1/+0
The blk_mq_alloc_single_hw_queue() is a prototype artifact that should have been removed with commit cdef54dd85ad66e77262ea57796a3e81683dd5d6 "blk-mq: remove alloc_hctx and free_hctx methods" where the last users of it were deleted. Fixes: cdef54dd85ad ("blk-mq: remove alloc_hctx and free_hctx methods") Signed-off-by: Linus Walleij <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2016-09-14block_dev: remove DAX leftoversChristoph Hellwig1-10/+1
DAX support for block devices was removed in commits 03cdad ("block: disable block device DAX by default") and 99a01cd ("block: remove BLK_DEV_DAX config option"), but we still kept a call to dax_do_io and some uneeded i_flags manipulations introduced in commit bbab37 ("block: Add support for DAX reads/writes to block devices"). Remove those leftovers. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Johannes Thumshirn <[email protected]> Acked-by: Dan Williams <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2016-09-14block: enable zeroing of io_poll statisticsStephen Bates1-1/+10
Allow the io_poll statistics to be zeroed to make for easier logging of polling event. Signed-off-by: Stephen Bates <[email protected]> Acked-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2016-09-14block: add poll_considered statisticStephen Bates3-3/+10
In order to help determine the effectiveness of polling in a running system it is usful to determine the ratio of how often the poll function is called vs how often the completion is checked. For this reason we add a poll_considered variable and add it to the sysfs entry for io_poll. Signed-off-by: Stephen Bates <[email protected]> Acked-by: Christoph Hellwig <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2016-09-08nbd: allow block mq to deal with timeoutsJosef Bacik1-37/+14
Instead of rolling our own timer, just utilize the blk mq req timeout and do the disconnect if any of our commands timeout. Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2016-09-08nbd: use flags instead of boolJosef Bacik1-8/+10
In preparation for some future changes, change a few of the state bools over to normal bits to set/clear properly. Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2016-09-08nbd: don't shutdown sock with irq's disabledJosef Bacik1-4/+15
We hit a warning when shutting down the nbd connection because we have irq's disabled. We don't really need to do the shutdown under the lock, just clear the nbd->sock. So do the shutdown outside of the irq. This gets rid of the warning. Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2016-09-08nbd: convert to blkmqJosef Bacik1-208/+129
This moves NBD over to using blkmq, which allows us to get rid of the NBD wide queue lock and the async submit kthread. We will start with 1 hw queue for now, but I plan to add multiple tcp connection support in the future and we'll fix how we set the hwqueue's. Signed-off-by: Josef Bacik <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2016-08-29mtip32xx: mark symbols static where possibleBaoyou Xie1-1/+1
We get 1 warning when biuld kernel with W=1: drivers/block/mtip32xx/mtip32xx.c:3689:6: warning: no previous prototype for 'mtip_block_release' [-Wmissing-prototypes] In fact, this function is only used in the file in which it is declared and don't need a declaration, but can be made static. so this patch marks it 'static'. Signed-off-by: Baoyou Xie <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2016-08-29blk-mq: prefetch request in blk_mq_tag_to_rq()Jens Axboe1-1/+4
When drivers or the core calls this function, they usually dereference the request shortly there after. Prefetch the first cache line. Profiling IO workloads shows that this is the most common cache miss on the block side of things. Signed-off-by: Jens Axboe <[email protected]>
2016-08-29blk-mq: improve layout of blk_mq_hw_ctxJens Axboe1-4/+5
Various cache line optimizations: - Move delay_work towards the end. It's huge, and we don't use it a lot (only SCSI). - Move the atomic state into the same cacheline as the the dispatch list and lock. - Rearrange a few members to pack it better. - Shrink the max-order for dispatch accounting from 10 to 7. This means that ->dispatched[] and ->run now take up their own cacheline. This shrinks struct blk_mq_hw_ctx down to 8 cachelines. Signed-off-by: Jens Axboe <[email protected]>
2016-08-29blk-mq: turn hctx->run_work into a regular work structJens Axboe3-7/+6
We don't need the larger delayed work struct, since we always run it immediately. Signed-off-by: Jens Axboe <[email protected]>
2016-08-29block: add kblockd_schedule_work_on()Jens Axboe2-1/+7
Add a helper to schedule a regular struct work on a particular CPU. Signed-off-by: Jens Axboe <[email protected]>
2016-08-29workqueue: add cancel_work()Jens Axboe2-14/+27
Like cancel_delayed_work(), but for regular work. Signed-off-by: Jens Axboe <[email protected]> Mehed-by: Tejun Heo <[email protected]> Acked-by: Tejun Heo <[email protected]>
2016-08-28Linux 4.8-rc4Linus Torvalds1-1/+1
2016-08-28Merge tag 'drm-fixes-for-4.8-rc4' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds21-71/+424
Pull drm fixes from Dave Airlie: "A bunch of fixes covering i915, amdgpu, one tegra and some core DRM ones. Nothing too strange at this point" * tag 'drm-fixes-for-4.8-rc4' of git://people.freedesktop.org/~airlied/linux: (21 commits) drm/atomic: Don't potentially reset color_mgmt_changed on successive property updates. drm: Protect fb_defio in drivers with CONFIG_KMS_FBDEV_EMULATION drm/amdgpu: skip TV/CV in display parsing drm/amdgpu: avoid a possible array overflow drm/amdgpu: fix lru size grouping v2 drm/tegra: dsi: Enhance runtime power management drm/i915: Fix botched merge that downgrades CSR versions. drm/i915/skl: Ensure pipes with changed wms get added to the state drm/i915/gen9: Only copy WM results for changed pipes to skl_hw drm/i915/skl: Add support for the SAGV, fix underrun hangs drm/i915/gen6+: Interpret mailbox error flags drm/i915: Reattach comment, complete type specification drm/i915: Unconditionally flush any chipset buffers before execbuf drm/i915/gen9: Drop invalid WARN() during data rate calculation drm/i915/gen9: Initialize intel_state->active_crtcs during WM sanitization (v2) drm: Reject page_flip for !DRIVER_MODESET drm/amdgpu: fix timeout value check in amd_sched_job_recovery drm/amdgpu: fix sdma_v2_4_ring_test_ib drm/amdgpu: fix amdgpu_move_blit on 32bit systems drm/radeon: fix radeon_move_blit on 32bit systems ...
2016-08-29drm/atomic: Don't potentially reset color_mgmt_changed on successive ↵Mario Kleiner1-3/+3
property updates. Due to assigning the 'replaced' value instead of or'ing it, if drm_atomic_crtc_set_property() gets called multiple times, the last call will define the color_mgmt_changed flag, so a non-updating call to a property can reset the flag and prevent actual hw state updates required by preceding property updates. Signed-off-by: Mario Kleiner <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: <[email protected]> # v4.6+ Reviewed-by: Daniel Vetter <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
2016-08-28Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds5-7/+11
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Thomas Gleixner: "A few fixes from the perf departement - prevent a imbalanced preemption disable in the events teardown code - prevent out of bound acces in perf userspace - make perf tools compile with UCLIBC again - a fix for the userspace unwinder utility" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/core: Use this_cpu_ptr() when stopping AUX events perf evsel: Do not access outside hw cache name arrays tools lib: Reinstate strlcpy() header guard with __UCLIBC__ perf unwind: Use addr_location::addr instead of ip for entries
2016-08-28Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds1-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fix from Thomas Gleixner: "A single bugfix to prevent irq remapping when the ioapic is disabled" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/apic: Do not init irq remapping if ioapic is disabled
2016-08-28Merge branch 'irq-urgent-for-linus' of ↵Linus Torvalds7-9/+55
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: "This lot provides: - plug a hotplug race in the new affinity infrastructure - a fix for the trigger type of chained interrupts - plug a potential memory leak in the core code - a few fixes for ARM and MIPS GICs" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/mips-gic: Implement activate op for device domain irqchip/mips-gic: Cleanup chip and handler setup genirq/affinity: Use get/put_online_cpus around cpumask operations genirq: Fix potential memleak when failing to get irq pm irqchip/gicv3-its: Disable the ITS before initializing it irqchip/gicv3: Remove disabling redistributor and group1 non-secure interrupts irqchip/gic: Allow self-SGIs for SMP on UP configurations genirq: Correctly configure the trigger on chained interrupts
2016-08-28Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds7-8/+40
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "A few updates for timers & co: - prevent a livelock in the timekeeping code when debugging is enabled - prevent out of bounds access in the timekeeping debug code - various fixes in clocksource drivers - a new maintainers entry" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clocksource/drivers/sun4i: Clear interrupts after stopping timer in probe function drivers/clocksource/pistachio: Fix memory corruption in init clocksource/drivers/timer-atmel-pit: Enable mck clock clocksource/drivers/pxa: Fix include files for compilation MAINTAINERS: Add ARM ARCHITECTED TIMER entry timekeeping: Cap array access in timekeeping_debug timekeeping: Avoid taking lock in NMI path with CONFIG_DEBUG_TIMEKEEPING
2016-08-27Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds13-139/+234
Pull KVM fixes from Paolo Bonzini: "ARM: - fixes for ITS init issues, error handling, IRQ leakage, race conditions - an erratum workaround for timers - some removal of misleading use of errors and comments - a fix for GICv3 on 32-bit guests MIPS: - fix for where the guest could wrongly map the first page of physical memory x86: - nested virtualization fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: MIPS: KVM: Check for pfn noslot case kvm: nVMX: fix nested tsc scaling KVM: nVMX: postpone VMCS changes on MSR_IA32_APICBASE write KVM: nVMX: fix msr bitmaps to prevent L2 from accessing L0 x2APIC arm64: KVM: report configured SRE value to 32-bit world arm64: KVM: remove misleading comment on pmu status KVM: arm/arm64: timer: Workaround misconfigured timer interrupt arm64: Document workaround for Cortex-A72 erratum #853709 KVM: arm/arm64: Change misleading use of is_error_pfn KVM: arm64: ITS: avoid re-mapping LPIs KVM: arm64: check for ITS device on MSI injection KVM: arm64: ITS: move ITS registration into first VCPU run KVM: arm64: vgic-its: Make updates to propbaser/pendbaser atomic KVM: arm64: vgic-its: Plug race in vgic_put_irq KVM: arm64: vgic-its: Handle errors from vgic_add_lpi KVM: arm64: ITS: return 1 on successful MSI injection