aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2017-01-14aio: fix lock dep warningShaohua Li1-2/+4
lockdep reports a warnning. file_start_write/file_end_write only acquire/release the lock for regular files. So checking the files in aio side too. [ 453.532141] ------------[ cut here ]------------ [ 453.533011] WARNING: CPU: 1 PID: 1298 at ../kernel/locking/lockdep.c:3514 lock_release+0x434/0x670 [ 453.533011] DEBUG_LOCKS_WARN_ON(depth <= 0) [ 453.533011] Modules linked in: [ 453.533011] CPU: 1 PID: 1298 Comm: fio Not tainted 4.9.0+ #964 [ 453.533011] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.0-1.fc24 04/01/2014 [ 453.533011] ffff8803a24b7a70 ffffffff8196cffb ffff8803a24b7ae8 0000000000000000 [ 453.533011] ffff8803a24b7ab8 ffffffff81091ee1 ffff8803a5dba700 00000dba00000008 [ 453.533011] ffffed0074496f59 ffff8803a5dbaf54 ffff8803ae0f8488 fffffffffffffdef [ 453.533011] Call Trace: [ 453.533011] [<ffffffff8196cffb>] dump_stack+0x67/0x9c [ 453.533011] [<ffffffff81091ee1>] __warn+0x111/0x130 [ 453.533011] [<ffffffff81091f97>] warn_slowpath_fmt+0x97/0xb0 [ 453.533011] [<ffffffff81091f00>] ? __warn+0x130/0x130 [ 453.533011] [<ffffffff8191b789>] ? blk_finish_plug+0x29/0x60 [ 453.533011] [<ffffffff811205d4>] lock_release+0x434/0x670 [ 453.533011] [<ffffffff8198af94>] ? import_single_range+0xd4/0x110 [ 453.533011] [<ffffffff81322195>] ? rw_verify_area+0x65/0x140 [ 453.533011] [<ffffffff813aa696>] ? aio_write+0x1f6/0x280 [ 453.533011] [<ffffffff813aa6c9>] aio_write+0x229/0x280 [ 453.533011] [<ffffffff813aa4a0>] ? aio_complete+0x640/0x640 [ 453.533011] [<ffffffff8111df20>] ? debug_check_no_locks_freed+0x1a0/0x1a0 [ 453.533011] [<ffffffff8114793a>] ? debug_lockdep_rcu_enabled.part.2+0x1a/0x30 [ 453.533011] [<ffffffff81147985>] ? debug_lockdep_rcu_enabled+0x35/0x40 [ 453.533011] [<ffffffff812a92be>] ? __might_fault+0x7e/0xf0 [ 453.533011] [<ffffffff813ac9bc>] do_io_submit+0x94c/0xb10 [ 453.533011] [<ffffffff813ac2ae>] ? do_io_submit+0x23e/0xb10 [ 453.533011] [<ffffffff813ac070>] ? SyS_io_destroy+0x270/0x270 [ 453.533011] [<ffffffff8111d7b3>] ? mark_held_locks+0x23/0xc0 [ 453.533011] [<ffffffff8100201a>] ? trace_hardirqs_on_thunk+0x1a/0x1c [ 453.533011] [<ffffffff813acb90>] SyS_io_submit+0x10/0x20 [ 453.533011] [<ffffffff824f96aa>] entry_SYSCALL_64_fastpath+0x18/0xad [ 453.533011] [<ffffffff81119190>] ? trace_hardirqs_off_caller+0xc0/0x110 [ 453.533011] ---[ end trace b2fbe664d1cc0082 ]--- Cc: Dmitry Monakhov <[email protected]> Cc: Jan Kara <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Al Viro <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Shaohua Li <[email protected]> Signed-off-by: Al Viro <[email protected]>
2017-01-14Merge tag 'dmaengine-fix-4.10-rc4' of ↵Linus Torvalds9-48/+87
git://git.infradead.org/users/vkoul/slave-dma Pull dmaengine fixes from Vinod Koul: "The fixes this time around are spread over drivers, pretty normal update: - PCI ID for SKL ioatdma, workaround for SKX and ioat_alloc_chan_resources sleepy allocation fix - dw kconfig typo fix - null pointer deref for stm32 - MAINTAINERS Update for at_hdmac - pl330 runtime pm fixes - omap-dma port window fix - rcar-dmac unmap slave resource fix" * tag 'dmaengine-fix-4.10-rc4' of git://git.infradead.org/users/vkoul/slave-dma: dmaengine: rcar-dmac: unmap slave resource when channel is freed dmaengine: omap-dma: Fix the port_window support dmaengine: iota: ioat_alloc_chan_resources should not perform sleeping allocations. dmaengine: pl330: Fix runtime PM support for terminated transfers MAINTAINERS: dmaengine: Update + Hand over the at_hdmac driver to Ludovic dmaengine: omap-dma: Fix dynamic lch_map allocation dmaengine: ti-dma-crossbar: Add some 'of_node_put()' in error path. dmaengine: stm32-dma: Fix null pointer dereference in stm32_dma_tx_status dmaengine: stm32-dma: Set correct args number for DMA request from DT dmaengine: dw: fix typo in Kconfig dmaengine: ioatdma: workaround SKX ioatdma version dmaengine: ioatdma: Add Skylake PCI Dev ID
2017-01-14drm/i915: Eliminate superfluous i915_ggtt_view_normalChris Wilson4-9/+2
Since commit 058d88c4330f ("drm/i915: Track pinned VMA"), there is only one user of i915_ggtt_view_normal rodate. Just treat NULL as no special view in pin_to_display() like everywhere else. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2017-01-14drm/i915: Eliminate superfluous i915_ggtt_view_rotatedChris Wilson3-7/+2
It is only being used to clear a struct and set the type, after which it is overwritten. Since we no longer check the unset bits of the union, skipping the clear is permissible. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2017-01-14drm/i915: Convert i915_ggtt_view to use an anonymous unionChris Wilson7-27/+32
Reading the ggtt_views is much more pleasant without the extra characters from specifying the union (i.e. ggtt_view.partial rather than ggtt_view.params.partial). To make this work inside i915_vma_compare() with only a single memcmp requires us to ensure that there are no uninitialised bytes within each branch of the union (we make sure the structs are packed) and we need to store the size of each branch. v4: Rewrite changelog and add comments explaining the assert. Signed-off-by: Chris Wilson <[email protected]> Cc: Daniel Vetter <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Joonas Lahtinen <[email protected]> Reviewed-by: Tvrtko Ursulin <[email protected]>
2017-01-14drm/i915: Stop clearing i915_ggtt_viewChris Wilson1-1/+0
As we now use a compact memcmp in i915_vma_compare(), we can forgo clearing the entire view and only set the precise parameters used in this view. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Tvrtko Ursulin <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2017-01-14drm/i915: Compact memcmp in i915_vma_compare()Chris Wilson2-13/+35
In preparation for the next patch to convert to using an anonymous union and leaving the excess bytes in the union uninitialised, we first need to make sure we do not compare using those uninitialised bytes. We also want to preserve the compactness of the code, avoiding a second call to memcmp or introducing a switch, so we take advantage of using the type as an encoded size (as well as a unique identifier for each type of view). v2: Add the rationale for why we encode size into ggtt_view.type as a comment before the memcmp() v3: Use a switch to also assert that no two i915_ggtt_view_type have the same value. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Tvrtko Ursulin <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2017-01-14drm/i915: Mark the ggtt_view structs as packedChris Wilson1-2/+12
In the next few patches, we will depend upon there being no uninitialised bits inside the ggtt_view. To ensure this we add the __packed attribute and double check with a build bug that gcc hasn't expanded the struct to include some padding bytes. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Tvrtko Ursulin <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2017-01-14drm/i915: Name the anonymous structs inside i915_ggtt_viewChris Wilson1-5/+7
Naming this pair will become useful shortly... Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Tvrtko Ursulin <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2017-01-14efi/x86: Prune invalid memory map entries and fix boot regressionPeter Jones2-0/+67
Some machines, such as the Lenovo ThinkPad W541 with firmware GNET80WW (2.28), include memory map entries with phys_addr=0x0 and num_pages=0. These machines fail to boot after the following commit, commit 8e80632fb23f ("efi/esrt: Use efi_mem_reserve() and avoid a kmalloc()") Fix this by removing such bogus entries from the memory map. Furthermore, currently the log output for this case (with efi=debug) looks like: [ 0.000000] efi: mem45: [Reserved | | | | | | | | | | | | ] range=[0x0000000000000000-0xffffffffffffffff] (0MB) This is clearly wrong, and also not as informative as it could be. This patch changes it so that if we find obviously invalid memory map entries, we print an error and skip those entries. It also detects the display of the address range calculation overflow, so the new output is: [ 0.000000] efi: [Firmware Bug]: Invalid EFI memory map entries: [ 0.000000] efi: mem45: [Reserved | | | | | | | | | | | | ] range=[0x0000000000000000-0x0000000000000000] (invalid) It also detects memory map sizes that would overflow the physical address, for example phys_addr=0xfffffffffffff000 and num_pages=0x0200000000000001, and prints: [ 0.000000] efi: [Firmware Bug]: Invalid EFI memory map entries: [ 0.000000] efi: mem45: [Reserved | | | | | | | | | | | | ] range=[phys_addr=0xfffffffffffff000-0x20ffffffffffffffff] (invalid) It then removes these entries from the memory map. Signed-off-by: Peter Jones <[email protected]> Signed-off-by: Ard Biesheuvel <[email protected]> [ardb: refactor for clarity with no functional changes, avoid PAGE_SHIFT] Signed-off-by: Matt Fleming <[email protected]> [Matt: Include bugzilla info in commit log] Cc: <[email protected]> # v4.9+ Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: https://bugzilla.kernel.org/show_bug.cgi?id=191121 Signed-off-by: Ingo Molnar <[email protected]>
2017-01-14Revert "driver core: Add deferred_probe attribute to devices in sysfs"Greg Kroah-Hartman4-34/+0
This reverts commit 6751667a29d6fd64afb9ce30567ad616b68ed789. Rob Herring objected to it, and a replacement for it will be added using debugfs in the future. Cc: Ben Hutchings <[email protected]> Reported-by: Rob Herring <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2017-01-14perf/x86: Reject non sampling events with precise_ipJiri Olsa1-0/+4
As Peter suggested [1] rejecting non sampling PEBS events, because they dont make any sense and could cause bugs in the NMI handler [2]. [1] http://lkml.kernel.org/r/20170103094059.GC3093@worktop [2] http://lkml.kernel.org/r/[email protected] Signed-off-by: Jiri Olsa <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Cc: Vince Weaver <[email protected]> Link: http://lkml.kernel.org/r/20170103142454.GA26251@krava Signed-off-by: Ingo Molnar <[email protected]>
2017-01-14perf/x86/intel: Account interrupts for PEBS errorsJiri Olsa3-17/+37
It's possible to set up PEBS events to get only errors and not any data, like on SNB-X (model 45) and IVB-EP (model 62) via 2 perf commands running simultaneously: taskset -c 1 ./perf record -c 4 -e branches:pp -j any -C 10 This leads to a soft lock up, because the error path of the intel_pmu_drain_pebs_nhm() does not account event->hw.interrupt for error PEBS interrupts, so in case you're getting ONLY errors you don't have a way to stop the event when it's over the max_samples_per_tick limit: NMI watchdog: BUG: soft lockup - CPU#22 stuck for 22s! [perf_fuzzer:5816] ... RIP: 0010:[<ffffffff81159232>] [<ffffffff81159232>] smp_call_function_single+0xe2/0x140 ... Call Trace: ? trace_hardirqs_on_caller+0xf5/0x1b0 ? perf_cgroup_attach+0x70/0x70 perf_install_in_context+0x199/0x1b0 ? ctx_resched+0x90/0x90 SYSC_perf_event_open+0x641/0xf90 SyS_perf_event_open+0x9/0x10 do_syscall_64+0x6c/0x1f0 entry_SYSCALL64_slow_path+0x25/0x25 Add perf_event_account_interrupt() which does the interrupt and frequency checks and call it from intel_pmu_drain_pebs_nhm()'s error path. We keep the pending_kill and pending_wakeup logic only in the __perf_event_overflow() path, because they make sense only if there's any data to deliver. Signed-off-by: Jiri Olsa <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Cc: Vince Weaver <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-01-14perf/core: Fix concurrent sys_perf_event_open() vs. 'move_group' racePeter Zijlstra1-4/+54
Di Shen reported a race between two concurrent sys_perf_event_open() calls where both try and move the same pre-existing software group into a hardware context. The problem is exactly that described in commit: f63a8daa5812 ("perf: Fix event->ctx locking") ... where, while we wait for a ctx->mutex acquisition, the event->ctx relation can have changed under us. That very same commit failed to recognise sys_perf_event_context() as an external access vector to the events and thereby didn't apply the established locking rules correctly. So while one sys_perf_event_open() call is stuck waiting on mutex_lock_double(), the other (which owns said locks) moves the group about. So by the time the former sys_perf_event_open() acquires the locks, the context we've acquired is stale (and possibly dead). Apply the established locking rules as per perf_event_ctx_lock_nested() to the mutex_lock_double() for the 'move_group' case. This obviously means we need to validate state after we acquire the locks. Reported-by: Di Shen (Keen Lab) Tested-by: John Dias <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kees Cook <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Min Chong <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Fixes: f63a8daa5812 ("perf: Fix event->ctx locking") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-01-14perf/core: Fix sys_perf_event_open() vs. hotplugPeter Zijlstra1-22/+48
There is problem with installing an event in a task that is 'stuck' on an offline CPU. Blocked tasks are not dis-assosciated from offlined CPUs, after all, a blocked task doesn't run and doesn't require a CPU etc.. Only on wakeup do we ammend the situation and place the task on a available CPU. If we hit such a task with perf_install_in_context() we'll loop until either that task wakes up or the CPU comes back online, if the task waking depends on the event being installed, we're stuck. While looking into this issue, I also spotted another problem, if we hit a task with perf_install_in_context() that is in the middle of being migrated, that is we observe the old CPU before sending the IPI, but run the IPI (on the old CPU) while the task is already running on the new CPU, things also go sideways. Rework things to rely on task_curr() -- outside of rq->lock -- which is rather tricky. Imagine the following scenario where we're trying to install the first event into our task 't': CPU0 CPU1 CPU2 (current == t) t->perf_event_ctxp[] = ctx; smp_mb(); cpu = task_cpu(t); switch(t, n); migrate(t, 2); switch(p, t); ctx = t->perf_event_ctxp[]; // must not be NULL smp_function_call(cpu, ..); generic_exec_single() func(); spin_lock(ctx->lock); if (task_curr(t)) // false add_event_to_ctx(); spin_unlock(ctx->lock); perf_event_context_sched_in(); spin_lock(ctx->lock); // sees event So its CPU0's store of t->perf_event_ctxp[] that must not go 'missing'. Because if CPU2's load of that variable were to observe NULL, it would not try to schedule the ctx and we'd have a task running without its counter, which would be 'bad'. As long as we observe !NULL, we'll acquire ctx->lock. If we acquire it first and not see the event yet, then CPU0 must observe task_curr() and retry. If the install happens first, then we must see the event on sched-in and all is well. I think we can translate the first part (until the 'must not be NULL') of the scenario to a litmus test like: C C-peterz { } P0(int *x, int *y) { int r1; WRITE_ONCE(*x, 1); smp_mb(); r1 = READ_ONCE(*y); } P1(int *y, int *z) { WRITE_ONCE(*y, 1); smp_store_release(z, 1); } P2(int *x, int *z) { int r1; int r2; r1 = smp_load_acquire(z); smp_mb(); r2 = READ_ONCE(*x); } exists (0:r1=0 /\ 2:r1=1 /\ 2:r2=0) Where: x is perf_event_ctxp[], y is our tasks's CPU, and z is our task being placed on the rq of CPU2. The P0 smp_mb() is the one added by this patch, ordering the store to perf_event_ctxp[] from find_get_context() and the load of task_cpu() in task_function_call(). The smp_store_release/smp_load_acquire model the RCpc locking of the rq->lock and the smp_mb() of P2 is the context switch switching from whatever CPU2 was running to our task 't'. This litmus test evaluates into: Test C-peterz Allowed States 7 0:r1=0; 2:r1=0; 2:r2=0; 0:r1=0; 2:r1=0; 2:r2=1; 0:r1=0; 2:r1=1; 2:r2=1; 0:r1=1; 2:r1=0; 2:r2=0; 0:r1=1; 2:r1=0; 2:r2=1; 0:r1=1; 2:r1=1; 2:r2=0; 0:r1=1; 2:r1=1; 2:r2=1; No Witnesses Positive: 0 Negative: 7 Condition exists (0:r1=0 /\ 2:r1=1 /\ 2:r2=0) Observation C-peterz Never 0 7 Hash=e427f41d9146b2a5445101d3e2fcaa34 And the strong and weak model agree. Reported-by: Mark Rutland <[email protected]> Tested-by: Mark Rutland <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Arnaldo Carvalho de Melo <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sebastian Andrzej Siewior <[email protected]> Cc: Stephane Eranian <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vince Weaver <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-01-14x86/mpx: Use compatible types in comparison to fix sparse errorTobias Klauser1-1/+1
info->si_addr is of type void __user *, so it should be compared against something from the same address space. This fixes the following sparse error: arch/x86/mm/mpx.c:296:27: error: incompatible types in comparison expression (different address spaces) Signed-off-by: Tobias Klauser <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-01-14x86/tsc: Add the Intel Denverton Processor to native_calibrate_tsc()Len Brown1-0/+1
The Intel Denverton microserver uses a 25 MHz TSC crystal, so we can derive its exact [*] TSC frequency using CPUID and some arithmetic, eg.: TSC: 1800 MHz (25000000 Hz * 216 / 3 / 1000000) [*] 'exact' is only as good as the crystal, which should be +/- 20ppm Signed-off-by: Len Brown <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/306899f94804aece6d8fa8b4223ede3b48dbb59c.1484287748.git.len.brown@intel.com Signed-off-by: Ingo Molnar <[email protected]>
2017-01-13Merge branch 'for-linus-4.10' of ↵Linus Torvalds6-82/+117
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "These are all over the place. The tracepoint part of the pull fixes a crash and adds a little more information to two tracepoints, while the rest are good old fashioned fixes" * 'for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: btrfs: make tracepoint format strings more compact Btrfs: add truncated_len for ordered extent tracepoints Btrfs: add 'inode' for extent map tracepoint btrfs: fix crash when tracepoint arguments are freed by wq callbacks Btrfs: adjust outstanding_extents counter properly when dio write is split Btrfs: fix lockdep warning about log_mutex Btrfs: use down_read_nested to make lockdep silent btrfs: fix locking when we put back a delayed ref that's too new btrfs: fix error handling when run_delayed_extent_op fails btrfs: return the actual error value from from btrfs_uuid_tree_iterate
2017-01-13Merge tag 'ceph-for-4.10-rc4' of git://github.com/ceph/ceph-clientLinus Torvalds2-2/+7
Pull ceph fixes from Ilya Dryomov: "Two small fixups for the filesystem changes that went into this merge window" * tag 'ceph-for-4.10-rc4' of git://github.com/ceph/ceph-client: ceph: fix get_oldest_context() ceph: fix mds cluster availability check
2017-01-13Merge tag 'vfio-v4.10-rc4' of git://github.com/awilliam/linux-vfioLinus Torvalds3-10/+18
Pull VFIO fixes from Alex Williamson: - Cleanups and bug fixes for the mtty sample driver (Dan Carpenter) - Export and make use of has_capability() to fix incorrect use of ns_capable() for testing task capabilities (Jike Song) * tag 'vfio-v4.10-rc4' of git://github.com/awilliam/linux-vfio: vfio/type1: Remove pid_namespace.h include vfio iommu type1: fix the testing of capability for remote task capability: export has_capability vfio-mdev: remove some dead code vfio-mdev: buffer overflow in ioctl() vfio-mdev: return -EFAULT if copy_to_user() fails
2017-01-13Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds7-16/+80
Pull KVM fixes from Paolo Bonzini: - fix for module unload vs deferred jump labels (note: there might be other buggy modules!) - two NULL pointer dereferences from syzkaller - also syzkaller: fix emulation of fxsave/fxrstor/sgdt/sidt, problem made worse during this merge window, "just" kernel memory leak on releases - fix emulation of "mov ss" - somewhat serious on AMD, less so on Intel * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: fix emulation of "MOV SS, null selector" KVM: x86: fix NULL deref in vcpu_scan_ioapic KVM: eventfd: fix NULL deref irqbypass consumer KVM: x86: Introduce segmented_write_std KVM: x86: flush pending lapic jump label updates on module unload jump_labels: API for flushing deferred jump label updates
2017-01-13Merge tag 'arm64-fixes' of ↵Linus Torvalds2-10/+28
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Fix huge_ptep_set_access_flags() to return "changed" when any of the ptes in the contiguous range is changed, not just the last one - Fix the adr_l assembly macro to work in modules under KASLR * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: assembler: make adr_l work in modules under KASLR arm64: hugetlb: fix the wrong return value for huge_ptep_set_access_flags
2017-01-13block: don't try to discard from __blkdev_issue_zerooutChristoph Hellwig1-7/+6
Discard can return -EIO asynchronously if the alignment for the request isn't suitable for the driver, which makes a proper fallback to other methods in __blkdev_issue_zeroout impossible. Thus only issue a sync discard from blkdev_issue_zeroout an don't try discard at all from __blkdev_issue_zeroout as a non-invasive workaround. One more reason why abusing discard for zeroing must die.. Signed-off-by: Christoph Hellwig <[email protected]> Reported-by: Eryu Guan <[email protected]> Fixes: e73c23ff ("block: add async variant of blkdev_issue_zeroout") Signed-off-by: Jens Axboe <[email protected]>
2017-01-13sd: remove __data_len hack for WRITE SAMEChristoph Hellwig1-16/+1
Now that we have the blk_rq_payload_bytes helper available to determine the actual I/O size we don't need to mess around with __data_len for WRITE SAME. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2017-01-13nvme: use blk_rq_payload_bytesChristoph Hellwig4-30/+15
The new blk_rq_payload_bytes generalizes the payload length hacks that nvme_map_len did before. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2017-01-13scsi: use blk_rq_payload_bytesChristoph Hellwig1-1/+1
Without that we'll pass a wrong payload size in cmd->sdb, which can lead to hangs with drivers that need the total transfer size. Signed-off-by: Christoph Hellwig <[email protected]> Reported-by: Chris Valean <[email protected]> Reported-by: Dexuan Cui <[email protected]> Fixes: f9d03f96 ("block: improve handling of the magic discard payload") Reviewed-by: Hannes Reinecke <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2017-01-13block: add blk_rq_payload_bytesChristoph Hellwig1-0/+13
Add a helper to calculate the actual data transfer size for special payload requests. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2017-01-13Merge tag 'scsi-fixes' of ↵Linus Torvalds7-6/+26
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "The major fix is the bfa firmware, since the latest 10Gb cards fail probing with the current firmware. The rest is a set of minor fixes: one missed Kconfig dependency causing randconfig failures, a missed error return on an error leg, a change for how multiqueue waits on a blocked device and a don't reset while in reset fix" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: bfa: Increase requested firmware version to 3.2.5.1 scsi: snic: Return error code on memory allocation failure scsi: fnic: Avoid sending reset to firmware when another reset is in progress scsi: qedi: fix build, depends on UIO scsi: scsi-mq: Wait for .queue_rq() if necessary
2017-01-13Merge branch 'for-linus' of ↵Linus Torvalds8-10/+20
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input updates from Dmitry Torokhov: "Small driver fixups" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: elants_i2c - avoid divide by 0 errors on bad touchscreen data Input: adxl34x - make it enumerable in ACPI environment Input: ALPS - fix TrackStick Y axis handling for SS5 hardware Input: synaptics-rmi4 - fix F03 build error when serio is module Input: xpad - use correct product id for x360w controllers Input: synaptics_i2c - change msleep to usleep_range for small msecs Input: i8042 - add Pegatron touchpad to noloop table Input: joydev - remove unused linux/miscdevice.h include
2017-01-13drm/i915/psr: report live PSR2 StateNagaraju, Vathsala2-0/+25
Reports live state of PSR2 form PSR2_STATUS register. bit field 31:28 gives the live state of PSR2. It can be used to check if system is in deep sleep, selective update or selective update standby. During video play back, we can use this to check if system is entering SU mode or not. when system is in idle state, DEEP_SLEEP(8) must be entered. When video playback is happening, system must be in SLEEP(3 / selective update) or SU_STANDBY( 6 / selective update standby) v2: (Rodrigo) - Remove EDP_PSR2_STATUS_TG_ON=a ,instead use ARRAY_SIZE Cc: Rodrigo Vivi <[email protected]> Cc: Jim Bride <[email protected]> Signed-off-by: Vathsala Nagaraju <[email protected]> Signed-off-by: Patil Deepti <[email protected]> Reviewed-by: Rodrigo Vivi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2017-01-13drm/i915/psr: enable psr2 for y cordinate panelsNagaraju, Vathsala1-0/+9
Psr2 is enabled only for y cordinate panels.Once GTC (global time code) is implemented,this restriction is removed so that psr2 can work on panels without y cordinate support. v2: (Rodrigo) - Move the check to intel_psr_match_conditions v3: (Rodrigo) - add return false v4: rebase Cc: Rodrigo Vivi <[email protected]> Cc: Jim Bride <[email protected]> Signed-off-by: Vathsala Nagaraju <[email protected]> Signed-off-by: Patil Deepti <[email protected]> Reviewed-by: Rodrigo Vivi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2017-01-13drm/i915/psr: set PSR_MASK bits for deep sleepNagaraju, Vathsala2-13/+27
Program EDP_PSR_DEBUG_CTL (PSR_MASK) to enable system to go to deep sleep while in psr2.PSR2_STATUS bit 31:28 should report value 8 , if system enters deep sleep state. Also, EDP_FRAMES_BEFORE_SU_ENTRY is set 1 , if not set, flickering is observed on psr2 panel. v2: (Ilia Mirkin) - Remove duplicate bit definition 25:27 v3: rebase v4: rebase v5: rebase Cc: Rodrigo Vivi <[email protected]> Cc: Jim Bride <[email protected]> Signed-off-by: Vathsala Nagaraju <[email protected]> Signed-off-by: Patil Deepti <[email protected]> Reviewed-by: Jim Bride <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2017-01-13drm/i915/psr: set CHICKEN_TRANS for psr2Nagaraju, Vathsala2-0/+13
As per bpsec, CHICKEN_TRANS_EDP bit 12 ,15 must be programmed in psr2 enable sequence. bit 12 : Program Transcoder EDP VSC DIP header with a valid setting for PSR2 and Set CHICKEN_TRANS_EDP(0x420cc) bit 12 for programmable header packet. bit 15 : Set CHICKEN_TRANS_EDP(0x420cc) bit 15 if Y coordinate is supported v2: (Rodrigo) - move CHICKEN_TRANS_EDP bit set logic right after setup_vsc v3:(Rodrigo) - initialize chicken_trans to CHICKEN_TRANS_BIT12 instead of 0 v4:(chris wilson) - use BIT(12), remove CHICKEN_TRANS_BIT12 - remove unnecessary comments - update commit message v5: - rename bit 12 PSR2_VSC_ENABLE_PROG_HEADER - rename bit 15 PSR2_ADD_VERTICAL_LINE_COUNT v6:(Rodrigo) - remove TRANS_EDP=3, use cpu_transcoder Cc: Rodrigo Vivi <[email protected]> Cc: Jim Bride <[email protected]> Signed-off-by: vathsala nagaraju <[email protected]> Signed-off-by: Patil Deepti <[email protected]> Reviewed-by: Rodrigo Vivi <[email protected]> Signed-off-by: Rodrigo Vivi <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2017-01-13NFSv4: Fix client recovery when server reboots multiple timesTrond Myklebust1-1/+0
If the server reboots multiple times, the client should rely on the server to tell it that it cannot reclaim state as per section 9.6.3.4 in RFC7530 and section 8.4.2.1 in RFC5661. Currently, the client is being to conservative, and is assuming that if the server reboots while state recovery is in progress, then it must ignore state that was not recovered before the reboot. Signed-off-by: Trond Myklebust <[email protected]>
2017-01-13partially revert "xen: Remove event channel notification through Xen PCI ↵Stefano Stabellini1-0/+71
platform device" Commit 72a9b186292d ("xen: Remove event channel notification through Xen PCI platform device") broke Linux when booting as Dom0 on Xen in a nested Xen environment (Xen installed inside a Xen VM). In this scenario, Linux is a PV guest, but at the same time it uses the platform-pci driver to receive notifications from L0 Xen. vector callbacks are not available because L1 Xen doesn't allow them. Partially revert the offending commit, by restoring IRQ based notifications for PV guests only. I restored only the code which is strictly needed and replaced the xen_have_vector_callback checks within it with xen_pv_domain() checks. Signed-off-by: Stefano Stabellini <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]>
2017-01-13libnvdimm, namespace: fix pmem namespace leak, delete when size set to zeroDan Williams1-13/+10
Commit 98a29c39dc68 ("libnvdimm, namespace: allow creation of multiple pmem-namespaces per region") added support for establishing additional pmem namespace beyond the seed device, similar to blk namespaces. However, it neglected to delete the namespace when the size is set to zero. Fixes: 98a29c39dc68 ("libnvdimm, namespace: allow creation of multiple pmem-namespaces per region") Cc: <[email protected]> Signed-off-by: Dan Williams <[email protected]>
2017-01-13tcp: fix tcp_fastopen unaligned access complaints on sparcShannon Nelson2-2/+7
Fix up a data alignment issue on sparc by swapping the order of the cookie byte array field with the length field in struct tcp_fastopen_cookie, and making it a proper union to clean up the typecasting. This addresses log complaints like these: log_unaligned: 113 callbacks suppressed Kernel unaligned access at TPC[976490] tcp_try_fastopen+0x2d0/0x360 Kernel unaligned access at TPC[9764ac] tcp_try_fastopen+0x2ec/0x360 Kernel unaligned access at TPC[9764c8] tcp_try_fastopen+0x308/0x360 Kernel unaligned access at TPC[9764e4] tcp_try_fastopen+0x324/0x360 Kernel unaligned access at TPC[976490] tcp_try_fastopen+0x2d0/0x360 Cc: Eric Dumazet <[email protected]> Signed-off-by: Shannon Nelson <[email protected]> Acked-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-01-13ipv6: sr: fix several BUGs when preemption is enabledDavid Lebrun2-1/+5
When CONFIG_PREEMPT=y, CONFIG_IPV6=m and CONFIG_SEG6_HMAC=y, seg6_hmac_init() is called during the initialization of the ipv6 module. This causes a subsequent call to smp_processor_id() with preemption enabled, resulting in the following trace. [ 20.451460] BUG: using smp_processor_id() in preemptible [00000000] code: systemd/1 [ 20.452556] caller is debug_smp_processor_id+0x17/0x19 [ 20.453304] CPU: 0 PID: 1 Comm: systemd Not tainted 4.9.0-rc5-00973-g46738b1 #1 [ 20.454406] ffffc9000062fc18 ffffffff813607b2 0000000000000000 ffffffff81a7f782 [ 20.455528] ffffc9000062fc48 ffffffff813778dc 0000000000000000 00000000001dcf98 [ 20.456539] ffffffffa003bd08 ffffffff81af93e0 ffffc9000062fc58 ffffffff81377905 [ 20.456539] Call Trace: [ 20.456539] [<ffffffff813607b2>] dump_stack+0x63/0x7f [ 20.456539] [<ffffffff813778dc>] check_preemption_disabled+0xd1/0xe3 [ 20.456539] [<ffffffff81377905>] debug_smp_processor_id+0x17/0x19 [ 20.460260] [<ffffffffa0061f3b>] seg6_hmac_init+0xfa/0x192 [ipv6] [ 20.460260] [<ffffffffa0061ccc>] seg6_init+0x39/0x6f [ipv6] [ 20.460260] [<ffffffffa006121a>] inet6_init+0x21a/0x321 [ipv6] [ 20.460260] [<ffffffffa0061000>] ? 0xffffffffa0061000 [ 20.460260] [<ffffffff81000457>] do_one_initcall+0x8b/0x115 [ 20.460260] [<ffffffff811328a3>] do_init_module+0x53/0x1c4 [ 20.460260] [<ffffffff8110650a>] load_module+0x1153/0x14ec [ 20.460260] [<ffffffff81106a7b>] SYSC_finit_module+0x8c/0xb9 [ 20.460260] [<ffffffff81106a7b>] ? SYSC_finit_module+0x8c/0xb9 [ 20.460260] [<ffffffff81106abc>] SyS_finit_module+0x9/0xb [ 20.460260] [<ffffffff810014d1>] do_syscall_64+0x62/0x75 [ 20.460260] [<ffffffff816834f0>] entry_SYSCALL64_slow_path+0x25/0x25 Moreover, dst_cache_* functions also call smp_processor_id(), generating a similar trace. This patch uses raw_cpu_ptr() in seg6_hmac_init() rather than this_cpu_ptr() and disable preemption when using dst_cache_* functions. Signed-off-by: David Lebrun <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-01-13net: systemport: Decouple flow control from __bcm_sysport_tx_reclaimFlorian Fainelli1-7/+18
The __bcm_sysport_tx_reclaim() function is used to reclaim transmit resources in different places within the driver. Most of them should not affect the state of the transit flow control. Introduce bcm_sysport_tx_clean() which cleans the ring, but does not re-enable flow control towards the networking stack, and make bcm_sysport_tx_reclaim() do the actual transmit queue flow control. Fixes: 80105befdb4b ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver") Signed-off-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-01-13ARM: dts: OMAP5 / DRA7: indicate that SATA port 0 is available.Jean-Jacques Hiblot2-0/+2
AHCI provides the register PORTS_IMPL to let the software know which port is supported. The register must be initialized by the bootloader. However in some cases u-boot doesn't properly initialize this value (if it is not compiled with SATA support for example or if the SATA initialization fails). The DTS entry "ports-implemented" can be used to override the value in PORTS_IMPL. Without this patch the SATA will not work in the following two cases: * if there has been a failure to initialize SATA in u-boot. * if ahci_platform module has been removed and re-inserted. The reason is that the content of PORTS_IMPL is lost after the module is removed. I suspect that it's because the controller is reset by the hwmod. Cc: <[email protected]> # v4.6+ Signed-off-by: Jean-Jacques Hiblot <[email protected]> Acked-by: Roger Quadros <[email protected]> [[email protected]: updated comments with what goes wrong] Signed-off-by: Tony Lindgren <[email protected]>
2017-01-13Revert "drm/amdgpu: Only update the CUR_SIZE register when necessary"Michel Dänzer4-60/+30
This reverts commits 7c83d7abc9997cf1efac2c0ce384b5e8453ee870 and a1f49cc179ce6b7b7758ae3ff5cdb138d0ee0f56. They caused the HW cursor to disappear under various circumstances in the wild. I wasn't able to reproduce any of them, and I'm not sure what's going on. But those changes aren't a big deal anyway, so let's just revert for now. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=191291 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99143 Acked-by: Alex Deucher <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
2017-01-13ARM: put types.h in uapiNicolas Dichtel1-3/+3
Due to the way kbuild works, this header was unintentionally exported back in 2013 when it was created, despite it not being in a uapi/ directory. This is very non-intuitive behaviour by Kbuild. However, we've had this include exported to userland for almost four years, and searching google for "ARM types.h __UINTPTR_TYPE__" gives no hint that anyone has complained about it. So, let's make it officially exported in this state. Signed-off-by: Nicolas Dichtel <[email protected]> Signed-off-by: Russell King <[email protected]>
2017-01-13drm/i915: Assert that we have allocated the drm_mm_node upon pinningChris Wilson2-1/+5
We currently check after the slow path that the vma is bound correctly, but we don't currently check after the fast path. This is important in case we accidentally take the fast path and leave the vma misplaced. Signed-off-by: Chris Wilson <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Joonas Lahtinen <[email protected]>
2017-01-13drm/i915: Move i915_ppgtt_close() into i915_gem_gtt.cChris Wilson3-21/+22
Move it alongside its ppgtt counterparts, in order to make it available for the ppgtt selftests. Signed-off-by: Chris Wilson <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected] Reviewed-by: Joonas Lahtinen <[email protected]>
2017-01-13fuse: fix time_to_jiffies nsec sanity checkDavid Sheets1-1/+1
Commit bcb6f6d2b9c2 ("fuse: use timespec64") introduced clamped nsec values in time_to_jiffies but used the max of nsec and NSEC_PER_SEC - 1 instead of the min. Because of this, dentries would stay in the cache longer than requested and go stale in scenarios that relied on their timely eviction. Fixes: bcb6f6d2b9c2 ("fuse: use timespec64") Signed-off-by: David Sheets <[email protected]> Signed-off-by: Miklos Szeredi <[email protected]> Cc: <[email protected]> # 4.9
2017-01-13vfio/type1: Remove pid_namespace.h includeAlex Williamson1-1/+0
Using has_capability() rather than ns_capable(), we're no longer using this header. Cc: Jike Song <[email protected]> Cc: Kirti Wankhede <[email protected]> Signed-off-by: Alex Williamson <[email protected]>
2017-01-13drm/msm: fix potential null ptr issue in non-iommu caseRob Clark2-3/+4
Fixes: 9cb07b099fb ("drm/msm: support multiple address spaces") Reported-by: Riku Voipio <[email protected]> Signed-off-by: Rob Clark <[email protected]>
2017-01-13drm/msm/mdp5: rip out plane->pending trackingRob Clark3-32/+0
It would race between userspace thread and commit worker. Ie. vblank irq would trigger event and userspace could begin the next atomic update, before the commit worker had a chance to clear the pending flag. If we do end up needing something to prevent userspace from trying another pageflip before getting vblank event, it should probably be implemented as a pending_planes bitmask, similar to pending_crtcs. See start_atomic() and end_atomic(). Signed-off-by: Rob Clark <[email protected]>
2017-01-13Merge remote-tracking branch 'mkp-scsi/4.10/scsi-fixes' into fixesJames Bottomley5-85/+33
2017-01-13mac80211: prevent skb/txq mismatchMichal Kazior1-10/+7
Station structure is considered as not uploaded (to driver) until drv_sta_state() finishes. This call is however done after the structure is attached to mac80211 internal lists and hashes. This means mac80211 can lookup (and use) station structure before it is uploaded to a driver. If this happens (structure exists, but sta->uploaded is false) fast_tx path can still be taken. Deep in the fastpath call the sta->uploaded is checked against to derive "pubsta" argument for ieee80211_get_txq(). If sta->uploaded is false (and sta is actually non-NULL) ieee80211_get_txq() effectively downgraded to vif->txq. At first glance this may look innocent but coerces mac80211 into a state that is almost guaranteed (codel may drop offending skb) to crash because a station-oriented skb gets queued up on vif-oriented txq. The ieee80211_tx_dequeue() ends up looking at info->control.flags and tries to use txq->sta which in the fail case is NULL. It's probably pointless to pretend one can downgrade skb from sta-txq to vif-txq. Since downgrading unicast traffic to vif->txq must not be done there's no txq to put a frame on if sta->uploaded is false. Therefore the code is made to fall back to regular tx() op path if the described condition is hit. Only drivers using wake_tx_queue were affected. Example crash dump before fix: Unable to handle kernel paging request at virtual address ffffe26c PC is at ieee80211_tx_dequeue+0x204/0x690 [mac80211] [<bf4252a4>] (ieee80211_tx_dequeue [mac80211]) from [<bf4b1388>] (ath10k_mac_tx_push_txq+0x54/0x1c0 [ath10k_core]) [<bf4b1388>] (ath10k_mac_tx_push_txq [ath10k_core]) from [<bf4bdfbc>] (ath10k_htt_txrx_compl_task+0xd78/0x11d0 [ath10k_core]) [<bf4bdfbc>] (ath10k_htt_txrx_compl_task [ath10k_core]) [<bf51c5a4>] (ath10k_pci_napi_poll+0x54/0xe8 [ath10k_pci]) [<bf51c5a4>] (ath10k_pci_napi_poll [ath10k_pci]) from [<c0572e90>] (net_rx_action+0xac/0x160) Reported-by: Mohammed Shafi Shajakhan <[email protected]> Signed-off-by: Michal Kazior <[email protected]> Signed-off-by: Johannes Berg <[email protected]>