aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2018-05-15tools/memory-model: Rename link and rcu-path to rcu-link and rbAlan Stern2-54/+55
This patch makes a simple non-functional change to the RCU portion of the Linux Kernel Memory Consistency Model by renaming the "link" and "rcu-path" relations to "rcu-link" and "rb", respectively. The name "link" was an unfortunate choice, because it was too generic and subject to confusion with other meanings of the same word, which occur quite often in LKMM documentation. The name "rcu-path" is not very appropriate, because the relation is analogous to the happens-before (hb) and propagates-before (pb) relations -- although that fact won't become apparent until the second patch in this series. Signed-off-by: Alan Stern <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Acked-by: Andrea Parri <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-05-15locking/spinlocks: Clean up comment and #ifndef for {,queued_}spin_is_locked()Andrea Parri2-5/+0
Removes "#ifndef queued_spin_is_locked" from the generic code: this is unused and it's reasonable to conclude that it will continue to be unused. Also removes the comment about spin_is_locked() from mutex_is_locked(): the comment remains valid but not particularly useful. Suggested-by: Will Deacon <[email protected]> Signed-off-by: Andrea Parri <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Acked-by: Will Deacon <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-05-15locking/spinlocks/arm64: Remove smp_mb() from arch_spin_is_locked()Andrea Parri1-5/+0
The following commit: 38b850a73034f ("arm64: spinlock: order spin_{is_locked,unlock_wait} against local locks") ... added an smp_mb() to arch_spin_is_locked(), in order "to ensure that the lock value is always loaded after any other locks have been taken by the current CPU", and reported one example (the "insane case" in ipc/sem.c) relying on such guarantee. It is however understood that spin_is_locked() is not required to provide such an ordering guarantee (a guarantee that is currently not provided by all the implementations/archs), and that callers relying on such ordering should instead insert suitable memory barriers before acting on the result of spin_is_locked(). Following a recent auditing [1] of the callers of {,raw_}spin_is_locked(), revealing that none of them are relying on the ordering guarantee anymore, this commit removes the leading smp_mb() from the primitive thus reverting 38b850a73034f. [1] https://marc.info/?l=linux-kernel&m=151981440005264&w=2 https://marc.info/?l=linux-kernel&m=152042843808540&w=2 https://marc.info/?l=linux-kernel&m=152043346110262&w=2 Signed-off-by: Andrea Parri <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Acked-by: Will Deacon <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-05-15locking/spinlocks: Document the semantics of spin_is_locked()Andrea Parri1-0/+18
There appeared to be a certain, recurrent uncertainty concerning the semantics of spin_is_locked(), likely a consequence of the fact that this semantics remains undocumented or that it has been historically linked to the (likewise unclear) semantics of spin_unlock_wait(). A recent auditing [1] of the callers of the primitive confirmed that none of them are relying on particular ordering guarantees; document this semantics by adding a docbook header to spin_is_locked(). Also, describe behaviors specific to certain CONFIG_SMP=n builds. [1] https://marc.info/?l=linux-kernel&m=151981440005264&w=2 https://marc.info/?l=linux-kernel&m=152042843808540&w=2 https://marc.info/?l=linux-kernel&m=152043346110262&w=2 Co-Developed-by: Andrea Parri <[email protected]> Co-Developed-by: Alan Stern <[email protected]> Co-Developed-by: David Howells <[email protected]> Signed-off-by: Andrea Parri <[email protected]> Signed-off-by: Alan Stern <[email protected]> Signed-off-by: David Howells <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Acked-by: Randy Dunlap <[email protected]> Cc: Akira Yokosawa <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Boqun Feng <[email protected]> Cc: Jade Alglave <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Luc Maranget <[email protected]> Cc: Nicholas Piggin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-05-15locking/Documentation: Use `warning` RST directiveSeongJae Park1-4/+6
Use the proper RST directive, pointed out by a build warning. Signed-off-by: SeongJae Park <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-05-15locking/Documentation: Fix incorrect example codeSeongJae Park1-2/+1
- Remove a stale line of code - Fix the condition of the READ_ONCE() example Signed-off-by: SeongJae Park <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Acked-by: Alan Stern <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-05-15locking/memory-barriers.txt/kokr: Update Korean translation to de-emphasize ↵SeongJae Park1-9/+18
smp_read_barrier_depends() some more Translate this commit to Korean: f28f0868feb1 ("locking/memory-barriers: De-emphasize smp_read_barrier_depends() some more") Signed-off-by: SeongJae Park <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-05-15locking/memory-barriers.txt/kokr: Update Korean translation to fix ↵SeongJae Park1-2/+2
description of data dependency barriers Translate this commit to Korean: 51de78892b12 ("memory-barriers: Fix description of data dependency barriers") Signed-off-by: SeongJae Park <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-05-15locking/memory-barriers.txt/kokr: Update Korean translation to ↵SeongJae Park1-0/+3
cross-reference "tools/memory-model/" Translate this commit to Korean: 621df431b0ac ("Documentation/memory-barriers.txt: Cross-reference "tools/memory-model/"") Signed-off-by: SeongJae Park <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-05-15locking/memory-barriers.txt/kokr: Update Korean translation to de-emphasize ↵SeongJae Park1-2/+5
smp_read_barrier_depends() Translate this commit to Korean: 9ad3c143d7d6 ("doc: De-emphasize smp_read_barrier_depends") Signed-off-by: SeongJae Park <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-05-15locking/memory-barriers.txt/kokr: Update Korean translation to indicate that ↵SeongJae Park1-6/+9
READ_ONCE() now implies smp_barrier_depends() Translate this commit to Korean: 40555946447a ("doc: READ_ONCE() now implies smp_barrier_depends()") Signed-off-by: SeongJae Park <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Will Deacon <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-05-15locking/memory-barriers.txt: Fix broken DMA vs. MMIO ordering exampleWill Deacon1-8/+9
The section of memory-barriers.txt that describes the dma_Xmb() barriers has an incorrect example claiming that a wmb() is required after writing to coherent memory in order for those writes to be visible to a device before a subsequent MMIO access using writel() can reach the device. In fact, this ordering guarantee is provided (at significant cost on some architectures such as arm and power) by writel, so the wmb() is not necessary. writel_relaxed exists for cases where this ordering is not required. Fix the example and update the text to make this clearer. Reported-by: Sinan Kaya <[email protected]> Signed-off-by: Will Deacon <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-05-15Merge tag 'v4.17-rc5' into locking/core, to pick up fixesIngo Molnar492-1928/+3627
Signed-off-by: Ingo Molnar <[email protected]>
2018-05-14locking/lockdep: Move sanity check to inside lockdep_print_held_locks()Tetsuo Handa1-16/+13
Calling lockdep_print_held_locks() on a running thread is considered unsafe. Since all callers should follow that rule and the sanity check is not heavy, this patch moves the sanity check to inside lockdep_print_held_locks(). As a side effect of this patch, the number of locks held by running threads will be printed as well. This change will be preferable when we want to know which threads might be relevant to a problem but are unable to print any clues because that thread is running. Signed-off-by: Tetsuo Handa <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/1523011279-8206-2-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp Signed-off-by: Ingo Molnar <[email protected]>
2018-05-14locking/lockdep: Use for_each_process_thread() for debug_show_all_locks()Tetsuo Handa1-35/+8
debug_show_all_locks() tries to grab the tasklist_lock for two seconds, but calling while_each_thread() without tasklist_lock held is not safe. See the following commit for more information: 4449a51a7c281602 ("vm_is_stack: use for_each_thread() rather then buggy while_each_thread()") Change debug_show_all_locks() from "do_each_thread()/while_each_thread() with possibility of missing tasklist_lock" to "for_each_process_thread() with RCU", and add a call to touch_all_softlockup_watchdogs() like show_state_filter() does. Signed-off-by: Tetsuo Handa <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/1523011279-8206-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp Signed-off-by: Ingo Molnar <[email protected]>
2018-05-13Linux 4.17-rc5Linus Torvalds1-1/+1
2018-05-13Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds14-78/+153
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/pti updates from Thomas Gleixner: "A mixed bag of fixes and updates for the ghosts which are hunting us. The scheduler fixes have been pulled into that branch to avoid conflicts. - A set of fixes to address a khread_parkme() race which caused lost wakeups and loss of state. - A deadlock fix for stop_machine() solved by moving the wakeups outside of the stopper_lock held region. - A set of Spectre V1 array access restrictions. The possible problematic spots were discuvered by Dan Carpenters new checks in smatch. - Removal of an unused file which was forgotten when the rest of that functionality was removed" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/vdso: Remove unused file perf/x86/cstate: Fix possible Spectre-v1 indexing for pkg_msr perf/x86/msr: Fix possible Spectre-v1 indexing in the MSR driver perf/x86: Fix possible Spectre-v1 indexing for x86_pmu::event_map() perf/x86: Fix possible Spectre-v1 indexing for hw_perf_event cache_* perf/core: Fix possible Spectre-v1 indexing for ->aux_pages[] sched/autogroup: Fix possible Spectre-v1 indexing for sched_prio_to_weight[] sched/core: Fix possible Spectre-v1 indexing for sched_prio_to_weight[] sched/core: Introduce set_special_state() kthread, sched/wait: Fix kthread_parkme() completion issue kthread, sched/wait: Fix kthread_parkme() wait-loop sched/fair: Fix the update of blocked load when newly idle stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock
2018-05-13Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds1-56/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fix from Thomas Gleixner: "Revert the new NUMA aware placement approach which turned out to create more problems than it solved" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Revert "sched/numa: Delay retrying placement for automatic NUMA balance after wake_affine()"
2018-05-13Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds7-6/+25
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf tooling fixes from Thomas Gleixner: "Another small set of perf tooling fixes and updates: - Revert "perf pmu: Fix pmu events parsing rule", as it broke Intel PT event description parsing (Arnaldo Carvalho de Melo) - Sync x86's cpufeatures.h and kvm UAPI headers with the kernel sources, suppressing the ABI drift warnings (Arnaldo Carvalho de Melo) - Remove duplicated entry for westmereep-dp in Intel's mapfile.csv (William Cohen) - Fix typo in 'perf bench numa' options description (Yisheng Xie)" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Revert "perf pmu: Fix pmu events parsing rule" tools headers kvm: Sync ARM UAPI headers with the kernel sources tools headers kvm: Sync uapi/linux/kvm.h with the kernel sources tools headers: Sync x86 cpufeatures.h with the kernel sources perf vendor events intel: Remove duplicated entry for westmereep-dp in mapfile.csv perf bench numa: Fix typo in options
2018-05-13Merge tag 'dma-mapping-4.17-5' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds1-1/+1
Pull dma-mapping fix from Christoph Hellwig: "Just one little fix from Jean to avoid a harmless but very annoying warning, especially for the drm code" * tag 'dma-mapping-4.17-5' of git://git.infradead.org/users/hch/dma-mapping: swiotlb: silent unwanted warning "buffer is full"
2018-05-12Merge tag '4.17-rc4-SMB3-Fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds4-43/+57
Pull cifs fixes from Steve French: "Some small SMB3 fixes for 4.17-rc5, some for stable" * tag '4.17-rc4-SMB3-Fixes' of git://git.samba.org/sfrench/cifs-2.6: smb3: directory sync should not return an error cifs: smb2ops: Fix listxattr() when there are no EAs cifs: smbd: Enable signing with smbdirect cifs: Allocate validate negotiation request through kmalloc
2018-05-12Merge branch 'next' of ↵Linus Torvalds2-5/+12
git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux Pull thermal fixes from Zhang Rui: - fix NULL pointer dereference on module load/probe for int3403_thermal driver - fix an emergency shutdown issue on exynos thermal driver * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux: thermal: exynos: Propagate error value from tmu_read() thermal: exynos: Reading temperature makes sense only when TMU is turned on thermal: int3403_thermal: Fix NULL pointer deref on module load / probe
2018-05-12Merge tag 'for-linus-20180511' of git://git.kernel.dk/linux-blockLinus Torvalds3-2/+23
Pull block fixes from Jens Axboe: "Just a few NVMe fixes this round - one fixing a use-after-free, one fixes the return value after controller reset, and the last one fixes an issue where some drives will spuriously EIO. We should get these into 4.17" * tag 'for-linus-20180511' of git://git.kernel.dk/linux-block: nvme: add quirk to force medium priority for SQ creation nvme: Fix sync controller reset return nvme: fix use-after-free in nvme_free_ns_head
2018-05-12swiotlb: silent unwanted warning "buffer is full"Jean Delvare1-1/+1
If DMA_ATTR_NO_WARN is passed to swiotlb_alloc_buffer(), it should be passed further down to swiotlb_tbl_map_single(). Otherwise we escape half of the warnings but still log the other half. This is one of the multiple causes of spurious warnings reported at: https://bugs.freedesktop.org/show_bug.cgi?id=104082 Signed-off-by: Jean Delvare <[email protected]> Fixes: 0176adb00406 ("swiotlb: refactor coherent buffer allocation") Cc: Christoph Hellwig <[email protected]> Cc: Christian König <[email protected]> Cc: Michel Dänzer <[email protected]> Cc: Takashi Iwai <[email protected]> Cc: [email protected] # v4.16
2018-05-12Revert "sched/numa: Delay retrying placement for automatic NUMA balance ↵Mel Gorman1-56/+1
after wake_affine()" This reverts commit 7347fc87dfe6b7315e74310ee1243dc222c68086. Srikar Dronamra pointed out that while the commit in question did show a performance improvement on ppc64, it did so at the cost of disabling active CPU migration by automatic NUMA balancing which was not the intent. The issue was that a serious flaw in the logic failed to ever active balance if SD_WAKE_AFFINE was disabled on scheduler domains. Even when it's enabled, the logic is still bizarre and against the original intent. Investigation showed that fixing the patch in either the way he suggested, using the correct comparison for jiffies values or introducing a new numa_migrate_deferred variable in task_struct all perform similarly to a revert with a mix of gains and losses depending on the workload, machine and socket count. The original intent of the commit was to handle a problem whereby wake_affine, idle balancing and automatic NUMA balancing disagree on the appropriate placement for a task. This was particularly true for cases where a single task was a massive waker of tasks but where wake_wide logic did not apply. This was particularly noticeable when a futex (a barrier) woke all worker threads and tried pulling the wakees to the waker nodes. In that specific case, it could be handled by tuning MPI or openMP appropriately, but the behavior is not illogical and was worth attempting to fix. However, the approach was wrong. Given that we're at rc4 and a fix is not obvious, it's better to play safe, revert this commit and retry later. Signed-off-by: Mel Gorman <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Srikar Dronamraju <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2018-05-11Merge branch 'akpm' (patches from Andrew)Linus Torvalds17-87/+164
Merge misc fixes from Andrew Morton: "13 fixes" * emailed patches from Andrew Morton <[email protected]>: rbtree: include rcu.h scripts/faddr2line: fix error when addr2line output contains discriminator ocfs2: take inode cluster lock before moving reflinked inode from orphan dir mm, oom: fix concurrent munlock and oom reaper unmap, v3 mm: migrate: fix double call of radix_tree_replace_slot() proc/kcore: don't bounds check against address 0 mm: don't show nr_indirectly_reclaimable in /proc/vmstat mm: sections are not offlined during memory hotremove z3fold: fix reclaim lock-ups init: fix false positives in W+X checking lib/find_bit_benchmark.c: avoid soft lockup in test_find_first_bit() KASAN: prohibit KASAN+STRUCTLEAK combination MAINTAINERS: update Shuah's email address
2018-05-11rbtree: include rcu.hSebastian Andrzej Siewior2-0/+2
Since commit c1adf20052d8 ("Introduce rb_replace_node_rcu()") rbtree_augmented.h uses RCU related data structures but does not include the header file. It works as long as it gets somehow included before that and fails otherwise. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: David Howells <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-05-11scripts/faddr2line: fix error when addr2line output contains discriminatorChangbin Du1-1/+4
When addr2line output contains discriminator, the current awk script cannot parse it. This patch fixes it by extracting key words using regex which is more reliable. $ scripts/faddr2line vmlinux tlb_flush_mmu_free+0x26 tlb_flush_mmu_free+0x26/0x50: tlb_flush_mmu_free at mm/memory.c:258 (discriminator 3) scripts/faddr2line: eval: line 173: unexpected EOF while looking for matching `)' Link: http://lkml.kernel.org/r/[email protected] Fixes: 6870c0165feaa5 ("scripts/faddr2line: show the code context") Signed-off-by: Changbin Du <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Philippe Ombredanne <[email protected]> Cc: NeilBrown <[email protected]> Cc: Richard Weinberger <[email protected]> Cc: Kate Stewart <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-05-11ocfs2: take inode cluster lock before moving reflinked inode from orphan dirAshish Samant1-2/+12
While reflinking an inode, we create a new inode in orphan directory, then take EX lock on it, reflink the original inode to orphan inode and release EX lock. Once the lock is released another node could request it in EX mode from ocfs2_recover_orphans() which causes downconvert of the lock, on this node, to NL mode. Later we attempt to initialize security acl for the orphan inode and move it to the reflink destination. However, while doing this we dont take EX lock on the inode. This could potentially cause problems because we could be starting transaction, accessing journal and modifying metadata of the inode while holding NL lock and with another node holding EX lock on the inode. Fix this by taking orphan inode cluster lock in EX mode before initializing security and moving orphan inode to reflink destination. Use the __tracker variant while taking inode lock to avoid recursive locking in the ocfs2_init_security_and_acl() call chain. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ashish Samant <[email protected]> Reviewed-by: Joseph Qi <[email protected]> Reviewed-by: Junxiao Bi <[email protected]> Acked-by: Jun Piao <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Changwei Ge <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-05-11mm, oom: fix concurrent munlock and oom reaper unmap, v3David Rientjes3-56/+71
Since exit_mmap() is done without the protection of mm->mmap_sem, it is possible for the oom reaper to concurrently operate on an mm until MMF_OOM_SKIP is set. This allows munlock_vma_pages_all() to concurrently run while the oom reaper is operating on a vma. Since munlock_vma_pages_range() depends on clearing VM_LOCKED from vm_flags before actually doing the munlock to determine if any other vmas are locking the same memory, the check for VM_LOCKED in the oom reaper is racy. This is especially noticeable on architectures such as powerpc where clearing a huge pmd requires serialize_against_pte_lookup(). If the pmd is zapped by the oom reaper during follow_page_mask() after the check for pmd_none() is bypassed, this ends up deferencing a NULL ptl or a kernel oops. Fix this by manually freeing all possible memory from the mm before doing the munlock and then setting MMF_OOM_SKIP. The oom reaper can not run on the mm anymore so the munlock is safe to do in exit_mmap(). It also matches the logic that the oom reaper currently uses for determining when to set MMF_OOM_SKIP itself, so there's no new risk of excessive oom killing. This issue fixes CVE-2018-1000200. Link: http://lkml.kernel.org/r/[email protected] Fixes: 212925802454 ("mm: oom: let oom_reap_task and exit_mmap run concurrently") Signed-off-by: David Rientjes <[email protected]> Suggested-by: Tetsuo Handa <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: <[email protected]> [4.14+] Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-05-11mm: migrate: fix double call of radix_tree_replace_slot()Naoya Horiguchi1-3/+1
radix_tree_replace_slot() is called twice for head page, it's obviously a bug. Let's fix it. Link: http://lkml.kernel.org/r/[email protected] Fixes: e71769ae5260 ("mm: enable thp migration for shmem thp") Signed-off-by: Naoya Horiguchi <[email protected]> Reported-by: Matthew Wilcox <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Michal Hocko <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: Zi Yan <[email protected]> Cc: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-05-11proc/kcore: don't bounds check against address 0Laura Abbott1-7/+16
The existing kcore code checks for bad addresses against __va(0) with the assumption that this is the lowest address on the system. This may not hold true on some systems (e.g. arm64) and produce overflows and crashes. Switch to using other functions to validate the address range. It's currently only seen on arm64 and it's not clear if anyone wants to use that particular combination on a stable release. So this is not urgent for stable. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Laura Abbott <[email protected]> Tested-by: Dave Anderson <[email protected]> Cc: Kees Cook <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Alexey Dobriyan <[email protected]>a Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-05-11mm: don't show nr_indirectly_reclaimable in /proc/vmstatRoman Gushchin1-1/+5
Don't show nr_indirectly_reclaimable in /proc/vmstat, because there is no need to export this vm counter to userspace, and some changes are expected in reclaimable object accounting, which can alter this counter. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Roman Gushchin <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Alexander Viro <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: David Rientjes <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-05-11mm: sections are not offlined during memory hotremovePavel Tatashin1-1/+1
Memory hotplug and hotremove operate with per-block granularity. If the machine has a large amount of memory (more than 64G), the size of a memory block can span multiple sections. By mistake, during hotremove we set only the first section to offline state. The bug was discovered because kernel selftest started to fail: https://lkml.kernel.org/r/20180423011247.GK5563@yexl-desktop After commit, "mm/memory_hotplug: optimize probe routine". But, the bug is older than this commit. In this optimization we also added a check for sections to be in a proper state during hotplug operation. Link: http://lkml.kernel.org/r/[email protected] Fixes: 2d070eab2e82 ("mm: consider zone which is not fully populated to have holes") Signed-off-by: Pavel Tatashin <[email protected]> Acked-by: Michal Hocko <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Steven Sistare <[email protected]> Cc: Daniel Jordan <[email protected]> Cc: "Kirill A. Shutemov" <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-05-11z3fold: fix reclaim lock-upsVitaly Wool1-12/+30
Do not try to optimize in-page object layout while the page is under reclaim. This fixes lock-ups on reclaim and improves reclaim performance at the same time. [[email protected]: coding-style fixes] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Vitaly Wool <[email protected]> Reported-by: Guenter Roeck <[email protected]> Tested-by: Guenter Roeck <[email protected]> Cc: <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-05-11init: fix false positives in W+X checkingJeffrey Hugo2-0/+12
load_module() creates W+X mappings via __vmalloc_node_range() (from layout_and_allocate()->move_module()->module_alloc()) by using PAGE_KERNEL_EXEC. These mappings are later cleaned up via "call_rcu_sched(&freeinit->rcu, do_free_init)" from do_init_module(). This is a problem because call_rcu_sched() queues work, which can be run after debug_checkwx() is run, resulting in a race condition. If hit, the race results in a nasty splat about insecure W+X mappings, which results in a poor user experience as these are not the mappings that debug_checkwx() is intended to catch. This issue is observed on multiple arm64 platforms, and has been artificially triggered on an x86 platform. Address the race by flushing the queued work before running the arch-defined mark_rodata_ro() which then calls debug_checkwx(). Link: http://lkml.kernel.org/r/[email protected] Fixes: e1a58320a38d ("x86/mm: Warn on W^X mappings") Signed-off-by: Jeffrey Hugo <[email protected]> Reported-by: Timur Tabi <[email protected]> Reported-by: Jan Glauber <[email protected]> Acked-by: Kees Cook <[email protected]> Acked-by: Ingo Molnar <[email protected]> Acked-by: Will Deacon <[email protected]> Acked-by: Laura Abbott <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Stephen Smalley <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Peter Zijlstra <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-05-11lib/find_bit_benchmark.c: avoid soft lockup in test_find_first_bit()Yury Norov1-1/+6
test_find_first_bit() is intentionally sub-optimal, and may cause soft lockup due to long time of run on some systems. So decrease length of bitmap to traverse to avoid lockup. With the change below, time of test execution doesn't exceed 0.2 seconds on my testing system. Link: http://lkml.kernel.org/r/[email protected] Fixes: 4441fca0a27f5 ("lib: test module for find_*_bit() functions") Signed-off-by: Yury Norov <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Reported-by: Fengguang Wu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-05-11KASAN: prohibit KASAN+STRUCTLEAK combinationDmitry Vyukov1-0/+4
Currently STRUCTLEAK inserts initialization out of live scope of variables from KASAN point of view. This leads to KASAN false positive reports. Prohibit this combination for now. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Dmitry Vyukov <[email protected]> Acked-by: Kees Cook <[email protected]> Cc: Fengguang Wu <[email protected]> Cc: Sergey Senozhatsky <[email protected]> Cc: Andrey Ryabinin <[email protected]> Cc: Dennis Zhou <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-05-11MAINTAINERS: update Shuah's email addressShuah Khan (Samsung OSG)1-3/+0
Update email address in MAINTAINERS file due to IT infrastructure changes at Samsung. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Shuah Khan (Samsung OSG) <[email protected]> Cc: Mauro Carvalho Chehab <[email protected]> Cc: David S. Miller <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Linus Walleij <[email protected]> Cc: Randy Dunlap <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-05-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds106-291/+714
Pull networking fixes from David Miller: 1) Verify lengths of keys provided by the user is AF_KEY, from Kevin Easton. 2) Add device ID for BCM89610 PHY. Thanks to Bhadram Varka. 3) Add Spectre guards to some ATM code, courtesy of Gustavo A. R. Silva. 4) Fix infinite loop in NSH protocol code. To Eric Dumazet we are most grateful for this fix. 5) Line up /proc/net/netlink headers properly. This fix from YU Bo, we do appreciate. 6) Use after free in TLS code. Once again we are blessed by the honorable Eric Dumazet with this fix. 7) Fix regression in TLS code causing stalls on partial TLS records. This fix is bestowed upon us by Andrew Tomt. 8) Deal with too small MTUs properly in LLC code, another great gift from Eric Dumazet. 9) Handle cached route flushing properly wrt. MTU locking in ipv4, to Hangbin Liu we give thanks for this. 10) Fix regression in SO_BINDTODEVIC handling wrt. UDP socket demux. Paolo Abeni, he gave us this. 11) Range check coalescing parameters in mlx4 driver, thank you Moshe Shemesh. 12) Some ipv6 ICMP error handling fixes in rxrpc, from our good brother David Howells. 13) Fix kexec on mlx5 by freeing IRQs in shutdown path. Daniel Juergens, you're the best! 14) Don't send bonding RLB updates to invalid MAC addresses. Debabrata Benerjee saved us! 15) Uh oh, we were leaking in udp_sendmsg and ping_v4_sendmsg. The ship is now water tight, thanks to Andrey Ignatov. 16) IPSEC memory leak in ixgbe from Colin Ian King, man we've got holes everywhere! 17) Fix error path in tcf_proto_create, Jiri Pirko what would we do without you! * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (92 commits) net sched actions: fix refcnt leak in skbmod net: sched: fix error path in tcf_proto_create() when modules are not configured net sched actions: fix invalid pointer dereferencing if skbedit flags missing ixgbe: fix memory leak on ipsec allocation ixgbevf: fix ixgbevf_xmit_frame()'s return type ixgbe: return error on unsupported SFP module when resetting ice: Set rq_last_status when cleaning rq ipv4: fix memory leaks in udp_sendmsg, ping_v4_sendmsg mlxsw: core: Fix an error handling path in 'mlxsw_core_bus_device_register()' bonding: send learning packets for vlans on slave bonding: do not allow rlb updates to invalid mac net/mlx5e: Err if asked to offload TC match on frag being first net/mlx5: E-Switch, Include VF RDMA stats in vport statistics net/mlx5: Free IRQs in shutdown path rxrpc: Trace UDP transmission failure rxrpc: Add a tracepoint to log ICMP/ICMP6 and error messages rxrpc: Fix the min security level for kernel calls rxrpc: Fix error reception on AF_INET6 sockets rxrpc: Fix missing start of call timeout qed: fix spelling mistake: "taskelt" -> "tasklet" ...
2018-05-11Merge tag 'nfs-for-4.17-2' of git://git.linux-nfs.org/projects/anna/linux-nfsLinus Torvalds6-22/+17
Pull NFS client fixes from Anna Schumaker: "These patches fix both a possible corruption during NFSoRDMA MR recovery, and a sunrpc tracepoint crash. Additionally, Trond has a new email address to put in the MAINTAINERS file" * tag 'nfs-for-4.17-2' of git://git.linux-nfs.org/projects/anna/linux-nfs: Change Trond's email address in MAINTAINERS sunrpc: Fix latency trace point crashes xprtrdma: Fix list corruption / DMAR errors during MR recovery
2018-05-11net sched actions: fix refcnt leak in skbmodRoman Mashak1-1/+4
When application fails to pass flags in netlink TLV when replacing existing skbmod action, the kernel will leak refcnt: $ tc actions get action skbmod index 1 total acts 0 action order 0: skbmod pipe set smac 00:11:22:33:44:55 index 1 ref 1 bind 0 For example, at this point a buggy application replaces the action with index 1 with new smac 00:aa:22:33:44:55, it fails because of zero flags, however refcnt gets bumped: $ tc actions get actions skbmod index 1 total acts 0 action order 0: skbmod pipe set smac 00:11:22:33:44:55 index 1 ref 2 bind 0 $ Tha patch fixes this by calling tcf_idr_release() on existing actions. Fixes: 86da71b57383d ("net_sched: Introduce skbmod action") Signed-off-by: Roman Mashak <[email protected]> Acked-by: Cong Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-05-11Merge tag 'ceph-for-4.17-rc5' of git://github.com/ceph/ceph-clientLinus Torvalds4-90/+158
Pull ceph fixes from Ilya Dryomov: "These patches fix two long-standing bugs in the DIO code path, one of which is a crash trivially triggerable with splice()" * tag 'ceph-for-4.17-rc5' of git://github.com/ceph/ceph-client: ceph: fix iov_iter issues in ceph_direct_read_write() libceph: add osd_req_op_extent_osd_data_bvecs() ceph: fix rsize/wsize capping in ceph_direct_read_write()
2018-05-11net: sched: fix error path in tcf_proto_create() when modules are not configuredJiri Pirko1-1/+1
In case modules are not configured, error out when tp->ops is null and prevent later null pointer dereference. Fixes: 33a48927c193 ("sched: push TC filter protocol creation into a separate function") Signed-off-by: Jiri Pirko <[email protected]> Acked-by: Cong Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2018-05-11Merge tag 'sh-for-4.17-fixes' of git://git.libc.org/linux-shLinus Torvalds6-85/+19
Pull arch/sh fixes from Rich Felker: "Fixes for critical regressions and a build failure. The regressions were introduced in 4.15 and 4.17-rc1 and prevented booting on affected systems" * tag 'sh-for-4.17-fixes' of git://git.libc.org/linux-sh: sh: switch to NO_BOOTMEM sh: mm: Fix unprotected access to struct device sh: fix build failure for J2 cpu with SMP disabled
2018-05-11Merge tag 'arm64-fixes' of ↵Linus Torvalds3-1/+10
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "There's a small memblock accounting problem when freeing the initrd and a Spectre-v2 mitigation for NVIDIA Denver CPUs which just requires a match on the CPU ID register. Summary: - Mitigate Spectre-v2 for NVIDIA Denver CPUs - Free memblocks corresponding to freed initrd area" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: capabilities: Add NVIDIA Denver CPU to bp_harden list arm64: Add MIDR encoding for NVIDIA CPUs arm64: To remove initrd reserved area entry from memblock
2018-05-11Merge tag 'powerpc-4.17-5' of ↵Linus Torvalds3-17/+26
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "One fix for an actual regression, the change to the SYSCALL_DEFINE wrapper broke FTRACE_SYSCALLS for us due to a name mismatch. There's also another commit to the same code to make sure we match all our syscalls with various prefixes. And then just one minor build fix, and the removal of an unused variable that was removed and then snuck back in due to some rebasing. Thanks to: Naveen N. Rao" * tag 'powerpc-4.17-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/pseries: Fix CONFIG_NUMA=n build powerpc/trace/syscalls: Update syscall name matching logic to account for ppc_ prefix powerpc/trace/syscalls: Update syscall name matching logic powerpc/64: Remove unused paca->soft_enabled
2018-05-11Merge tag 'trace-v4.17-rc4' of ↵Linus Torvalds1-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fix from Steven Rostedt: "Working on some new updates to trace filtering, I noticed that the regex_match_front() test was updated to be limited to the size of the pattern instead of the full test string. But as the test string is not guaranteed to be nul terminated, it still needs to consider the size of the test string" * tag 'trace-v4.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Fix regex_match_front() to not over compare the test string
2018-05-11Merge branch '10GbE' of ↵David S. Miller4-3/+6
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2018-05-11 This series contains fixes to the ice, ixgbe and ixgbevf drivers. Jeff Shaw provides a fix to ensure rq_last_status gets set, whether or not the hardware responds with an error in the ice driver. Emil adds a check for unsupported module during the reset routine for ixgbe. Luc Van Oostenryck fixes ixgbevf_xmit_frame() where it was not using the correct return value (int). Colin Ian King fixes a potential resource leak in ixgbe, where we were not freeing ipsec in our cleanup path. ==================== Signed-off-by: David S. Miller <[email protected]>
2018-05-11Merge tag 'rxrpc-fixes-20180510' of ↵David S. Miller11-48/+209
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== rxrpc: Fixes Here are three fixes for AF_RXRPC and two tracepoints that were useful for finding them: (1) Fix missing start of expect-Rx-by timeout on initial packet transmission so that calls will time out if the peer doesn't respond. (2) Fix error reception on AF_INET6 sockets by using the correct family of sockopts on the UDP transport socket. (3) Fix setting the minimum security level on kernel calls so that they can be encrypted. (4) Add a tracepoint to log ICMP/ICMP6 and other error reports from the transport socket. (5) Add a tracepoint to log UDP sendmsg failure so that we can find out if transmission failure occurred on the UDP socket. ==================== Signed-off-by: David S. Miller <[email protected]>