aboutsummaryrefslogtreecommitdiff
path: root/lib
AgeCommit message (Collapse)AuthorFilesLines
2017-02-13radix_tree_iter_resume: Fix out of bounds errorMatthew Wilcox1-1/+0
The address sanitizer occasionally finds an out of bounds error while running the test-suite. It turned out to be a read of the pointer immediately next to the tree root, but this out of bounds error could have occurred elsewhere. This happens because radix_tree_iter_resume() dereferences 'slot' before checking whether we've come to the end of the chunk. We can just delete this line; the value was never used. Signed-off-by: Matthew Wilcox <[email protected]>
2017-02-13radix-tree: Store a pointer to the root in each nodeMatthew Wilcox1-6/+8
Instead of having this mysterious private_data in each radix_tree_node, store a pointer to the root, which can be useful for debugging. This also relieves the mm code from the duty of updating it. Acked-by: Johannes Weiner <[email protected]> Signed-off-by: Matthew Wilcox <[email protected]>
2017-02-13radix-tree: Chain preallocated nodes through ->parentMatthew Wilcox1-5/+4
Chaining through the ->private_data member means we have to zero ->private_data after removing preallocated nodes from the list. We're about to initialise ->parent anyway, so we can avoid zeroing it. Signed-off-by: Matthew Wilcox <[email protected]>
2017-02-13ida: Use exceptional entries for small IDAsMatthew Wilcox2-6/+89
We can use the root entry as a bitmap and save allocating a 128 byte bitmap for an IDA that contains only a few entries (30 on a 32-bit machine, 62 on a 64-bit machine). This costs about 300 bytes of kernel text on x86-64, so as long as 3 IDAs fall into this category, this is a net win for memory consumption. Thanks to Rasmus Villemoes for his work documenting the problem and collecting statistics on IDAs. Signed-off-by: Matthew Wilcox <[email protected]>
2017-02-13ida: Move ida_bitmap to a percpu variableMatthew Wilcox2-40/+44
When we preload the IDA, we allocate an IDA bitmap. Instead of storing that preallocated bitmap in the IDA, we store it in a percpu variable. Generally there are more IDAs in the system than CPUs, so this cuts down on the number of preallocated bitmaps that are unused, and about half of the IDA users did not call ida_destroy() so they were leaking IDA bitmaps. Signed-off-by: Matthew Wilcox <[email protected]>
2017-02-13Reimplement IDR and IDA using the radix treeMatthew Wilcox2-1034/+519
The IDR is very similar to the radix tree. It has some functionality that the radix tree did not have (alloc next free, cyclic allocation, a callback-based for_each, destroy tree), which is readily implementable on top of the radix tree. A few small changes were needed in order to use a tag to represent nodes with free space below them. More extensive changes were needed to support storing NULL as a valid entry in an IDR. Plain radix trees still interpret NULL as a not-present entry. The IDA is reimplemented as a client of the newly enhanced radix tree. As in the current implementation, it uses a bitmap at the last level of the tree. Signed-off-by: Matthew Wilcox <[email protected]> Signed-off-by: Matthew Wilcox <[email protected]> Tested-by: Kirill A. Shutemov <[email protected]> Cc: Konstantin Khlebnikov <[email protected]> Cc: Ross Zwisler <[email protected]> Cc: Tejun Heo <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2017-02-13radix-tree: Add radix_tree_iter_deleteMatthew Wilcox1-31/+60
Factor the deletion code out into __radix_tree_delete() and provide a nice iterator-based wrapper around it. If we free the node, advance the iterator to avoid reading from freed memory. Signed-off-by: Matthew Wilcox <[email protected]>
2017-02-13radix-tree: Add radix_tree_iter_tag_clear()Matthew Wilcox1-28/+40
The counterpart to radix_tree_iter_tag_set(), used by the IDR code Signed-off-by: Matthew Wilcox <[email protected]> Reviewed-by: Rehas Sachdeva <[email protected]>
2017-02-13EXPORT_SYMBOL radix_tree_replace_slotSong Liu1-0/+1
It will be used in drivers/md/raid5-cache.c Signed-off-by: Song Liu <[email protected]> Signed-off-by: Shaohua Li <[email protected]>
2017-02-10time: Remove CONFIG_TIMER_STATSKees Cook1-14/+0
Currently CONFIG_TIMER_STATS exposes process information across namespaces: kernel/time/timer_list.c print_timer(): SEQ_printf(m, ", %s/%d", tmp, timer->start_pid); /proc/timer_list: #11: <0000000000000000>, hrtimer_wakeup, S:01, do_nanosleep, cron/2570 Given that the tracer can give the same information, this patch entirely removes CONFIG_TIMER_STATS. Suggested-by: Thomas Gleixner <[email protected]> Signed-off-by: Kees Cook <[email protected]> Acked-by: John Stultz <[email protected]> Cc: Nicolas Pitre <[email protected]> Cc: [email protected] Cc: Lai Jiangshan <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Xing Gao <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Jessica Frazelle <[email protected]> Cc: [email protected] Cc: Nicolas Iooss <[email protected]> Cc: "Paul E. McKenney" <[email protected]> Cc: Petr Mladek <[email protected]> Cc: Richard Cochran <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Michal Marek <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Dmitry Vyukov <[email protected]> Cc: Oleg Nesterov <[email protected]> Cc: "Eric W. Biederman" <[email protected]> Cc: Olof Johansson <[email protected]> Cc: Andrew Morton <[email protected]> Cc: [email protected] Cc: Arjan van de Ven <[email protected]> Link: http://lkml.kernel.org/r/20170208192659.GA32582@beast Signed-off-by: Thomas Gleixner <[email protected]>
2017-02-10debugobjects: Improve variable namingWaiman Long1-5/+5
As suggested by Ingo, the debug_objects_alloc counter is now renamed to debug_objects_allocated with minor twist in comment and debug output. Signed-off-by: Waiman Long <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-02-10refcount_t: Introduce a special purpose refcount typePeter Zijlstra1-0/+13
Provide refcount_t, an atomic_t like primitive built just for refcounting. It provides saturation semantics such that overflow becomes impossible and thereby 'spurious' use-after-free is avoided. Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Signed-off-by: Ingo Molnar <[email protected]>
2017-02-08printk: rename nmi.c and exported apiSergey Senozhatsky1-1/+1
A preparation patch for printk_safe work. No functional change. - rename nmi.c to print_safe.c - add `printk_safe' prefix to some (which used both by printk-safe and printk-nmi) of the exported functions. Link: http://lkml.kernel.org/r/[email protected] Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Jan Kara <[email protected]> Cc: Tejun Heo <[email protected]> Cc: Calvin Owens <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Peter Hurley <[email protected]> Cc: [email protected] Signed-off-by: Sergey Senozhatsky <[email protected]> Signed-off-by: Petr Mladek <[email protected]>
2017-02-06Merge 4.10-rc7 into char-misc-nextGreg Kroah-Hartman3-5/+4
We want the hv and other fixes in here as well to handle merge and testing issues. Signed-off-by: Greg Kroah-Hartman <[email protected]>
2017-02-05debugobjects: Reduce contention on the global pool_lockWaiman Long1-9/+23
On a large SMP system with many CPUs, the global pool_lock may become a performance bottleneck as all the CPUs that need to allocate or free debug objects have to take the lock. That can sometimes cause soft lockups like: NMI watchdog: BUG: soft lockup - CPU#35 stuck for 22s! [rcuos/1:21] ... RIP: 0010:[<ffffffff817c216b>] [<ffffffff817c216b>] _raw_spin_unlock_irqrestore+0x3b/0x60 ... Call Trace: [<ffffffff813f40d1>] free_object+0x81/0xb0 [<ffffffff813f4f33>] debug_check_no_obj_freed+0x193/0x220 [<ffffffff81101a59>] ? trace_hardirqs_on_caller+0xf9/0x1c0 [<ffffffff81284996>] ? file_free_rcu+0x36/0x60 [<ffffffff81251712>] kmem_cache_free+0xd2/0x380 [<ffffffff81284960>] ? fput+0x90/0x90 [<ffffffff81284996>] file_free_rcu+0x36/0x60 [<ffffffff81124c23>] rcu_nocb_kthread+0x1b3/0x550 [<ffffffff81124b71>] ? rcu_nocb_kthread+0x101/0x550 [<ffffffff81124a70>] ? sync_exp_work_done.constprop.63+0x50/0x50 [<ffffffff810c59d1>] kthread+0x101/0x120 [<ffffffff81101a59>] ? trace_hardirqs_on_caller+0xf9/0x1c0 [<ffffffff817c2d32>] ret_from_fork+0x22/0x50 To reduce the amount of contention on the pool_lock, the actual kmem_cache_free() of the debug objects will be delayed if the pool_lock is busy. This will temporarily increase the amount of free objects available at the free pool when the system is busy. As a result, the number of kmem_cache allocation and freeing is reduced. To further reduce the lock operations free debug objects in batches of four. Signed-off-by: Waiman Long <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: "Du Changbin" <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Jan Stancek <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2017-02-04debugobjects: Scale thresholds with # of CPUsWaiman Long1-5/+15
On a large SMP systems with hundreds of CPUs, the current thresholds for allocating and freeing debug objects (256 and 1024 respectively) may not work well. This can cause a lot of needless calls to kmem_aloc() and kmem_free() on those systems. To alleviate this thrashing problem, the object freeing threshold is now increased to "1024 + # of CPUs * 32". Whereas the object allocation threshold is increased to "256 + # of CPUs * 4". That should make the debug objects subsystem scale better with the number of CPUs available in the system. Signed-off-by: Waiman Long <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: "Du Changbin" <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Jan Stancek <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2017-02-04debugobjects: Track number of kmem_cache_alloc/kmem_cache_free doneWaiman Long1-0/+10
New debugfs stat counters are added to track the numbers of kmem_cache_alloc() and kmem_cache_free() function calls to get a sense of how the internal debug objects cache management is performing. Signed-off-by: Waiman Long <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: "Du Changbin" <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Jan Stancek <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Thomas Gleixner <[email protected]>
2017-02-03lib: Introduce priority array area managerJiri Pirko5-0/+787
This introduces a infrastructure for management of linear priority areas. Priority order in an array matters, however order of items inside a priority group does not matter. As an initial implementation, L-sort algorithm is used. It is quite trivial. More advanced algorithm called P-sort will be introduced as a follow-up. The infrastructure is prepared for other algos. Alongside this, a testing module is introduced as well. Signed-off-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]>
2017-02-02ext4: move halfmd4 into hash.c directlyJason A. Donenfeld2-68/+1
The "half md4" transform should not be used by any new code. And fortunately, it's only used now by ext4. Since ext4 supports several hashing methods, at some point it might be desirable to move to something like SipHash. As an intermediate step, remove half md4 from cryptohash.h and lib, and make it just a local function in ext4's hash.c. There's precedent for doing this; the other function ext can use for its hashes -- TEA -- is also implemented in the same place. Also, by being a local function, this might allow gcc to perform some additional optimizations. Signed-off-by: Jason A. Donenfeld <[email protected]> Reviewed-by: Andreas Dilger <[email protected]> Cc: Theodore Ts'o <[email protected]> Signed-off-by: Theodore Ts'o <[email protected]>
2017-02-01Merge tag 'drm-misc-next-2017-01-30' of ↵Dave Airlie1-1/+2
git://anongit.freedesktop.org/git/drm-misc into drm-next Another round of -misc stuff: - Noralf debugfs cleanup cleanup (not yet everything, some more driver patches awaiting acks). - More doc work. - edid/infoframe fixes from Ville. - misc 1-patch fixes all over, as usual Noralf needs this for his tinydrm pull request. * tag 'drm-misc-next-2017-01-30' of git://anongit.freedesktop.org/git/drm-misc: (48 commits) drm/vc4: Remove vc4_debugfs_cleanup() dma/fence: Export enable-signaling tracepoint for emission by drivers drm/tilcdc: Remove tilcdc_debugfs_cleanup() drm/tegra: Remove tegra_debugfs_cleanup() drm/sti: Remove drm_debugfs_remove_files() calls drm/radeon: Remove drm_debugfs_remove_files() call drm/omap: Remove omap_debugfs_cleanup() drm/hdlcd: Remove hdlcd_debugfs_cleanup() drm/etnaviv: Remove etnaviv_debugfs_cleanup() drm/etnaviv: allow build with COMPILE_TEST drm/amd/amdgpu: Remove drm_debugfs_remove_files() call drm/prime: Clarify DMA-BUF/GEM Object lifetime drm/ttm: Make sure BOs being swapped out are cacheable drm/atomic: Remove drm_atomic_debugfs_cleanup() drm: drm_minor_register(): Clean up debugfs on failure drm: debugfs: Remove all files automatically on cleanup drm/fourcc: add vivante tiled layout format modifiers drm/edid: Set YQ bits in the AVI infoframe according to CEA-861-F drm/edid: Set AVI infoframe Q even when QS=0 drm/edid: Introduce drm_hdmi_avi_infoframe_quant_range() ...
2017-01-31Merge branch 'for-mingo' of ↵Ingo Molnar1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull RCU changes from Paul E. McKenney: - Dynticks updates, consolidating open-coded counter accesses into a well-defined API - SRCU updates: Simplify algorithm, add formal verification - Documentation updates - Miscellaneous fixes - Torture-test updates Signed-off-by: Ingo Molnar <[email protected]>
2017-01-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2-2/+1
Two trivial overlapping changes conflicts in MPLS and mlx5. Signed-off-by: David S. Miller <[email protected]>
2017-01-27radix tree: constify some pointersMatthew Wilcox1-26/+31
If we're just getting the value of a tag, or looking up an entry, we won't modify the radix tree, so we can declare these functions as taking a const pointer. Mostly for documentation purposes, though it might help code generation. Signed-off-by: Matthew Wilcox <[email protected]>
2017-01-27sbitmap: add helpers for dumping to a seq_fileOmar Sandoval1-0/+91
This is useful debugging information that will be used in the blk-mq debugfs directory. Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Omar Sandoval <[email protected]> Changed 'weight' to 'busy'. Signed-off-by: Jens Axboe <[email protected]>
2017-01-27Merge branch 'master' of ↵Dave Airlie5-29/+36
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux into drm-next Backmerge Linus master to get the connector locking revert. * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux: (645 commits) sysctl: fix proc_doulongvec_ms_jiffies_minmax() Revert "drm/probe-helpers: Drop locking from poll_enable" MAINTAINERS: add Dan Streetman to zbud maintainers MAINTAINERS: add Dan Streetman to zswap maintainers mm: do not export ioremap_page_range symbol for external module mn10300: fix build error of missing fpu_save() romfs: use different way to generate fsid for BLOCK or MTD frv: add missing atomic64 operations mm, page_alloc: fix premature OOM when racing with cpuset mems update mm, page_alloc: move cpuset seqcount checking to slowpath mm, page_alloc: fix fast-path race with cpuset update or removal mm, page_alloc: fix check for NULL preferred_zone kernel/panic.c: add missing \n fbdev: color map copying bounds checking frv: add atomic64_add_unless() mm/mempolicy.c: do not put mempolicy before using its nodemask radix-tree: fix private list warnings Documentation/filesystems/proc.txt: add VmPin mm, memcg: do not retry precharge charges proc: add a schedule point in proc_pid_readdir() ...
2017-01-25test_firmware: add test custom fallback triggerLuis R. Rodriguez1-0/+45
We have no custom fallback mechanism test interface. Provide one. This tests both the custom fallback mechanism and cancelling the it. Signed-off-by: Luis R. Rodriguez <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2017-01-25test_firmware: use device attribute groupsLuis R. Rodriguez1-24/+11
This simplifies init and exit. Signed-off-by: Luis R. Rodriguez <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2017-01-25test_firmware: move misc_device downLuis R. Rodriguez1-6/+6
This will make further changes easier to review. Signed-off-by: Luis R. Rodriguez <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2017-01-24mm: do not export ioremap_page_range symbol for external modulezhong jiang1-1/+0
Recently, I've found cases in which ioremap_page_range was used incorrectly, in external modules, leading to crashes. This can be partly attributed to the fact that ioremap_page_range is lower-level, with fewer protections, as compared to the other functions that an external module would typically call. Those include: ioremap_cache ioremap_nocache ioremap_prot ioremap_uc ioremap_wc ioremap_wt ...each of which wraps __ioremap_caller, which in turn provides a safer way to achieve the mapping. Therefore, stop EXPORT-ing ioremap_page_range. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: zhong jiang <[email protected]> Reviewed-by: John Hubbard <[email protected]> Suggested-by: John Hubbard <[email protected]> Acked-by: Michal Hocko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-01-24radix-tree: fix private list warningsMatthew Wilcox1-1/+1
The newly introduced warning in radix_tree_free_nodes() was testing the wrong variable; it should have been 'old' instead of 'node'. Fixes: ea07b862ac8e ("mm: workingset: fix use-after-free in shadow node shrinker") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox <[email protected]> Signed-off-by: Johannes Weiner <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2017-01-24lib/dma-virt: Add dma_virt_opsBart Van Assche3-0/+78
Several RDMA drivers (hfi1, qib and rxe) expect that ib_sge.addr is a virtual address. Provide DMA mapping operations that are suitable for these drivers. Signed-off-by: Bart Van Assche <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Michael S. Tsirkin <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-01-24lib/dma-noop: Only build dma_noop_ops for s390 and m32rBart Van Assche2-1/+6
Reduce the kernel size by only building dma_noop_ops for those architectures that actually use it. This was suggested by Christoph Hellwig. Signed-off-by: Bart Van Assche <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Michael S. Tsirkin <[email protected]> Cc: Christoph Hellwig <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-01-24lib/dma-noop: Clarify a commentBart Van Assche1-1/+1
The next patch in this series will introduce another set of DMA operations that map 1:1 with memory. Clarify that dma-noop maps to physical addresses. Signed-off-by: Bart Van Assche <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Joerg Roedel <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Michael S. Tsirkin <[email protected]> Signed-off-by: Doug Ledford <[email protected]>
2017-01-24treewide: Constify most dma_map_ops structuresBart Van Assche1-1/+1
Most dma_map_ops structures are never modified. Constify these structures such that these can be write-protected. This patch has been generated as follows: git grep -l 'struct dma_map_ops' | xargs -d\\n sed -i \ -e 's/struct dma_map_ops/const struct dma_map_ops/g' \ -e 's/const struct dma_map_ops {/struct dma_map_ops {/g' \ -e 's/^const struct dma_map_ops;$/struct dma_map_ops;/' \ -e 's/const const struct dma_map_ops /const struct dma_map_ops /g'; sed -i -e 's/const \(struct dma_map_ops intel_dma_ops\)/\1/' \ $(git grep -l 'struct dma_map_ops intel_dma_ops'); sed -i -e 's/const \(struct dma_map_ops dma_iommu_ops\)/\1/' \ $(git grep -l 'struct dma_map_ops' | grep ^arch/powerpc); sed -i -e '/^struct vmd_dev {$/,/^};$/ s/const \(struct dma_map_ops[[:blank:]]dma_ops;\)/\1/' \ -e '/^static void vmd_setup_dma_ops/,/^}$/ s/const \(struct dma_map_ops \*dest\)/\1/' \ -e 's/const \(struct dma_map_ops \*dest = \&vmd->dma_ops\)/\1/' \ drivers/pci/host/*.c sed -i -e '/^void __init pci_iommu_alloc(void)$/,/^}$/ s/dma_ops->/intel_dma_ops./' arch/ia64/kernel/pci-dma.c sed -i -e 's/static const struct dma_map_ops sn_dma_ops/static struct dma_map_ops sn_dma_ops/' arch/ia64/sn/pci/pci_dma.c sed -i -e 's/(const struct dma_map_ops \*)//' drivers/misc/mic/bus/vop_bus.c Signed-off-by: Bart Van Assche <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: David Woodhouse <[email protected]> Cc: Juergen Gross <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: [email protected] Cc: [email protected] Cc: Russell King <[email protected]> Cc: [email protected] Signed-off-by: Doug Ledford <[email protected]>
2017-01-23rcu: Enable RCU tracepoints by default to aid in debuggingMatt Fleming1-0/+1
While debugging a performance issue I needed to understand why RCU sofitrqs were firing so frequently. Unfortunately, the RCU callback tracepoints are hidden behind CONFIG_RCU_TRACE which defaults to off in the upstream kernel and is likely to also be disabled in enterprise distribution configs. Enable it by default for CONFIG_TREE_RCU. However, we must keep it disabled for tiny RCU, because it would otherwise pull in a large amount of code that would make tiny RCU less than tiny. I ran some file system metadata intensive workloads (git checkout, FS-Mark) on a variety of machines with this patch and saw no detectable change in performance. Cc: Mel Gorman <[email protected]> Signed-off-by: Matt Fleming <[email protected]> Signed-off-by: Paul E. McKenney <[email protected]> Reviewed-by: Josh Triplett <[email protected]>
2017-01-23lib/prime_numbers: Suppress warn on kmalloc failureChris Wilson1-1/+2
The allocation for the bitmap may become very large, larger than MAX_ORDER, for large requests. We fail gracefully by falling back to trail-division, so disable the warning from kmalloc: 521.961092] WARNING: CPU: 0 PID: 30637 at mm/page_alloc.c:3548 __alloc_pages_slowpath+0x237/0x9a0 [ 521.961105] Modules linked in: i915(+) drm_kms_helper intel_gtt prime_numbers [last unloaded: drm_kms_helper] [ 521.961126] CPU: 0 PID: 30637 Comm: drv_selftest Tainted: G U W 4.10.0-rc3+ #321 [ 521.961137] Hardware name: / , BIOS PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015 [ 521.961148] Call Trace: [ 521.961161] dump_stack+0x4d/0x6f [ 521.961172] __warn+0xc1/0xe0 [ 521.961181] warn_slowpath_null+0x18/0x20 [ 521.961189] __alloc_pages_slowpath+0x237/0x9a0 [ 521.961200] ? sg_init_table+0x1a/0x40 [ 521.961208] ? get_page_from_freelist+0x3fa/0x910 [ 521.961275] ? i915_gem_object_get_sg+0x272/0x2b0 [i915] [ 521.961285] __alloc_pages_nodemask+0x1ea/0x220 [ 521.961295] kmalloc_order+0x1c/0x50 [ 521.961304] __kmalloc+0x115/0x170 [ 521.961314] expand_to_next_prime+0x43/0x180 [prime_numbers] [ 521.961324] next_prime_number+0x47/0xc0 [prime_numbers] [ 521.961377] igt_vma_rotate+0x386/0x590 [i915] [ 521.961429] i915_subtests+0x37/0xc0 [i915] [ 521.961481] i915_vma_mock_selftests+0x3d/0x70 [i915] [ 521.961532] run_selftests+0x16e/0x1f0 [i915] [ 521.961541] ? 0xffffffffa02a4000 [ 521.961592] i915_mock_selftests+0x29/0x40 [i915] [ 521.961638] i915_init+0xa/0x5e [i915] [ 521.961646] ? 0xffffffffa02a4000 [ 521.961655] do_one_initcall+0x3e/0x160 [ 521.961664] ? __vunmap+0x7c/0xc0 [ 521.961672] ? vfree+0x29/0x70 [ 521.961680] ? kmem_cache_alloc+0xcf/0x120 [ 521.961690] do_init_module+0x55/0x1c4 [ 521.961699] load_module+0x1f3f/0x25b0 [ 521.961707] ? __symbol_put+0x40/0x40 [ 521.961716] ? kernel_read_file+0x100/0x190 [ 521.961725] SYSC_finit_module+0xbc/0xf0 [ 521.961734] SyS_finit_module+0x9/0x10 [ 521.961744] entry_SYSCALL_64_fastpath+0x17/0x98 [ 521.961752] RIP: 0033:0x7f111aca4119 [ 521.961760] RSP: 002b:00007ffd8be6cbe8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 521.961773] RAX: ffffffffffffffda RBX: 0000000000000006 RCX: 00007f111aca4119 [ 521.961781] RDX: 0000000000000000 RSI: 000055dfc18bc8e0 RDI: 0000000000000006 [ 521.961789] RBP: 00007ffd8be6bbe0 R08: 0000000000000000 R09: 0000000000000000 [ 521.961796] R10: 0000000000000006 R11: 0000000000000246 R12: 0000000000000005 [ 521.961805] R13: 000055dfc18bd3a0 R14: 00007ffd8be6bbc0 R15: 0000000000000005 Signed-off-by: Chris Wilson <[email protected]> Cc: Joonas Lahtinen <[email protected]> Cc: Daniel Vetter <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Signed-off-by: Daniel Vetter <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
2017-01-20percpu_counter: percpu_counter_hotcpu_callback() cleanupEric Dumazet1-3/+2
In commit ebd8fef304f9 ("percpu_counter: make percpu_counters_lock irq-safe") we disabled irqs in percpu_counter_hotcpu_callback() We can grab every counter spinlock without having to disable irqs again. Signed-off-by: Eric Dumazet <[email protected]> Cc: Tejun Heo <[email protected]> Signed-off-by: Tejun Heo <[email protected]>
2017-01-20timerqueue: Use rb_entry_safe() instead of open-coding itGeliang Tang1-2/+1
Signed-off-by: Geliang Tang <[email protected]> Cc: Andrew Morton <[email protected]> Cc: John Stultz <[email protected]> Link: http://lkml.kernel.org/r/0d5cf199ac43792df0b6f7e2145545c30fa1dbbe.1482222135.git.geliangtang@gmail.com Signed-off-by: Thomas Gleixner <[email protected]>
2017-01-18sbitmap: fix wakeup hang after sbq resizeOmar Sandoval1-5/+30
When we resize a struct sbitmap_queue, we update the wakeup batch size, but we don't update the wait count in the struct sbq_wait_states. If we resized down from a size which could use a bigger batch size, these counts could be too large and cause us to miss necessary wakeups. To fix this, update the wait counts when we resize (ensuring some careful memory ordering so that it's safe w.r.t. concurrent clears). This also fixes a theoretical issue where two threads could end up bumping the wait count up by the batch size, which could also potentially lead to hangs. Reported-by: Martin Raiber <[email protected]> Fixes: e3a2b3f931f5 ("blk-mq: allow changing of queue depth through sysfs") Fixes: 2971c35f3588 ("blk-mq: bitmap tag: fix race on blk_mq_bitmap_tags::wake_cnt") Signed-off-by: Omar Sandoval <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2017-01-18sbitmap: use smp_mb__after_atomic() in sbq_wake_up()Omar Sandoval1-3/+10
We always do an atomic clear_bit() right before we call sbq_wake_up(), so we can use smp_mb__after_atomic(). While we're here, comment the memory barriers in here a little more. Signed-off-by: Omar Sandoval <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
2017-01-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller2-26/+34
2017-01-17Merge branch 'stable/for-linus-4.10' of ↵Linus Torvalds1-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb Pull swiotlb fix from Konrad Rzeszutek Wilk: "A tiny fix to make sure that page-sized mappings are page-aligned (and not say straddle two pages). This is important for some drivers (such as NVME)" * 'stable/for-linus-4.10' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb: swiotlb: ensure that page-sized mappings are page-aligned
2017-01-16Merge 4.10-rc4 into tty-nextGreg Kroah-Hartman2-24/+32
We want the serial/tty fixes in here as well to build on top of. Signed-off-by: Greg Kroah-Hartman <[email protected]>
2017-01-15swiotlb: ensure that page-sized mappings are page-alignedNikita Yushchenko1-3/+3
Some drivers do depend on page mappings to be page aligned. Swiotlb already enforces such alignment for mappings greater than page, extend that to page-sized mappings as well. Without this fix, nvme hits BUG() in nvme_setup_prps(), because that routine assumes page-aligned mappings. Signed-off-by: Nikita Yushchenko <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: Sagi Grimberg <[email protected]> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
2017-01-14Merge branch 'for-linus' of ↵Linus Torvalds1-23/+31
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro. The most notable fix here is probably the fix for a splice regression ("fix a fencepost error in pipe_advance()") noticed by Alan Wylie. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fix a fencepost error in pipe_advance() coredump: Ensure proper size of sparse core files aio: fix lock dep warning tmpfs: clear S_ISGID when setting posix ACLs
2017-01-14fix a fencepost error in pipe_advance()Al Viro1-23/+31
The logics in pipe_advance() used to release all buffers past the new position failed in cases when the number of buffers to release was equal to pipe->buffers. If that happened, none of them had been released, leaving pipe full. Worse, it was trivial to trigger and we end up with pipe full of uninitialized pages. IOW, it's an infoleak. Cc: [email protected] # v4.9 Reported-by: "Alan J. Wylie" <[email protected]> Tested-by: "Alan J. Wylie" <[email protected]> Signed-off-by: Al Viro <[email protected]>
2017-01-14locking/ww_mutex: Begin kselftests for ww_mutexChris Wilson1-0/+12
Signed-off-by: Chris Wilson <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Linus Torvalds <[email protected]> Cc: Maarten Lankhorst <[email protected]> Cc: Nicolai Hähnle <[email protected]> Cc: Paul E. McKenney <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>
2017-01-12Merge branch 'aarch64/for-next/debug-virtual' into aarch64/for-next/coreWill Deacon1-1/+4
Merge core DEBUG_VIRTUAL changes from Laura Abbott. Later arm and arm64 support depends on these. * aarch64/for-next/debug-virtual: drivers: firmware: psci: Use __pa_symbol for kernel symbol mm/usercopy: Switch to using lm_alias mm/kasan: Switch to using __pa_symbol and lm_alias kexec: Switch to __pa_symbol mm: Introduce lm_alias mm/cma: Cleanup highmem check lib/Kconfig.debug: Add ARCH_HAS_DEBUG_VIRTUAL
2017-01-12serial: do not accept sysrq characters via serial portFelix Fietkau1-0/+10
many embedded boards have a disconnected TTL level serial which can generate some garbage that can lead to spurious false sysrq detects. Signed-off-by: John Crispin <[email protected]> Signed-off-by: Felix Fietkau <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
2017-01-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller1-1/+1
Two AF_* families adding entries to the lockdep tables at the same time. Signed-off-by: David S. Miller <[email protected]>