Age | Commit message (Collapse) | Author | Files | Lines |
|
Various coding style tweaks to various files under mm/
[[email protected]: mm/swapfile: minor coding style tweaks]
Link: https://lkml.kernel.org/r/[email protected]
[[email protected]: mm/sparse: minor coding style tweaks]
Link: https://lkml.kernel.org/r/[email protected]
[[email protected]: mm/vmscan: minor coding style tweaks]
Link: https://lkml.kernel.org/r/[email protected]
[[email protected]: mm/compaction: minor coding style tweaks]
Link: https://lkml.kernel.org/r/[email protected]
[[email protected]: mm/oom_kill: minor coding style tweaks]
Link: https://lkml.kernel.org/r/[email protected]
[[email protected]: mm/shmem: minor coding style tweaks]
Link: https://lkml.kernel.org/r/[email protected]
[[email protected]: mm/page_alloc: minor coding style tweaks]
Link: https://lkml.kernel.org/r/[email protected]
[[email protected]: mm/filemap: minor coding style tweaks]
Link: https://lkml.kernel.org/r/[email protected]
[[email protected]: mm/mlock: minor coding style tweaks]
Link: https://lkml.kernel.org/r/[email protected]
[[email protected]: mm/frontswap: minor coding style tweaks]
Link: https://lkml.kernel.org/r/[email protected]
[[email protected]: mm/vmalloc: minor coding style tweaks]
Link: https://lkml.kernel.org/r/[email protected]
[[email protected]: mm/memory_hotplug: minor coding style tweaks]
Link: https://lkml.kernel.org/r/[email protected]
[[email protected]: mm/mempolicy: minor coding style tweaks]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Zhiyuan Dai <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
Cc: Hillf Danton <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Oleksiy Avramchenko <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Steven Rostedt <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Instead of keeping open-coded style, move the code related to preloading
into a separate function. Therefore introduce the preload_this_cpu_lock()
routine that prelaods a current CPU with one extra vmap_area object.
There is no functional change as a result of this patch.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
Cc: Hillf Danton <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Oleksiy Avramchenko <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Steven Rostedt <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
A potential use after free can occur in _vm_unmap_aliases where an already
freed vmap_area could be accessed, Consider the following scenario:
Process 1 Process 2
__vm_unmap_aliases __vm_unmap_aliases
purge_fragmented_blocks_allcpus rcu_read_lock()
rcu_read_lock()
list_del_rcu(&vb->free_list)
list_for_each_entry_rcu(vb .. )
__purge_vmap_area_lazy
kmem_cache_free(va)
va_start = vb->va->va_start
Here Process 1 is in purge path and it does list_del_rcu on vmap_block and
later frees the vmap_area, since Process 2 was holding the rcu lock at
this time vmap_block will still be present in and Process 2 accesse it and
thereby it tries to access vmap_area of that vmap_block which was already
freed by Process 1 and this results in use after free.
Fix this by adding a check for vb->dirty before accessing vmap_area
structure since vb->dirty will be set to VMAP_BBMAP_BITS in purge path
checking for this will prevent the use after free.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Vijayanand Jitta <[email protected]>
Reviewed-by: Uladzislau Rezki (Sony) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
There are several reasons why a vmalloc can fail, virtual space exhausted,
page array allocation failure, page allocation failure, and kernel page
table allocation failure.
Add distinct warning messages for the main causes of failure, with some
added information like page order or allocation size where applicable.
[[email protected]: print correct vmalloc allocation size]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Nicholas Piggin <[email protected]>
Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Cc: Cédric Le Goater <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This is a shim around vunmap_range, get rid of it.
Move the main API comment from the _noflush variant to the normal
variant, and make _noflush internal to mm/.
[[email protected]: fix nommu builds and a comment bug per sfr]
Link: https://lkml.kernel.org/r/[email protected]
[[email protected]: move vunmap_range_noflush() stub inside !CONFIG_MMU, not !CONFIG_NUMA]
[[email protected]: fix nommu builds]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Nicholas Piggin <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Cc: Cédric Le Goater <[email protected]>
Cc: Uladzislau Rezki <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Patch series "mm/vmalloc: cleanup after hugepage series", v2.
Christoph pointed out some overdue cleanups required after the huge
vmalloc series, and I had another failure error message improvement as
well.
This patch (of 5):
This is a shim around vmap_pages_range, get rid of it.
Move the main API comment from the _noflush variant to the normal variant,
and make _noflush internal to mm/.
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Nicholas Piggin <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Cc: Uladzislau Rezki <[email protected]>
Cc: Cédric Le Goater <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Support huge page vmalloc mappings. Config option HAVE_ARCH_HUGE_VMALLOC
enables support on architectures that define HAVE_ARCH_HUGE_VMAP and
supports PMD sized vmap mappings.
vmalloc will attempt to allocate PMD-sized pages if allocating PMD size or
larger, and fall back to small pages if that was unsuccessful.
Architectures must ensure that any arch specific vmalloc allocations that
require PAGE_SIZE mappings (e.g., module allocations vs strict module rwx)
use the VM_NOHUGE flag to inhibit larger mappings.
This can result in more internal fragmentation and memory overhead for a
given allocation, an option nohugevmalloc is added to disable at boot.
[[email protected]: fix read of uninitialized pointer area]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Nicholas Piggin <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Ding Tianhong <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Miaohe Lin <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Russell King <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Uladzislau Rezki (Sony) <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
As a side-effect, the order of flush_cache_vmap() and
arch_sync_kernel_mappings() calls are switched, but that now matches the
other callers in this file.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Nicholas Piggin <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Ding Tianhong <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Miaohe Lin <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Russell King <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Uladzislau Rezki (Sony) <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This is a generic kernel virtual memory mapper, not specific to ioremap.
Code is unchanged other than making vmap_range non-static.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Nicholas Piggin <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Ding Tianhong <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Miaohe Lin <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Russell King <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Uladzislau Rezki (Sony) <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The vmalloc mapper operates on a struct page * array rather than a linear
physical address, re-name it to make this distinction clear.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Nicholas Piggin <[email protected]>
Reviewed-by: Miaohe Lin <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Ding Tianhong <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Russell King <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Uladzislau Rezki (Sony) <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
vmalloc_to_page returns NULL for addresses mapped by larger pages[*].
Whether or not a vmap is huge depends on the architecture details,
alignments, boot options, etc., which the caller can not be expected to
know. Therefore HUGE_VMAP is a regression for vmalloc_to_page.
This change teaches vmalloc_to_page about larger pages, and returns the
struct page that corresponds to the offset within the large page. This
makes the API agnostic to mapping implementation details.
[*] As explained by commit 029c54b095995 ("mm/vmalloc.c: huge-vmap:
fail gracefully on unexpected huge vmap mappings")
[[email protected]: sparc32: add stub pud_page define for walking huge vmalloc page tables]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Nicholas Piggin <[email protected]>
Reviewed-by: Miaohe Lin <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Ding Tianhong <[email protected]>
Cc: "H. Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Michael Ellerman <[email protected]>
Cc: Russell King <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Uladzislau Rezki (Sony) <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Stephen Rothwell <[email protected]>
Cc: David S. Miller <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
vread() has been linearly searching vmap_area_list for looking up vmalloc
areas to read from. These same areas are also tracked by a rb_tree
(vmap_area_root) which offers logarithmic lookup.
This patch modifies vread() to use the rb_tree structure instead of the
list and the speedup for heavy /proc/kcore readers can be pretty
significant. Below are the wall clock measurements of a Python
application that leverages the drgn debugging library to read and
interpret data read from /proc/kcore.
Before the patch:
-----
$ time sudo sdb -e 'dbuf | head 3000 | wc'
(unsigned long)3000
real 0m22.446s
user 0m2.321s
sys 0m20.690s
-----
With the patch:
-----
$ time sudo sdb -e 'dbuf | head 3000 | wc'
(unsigned long)3000
real 0m2.104s
user 0m2.043s
sys 0m0.921s
-----
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Serapheim Dimitropoulos <[email protected]>
Reviewed-by: Uladzislau Rezki (Sony) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
remap_vmalloc_range_partial is only used to implement remap_vmalloc_range
and by procfs. Unexport it.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Christoph Hellwig <[email protected]>
Cc: Kirti Wankhede <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The mem_dump_obj() functionality adds a few hundred bytes, which is a
small price to pay. Except on kernels built with CONFIG_PRINTK=n, in
which mem_dump_obj() messages will be suppressed. This commit therefore
makes mem_dump_obj() be a static inline empty function on kernels built
with CONFIG_PRINTK=n and excludes all of its support functions as well.
This avoids kernel bloat on systems that cannot use mem_dump_obj().
Cc: Christoph Lameter <[email protected]>
Cc: Pekka Enberg <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Cc: <[email protected]>
Suggested-by: Andrew Morton <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu
Pull RCU updates from Paul E. McKenney:
- Documentation updates.
- Miscellaneous fixes.
- kfree_rcu() updates: Addition of mem_dump_obj() to provide allocator return
addresses to more easily locate bugs. This has a couple of RCU-related commits,
but is mostly MM. Was pulled in with akpm's agreement.
- Per-callback-batch tracking of numbers of callbacks,
which enables better debugging information and smarter
reactions to large numbers of callbacks.
- The first round of changes to allow CPUs to be runtime switched from and to
callback-offloaded state.
- CONFIG_PREEMPT_RT-related changes.
- RCU CPU stall warning updates.
- Addition of polling grace-period APIs for SRCU.
- Torture-test and torture-test scripting updates, including a "torture everything"
script that runs rcutorture, locktorture, scftorture, rcuscale, and refscale.
Plus does an allmodconfig build.
Signed-off-by: Ingo Molnar <[email protected]>
|
|
This commit adds the starting address and number of pages to the vmalloc()
information dumped by way of vmalloc_dump_obj().
Cc: Andrew Morton <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Cc: <[email protected]>
Reported-by: Andrii Nakryiko <[email protected]>
Suggested-by: Vlastimil Babka <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Tested-by: Naresh Kamboju <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
|
|
This commit adds vmalloc() support to mem_dump_obj(). Note that the
vmalloc_dump_obj() function combines the checking and dumping, in
contrast with the split between kmem_valid_obj() and kmem_dump_obj().
The reason for the difference is that the checking in the vmalloc()
case involves acquiring a global lock, and redundant acquisitions of
global locks should be avoided, even on not-so-fast paths.
Note that this change causes on-stack variables to be reported as
vmalloc() storage from kernel_clone() or similar, depending on the degree
of inlining that your compiler does. This is likely more helpful than
the earlier "non-paged (local) memory".
Cc: Andrew Morton <[email protected]>
Cc: Joonsoo Kim <[email protected]>
Cc: <[email protected]>
Reported-by: Andrii Nakryiko <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
Tested-by: Naresh Kamboju <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
|
|
In VM_MAP_PUT_PAGES case, we should put pages and free array in vfree.
But we missed to set area->nr_pages in vmap(). So we would fail to put
pages in __vunmap() because area->nr_pages = 0.
Link: https://lkml.kernel.org/r/[email protected]
Fixes: b944afc9d64d ("mm: add a VM_MAP_PUT_PAGES flag for vmap")
Signed-off-by: Shijie Luo <[email protected]>
Signed-off-by: Miaohe Lin <[email protected]>
Reviewed-by: Uladzislau Rezki (Sony) <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The size of vm area can be affected by the presence or not of the guard
page. In particular when VM_NO_GUARD is present, the actual accessible
size has to be considered like the real size minus the guard page.
Currently kasan does not keep into account this information during the
poison operation and in particular tries to poison the guard page as well.
This approach, even if incorrect, does not cause an issue because the tags
for the guard page are written in the shadow memory. With the future
introduction of the Tag-Based KASAN, being the guard page inaccessible by
nature, the write tag operation on this page triggers a fault.
Fix kasan shadow poisoning size invoking get_vm_area_size() instead of
accessing directly the field in the data structure to detect the correct
value.
Link: https://lkml.kernel.org/r/[email protected]
Fixes: d98c9e83b5e7c ("kasan: fix crashes on access to memory mapped by vm_map_ram()")
Signed-off-by: Vincenzo Frascino <[email protected]>
Cc: Andrey Konovalov <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Alexander Potapenko <[email protected]>
Cc: Marco Elver <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
When multiple locks are acquired, they should be released in reverse
order. For s_start() and s_stop() in mm/vmalloc.c, that is not the
case.
s_start: mutex_lock(&vmap_purge_lock); spin_lock(&vmap_area_lock);
s_stop : mutex_unlock(&vmap_purge_lock); spin_unlock(&vmap_area_lock);
This unlock sequence, though allowed, is not optimal. If a waiter is
present, mutex_unlock() will need to go through the slowpath of waking
up the waiter with preemption disabled. Fix that by releasing the
spinlock first before the mutex.
Link: https://lkml.kernel.org/r/[email protected]
Fixes: e36176be1c39 ("mm/vmalloc: rework vmap_area_lock")
Signed-off-by: Waiman Long <[email protected]>
Reviewed-by: Uladzislau Rezki (Sony) <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Remove unnecessary return statement for void function.
Link: https://lkml.kernel.org/r/ca23f89259c80c3562700ae6e227b2815a195853.1606891153.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Kernel-doc markup has a issue on pvm_determine_end_from_reverse:
mm/vmalloc.c:3145: warning: Function parameter or member 'align' not described in 'pvm_determine_end_from_reverse'
Add a explanation for it to remove the warning.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Alex Shi <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Randy Dunlap <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
A current "lazy drain" model suffers from at least two issues.
First one is related to the unsorted list of vmap areas, thus in order to
identify the [min:max] range of areas to be drained, it requires a full
list scan. What is a time consuming if the list is too long.
Second one and as a next step is about merging all fragments with a free
space. What is also a time consuming because it has to iterate over
entire list which holds outstanding lazy areas.
See below the "preemptirqsoff" tracer that illustrates a high latency. It
is ~24676us. Our workloads like audio and video are effected by such long
latency:
<snip>
tracer: preemptirqsoff
preemptirqsoff latency trace v1.1.5 on 4.9.186-perf+
--------------------------------------------------------------------
latency: 24676 us, #4/4, CPU#1 | (M:preempt VP:0, KP:0, SP:0 HP:0 P:8)
-----------------
| task: crtc_commit:112-261 (uid:0 nice:0 policy:1 rt_prio:16)
-----------------
=> started at: __purge_vmap_area_lazy
=> ended at: __purge_vmap_area_lazy
_------=> CPU#
/ _-----=> irqs-off
| / _----=> need-resched
|| / _---=> hardirq/softirq
||| / _--=> preempt-depth
|||| / delay
cmd pid ||||| time | caller
\ / ||||| \ | /
crtc_com-261 1...1 1us*: _raw_spin_lock <-__purge_vmap_area_lazy
[...]
crtc_com-261 1...1 24675us : _raw_spin_unlock <-__purge_vmap_area_lazy
crtc_com-261 1...1 24677us : trace_preempt_on <-__purge_vmap_area_lazy
crtc_com-261 1...1 24683us : <stack trace>
=> free_vmap_area_noflush
=> remove_vm_area
=> __vunmap
=> vfree
=> drm_property_free_blob
=> drm_mode_object_unreference
=> drm_property_unreference_blob
=> __drm_atomic_helper_crtc_destroy_state
=> sde_crtc_destroy_state
=> drm_atomic_state_default_clear
=> drm_atomic_state_clear
=> drm_atomic_state_free
=> complete_commit
=> _msm_drm_commit_work_cb
=> kthread_worker_fn
=> kthread
=> ret_from_fork
<snip>
To address those two issues we can redesign a purging of the outstanding
lazy areas. Instead of queuing vmap areas to the list, we replace it by
the separate rb-tree. In hat case an area is located in the tree/list in
ascending order. It will give us below advantages:
a) Outstanding vmap areas are merged creating bigger coalesced blocks,
thus it becomes less fragmented.
b) It is possible to calculate a flush range [min:max] without scanning
all elements. It is O(1) access time or complexity;
c) The final merge of areas with the rb-tree that represents a free
space is faster because of (a). As a result the lock contention is
also reduced.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
Cc: Hillf Danton <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Oleksiy Avramchenko <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: huang ying <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
There is a dedicated and separate function that finds and removes a
continuous kernel virtual area. As a final step it also releases the
"area", a descriptor of corresponding vm_struct.
Use free_vmap_area() in the __vmalloc_node_range() instead of open coded
steps which are exactly the same, to perform a cleanup.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
Cc: Hillf Danton <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Oleksiy Avramchenko <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: "Huang, Ying" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
With a machine with 3 TB (more than 2 TB memory). If you use vmalloc to
allocate > 2 TB memory, the array_size below will be overflowed.
The array_size is an unsigned int and can only be used to allocate less
than 2 TB memory. If you pass 2*1028*1028*1024*1024 = 2 * 2^40 in the
argument of vmalloc. The array_size will become 2*2^31 = 2^32. The 2^32
cannot be store with a 32 bit integer.
The fix is to change the type of array_size to unsigned long.
[[email protected]: rework for current mainline]
Link: https://bugzilla.kernel.org/show_bug.cgi?id=210023
Reported-by: <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
No point in having the filename inside the file.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Uladzislau Rezki (Sony) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Patch series "two small vmalloc cleanups".
This patch (of 2):
__vmalloc_area_node currently has four different gfp_t variables to
just express this simple logic:
- use the passed in mask, plus __GFP_NOWARN and __GFP_HIGHMEM (if
suitable) for the underlying page allocation
- use just the reclaim flags from the passed in mask plus __GFP_ZERO
for allocating the page array
Simplify this down to just use the pre-existing nested_gfp as-is for
the page array allocation, and just the passed in gfp_mask for the
page allocation, after conditionally ORing __GFP_HIGHMEM into it. This
also makes the allocation warning a little more correct.
Also initialize two variables at the time of declaration while touching
this area.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Uladzislau Rezki (Sony) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
All users are gone now.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Chris Wilson <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Matthew Auld <[email protected]>
Cc: "Matthew Wilcox (Oracle)" <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Nitin Gupta <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Uladzislau Rezki (Sony) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Add a proper helper to remap PFNs into kernel virtual space so that
drivers don't have to abuse alloc_vm_area and open coded PTE manipulation
for it.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Chris Wilson <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Matthew Auld <[email protected]>
Cc: "Matthew Wilcox (Oracle)" <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Nitin Gupta <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Uladzislau Rezki (Sony) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Add a flag so that vmap takes ownership of the passed in page array. When
vfree is called on such an allocation it will put one reference on each
page, and free the page array itself.
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Chris Wilson <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Matthew Auld <[email protected]>
Cc: "Matthew Wilcox (Oracle)" <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Nitin Gupta <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Uladzislau Rezki (Sony) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Patch series "remove alloc_vm_area", v4.
This series removes alloc_vm_area, which was left over from the big
vmalloc interface rework. It is a rather arkane interface, basicaly the
equivalent of get_vm_area + actually faulting in all PTEs in the allocated
area. It was originally addeds for Xen (which isn't modular to start
with), and then grew users in zsmalloc and i915 which seems to mostly
qualify as abuses of the interface, especially for i915 as a random driver
should not set up PTE bits directly.
This patch (of 11):
* Document that you can call vfree() on an address returned from vmap()
* Remove the note about the minimum size -- the minimum size of a vmalloc
allocation is one page
* Add a Context: section
* Fix capitalisation
* Reword the prohibition on calling from NMI context to avoid a double
negative
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Jani Nikula <[email protected]>
Cc: Joonas Lahtinen <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Chris Wilson <[email protected]>
Cc: Matthew Auld <[email protected]>
Cc: Rodrigo Vivi <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Nitin Gupta <[email protected]>
Cc: Uladzislau Rezki (Sony) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Fix the comment of find_vm_area() and get_vm_area()
Signed-off-by: Hui Su <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Link: https://lkml.kernel.org/r/20200927153034.GA199877@rlk
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Since c67dc624757 ("mm/vmalloc: do not call kmemleak_free() on not yet
accounted memory"), the __vunmap() have been changed to __vfree(), so
update the confusing comment().
Signed-off-by: Hui Su <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: Roman Penyaev <[email protected]>
Link: https://lkml.kernel.org/r/20200927155409.GA3315@rlk
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Like zap_pte_range add cond_resched so that we can avoid softlockups as
reported below. On non-preemptible kernel with large I/O map region (like
the one we get when using persistent memory with sector mode), an unmap of
the namespace can report below softlockups.
22724.027334] watchdog: BUG: soft lockup - CPU#49 stuck for 23s! [ndctl:50777]
NIP [c0000000000dc224] plpar_hcall+0x38/0x58
LR [c0000000000d8898] pSeries_lpar_hpte_invalidate+0x68/0xb0
Call Trace:
flush_hash_page+0x114/0x200
hpte_need_flush+0x2dc/0x540
vunmap_page_range+0x538/0x6f0
free_unmap_vmap_area+0x30/0x70
remove_vm_area+0xfc/0x140
__vunmap+0x68/0x270
__iounmap.part.0+0x34/0x60
memunmap+0x54/0x70
release_nodes+0x28c/0x300
device_release_driver_internal+0x16c/0x280
unbind_store+0x124/0x170
drv_attr_store+0x44/0x60
sysfs_kf_write+0x64/0x90
kernfs_fop_write+0x1b0/0x290
__vfs_write+0x3c/0x70
vfs_write+0xd8/0x260
ksys_write+0xdc/0x130
system_call+0x5c/0x70
Reported-by: Harish Sriram <[email protected]>
Signed-off-by: Aneesh Kumar K.V <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Cc: <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Get rid of BUG() macro, that should be used only when a critical situation
happens and a system is not able to function anymore.
Replace it with WARN() macro instead, dump some extra information about
start/end addresses of both VAs which overlap. Such overlap data can help
to figure out what happened making further analysis easier. For example
if both areas are identical it could mean a double free.
A recovery process consists of declining all further steps regarding
inserting of conflicting overlap range. In that sense find_va_links() now
can return NULL, so its return value has to be checked by callers.
Side effect of such process is it can leak memory, but it is better than
just killing a machine for no good reason. Apart of that a debugging
process can be done on alive system.
Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Cc: Hillf Danton <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Oleksiy Avramchenko <[email protected]>
Cc: Steven Rostedt <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
'addr' is set to 'start' and then a few lines afterwards 'start' is set to
'addr'. Remove the second asignment.
Fixes: 2ba3e6947aed ("mm/vmalloc: track which page-table levels were modified")
Signed-off-by: Mike Rapoport <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Cc: Joerg Roedel <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Reflect information about the author, date and year when the KVA rework
was done.
Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
An augment_tree_propagate_from() function uses its own implementation that
populates a tree from the specified node toward a root node.
On the other hand the RB_DECLARE_CALLBACKS_MAX macro provides the
"propagate()" callback that does exactly the same. Having two similar
functions does not make sense and is redundant.
Reuse "built in" functionality to the macros. So the code size gets
reduced.
Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This function is for debug purpose only. Currently it uses recursion for
tree traversal, checking an augmented value of each node to find out if it
is valid or not.
The recursion can corrupt the stack because the tree can be huge if
synthetic tests are applied. To prevent it, navigate the tree from bottom
to upper levels using a regular list instead, because nodes are linked
among each other also. It is faster and without recursion.
Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Currently when a VA is deallocated and is about to be placed back to the
tree, it can be either: merged with next/prev neighbors or inserted if not
coalesced.
On those steps the tree can be populated several times. For example when
both neighbors are merged. It can be avoided and simplified in fact.
Therefore do it only once when VA points to final merged area, after all
manipulations: merging/removing/inserting.
Signed-off-by: Uladzislau Rezki (Sony) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The radix tree of vmap blocks is simpler to express as an XArray. Reduces
both the text and data sizes of the object file and eliminates a user of
the radix tree preload API.
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: William Kucharski <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
The functions are only used in two source files, so there is no need for
them to be in the global <linux/mm.h> header. Move them to the new
<linux/pgalloc-track.h> header and include it only where needed.
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: Pekka Enberg <[email protected]>
Cc: Peter Zijlstra (Intel) <[email protected]>
Cc: Andy Lutomirski <[email protected]>
Cc: Abdul Haleem <[email protected]>
Cc: Satheesh Rajendran <[email protected]>
Cc: Stephen Rothwell <[email protected]>
Cc: Steven Rostedt (VMware) <[email protected]>
Cc: Mike Rapoport <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Max Filippov <[email protected]>
Cc: Stafford Horne <[email protected]>
Cc: Geert Uytterhoeven <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Merge vmalloc_exec into its only caller. Note that for !CONFIG_MMU
__vmalloc_node_range maps to __vmalloc, which directly clears the
__GFP_HIGHMEM added by the vmalloc_exec stub anyway.
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Christoph Hellwig <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Dexuan Cui <[email protected]>
Cc: Jessica Yu <[email protected]>
Cc: Vitaly Kuznetsov <[email protected]>
Cc: Wei Liu <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
This patch fixes following warning while "make xmldocs"
mm/vmalloc.c:1877: warning: Excess function parameter 'prot' description in 'vm_map_ram'
This warning started since commit d4efd79a81ab ("mm: remove the prot
argument from vm_map_ram").
Link: http://lkml.kernel.org/r/[email protected]
Fixes: d4efd79a81ab ("mm: remove the prot argument from vm_map_ram")
Signed-off-by: Masanari Iida <[email protected]>
Reviewed-by: Andrew Morton <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
|
|
There is a typo in comment, fix it.
"nother" -> "another"
Signed-off-by: Jeongtae Park <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Cc: Andrey Ryabinin <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
These functions are not needed anymore because the vmalloc and ioremap
mappings are now synchronized when they are created or torn down.
Remove all callers and function definitions.
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Tested-by: Steven Rostedt (VMware) <[email protected]>
Acked-by: Andy Lutomirski <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: "H . Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Track at which levels in the page-table entries were modified by
vmap/vunmap.
After the page-table has been modified, use that information do decide
whether the new arch_sync_kernel_mappings() needs to be called.
[[email protected]: map_kernel_range_noflush() needs the arch_sync_kernel_mappings() call]
Signed-off-by: Joerg Roedel <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Andy Lutomirski <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: "H . Peter Anvin" <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Steven Rostedt (VMware) <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
Open code it in __bpf_map_area_alloc, which is the only caller. Also
clean up __bpf_map_area_alloc to have a single vmalloc call with slightly
different flags instead of the current two different calls.
For this to compile for the nommu case add a __vmalloc_node_range stub to
nommu.c.
[[email protected]: fix nommu.c build]
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: David Airlie <[email protected]>
Cc: Gao Xiang <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Haiyang Zhang <[email protected]>
Cc: "K. Y. Srinivasan" <[email protected]>
Cc: Laura Abbott <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Michael Kelley <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Nitin Gupta <[email protected]>
Cc: Robin Murphy <[email protected]>
Cc: Sakari Ailus <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Cc: Sumit Semwal <[email protected]>
Cc: Wei Liu <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Stephen Rothwell <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|
|
No need to export the very low-level __vmalloc_node_range when the test
module can use a slightly higher level variant.
[[email protected]: add missing `node' arg]
[[email protected]: fix riscv nommu build]
Signed-off-by: Christoph Hellwig <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Christophe Leroy <[email protected]>
Cc: Daniel Vetter <[email protected]>
Cc: David Airlie <[email protected]>
Cc: Gao Xiang <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Haiyang Zhang <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: "K. Y. Srinivasan" <[email protected]>
Cc: Laura Abbott <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Michael Kelley <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Nitin Gupta <[email protected]>
Cc: Robin Murphy <[email protected]>
Cc: Sakari Ailus <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Cc: Sumit Semwal <[email protected]>
Cc: Wei Liu <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Heiko Carstens <[email protected]>
Cc: Paul Mackerras <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: Will Deacon <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Linus Torvalds <[email protected]>
|