| Age | Commit message (Collapse) | Author | Files | Lines |
|
Avoid pointer type value compared with 0 to make code clear.
./tools/testing/radix-tree/maple.c:34142:15-16: WARNING comparing pointer to 0.
Link: https://lkml.kernel.org/r/[email protected]
Reported-by: Abaci Robot <[email protected]>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=7696
Signed-off-by: Jiapeng Chong <[email protected]>
Cc: Liam R. Howlett <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
|
|
mas_preallocate() defaults to requesting 1 node for preallocation and then
,depending on the type of store, will update the request variable. There
isn't a check for a slot store type, so slot stores are preallocating the
default 1 node. Slot stores do not require any additional nodes, so add a
check for the slot store case that will bypass node_count_gfp(). Update
the tests to reflect that slot stores do not require allocations.
User visible effects of this bug include increased memory usage from the
unneeded node that was allocated.
Link: https://lkml.kernel.org/r/[email protected]
Fixes: 0b8bb544b1a7 ("maple_tree: update mas_preallocate() testing")
Signed-off-by: Sidhartha Kumar <[email protected]>
Cc: Liam R. Howlett <[email protected]>
Cc: Matthew Wilcox (Oracle) <[email protected]>
Cc: Peng Zhang <[email protected]>
Cc: <[email protected]> [6.6+]
Signed-off-by: Andrew Morton <[email protected]>
|
|
Now that the status of the maple state is outside of the node, the
mas_searchable() function can be dropped for easier open-coding of what is
going on.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Liam R. Howlett <[email protected]>
Cc: Peng Zhang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The maple tree node is overloaded to keep status as well as the active
node. This, unfortunately, results in a re-walk on underflow or overflow.
Since the maple state has room, the status can be placed in its own enum
in the structure. Once an underflow/overflow is detected, certain modes
can restore the status to active and others may need to re-walk just that
one node to see the entry.
The status being an enum has the benefit of detecting unhandled status in
switch statements.
[[email protected]: fix comments about MAS_*]
Link: https://lkml.kernel.org/r/[email protected]
[[email protected]: update forking to separate maple state and node]
Link: https://lkml.kernel.org/r/[email protected]
[[email protected]: fix mas_prev() state separation code]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Liam R. Howlett <[email protected]>
Cc: Peng Zhang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Analysis of the mas_for_each() iteration showed that there is a
significant time spent finding the end of a node. This time can be
greatly reduced if the end of the node is cached in the maple state. Care
must be taken to update & invalidate as necessary.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Liam R. Howlett <[email protected]>
Cc: Peng Zhang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Skip other tests when BENCH is enabled so that performance can be measured
in user space.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Peng Zhang <[email protected]>
Reviewed-by: Liam R. Howlett <[email protected]>
Cc: Christian Brauner <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Mateusz Guzik <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: Mike Christie <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Add test for mtree_dup().
Test by duplicating different maple trees and then comparing the two
trees. Includes tests for duplicating full trees and memory allocation
failures on different nodes.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Peng Zhang <[email protected]>
Reviewed-by: Liam R. Howlett <[email protected]>
Cc: Christian Brauner <[email protected]>
Cc: Jonathan Corbet <[email protected]>
Cc: Mateusz Guzik <[email protected]>
Cc: Mathieu Desnoyers <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Michael S. Tsirkin <[email protected]>
Cc: Mike Christie <[email protected]>
Cc: Nicholas Piggin <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Since the mas_preallocate() calculation has been updated to be more
precise, the testing must also be updated to check for what is expected.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Liam R. Howlett <[email protected]>
Cc: Peng Zhang <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The current preallocation strategy is to preallocate the absolute
worst-case allocation for a tree modification. The entry (or NULL) is
needed to know how many nodes are needed to write to the tree. Start by
adding the argument to the mas_preallocate() definition.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Liam R. Howlett <[email protected]>
Cc: Peng Zhang <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Add test for expanding range in RCU mode. If we use the fast path of the
slot store to expand range in RCU mode, this test will fail.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Peng Zhang <[email protected]>
Reviewed-by: Liam R. Howlett <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Internal node counting was altered and the 64 bit test was updated,
however the 32bit test was missed.
Restore the 32bit test to a functional state.
Link: https://lore.kernel.org/linux-mm/CAMuHMdV4T53fOw7VPoBgPR7fP6RYqf=CBhD_y_vOg53zZX_DnA@mail.gmail.com/
Link: https://lkml.kernel.org/r/[email protected]
Fixes: 541e06b772c1 ("maple_tree: remove GFP_ZERO from kmem_cache_alloc() and kmem_cache_alloc_bulk()")
Signed-off-by: Liam R. Howlett <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The test functions are not needed after the module is removed, so mark
them as such. Add __exit to the module removal function. Some other
variables have been marked as const static as well.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Liam R. Howlett <[email protected]>
Suggested-by: Andrew Morton <[email protected]>
Cc: David Binderman <[email protected]>
Cc: Peng Zhang <[email protected]>
Cc: Sergey Senozhatsky <[email protected]>
Cc: Vernon Yang <[email protected]>
Cc: Wei Yang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The test code is less useful without debug, but can still do general
validations. Define mt_dump(), mas_dump() and mas_wr_dump() as a noop if
debug is not enabled and document it in the test module information that
more information can be obtained with another kernel config option.
MT_BUG_ON() will report a failures without tree dumps, and the output will
be less useful.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Liam R. Howlett <[email protected]>
Cc: David Binderman <[email protected]>
Cc: Peng Zhang <[email protected]>
Cc: Sergey Senozhatsky <[email protected]>
Cc: Vernon Yang <[email protected]>
Cc: Wei Yang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Allow different formatting strings to be used when dumping the tree.
Currently supports hex and decimal.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Liam R. Howlett <[email protected]>
Cc: David Binderman <[email protected]>
Cc: Peng Zhang <[email protected]>
Cc: Sergey Senozhatsky <[email protected]>
Cc: Vernon Yang <[email protected]>
Cc: Wei Yang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The maple tree node limits are implied by the parent. When walking up the
tree, the limit may not be known until a slot that does not have implied
limits are encountered. However, if the node is the left-most or
right-most node, the walking up to find that limit can be skipped.
This commit also fixes the debug/testing code that was not setting the
limit on walking down the tree as that optimization is not compatible with
this change.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Liam R. Howlett <[email protected]>
Reviewed-by: Peng Zhang <[email protected]>
Cc: David Binderman <[email protected]>
Cc: Sergey Senozhatsky <[email protected]>
Cc: Vernon Yang <[email protected]>
Cc: Wei Yang <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Add a test case to check whether the number of maple_alloc structures is
actually equal to mas->alloc->total.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Peng Zhang <[email protected]>
Cc: Liam R. Howlett <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
During the development of the maple tree, the strategy of freeing multiple
nodes changed and, in the process, the pivots were reused to store
pointers to dead nodes. To ensure the readers see accurate pivots, the
writers need to mark the nodes as dead and call smp_wmb() to ensure any
readers can identify the node as dead before using the pivot values.
There were two places where the old method of marking the node as dead
without smp_wmb() were being used, which resulted in RCU readers seeing
the wrong pivot value before seeing the node was dead. Fix this race
condition by using mte_set_node_dead() which has the smp_wmb() call to
ensure the race is closed.
Add a WARN_ON() to the ma_free_rcu() call to ensure all nodes being freed
are marked as dead to ensure there are no other call paths besides the two
updated paths.
This is necessary for the RCU mode of the maple tree.
Link: https://lkml.kernel.org/r/[email protected]
Fixes: 54a611b60590 ("Maple Tree: add new data structure")
Signed-off-by: Liam R. Howlett <[email protected]>
Signed-off-by: Suren Baghdasaryan <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
The parameter entry of mas_preallocate is not used, so drop it.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Vernon Yang <[email protected]>
Cc: Liam Howlett <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Preallocations are common in the VMA code to avoid allocating under
certain locking conditions. The preallocations must also cover the
worst-case scenario. Removing the GFP_ZERO flag from the
kmem_cache_alloc() (and bulk variant) calls will reduce the amount of time
spent zeroing memory that may not be used. Only zero out the necessary
area to keep track of the allocations in the maple state. Zero the entire
node prior to using it in the tree.
This required internal changes to node counting on allocation, so the test
code is also updated.
This restores some micro-benchmark performance: up to +9% in mmtests mmap1
by my testing +10% to +20% in mmap, mmapaddr, mmapmany tests reported by
Red Hat
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2149636
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Liam Howlett <[email protected]>
Reported-by: Jirka Hladky <[email protected]>
Suggested-by: Matthew Wilcox (Oracle) <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Add the span to the year of the development.
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Liam R. Howlett <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Joe Perches <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Along the development cycle, the testing code support for module/in-kernel
compiles was removed. Restore this functionality by moving any internal
API tests to the userspace side, as well as threading tests. Fix the
lockdep issues and add a way to reduce memory usage so the tests can
complete with KASAN + memleak detection. Make the tests work on 32 bit
hosts where possible and detect 32 bit hosts in the radix test suite.
[[email protected]: fix module export]
[[email protected]: fix it some more]
[[email protected]: fix compile warnings on 32bit build in check_find()]
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Liam R. Howlett <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|
|
Patch series "Introducing the Maple Tree"
The maple tree is an RCU-safe range based B-tree designed to use modern
processor cache efficiently. There are a number of places in the kernel
that a non-overlapping range-based tree would be beneficial, especially
one with a simple interface. If you use an rbtree with other data
structures to improve performance or an interval tree to track
non-overlapping ranges, then this is for you.
The tree has a branching factor of 10 for non-leaf nodes and 16 for leaf
nodes. With the increased branching factor, it is significantly shorter
than the rbtree so it has fewer cache misses. The removal of the linked
list between subsequent entries also reduces the cache misses and the need
to pull in the previous and next VMA during many tree alterations.
The first user that is covered in this patch set is the vm_area_struct,
where three data structures are replaced by the maple tree: the augmented
rbtree, the vma cache, and the linked list of VMAs in the mm_struct. The
long term goal is to reduce or remove the mmap_lock contention.
The plan is to get to the point where we use the maple tree in RCU mode.
Readers will not block for writers. A single write operation will be
allowed at a time. A reader re-walks if stale data is encountered. VMAs
would be RCU enabled and this mode would be entered once multiple tasks
are using the mm_struct.
Davidlor said
: Yes I like the maple tree, and at this stage I don't think we can ask for
: more from this series wrt the MM - albeit there seems to still be some
: folks reporting breakage. Fundamentally I see Liam's work to (re)move
: complexity out of the MM (not to say that the actual maple tree is not
: complex) by consolidating the three complimentary data structures very
: much worth it considering performance does not take a hit. This was very
: much a turn off with the range locking approach, which worst case scenario
: incurred in prohibitive overhead. Also as Liam and Matthew have
: mentioned, RCU opens up a lot of nice performance opportunities, and in
: addition academia[1] has shown outstanding scalability of address spaces
: with the foundation of replacing the locked rbtree with RCU aware trees.
A similar work has been discovered in the academic press
https://pdos.csail.mit.edu/papers/rcuvm:asplos12.pdf
Sheer coincidence. We designed our tree with the intention of solving the
hardest problem first. Upon settling on a b-tree variant and a rough
outline, we researched ranged based b-trees and RCU b-trees and did find
that article. So it was nice to find reassurances that we were on the
right path, but our design choice of using ranges made that paper unusable
for us.
This patch (of 70):
The maple tree is an RCU-safe range based B-tree designed to use modern
processor cache efficiently. There are a number of places in the kernel
that a non-overlapping range-based tree would be beneficial, especially
one with a simple interface. If you use an rbtree with other data
structures to improve performance or an interval tree to track
non-overlapping ranges, then this is for you.
The tree has a branching factor of 10 for non-leaf nodes and 16 for leaf
nodes. With the increased branching factor, it is significantly shorter
than the rbtree so it has fewer cache misses. The removal of the linked
list between subsequent entries also reduces the cache misses and the need
to pull in the previous and next VMA during many tree alterations.
The first user that is covered in this patch set is the vm_area_struct,
where three data structures are replaced by the maple tree: the augmented
rbtree, the vma cache, and the linked list of VMAs in the mm_struct. The
long term goal is to reduce or remove the mmap_lock contention.
The plan is to get to the point where we use the maple tree in RCU mode.
Readers will not block for writers. A single write operation will be
allowed at a time. A reader re-walks if stale data is encountered. VMAs
would be RCU enabled and this mode would be entered once multiple tasks
are using the mm_struct.
There is additional BUG_ON() calls added within the tree, most of which
are in debug code. These will be replaced with a WARN_ON() call in the
future. There is also additional BUG_ON() calls within the code which
will also be reduced in number at a later date. These exist to catch
things such as out-of-range accesses which would crash anyways.
Link: https://lkml.kernel.org/r/[email protected]
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Liam R. Howlett <[email protected]>
Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
Tested-by: David Howells <[email protected]>
Tested-by: Sven Schnelle <[email protected]>
Tested-by: Yu Zhao <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Davidlohr Bueso <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: SeongJae Park <[email protected]>
Cc: Will Deacon <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
|