diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2024-09-18 08:53:53 +0200 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2024-09-18 08:53:53 +0200 |
commit | bdf56c7580d267a123cc71ca0f2459c797b76fde (patch) | |
tree | 193461045ebd97cb4827d244bb1ec703ebffe53a /include/linux | |
parent | efdfcd40ad5e14d66f2e66e1fbbfb375c26bd3d7 (diff) | |
parent | ecc4d6af979b3bd4d239ff80bbba455c90d3f4f3 (diff) |
Merge tag 'slab-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab updates from Vlastimil Babka:
"This time it's mostly refactoring and improving APIs for slab users in
the kernel, along with some debugging improvements.
- kmem_cache_create() refactoring (Christian Brauner)
Over the years have been growing new parameters to
kmem_cache_create() where most of them are needed only for a small
number of caches - most recently the rcu_freeptr_offset parameter.
To avoid adding new parameters to kmem_cache_create() and adjusting
all its callers, or creating new wrappers such as
kmem_cache_create_rcu(), we can now pass extra parameters using the
new struct kmem_cache_args. Not explicitly initialized fields
default to values interpreted as unused.
kmem_cache_create() is for now a wrapper that works both with the
new form: kmem_cache_create(name, object_size, args, flags) and the
legacy form: kmem_cache_create(name, object_size, align, flags,
ctor)
- kmem_cache_destroy() waits for kfree_rcu()'s in flight (Vlastimil
Babka, Uladislau Rezki)
Since SLOB removal, kfree() is allowed for freeing objects
allocated by kmem_cache_create(). By extension kfree_rcu() as
allowed as well, which can allow converting simple call_rcu()
callbacks that only do kmem_cache_free(), as there was never a
kmem_cache_free_rcu() variant. However, for caches that can be
destroyed e.g. on module removal, the cache owners knew to issue
rcu_barrier() first to wait for the pending call_rcu()'s, and this
is not sufficient for pending kfree_rcu()'s due to its internal
batching optimizations. Ulad has provided a new
kvfree_rcu_barrier() and to make the usage less error-prone,
kmem_cache_destroy() calls it. Additionally, destroying
SLAB_TYPESAFE_BY_RCU caches now again issues rcu_barrier()
synchronously instead of using an async work, because the past
motivation for async work no longer applies. Users of custom
call_rcu() callbacks should however keep calling rcu_barrier()
before cache destruction.
- Debugging use-after-free in SLAB_TYPESAFE_BY_RCU caches (Jann Horn)
Currently, KASAN cannot catch UAFs in such caches as it is legal to
access them within a grace period, and we only track the grace
period when trying to free the underlying slab page. The new
CONFIG_SLUB_RCU_DEBUG option changes the freeing of individual
object to be RCU-delayed, after which KASAN can poison them.
- Delayed memcg charging (Shakeel Butt)
In some cases, the memcg is uknown at allocation time, such as
receiving network packets in softirq context. With
kmem_cache_charge() these may be now charged later when the user
and its memcg is known.
- Misc fixes and improvements (Pedro Falcato, Axel Rasmussen,
Christoph Lameter, Yan Zhen, Peng Fan, Xavier)"
* tag 'slab-for-6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: (34 commits)
mm, slab: restore kerneldoc for kmem_cache_create()
io_uring: port to struct kmem_cache_args
slab: make __kmem_cache_create() static inline
slab: make kmem_cache_create_usercopy() static inline
slab: remove kmem_cache_create_rcu()
file: port to struct kmem_cache_args
slab: create kmem_cache_create() compatibility layer
slab: port KMEM_CACHE_USERCOPY() to struct kmem_cache_args
slab: port KMEM_CACHE() to struct kmem_cache_args
slab: remove rcu_freeptr_offset from struct kmem_cache
slab: pass struct kmem_cache_args to do_kmem_cache_create()
slab: pull kmem_cache_open() into do_kmem_cache_create()
slab: pass struct kmem_cache_args to create_cache()
slab: port kmem_cache_create_usercopy() to struct kmem_cache_args
slab: port kmem_cache_create_rcu() to struct kmem_cache_args
slab: port kmem_cache_create() to struct kmem_cache_args
slab: add struct kmem_cache_args
slab: s/__kmem_cache_create/do_kmem_cache_create/g
memcg: add charging of already allocated slab objects
mm/slab: Optimize the code logic in find_mergeable()
...
Diffstat (limited to 'include/linux')
-rw-r--r-- | include/linux/kasan.h | 63 | ||||
-rw-r--r-- | include/linux/rcutiny.h | 5 | ||||
-rw-r--r-- | include/linux/rcutree.h | 1 | ||||
-rw-r--r-- | include/linux/slab.h | 228 |
4 files changed, 272 insertions, 25 deletions
diff --git a/include/linux/kasan.h b/include/linux/kasan.h index 70d6a8f6e25d..00a3bf7c0d8f 100644 --- a/include/linux/kasan.h +++ b/include/linux/kasan.h @@ -175,13 +175,59 @@ static __always_inline void * __must_check kasan_init_slab_obj( return (void *)object; } -bool __kasan_slab_free(struct kmem_cache *s, void *object, - unsigned long ip, bool init); +bool __kasan_slab_pre_free(struct kmem_cache *s, void *object, + unsigned long ip); +/** + * kasan_slab_pre_free - Check whether freeing a slab object is safe. + * @object: Object to be freed. + * + * This function checks whether freeing the given object is safe. It may + * check for double-free and invalid-free bugs and report them. + * + * This function is intended only for use by the slab allocator. + * + * @Return true if freeing the object is unsafe; false otherwise. + */ +static __always_inline bool kasan_slab_pre_free(struct kmem_cache *s, + void *object) +{ + if (kasan_enabled()) + return __kasan_slab_pre_free(s, object, _RET_IP_); + return false; +} + +bool __kasan_slab_free(struct kmem_cache *s, void *object, bool init, + bool still_accessible); +/** + * kasan_slab_free - Poison, initialize, and quarantine a slab object. + * @object: Object to be freed. + * @init: Whether to initialize the object. + * @still_accessible: Whether the object contents are still accessible. + * + * This function informs that a slab object has been freed and is not + * supposed to be accessed anymore, except when @still_accessible is set + * (indicating that the object is in a SLAB_TYPESAFE_BY_RCU cache and an RCU + * grace period might not have passed yet). + * + * For KASAN modes that have integrated memory initialization + * (kasan_has_integrated_init() == true), this function also initializes + * the object's memory. For other modes, the @init argument is ignored. + * + * This function might also take ownership of the object to quarantine it. + * When this happens, KASAN will defer freeing the object to a later + * stage and handle it internally until then. The return value indicates + * whether KASAN took ownership of the object. + * + * This function is intended only for use by the slab allocator. + * + * @Return true if KASAN took ownership of the object; false otherwise. + */ static __always_inline bool kasan_slab_free(struct kmem_cache *s, - void *object, bool init) + void *object, bool init, + bool still_accessible) { if (kasan_enabled()) - return __kasan_slab_free(s, object, _RET_IP_, init); + return __kasan_slab_free(s, object, init, still_accessible); return false; } @@ -371,7 +417,14 @@ static inline void *kasan_init_slab_obj(struct kmem_cache *cache, { return (void *)object; } -static inline bool kasan_slab_free(struct kmem_cache *s, void *object, bool init) + +static inline bool kasan_slab_pre_free(struct kmem_cache *s, void *object) +{ + return false; +} + +static inline bool kasan_slab_free(struct kmem_cache *s, void *object, + bool init, bool still_accessible) { return false; } diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h index cf2b5a188f78..0ee270b3f5ed 100644 --- a/include/linux/rcutiny.h +++ b/include/linux/rcutiny.h @@ -111,6 +111,11 @@ static inline void __kvfree_call_rcu(struct rcu_head *head, void *ptr) kvfree(ptr); } +static inline void kvfree_rcu_barrier(void) +{ + rcu_barrier(); +} + #ifdef CONFIG_KASAN_GENERIC void kvfree_call_rcu(struct rcu_head *head, void *ptr); #else diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h index 7dbde2b6f714..90a684f94776 100644 --- a/include/linux/rcutree.h +++ b/include/linux/rcutree.h @@ -35,6 +35,7 @@ static inline void rcu_virt_note_context_switch(void) void synchronize_rcu_expedited(void); void kvfree_call_rcu(struct rcu_head *head, void *ptr); +void kvfree_rcu_barrier(void); void rcu_barrier(void); void rcu_momentary_eqs(void); diff --git a/include/linux/slab.h b/include/linux/slab.h index 5b2da2cf31a8..da3a546571e7 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -240,17 +240,173 @@ struct mem_cgroup; */ bool slab_is_available(void); -struct kmem_cache *kmem_cache_create(const char *name, unsigned int size, - unsigned int align, slab_flags_t flags, - void (*ctor)(void *)); -struct kmem_cache *kmem_cache_create_usercopy(const char *name, - unsigned int size, unsigned int align, - slab_flags_t flags, - unsigned int useroffset, unsigned int usersize, - void (*ctor)(void *)); -struct kmem_cache *kmem_cache_create_rcu(const char *name, unsigned int size, - unsigned int freeptr_offset, - slab_flags_t flags); +/** + * struct kmem_cache_args - Less common arguments for kmem_cache_create() + * + * Any uninitialized fields of the structure are interpreted as unused. The + * exception is @freeptr_offset where %0 is a valid value, so + * @use_freeptr_offset must be also set to %true in order to interpret the field + * as used. For @useroffset %0 is also valid, but only with non-%0 + * @usersize. + * + * When %NULL args is passed to kmem_cache_create(), it is equivalent to all + * fields unused. + */ +struct kmem_cache_args { + /** + * @align: The required alignment for the objects. + * + * %0 means no specific alignment is requested. + */ + unsigned int align; + /** + * @useroffset: Usercopy region offset. + * + * %0 is a valid offset, when @usersize is non-%0 + */ + unsigned int useroffset; + /** + * @usersize: Usercopy region size. + * + * %0 means no usercopy region is specified. + */ + unsigned int usersize; + /** + * @freeptr_offset: Custom offset for the free pointer + * in &SLAB_TYPESAFE_BY_RCU caches + * + * By default &SLAB_TYPESAFE_BY_RCU caches place the free pointer + * outside of the object. This might cause the object to grow in size. + * Cache creators that have a reason to avoid this can specify a custom + * free pointer offset in their struct where the free pointer will be + * placed. + * + * Note that placing the free pointer inside the object requires the + * caller to ensure that no fields are invalidated that are required to + * guard against object recycling (See &SLAB_TYPESAFE_BY_RCU for + * details). + * + * Using %0 as a value for @freeptr_offset is valid. If @freeptr_offset + * is specified, %use_freeptr_offset must be set %true. + * + * Note that @ctor currently isn't supported with custom free pointers + * as a @ctor requires an external free pointer. + */ + unsigned int freeptr_offset; + /** + * @use_freeptr_offset: Whether a @freeptr_offset is used. + */ + bool use_freeptr_offset; + /** + * @ctor: A constructor for the objects. + * + * The constructor is invoked for each object in a newly allocated slab + * page. It is the cache user's responsibility to free object in the + * same state as after calling the constructor, or deal appropriately + * with any differences between a freshly constructed and a reallocated + * object. + * + * %NULL means no constructor. + */ + void (*ctor)(void *); +}; + +struct kmem_cache *__kmem_cache_create_args(const char *name, + unsigned int object_size, + struct kmem_cache_args *args, + slab_flags_t flags); +static inline struct kmem_cache * +__kmem_cache_create(const char *name, unsigned int size, unsigned int align, + slab_flags_t flags, void (*ctor)(void *)) +{ + struct kmem_cache_args kmem_args = { + .align = align, + .ctor = ctor, + }; + + return __kmem_cache_create_args(name, size, &kmem_args, flags); +} + +/** + * kmem_cache_create_usercopy - Create a kmem cache with a region suitable + * for copying to userspace. + * @name: A string which is used in /proc/slabinfo to identify this cache. + * @size: The size of objects to be created in this cache. + * @align: The required alignment for the objects. + * @flags: SLAB flags + * @useroffset: Usercopy region offset + * @usersize: Usercopy region size + * @ctor: A constructor for the objects, or %NULL. + * + * This is a legacy wrapper, new code should use either KMEM_CACHE_USERCOPY() + * if whitelisting a single field is sufficient, or kmem_cache_create() with + * the necessary parameters passed via the args parameter (see + * &struct kmem_cache_args) + * + * Return: a pointer to the cache on success, NULL on failure. + */ +static inline struct kmem_cache * +kmem_cache_create_usercopy(const char *name, unsigned int size, + unsigned int align, slab_flags_t flags, + unsigned int useroffset, unsigned int usersize, + void (*ctor)(void *)) +{ + struct kmem_cache_args kmem_args = { + .align = align, + .ctor = ctor, + .useroffset = useroffset, + .usersize = usersize, + }; + + return __kmem_cache_create_args(name, size, &kmem_args, flags); +} + +/* If NULL is passed for @args, use this variant with default arguments. */ +static inline struct kmem_cache * +__kmem_cache_default_args(const char *name, unsigned int size, + struct kmem_cache_args *args, + slab_flags_t flags) +{ + struct kmem_cache_args kmem_default_args = {}; + + /* Make sure we don't get passed garbage. */ + if (WARN_ON_ONCE(args)) + return ERR_PTR(-EINVAL); + + return __kmem_cache_create_args(name, size, &kmem_default_args, flags); +} + +/** + * kmem_cache_create - Create a kmem cache. + * @__name: A string which is used in /proc/slabinfo to identify this cache. + * @__object_size: The size of objects to be created in this cache. + * @__args: Optional arguments, see &struct kmem_cache_args. Passing %NULL + * means defaults will be used for all the arguments. + * + * This is currently implemented as a macro using ``_Generic()`` to call + * either the new variant of the function, or a legacy one. + * + * The new variant has 4 parameters: + * ``kmem_cache_create(name, object_size, args, flags)`` + * + * See __kmem_cache_create_args() which implements this. + * + * The legacy variant has 5 parameters: + * ``kmem_cache_create(name, object_size, align, flags, ctor)`` + * + * The align and ctor parameters map to the respective fields of + * &struct kmem_cache_args + * + * Context: Cannot be called within a interrupt, but can be interrupted. + * + * Return: a pointer to the cache on success, NULL on failure. + */ +#define kmem_cache_create(__name, __object_size, __args, ...) \ + _Generic((__args), \ + struct kmem_cache_args *: __kmem_cache_create_args, \ + void *: __kmem_cache_default_args, \ + default: __kmem_cache_create)(__name, __object_size, __args, __VA_ARGS__) + void kmem_cache_destroy(struct kmem_cache *s); int kmem_cache_shrink(struct kmem_cache *s); @@ -262,20 +418,23 @@ int kmem_cache_shrink(struct kmem_cache *s); * f.e. add ____cacheline_aligned_in_smp to the struct declaration * then the objects will be properly aligned in SMP configurations. */ -#define KMEM_CACHE(__struct, __flags) \ - kmem_cache_create(#__struct, sizeof(struct __struct), \ - __alignof__(struct __struct), (__flags), NULL) +#define KMEM_CACHE(__struct, __flags) \ + __kmem_cache_create_args(#__struct, sizeof(struct __struct), \ + &(struct kmem_cache_args) { \ + .align = __alignof__(struct __struct), \ + }, (__flags)) /* * To whitelist a single field for copying to/from usercopy, use this * macro instead for KMEM_CACHE() above. */ -#define KMEM_CACHE_USERCOPY(__struct, __flags, __field) \ - kmem_cache_create_usercopy(#__struct, \ - sizeof(struct __struct), \ - __alignof__(struct __struct), (__flags), \ - offsetof(struct __struct, __field), \ - sizeof_field(struct __struct, __field), NULL) +#define KMEM_CACHE_USERCOPY(__struct, __flags, __field) \ + __kmem_cache_create_args(#__struct, sizeof(struct __struct), \ + &(struct kmem_cache_args) { \ + .align = __alignof__(struct __struct), \ + .useroffset = offsetof(struct __struct, __field), \ + .usersize = sizeof_field(struct __struct, __field), \ + }, (__flags)) /* * Common kmalloc functions provided by all allocators @@ -556,6 +715,35 @@ void *kmem_cache_alloc_lru_noprof(struct kmem_cache *s, struct list_lru *lru, gfp_t gfpflags) __assume_slab_alignment __malloc; #define kmem_cache_alloc_lru(...) alloc_hooks(kmem_cache_alloc_lru_noprof(__VA_ARGS__)) +/** + * kmem_cache_charge - memcg charge an already allocated slab memory + * @objp: address of the slab object to memcg charge + * @gfpflags: describe the allocation context + * + * kmem_cache_charge allows charging a slab object to the current memcg, + * primarily in cases where charging at allocation time might not be possible + * because the target memcg is not known (i.e. softirq context) + * + * The objp should be pointer returned by the slab allocator functions like + * kmalloc (with __GFP_ACCOUNT in flags) or kmem_cache_alloc. The memcg charge + * behavior can be controlled through gfpflags parameter, which affects how the + * necessary internal metadata can be allocated. Including __GFP_NOFAIL denotes + * that overcharging is requested instead of failure, but is not applied for the + * internal metadata allocation. + * + * There are several cases where it will return true even if the charging was + * not done: + * More specifically: + * + * 1. For !CONFIG_MEMCG or cgroup_disable=memory systems. + * 2. Already charged slab objects. + * 3. For slab objects from KMALLOC_NORMAL caches - allocated by kmalloc() + * without __GFP_ACCOUNT + * 4. Allocating internal metadata has failed + * + * Return: true if charge was successful otherwise false. + */ +bool kmem_cache_charge(void *objp, gfp_t gfpflags); void kmem_cache_free(struct kmem_cache *s, void *objp); kmem_buckets *kmem_buckets_create(const char *name, slab_flags_t flags, |