diff options
author | Alexei Starovoitov <[email protected]> | 2023-03-07 09:33:43 -0800 |
---|---|---|
committer | Alexei Starovoitov <[email protected]> | 2023-03-07 09:33:56 -0800 |
commit | a73dc912aa7e43f4f12003a26aeab839b500b86d (patch) | |
tree | 33e31263163cc0d003c2942de91b147606a60e38 | |
parent | 2d5bcdcda8799cf21f9ab84598c946dd320207a2 (diff) | |
parent | 6b4a6ea2c62d34272d64161d43a19c02355576e2 (diff) |
Merge branch 'bpf: bpf memory usage'
Yafang Shao says:
====================
Currently we can't get bpf memory usage reliably either from memcg or
from bpftool.
In memcg, there's not a 'bpf' item in memory.stat, but only 'kernel',
'sock', 'vmalloc' and 'percpu' which may related to bpf memory. With
these items we still can't get the bpf memory usage, because bpf memory
usage may far less than the kmem in a memcg, for example, the dentry may
consume lots of kmem.
bpftool now shows the bpf memory footprint, which is difference with bpf
memory usage. The difference can be quite great in some cases, for example,
- non-preallocated bpf map
The non-preallocated bpf map memory usage is dynamically changed. The
allocated elements count can be from 0 to the max entries. But the
memory footprint in bpftool only shows a fixed number.
- bpf metadata consumes more memory than bpf element
In some corner cases, the bpf metadata can consumes a lot more memory
than bpf element consumes. For example, it can happen when the element
size is quite small.
- some maps don't have key, value or max_entries
For example the key_size and value_size of ringbuf is 0, so its
memlock is always 0.
We need a way to show the bpf memory usage especially there will be more
and more bpf programs running on the production environment and thus the
bpf memory usage is not trivial.
This patchset introduces a new map ops ->map_mem_usage to calculate the
memory usage. Note that we don't intend to make the memory usage 100%
accurate, while our goal is to make sure there is only a small difference
between what bpftool reports and the real memory. That small difference
can be ignored compared to the total usage. That is enough to monitor
the bpf memory usage. For example, the user can rely on this value to
monitor the trend of bpf memory usage, compare the difference in bpf
memory usage between different bpf program versions, figure out which
maps consume large memory, and etc.
This patchset implements the bpf memory usage for all maps, and yet there's
still work to do. We don't want to introduce runtime overhead in the
element update and delete path, but we have to do it for some
non-preallocated maps,
- devmap, xskmap
When we update or delete an element, it will allocate or free memory.
In order to track this dynamic memory, we have to track the count in
element update and delete path.
- cpumap
The element size of each cpumap element is not determinated. If we
want to track the usage, we have to count the size of all elements in
the element update and delete path. So I just put it aside currently.
- local_storage, bpf_local_storage
When we attach or detach a cgroup, it will allocate or free memory. If
we want to track the dynamic memory, we also need to do something in
the update and delete path. So I just put it aside currently.
- offload map
The element update and delete of offload map is via the netdev dev_ops,
in which it may dynamically allocate or free memory, but this dynamic
memory isn't counted in offload map memory usage currently.
The result of each map can be found in the individual patch.
We may also need to track per-container bpf memory usage, that will be
addressed by a different patchset.
Changes:
v3->v4: code improvement on ringbuf (Andrii)
use READ_ONCE() to read lpm_trie (Tao)
explain why we can't get bpf memory usage from memcg.
v2->v3: check callback at map creation time and avoid warning (Alexei)
fix build error under CONFIG_BPF=n ([email protected])
v1->v2: calculate the memory usage within bpf (Alexei)
- [v1] bpf, mm: bpf memory usage
https://lwn.net/Articles/921991/
- [RFC PATCH v2] mm, bpf: Add BPF into /proc/meminfo
https://lwn.net/Articles/919848/
- [RFC PATCH v1] mm, bpf: Add BPF into /proc/meminfo
https://lwn.net/Articles/917647/
- [RFC PATCH] bpf, mm: Add a new item bpf into memory.stat
https://lore.kernel.org/bpf/20220921170002.29557-1-laoar.shao@gmail].com/
====================
Signed-off-by: Alexei Starovoitov <[email protected]>
-rw-r--r-- | include/linux/bpf.h | 8 | ||||
-rw-r--r-- | include/linux/bpf_local_storage.h | 1 | ||||
-rw-r--r-- | include/net/xdp_sock.h | 1 | ||||
-rw-r--r-- | kernel/bpf/arraymap.c | 28 | ||||
-rw-r--r-- | kernel/bpf/bloom_filter.c | 12 | ||||
-rw-r--r-- | kernel/bpf/bpf_cgrp_storage.c | 1 | ||||
-rw-r--r-- | kernel/bpf/bpf_inode_storage.c | 1 | ||||
-rw-r--r-- | kernel/bpf/bpf_local_storage.c | 10 | ||||
-rw-r--r-- | kernel/bpf/bpf_struct_ops.c | 16 | ||||
-rw-r--r-- | kernel/bpf/bpf_task_storage.c | 1 | ||||
-rw-r--r-- | kernel/bpf/cpumap.c | 10 | ||||
-rw-r--r-- | kernel/bpf/devmap.c | 26 | ||||
-rw-r--r-- | kernel/bpf/hashtab.c | 43 | ||||
-rw-r--r-- | kernel/bpf/local_storage.c | 7 | ||||
-rw-r--r-- | kernel/bpf/lpm_trie.c | 11 | ||||
-rw-r--r-- | kernel/bpf/offload.c | 6 | ||||
-rw-r--r-- | kernel/bpf/queue_stack_maps.c | 10 | ||||
-rw-r--r-- | kernel/bpf/reuseport_array.c | 8 | ||||
-rw-r--r-- | kernel/bpf/ringbuf.c | 20 | ||||
-rw-r--r-- | kernel/bpf/stackmap.c | 14 | ||||
-rw-r--r-- | kernel/bpf/syscall.c | 20 | ||||
-rw-r--r-- | net/core/bpf_sk_storage.c | 1 | ||||
-rw-r--r-- | net/core/sock_map.c | 20 | ||||
-rw-r--r-- | net/xdp/xskmap.c | 13 |
24 files changed, 273 insertions, 15 deletions
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index d3456804f7aa..6792a7940e1e 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -161,6 +161,8 @@ struct bpf_map_ops { bpf_callback_t callback_fn, void *callback_ctx, u64 flags); + u64 (*map_mem_usage)(const struct bpf_map *map); + /* BTF id of struct allocated by map_alloc */ int *map_btf_id; @@ -2622,6 +2624,7 @@ static inline bool bpf_map_is_offloaded(struct bpf_map *map) struct bpf_map *bpf_map_offload_map_alloc(union bpf_attr *attr); void bpf_map_offload_map_free(struct bpf_map *map); +u64 bpf_map_offload_map_mem_usage(const struct bpf_map *map); int bpf_prog_test_run_syscall(struct bpf_prog *prog, const union bpf_attr *kattr, union bpf_attr __user *uattr); @@ -2693,6 +2696,11 @@ static inline void bpf_map_offload_map_free(struct bpf_map *map) { } +static inline u64 bpf_map_offload_map_mem_usage(const struct bpf_map *map) +{ + return 0; +} + static inline int bpf_prog_test_run_syscall(struct bpf_prog *prog, const union bpf_attr *kattr, union bpf_attr __user *uattr) diff --git a/include/linux/bpf_local_storage.h b/include/linux/bpf_local_storage.h index 6d37a40cd90e..d934248b8e81 100644 --- a/include/linux/bpf_local_storage.h +++ b/include/linux/bpf_local_storage.h @@ -164,5 +164,6 @@ bpf_local_storage_update(void *owner, struct bpf_local_storage_map *smap, void *value, u64 map_flags, gfp_t gfp_flags); void bpf_local_storage_free_rcu(struct rcu_head *rcu); +u64 bpf_local_storage_map_mem_usage(const struct bpf_map *map); #endif /* _BPF_LOCAL_STORAGE_H */ diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h index 3057e1a4a11c..e96a1151ec75 100644 --- a/include/net/xdp_sock.h +++ b/include/net/xdp_sock.h @@ -38,6 +38,7 @@ struct xdp_umem { struct xsk_map { struct bpf_map map; spinlock_t lock; /* Synchronize map updates */ + atomic_t count; struct xdp_sock __rcu *xsk_map[]; }; diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c index 484706959556..1588c793a715 100644 --- a/kernel/bpf/arraymap.c +++ b/kernel/bpf/arraymap.c @@ -721,6 +721,28 @@ static int bpf_for_each_array_elem(struct bpf_map *map, bpf_callback_t callback_ return num_elems; } +static u64 array_map_mem_usage(const struct bpf_map *map) +{ + struct bpf_array *array = container_of(map, struct bpf_array, map); + bool percpu = map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY; + u32 elem_size = array->elem_size; + u64 entries = map->max_entries; + u64 usage = sizeof(*array); + + if (percpu) { + usage += entries * sizeof(void *); + usage += entries * elem_size * num_possible_cpus(); + } else { + if (map->map_flags & BPF_F_MMAPABLE) { + usage = PAGE_ALIGN(usage); + usage += PAGE_ALIGN(entries * elem_size); + } else { + usage += entries * elem_size; + } + } + return usage; +} + BTF_ID_LIST_SINGLE(array_map_btf_ids, struct, bpf_array) const struct bpf_map_ops array_map_ops = { .map_meta_equal = array_map_meta_equal, @@ -742,6 +764,7 @@ const struct bpf_map_ops array_map_ops = { .map_update_batch = generic_map_update_batch, .map_set_for_each_callback_args = map_set_for_each_callback_args, .map_for_each_callback = bpf_for_each_array_elem, + .map_mem_usage = array_map_mem_usage, .map_btf_id = &array_map_btf_ids[0], .iter_seq_info = &iter_seq_info, }; @@ -762,6 +785,7 @@ const struct bpf_map_ops percpu_array_map_ops = { .map_update_batch = generic_map_update_batch, .map_set_for_each_callback_args = map_set_for_each_callback_args, .map_for_each_callback = bpf_for_each_array_elem, + .map_mem_usage = array_map_mem_usage, .map_btf_id = &array_map_btf_ids[0], .iter_seq_info = &iter_seq_info, }; @@ -1156,6 +1180,7 @@ const struct bpf_map_ops prog_array_map_ops = { .map_fd_sys_lookup_elem = prog_fd_array_sys_lookup_elem, .map_release_uref = prog_array_map_clear, .map_seq_show_elem = prog_array_map_seq_show_elem, + .map_mem_usage = array_map_mem_usage, .map_btf_id = &array_map_btf_ids[0], }; @@ -1257,6 +1282,7 @@ const struct bpf_map_ops perf_event_array_map_ops = { .map_fd_put_ptr = perf_event_fd_array_put_ptr, .map_release = perf_event_fd_array_release, .map_check_btf = map_check_no_btf, + .map_mem_usage = array_map_mem_usage, .map_btf_id = &array_map_btf_ids[0], }; @@ -1291,6 +1317,7 @@ const struct bpf_map_ops cgroup_array_map_ops = { .map_fd_get_ptr = cgroup_fd_array_get_ptr, .map_fd_put_ptr = cgroup_fd_array_put_ptr, .map_check_btf = map_check_no_btf, + .map_mem_usage = array_map_mem_usage, .map_btf_id = &array_map_btf_ids[0], }; #endif @@ -1379,5 +1406,6 @@ const struct bpf_map_ops array_of_maps_map_ops = { .map_lookup_batch = generic_map_lookup_batch, .map_update_batch = generic_map_update_batch, .map_check_btf = map_check_no_btf, + .map_mem_usage = array_map_mem_usage, .map_btf_id = &array_map_btf_ids[0], }; diff --git a/kernel/bpf/bloom_filter.c b/kernel/bpf/bloom_filter.c index 48ee750849f2..6350c5d35a9b 100644 --- a/kernel/bpf/bloom_filter.c +++ b/kernel/bpf/bloom_filter.c @@ -193,6 +193,17 @@ static int bloom_map_check_btf(const struct bpf_map *map, return btf_type_is_void(key_type) ? 0 : -EINVAL; } +static u64 bloom_map_mem_usage(const struct bpf_map *map) +{ + struct bpf_bloom_filter *bloom; + u64 bitset_bytes; + + bloom = container_of(map, struct bpf_bloom_filter, map); + bitset_bytes = BITS_TO_BYTES((u64)bloom->bitset_mask + 1); + bitset_bytes = roundup(bitset_bytes, sizeof(unsigned long)); + return sizeof(*bloom) + bitset_bytes; +} + BTF_ID_LIST_SINGLE(bpf_bloom_map_btf_ids, struct, bpf_bloom_filter) const struct bpf_map_ops bloom_filter_map_ops = { .map_meta_equal = bpf_map_meta_equal, @@ -206,5 +217,6 @@ const struct bpf_map_ops bloom_filter_map_ops = { .map_update_elem = bloom_map_update_elem, .map_delete_elem = bloom_map_delete_elem, .map_check_btf = bloom_map_check_btf, + .map_mem_usage = bloom_map_mem_usage, .map_btf_id = &bpf_bloom_map_btf_ids[0], }; diff --git a/kernel/bpf/bpf_cgrp_storage.c b/kernel/bpf/bpf_cgrp_storage.c index 6cdf6d9ed91d..9ae07aedaf23 100644 --- a/kernel/bpf/bpf_cgrp_storage.c +++ b/kernel/bpf/bpf_cgrp_storage.c @@ -221,6 +221,7 @@ const struct bpf_map_ops cgrp_storage_map_ops = { .map_update_elem = bpf_cgrp_storage_update_elem, .map_delete_elem = bpf_cgrp_storage_delete_elem, .map_check_btf = bpf_local_storage_map_check_btf, + .map_mem_usage = bpf_local_storage_map_mem_usage, .map_btf_id = &bpf_local_storage_map_btf_id[0], .map_owner_storage_ptr = cgroup_storage_ptr, }; diff --git a/kernel/bpf/bpf_inode_storage.c b/kernel/bpf/bpf_inode_storage.c index 05f4c66c9089..43e2619c8167 100644 --- a/kernel/bpf/bpf_inode_storage.c +++ b/kernel/bpf/bpf_inode_storage.c @@ -223,6 +223,7 @@ const struct bpf_map_ops inode_storage_map_ops = { .map_update_elem = bpf_fd_inode_storage_update_elem, .map_delete_elem = bpf_fd_inode_storage_delete_elem, .map_check_btf = bpf_local_storage_map_check_btf, + .map_mem_usage = bpf_local_storage_map_mem_usage, .map_btf_id = &bpf_local_storage_map_btf_id[0], .map_owner_storage_ptr = inode_storage_ptr, }; diff --git a/kernel/bpf/bpf_local_storage.c b/kernel/bpf/bpf_local_storage.c index 3d320393a12c..d3ba3f2db640 100644 --- a/kernel/bpf/bpf_local_storage.c +++ b/kernel/bpf/bpf_local_storage.c @@ -685,6 +685,16 @@ bool bpf_local_storage_unlink_nolock(struct bpf_local_storage *local_storage) return free_storage; } +u64 bpf_local_storage_map_mem_usage(const struct bpf_map *map) +{ + struct bpf_local_storage_map *smap = (struct bpf_local_storage_map *)map; + u64 usage = sizeof(*smap); + + /* The dynamically callocated selems are not counted currently. */ + usage += sizeof(*smap->buckets) * (1ULL << smap->bucket_log); + return usage; +} + struct bpf_map * bpf_local_storage_map_alloc(union bpf_attr *attr, struct bpf_local_storage_cache *cache) diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c index ece9870cab68..38903fb52f98 100644 --- a/kernel/bpf/bpf_struct_ops.c +++ b/kernel/bpf/bpf_struct_ops.c @@ -641,6 +641,21 @@ static struct bpf_map *bpf_struct_ops_map_alloc(union bpf_attr *attr) return map; } +static u64 bpf_struct_ops_map_mem_usage(const struct bpf_map *map) +{ + struct bpf_struct_ops_map *st_map = (struct bpf_struct_ops_map *)map; + const struct bpf_struct_ops *st_ops = st_map->st_ops; + const struct btf_type *vt = st_ops->value_type; + u64 usage; + + usage = sizeof(*st_map) + + vt->size - sizeof(struct bpf_struct_ops_value); + usage += vt->size; + usage += btf_type_vlen(vt) * sizeof(struct bpf_links *); + usage += PAGE_SIZE; + return usage; +} + BTF_ID_LIST_SINGLE(bpf_struct_ops_map_btf_ids, struct, bpf_struct_ops_map) const struct bpf_map_ops bpf_struct_ops_map_ops = { .map_alloc_check = bpf_struct_ops_map_alloc_check, @@ -651,6 +666,7 @@ const struct bpf_map_ops bpf_struct_ops_map_ops = { .map_delete_elem = bpf_struct_ops_map_delete_elem, .map_update_elem = bpf_struct_ops_map_update_elem, .map_seq_show_elem = bpf_struct_ops_map_seq_show_elem, + .map_mem_usage = bpf_struct_ops_map_mem_usage, .map_btf_id = &bpf_struct_ops_map_btf_ids[0], }; diff --git a/kernel/bpf/bpf_task_storage.c b/kernel/bpf/bpf_task_storage.c index 1e486055a523..20f942229f3c 100644 --- a/kernel/bpf/bpf_task_storage.c +++ b/kernel/bpf/bpf_task_storage.c @@ -335,6 +335,7 @@ const struct bpf_map_ops task_storage_map_ops = { .map_update_elem = bpf_pid_task_storage_update_elem, .map_delete_elem = bpf_pid_task_storage_delete_elem, .map_check_btf = bpf_local_storage_map_check_btf, + .map_mem_usage = bpf_local_storage_map_mem_usage, .map_btf_id = &bpf_local_storage_map_btf_id[0], .map_owner_storage_ptr = task_storage_ptr, }; diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c index d2110c1f6fa6..871809e71b4e 100644 --- a/kernel/bpf/cpumap.c +++ b/kernel/bpf/cpumap.c @@ -673,6 +673,15 @@ static int cpu_map_redirect(struct bpf_map *map, u64 index, u64 flags) __cpu_map_lookup_elem); } +static u64 cpu_map_mem_usage(const struct bpf_map *map) +{ + u64 usage = sizeof(struct bpf_cpu_map); + + /* Currently the dynamically allocated elements are not counted */ + usage += (u64)map->max_entries * sizeof(struct bpf_cpu_map_entry *); + return usage; +} + BTF_ID_LIST_SINGLE(cpu_map_btf_ids, struct, bpf_cpu_map) const struct bpf_map_ops cpu_map_ops = { .map_meta_equal = bpf_map_meta_equal, @@ -683,6 +692,7 @@ const struct bpf_map_ops cpu_map_ops = { .map_lookup_elem = cpu_map_lookup_elem, .map_get_next_key = cpu_map_get_next_key, .map_check_btf = map_check_no_btf, + .map_mem_usage = cpu_map_mem_usage, .map_btf_id = &cpu_map_btf_ids[0], .map_redirect = cpu_map_redirect, }; diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c index 2675fefc6cb6..19b036a228f7 100644 --- a/kernel/bpf/devmap.c +++ b/kernel/bpf/devmap.c @@ -819,8 +819,10 @@ static int dev_map_delete_elem(struct bpf_map *map, void *key) return -EINVAL; old_dev = unrcu_pointer(xchg(&dtab->netdev_map[k], NULL)); - if (old_dev) + if (old_dev) { call_rcu(&old_dev->rcu, __dev_map_entry_free); + atomic_dec((atomic_t *)&dtab->items); + } return 0; } @@ -931,6 +933,8 @@ static int __dev_map_update_elem(struct net *net, struct bpf_map *map, old_dev = unrcu_pointer(xchg(&dtab->netdev_map[i], RCU_INITIALIZER(dev))); if (old_dev) call_rcu(&old_dev->rcu, __dev_map_entry_free); + else + atomic_inc((atomic_t *)&dtab->items); return 0; } @@ -1016,6 +1020,20 @@ static int dev_hash_map_redirect(struct bpf_map *map, u64 ifindex, u64 flags) __dev_map_hash_lookup_elem); } +static u64 dev_map_mem_usage(const struct bpf_map *map) +{ + struct bpf_dtab *dtab = container_of(map, struct bpf_dtab, map); + u64 usage = sizeof(struct bpf_dtab); + + if (map->map_type == BPF_MAP_TYPE_DEVMAP_HASH) + usage += (u64)dtab->n_buckets * sizeof(struct hlist_head); + else + usage += (u64)map->max_entries * sizeof(struct bpf_dtab_netdev *); + usage += atomic_read((atomic_t *)&dtab->items) * + (u64)sizeof(struct bpf_dtab_netdev); + return usage; +} + BTF_ID_LIST_SINGLE(dev_map_btf_ids, struct, bpf_dtab) const struct bpf_map_ops dev_map_ops = { .map_meta_equal = bpf_map_meta_equal, @@ -1026,6 +1044,7 @@ const struct bpf_map_ops dev_map_ops = { .map_update_elem = dev_map_update_elem, .map_delete_elem = dev_map_delete_elem, .map_check_btf = map_check_no_btf, + .map_mem_usage = dev_map_mem_usage, .map_btf_id = &dev_map_btf_ids[0], .map_redirect = dev_map_redirect, }; @@ -1039,6 +1058,7 @@ const struct bpf_map_ops dev_map_hash_ops = { .map_update_elem = dev_map_hash_update_elem, .map_delete_elem = dev_map_hash_delete_elem, .map_check_btf = map_check_no_btf, + .map_mem_usage = dev_map_mem_usage, .map_btf_id = &dev_map_btf_ids[0], .map_redirect = dev_hash_map_redirect, }; @@ -1109,9 +1129,11 @@ static int dev_map_notification(struct notifier_block *notifier, if (!dev || netdev != dev->dev) continue; odev = unrcu_pointer(cmpxchg(&dtab->netdev_map[i], RCU_INITIALIZER(dev), NULL)); - if (dev == odev) + if (dev == odev) { call_rcu(&dev->rcu, __dev_map_entry_free); + atomic_dec((atomic_t *)&dtab->items); + } } } rcu_read_unlock(); diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index 653aeb481c79..0df4b0c10f59 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -2190,6 +2190,44 @@ out: return num_elems; } +static u64 htab_map_mem_usage(const struct bpf_map *map) +{ + struct bpf_htab *htab = container_of(map, struct bpf_htab, map); + u32 value_size = round_up(htab->map.value_size, 8); + bool prealloc = htab_is_prealloc(htab); + bool percpu = htab_is_percpu(htab); + bool lru = htab_is_lru(htab); + u64 num_entries; + u64 usage = sizeof(struct bpf_htab); + + usage += sizeof(struct bucket) * htab->n_buckets; + usage += sizeof(int) * num_possible_cpus() * HASHTAB_MAP_LOCK_COUNT; + if (prealloc) { + num_entries = map->max_entries; + if (htab_has_extra_elems(htab)) + num_entries += num_possible_cpus(); + + usage += htab->elem_size * num_entries; + + if (percpu) + usage += value_size * num_possible_cpus() * num_entries; + else if (!lru) + usage += sizeof(struct htab_elem *) * num_possible_cpus(); + } else { +#define LLIST_NODE_SZ sizeof(struct llist_node) + + num_entries = htab->use_percpu_counter ? + percpu_counter_sum(&htab->pcount) : + atomic_read(&htab->count); + usage += (htab->elem_size + LLIST_NODE_SZ) * num_entries; + if (percpu) { + usage += (LLIST_NODE_SZ + sizeof(void *)) * num_entries; + usage += value_size * num_possible_cpus() * num_entries; + } + } + return usage; +} + BTF_ID_LIST_SINGLE(htab_map_btf_ids, struct, bpf_htab) const struct bpf_map_ops htab_map_ops = { .map_meta_equal = bpf_map_meta_equal, @@ -2206,6 +2244,7 @@ const struct bpf_map_ops htab_map_ops = { .map_seq_show_elem = htab_map_seq_show_elem, .map_set_for_each_callback_args = map_set_for_each_callback_args, .map_for_each_callback = bpf_for_each_hash_elem, + .map_mem_usage = htab_map_mem_usage, BATCH_OPS(htab), .map_btf_id = &htab_map_btf_ids[0], .iter_seq_info = &iter_seq_info, @@ -2227,6 +2266,7 @@ const struct bpf_map_ops htab_lru_map_ops = { .map_seq_show_elem = htab_map_seq_show_elem, .map_set_for_each_callback_args = map_set_for_each_callback_args, .map_for_each_callback = bpf_for_each_hash_elem, + .map_mem_usage = htab_map_mem_usage, BATCH_OPS(htab_lru), .map_btf_id = &htab_map_btf_ids[0], .iter_seq_info = &iter_seq_info, @@ -2378,6 +2418,7 @@ const struct bpf_map_ops htab_percpu_map_ops = { .map_seq_show_elem = htab_percpu_map_seq_show_elem, .map_set_for_each_callback_args = map_set_for_each_callback_args, .map_for_each_callback = bpf_for_each_hash_elem, + .map_mem_usage = htab_map_mem_usage, BATCH_OPS(htab_percpu), .map_btf_id = &htab_map_btf_ids[0], .iter_seq_info = &iter_seq_info, @@ -2397,6 +2438,7 @@ const struct bpf_map_ops htab_lru_percpu_map_ops = { .map_seq_show_elem = htab_percpu_map_seq_show_elem, .map_set_for_each_callback_args = map_set_for_each_callback_args, .map_for_each_callback = bpf_for_each_hash_elem, + .map_mem_usage = htab_map_mem_usage, BATCH_OPS(htab_lru_percpu), .map_btf_id = &htab_map_btf_ids[0], .iter_seq_info = &iter_seq_info, @@ -2534,6 +2576,7 @@ const struct bpf_map_ops htab_of_maps_map_ops = { .map_fd_sys_lookup_elem = bpf_map_fd_sys_lookup_elem, .map_gen_lookup = htab_of_map_gen_lookup, .map_check_btf = map_check_no_btf, + .map_mem_usage = htab_map_mem_usage, BATCH_OPS(htab), .map_btf_id = &htab_map_btf_ids[0], }; diff --git a/kernel/bpf/local_storage.c b/kernel/bpf/local_storage.c index e90d9f63edc5..a993560f200a 100644 --- a/kernel/bpf/local_storage.c +++ b/kernel/bpf/local_storage.c @@ -446,6 +446,12 @@ static void cgroup_storage_seq_show_elem(struct bpf_map *map, void *key, rcu_read_unlock(); } +static u64 cgroup_storage_map_usage(const struct bpf_map *map) +{ + /* Currently the dynamically allocated elements are not counted. */ + return sizeof(struct bpf_cgroup_storage_map); +} + BTF_ID_LIST_SINGLE(cgroup_storage_map_btf_ids, struct, bpf_cgroup_storage_map) const struct bpf_map_ops cgroup_storage_map_ops = { @@ -457,6 +463,7 @@ const struct bpf_map_ops cgroup_storage_map_ops = { .map_delete_elem = cgroup_storage_delete_elem, .map_check_btf = cgroup_storage_check_btf, .map_seq_show_elem = cgroup_storage_seq_show_elem, + .map_mem_usage = cgroup_storage_map_usage, .map_btf_id = &cgroup_storage_map_btf_ids[0], }; diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c index d833496e9e42..dc23f2ac9cde 100644 --- a/kernel/bpf/lpm_trie.c +++ b/kernel/bpf/lpm_trie.c @@ -720,6 +720,16 @@ static int trie_check_btf(const struct bpf_map *map, -EINVAL : 0; } +static u64 trie_mem_usage(const struct bpf_map *map) +{ + struct lpm_trie *trie = container_of(map, struct lpm_trie, map); + u64 elem_size; + + elem_size = sizeof(struct lpm_trie_node) + trie->data_size + + trie->map.value_size; + return elem_size * READ_ONCE(trie->n_entries); +} + BTF_ID_LIST_SINGLE(trie_map_btf_ids, struct, lpm_trie) const struct bpf_map_ops trie_map_ops = { .map_meta_equal = bpf_map_meta_equal, @@ -733,5 +743,6 @@ const struct bpf_map_ops trie_map_ops = { .map_update_batch = generic_map_update_batch, .map_delete_batch = generic_map_delete_batch, .map_check_btf = trie_check_btf, + .map_mem_usage = trie_mem_usage, .map_btf_id = &trie_map_btf_ids[0], }; diff --git a/kernel/bpf/offload.c b/kernel/bpf/offload.c index 0c85e06f7ea7..d9c9f45e3529 100644 --- a/kernel/bpf/offload.c +++ b/kernel/bpf/offload.c @@ -563,6 +563,12 @@ void bpf_map_offload_map_free(struct bpf_map *map) bpf_map_area_free(offmap); } +u64 bpf_map_offload_map_mem_usage(const struct bpf_map *map) +{ + /* The memory dynamically allocated in netdev dev_ops is not counted */ + return sizeof(struct bpf_offloaded_map); +} + int bpf_map_offload_lookup_elem(struct bpf_map *map, void *key, void *value) { struct bpf_offloaded_map *offmap = map_to_offmap(map); diff --git a/kernel/bpf/queue_stack_maps.c b/kernel/bpf/queue_stack_maps.c index 8a5e060de63b..63ecbbcb349d 100644 --- a/kernel/bpf/queue_stack_maps.c +++ b/kernel/bpf/queue_stack_maps.c @@ -246,6 +246,14 @@ static int queue_stack_map_get_next_key(struct bpf_map *map, void *key, return -EINVAL; } +static u64 queue_stack_map_mem_usage(const struct bpf_map *map) +{ + u64 usage = sizeof(struct bpf_queue_stack); + + usage += ((u64)map->max_entries + 1) * map->value_size; + return usage; +} + BTF_ID_LIST_SINGLE(queue_map_btf_ids, struct, bpf_queue_stack) const struct bpf_map_ops queue_map_ops = { .map_meta_equal = bpf_map_meta_equal, @@ -259,6 +267,7 @@ const struct bpf_map_ops queue_map_ops = { .map_pop_elem = queue_map_pop_elem, .map_peek_elem = queue_map_peek_elem, .map_get_next_key = queue_stack_map_get_next_key, + .map_mem_usage = queue_stack_map_mem_usage, .map_btf_id = &queue_map_btf_ids[0], }; @@ -274,5 +283,6 @@ const struct bpf_map_ops stack_map_ops = { .map_pop_elem = stack_map_pop_elem, .map_peek_elem = stack_map_peek_elem, .map_get_next_key = queue_stack_map_get_next_key, + .map_mem_usage = queue_stack_map_mem_usage, .map_btf_id = &queue_map_btf_ids[0], }; diff --git a/kernel/bpf/reuseport_array.c b/kernel/bpf/reuseport_array.c index 82c61612f382..71cb72f5b733 100644 --- a/kernel/bpf/reuseport_array.c +++ b/kernel/bpf/reuseport_array.c @@ -335,6 +335,13 @@ static int reuseport_array_get_next_key(struct bpf_map *map, void *key, return 0; } +static u64 reuseport_array_mem_usage(const struct bpf_map *map) +{ + struct reuseport_array *array; + + return struct_size(array, ptrs, map->max_entries); +} + BTF_ID_LIST_SINGLE(reuseport_array_map_btf_ids, struct, reuseport_array) const struct bpf_map_ops reuseport_array_ops = { .map_meta_equal = bpf_map_meta_equal, @@ -344,5 +351,6 @@ const struct bpf_map_ops reuseport_array_ops = { .map_lookup_elem = reuseport_array_lookup_elem, .map_get_next_key = reuseport_array_get_next_key, .map_delete_elem = reuseport_array_delete_elem, + .map_mem_usage = reuseport_array_mem_usage, .map_btf_id = &reuseport_array_map_btf_ids[0], }; diff --git a/kernel/bpf/ringbuf.c b/kernel/bpf/ringbuf.c index 8732e0aadf36..0d2a45ff83f1 100644 --- a/kernel/bpf/ringbuf.c +++ b/kernel/bpf/ringbuf.c @@ -19,6 +19,7 @@ (offsetof(struct bpf_ringbuf, consumer_pos) >> PAGE_SHIFT) /* consumer page and producer page */ #define RINGBUF_POS_PAGES 2 +#define RINGBUF_NR_META_PAGES (RINGBUF_PGOFF + RINGBUF_POS_PAGES) #define RINGBUF_MAX_RECORD_SZ (UINT_MAX/4) @@ -96,7 +97,7 @@ static struct bpf_ringbuf *bpf_ringbuf_area_alloc(size_t data_sz, int numa_node) { const gfp_t flags = GFP_KERNEL_ACCOUNT | __GFP_RETRY_MAYFAIL | __GFP_NOWARN | __GFP_ZERO; - int nr_meta_pages = RINGBUF_PGOFF + RINGBUF_POS_PAGES; + int nr_meta_pages = RINGBUF_NR_META_PAGES; int nr_data_pages = data_sz >> PAGE_SHIFT; int nr_pages = nr_meta_pages + nr_data_pages; struct page **pages, *page; @@ -336,6 +337,21 @@ static __poll_t ringbuf_map_poll_user(struct bpf_map *map, struct file *filp, return 0; } +static u64 ringbuf_map_mem_usage(const struct bpf_map *map) +{ + struct bpf_ringbuf *rb; + int nr_data_pages; + int nr_meta_pages; + u64 usage = sizeof(struct bpf_ringbuf_map); + + rb = container_of(map, struct bpf_ringbuf_map, map)->rb; + usage += (u64)rb->nr_pages << PAGE_SHIFT; + nr_meta_pages = RINGBUF_NR_META_PAGES; + nr_data_pages = map->max_entries >> PAGE_SHIFT; + usage += (nr_meta_pages + 2 * nr_data_pages) * sizeof(struct page *); + return usage; +} + BTF_ID_LIST_SINGLE(ringbuf_map_btf_ids, struct, bpf_ringbuf_map) const struct bpf_map_ops ringbuf_map_ops = { .map_meta_equal = bpf_map_meta_equal, @@ -347,6 +363,7 @@ const struct bpf_map_ops ringbuf_map_ops = { .map_update_elem = ringbuf_map_update_elem, .map_delete_elem = ringbuf_map_delete_elem, .map_get_next_key = ringbuf_map_get_next_key, + .map_mem_usage = ringbuf_map_mem_usage, .map_btf_id = &ringbuf_map_btf_ids[0], }; @@ -361,6 +378,7 @@ const struct bpf_map_ops user_ringbuf_map_ops = { .map_update_elem = ringbuf_map_update_elem, .map_delete_elem = ringbuf_map_delete_elem, .map_get_next_key = ringbuf_map_get_next_key, + .map_mem_usage = ringbuf_map_mem_usage, .map_btf_id = &user_ringbuf_map_btf_ids[0], }; diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c index aecea7451b61..0f1d8dced933 100644 --- a/kernel/bpf/stackmap.c +++ b/kernel/bpf/stackmap.c @@ -654,6 +654,19 @@ static void stack_map_free(struct bpf_map *map) put_callchain_buffers(); } +static u64 stack_map_mem_usage(const struct bpf_map *map) +{ + struct bpf_stack_map *smap = container_of(map, struct bpf_stack_map, map); + u64 value_size = map->value_size; + u64 n_buckets = smap->n_buckets; + u64 enties = map->max_entries; + u64 usage = sizeof(*smap); + + usage += n_buckets * sizeof(struct stack_map_bucket *); + usage += enties * (sizeof(struct stack_map_bucket) + value_size); + return usage; +} + BTF_ID_LIST_SINGLE(stack_trace_map_btf_ids, struct, bpf_stack_map) const struct bpf_map_ops stack_trace_map_ops = { .map_meta_equal = bpf_map_meta_equal, @@ -664,5 +677,6 @@ const struct bpf_map_ops stack_trace_map_ops = { .map_update_elem = stack_map_update_elem, .map_delete_elem = stack_map_delete_elem, .map_check_btf = map_check_no_btf, + .map_mem_usage = stack_map_mem_usage, .map_btf_id = &stack_trace_map_btf_ids[0], }; diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index ede5f987484f..f406dfa13792 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -105,6 +105,7 @@ const struct bpf_map_ops bpf_map_offload_ops = { .map_alloc = bpf_map_offload_map_alloc, .map_free = bpf_map_offload_map_free, .map_check_btf = map_check_no_btf, + .map_mem_usage = bpf_map_offload_map_mem_usage, }; static struct bpf_map *find_and_alloc_map(union bpf_attr *attr) @@ -128,6 +129,8 @@ static struct bpf_map *find_and_alloc_map(union bpf_attr *attr) } if (attr->map_ifindex) ops = &bpf_map_offload_ops; + if (!ops->map_mem_usage) + return ERR_PTR(-EINVAL); map = ops->map_alloc(attr); if (IS_ERR(map)) return map; @@ -771,17 +774,10 @@ static fmode_t map_get_sys_perms(struct bpf_map *map, struct fd f) } #ifdef CONFIG_PROC_FS -/* Provides an approximation of the map's memory footprint. - * Used only to provide a backward compatibility and display - * a reasonable "memlock" info. - */ -static unsigned long bpf_map_memory_footprint(const struct bpf_map *map) +/* Show the memory usage of a bpf map */ +static u64 bpf_map_memory_usage(const struct bpf_map *map) { - unsigned long size; - - size = round_up(map->key_size + bpf_map_value_size(map), 8); - - return round_up(map->max_entries * size, PAGE_SIZE); + return map->ops->map_mem_usage(map); } static void bpf_map_show_fdinfo(struct seq_file *m, struct file *filp) @@ -803,7 +799,7 @@ static void bpf_map_show_fdinfo(struct seq_file *m, struct file *filp) "max_entries:\t%u\n" "map_flags:\t%#x\n" "map_extra:\t%#llx\n" - "memlock:\t%lu\n" + "memlock:\t%llu\n" "map_id:\t%u\n" "frozen:\t%u\n", map->map_type, @@ -812,7 +808,7 @@ static void bpf_map_show_fdinfo(struct seq_file *m, struct file *filp) map->max_entries, map->map_flags, (unsigned long long)map->map_extra, - bpf_map_memory_footprint(map), + bpf_map_memory_usage(map), map->id, READ_ONCE(map->frozen)); if (type) { diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c index bb378c33f542..7a36353dbc22 100644 --- a/net/core/bpf_sk_storage.c +++ b/net/core/bpf_sk_storage.c @@ -324,6 +324,7 @@ const struct bpf_map_ops sk_storage_map_ops = { .map_local_storage_charge = bpf_sk_storage_charge, .map_local_storage_uncharge = bpf_sk_storage_uncharge, .map_owner_storage_ptr = bpf_sk_storage_ptr, + .map_mem_usage = bpf_local_storage_map_mem_usage, }; const struct bpf_func_proto bpf_sk_storage_get_proto = { diff --git a/net/core/sock_map.c b/net/core/sock_map.c index a68a7290a3b2..9b854e236d23 100644 --- a/net/core/sock_map.c +++ b/net/core/sock_map.c @@ -797,6 +797,14 @@ static void sock_map_fini_seq_private(void *priv_data) bpf_map_put_with_uref(info->map); } +static u64 sock_map_mem_usage(const struct bpf_map *map) +{ + u64 usage = sizeof(struct bpf_stab); + + usage += (u64)map->max_entries * sizeof(struct sock *); + return usage; +} + static const struct bpf_iter_seq_info sock_map_iter_seq_info = { .seq_ops = &sock_map_seq_ops, .init_seq_private = sock_map_init_seq_private, @@ -816,6 +824,7 @@ const struct bpf_map_ops sock_map_ops = { .map_lookup_elem = sock_map_lookup, .map_release_uref = sock_map_release_progs, .map_check_btf = map_check_no_btf, + .map_mem_usage = sock_map_mem_usage, .map_btf_id = &sock_map_btf_ids[0], .iter_seq_info = &sock_map_iter_seq_info, }; @@ -1397,6 +1406,16 @@ static void sock_hash_fini_seq_private(void *priv_data) bpf_map_put_with_uref(info->map); } +static u64 sock_hash_mem_usage(const struct bpf_map *map) +{ + struct bpf_shtab *htab = container_of(map, struct bpf_shtab, map); + u64 usage = sizeof(*htab); + + usage += htab->buckets_num * sizeof(struct bpf_shtab_bucket); + usage += atomic_read(&htab->count) * (u64)htab->elem_size; + return usage; +} + static const struct bpf_iter_seq_info sock_hash_iter_seq_info = { .seq_ops = &sock_hash_seq_ops, .init_seq_private = sock_hash_init_seq_private, @@ -1416,6 +1435,7 @@ const struct bpf_map_ops sock_hash_ops = { .map_lookup_elem_sys_only = sock_hash_lookup_sys, .map_release_uref = sock_hash_release_progs, .map_check_btf = map_check_no_btf, + .map_mem_usage = sock_hash_mem_usage, .map_btf_id = &sock_hash_map_btf_ids[0], .iter_seq_info = &sock_hash_iter_seq_info, }; diff --git a/net/xdp/xskmap.c b/net/xdp/xskmap.c index 771d0fa90ef5..0c38d7175922 100644 --- a/net/xdp/xskmap.c +++ b/net/xdp/xskmap.c @@ -24,6 +24,7 @@ static struct xsk_map_node *xsk_map_node_alloc(struct xsk_map *map, return ERR_PTR(-ENOMEM); bpf_map_inc(&map->map); + atomic_inc(&map->count); node->map = map; node->map_entry = map_entry; @@ -32,8 +33,11 @@ static struct xsk_map_node *xsk_map_node_alloc(struct xsk_map *map, static void xsk_map_node_free(struct xsk_map_node *node) { + struct xsk_map *map = node->map; + bpf_map_put(&node->map->map); kfree(node); + atomic_dec(&map->count); } static void xsk_map_sock_add(struct xdp_sock *xs, struct xsk_map_node *node) @@ -85,6 +89,14 @@ static struct bpf_map *xsk_map_alloc(union bpf_attr *attr) return &m->map; } +static u64 xsk_map_mem_usage(const struct bpf_map *map) +{ + struct xsk_map *m = container_of(map, struct xsk_map, map); + + return struct_size(m, xsk_map, map->max_entries) + + (u64)atomic_read(&m->count) * sizeof(struct xsk_map_node); +} + static void xsk_map_free(struct bpf_map *map) { struct xsk_map *m = container_of(map, struct xsk_map, map); @@ -267,6 +279,7 @@ const struct bpf_map_ops xsk_map_ops = { .map_update_elem = xsk_map_update_elem, .map_delete_elem = xsk_map_delete_elem, .map_check_btf = map_check_no_btf, + .map_mem_usage = xsk_map_mem_usage, .map_btf_id = &xsk_map_btf_ids[0], .map_redirect = xsk_map_redirect, }; |