diff options
| author | Paolo Abeni <[email protected]> | 2023-11-28 15:48:42 +0100 |
|---|---|---|
| committer | Paolo Abeni <[email protected]> | 2023-11-28 15:48:43 +0100 |
| commit | a379972973a80924b1d03443e20f113ff76a94c7 (patch) | |
| tree | 03d795a2591a1f3ce239993abf09790f3bc8a799 /include/linux | |
| parent | a214724554aee8f6a5953dccab51ceff448c08cd (diff) | |
| parent | 637567e4a3ef6f6a5ffa48781207d270265f7e68 (diff) | |
Merge branch 'net-page_pool-add-netlink-based-introspection'
Jakub Kicinski says:
====================
net: page_pool: add netlink-based introspection
We recently started to deploy newer kernels / drivers at Meta,
making significant use of page pools for the first time.
We immediately run into page pool leaks both real and false positive
warnings. As Eric pointed out/predicted there's no guarantee that
applications will read / close their sockets so a page pool page
may be stuck in a socket (but not leaked) forever. This happens
a lot in our fleet. Most of these are obviously due to application
bugs but we should not be printing kernel warnings due to minor
application resource leaks.
Conversely the page pool memory may get leaked at runtime, and
we have no way to detect / track that, unless someone reconfigures
the NIC and destroys the page pools which leaked the pages.
The solution presented here is to expose the memory use of page
pools via netlink. This allows for continuous monitoring of memory
used by page pools, regardless if they were destroyed or not.
Sample in patch 15 can print the memory use and recycling
efficiency:
$ ./page-pool
eth0[2] page pools: 10 (zombies: 0)
refs: 41984 bytes: 171966464 (refs: 0 bytes: 0)
recycling: 90.3% (alloc: 656:397681 recycle: 89652:270201)
v4:
- use dev_net(netdev)->loopback_dev
- extend inflight doc
v3: https://lore.kernel.org/all/[email protected]/
- ID is still here, can't decide if it matters
- rename destroyed -> detach-time, good enough?
- fix build for netsec
v2: https://lore.kernel.org/r/[email protected]
- hopefully fix build with PAGE_POOL=n
v1: https://lore.kernel.org/all/[email protected]/
- The main change compared to the RFC is that the API now exposes
outstanding references and byte counts even for "live" page pools.
The warning is no longer printed if page pool is accessible via netlink.
RFC: https://lore.kernel.org/all/[email protected]/
====================
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Paolo Abeni <[email protected]>
Diffstat (limited to 'include/linux')
| -rw-r--r-- | include/linux/list.h | 20 | ||||
| -rw-r--r-- | include/linux/netdevice.h | 4 | ||||
| -rw-r--r-- | include/linux/poison.h | 2 |
3 files changed, 26 insertions, 0 deletions
diff --git a/include/linux/list.h b/include/linux/list.h index 1837caedf723..059aa1fff41e 100644 --- a/include/linux/list.h +++ b/include/linux/list.h @@ -1119,6 +1119,26 @@ static inline void hlist_move_list(struct hlist_head *old, old->first = NULL; } +/** + * hlist_splice_init() - move all entries from one list to another + * @from: hlist_head from which entries will be moved + * @last: last entry on the @from list + * @to: hlist_head to which entries will be moved + * + * @to can be empty, @from must contain at least @last. + */ +static inline void hlist_splice_init(struct hlist_head *from, + struct hlist_node *last, + struct hlist_head *to) +{ + if (to->first) + to->first->pprev = &last->next; + last->next = to->first; + to->first = from->first; + from->first->pprev = &to->first; + from->first = NULL; +} + #define hlist_entry(ptr, type, member) container_of(ptr,type,member) #define hlist_for_each(pos, head) \ diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index e87caa81f70c..998c7aaa98b8 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -2447,6 +2447,10 @@ struct net_device { #if IS_ENABLED(CONFIG_DPLL) struct dpll_pin *dpll_pin; #endif +#if IS_ENABLED(CONFIG_PAGE_POOL) + /** @page_pools: page pools created for this netdevice */ + struct hlist_head page_pools; +#endif }; #define to_net_dev(d) container_of(d, struct net_device, dev) diff --git a/include/linux/poison.h b/include/linux/poison.h index 851a855d3868..27a7dad17eef 100644 --- a/include/linux/poison.h +++ b/include/linux/poison.h @@ -83,6 +83,8 @@ /********** net/core/skbuff.c **********/ #define SKB_LIST_POISON_NEXT ((void *)(0x800 + POISON_POINTER_DELTA)) +/********** net/ **********/ +#define NET_PTR_POISON ((void *)(0x801 + POISON_POINTER_DELTA)) /********** kernel/bpf/ **********/ #define BPF_PTR_POISON ((void *)(0xeB9FUL + POISON_POINTER_DELTA)) |