mm, pcp: reduce lock contention for draining high-order pages

In commit f26b3fa04611 ("mm/page_alloc: limit number of high-order pages on PCP during bulk free"), the PCP (Per-CPU Pageset) will be drained when PCP is mostly used for high-order pages freeing to improve the cache-hot pages reusing between page allocating and freeing CPUs. On system with small per-CPU data cache slice, pages shouldn't be cached before draining to guarantee cache-hot. But on a system with large per-CPU data cache slice, some pages can be cached before draining to reduce zone lock contention. So, in this patch, instead of draining without any caching, "pcp->batch" pages will be cached in PCP before draining if the size of the per-CPU data cache slice is more than "3 * batch". In theory, if the size of per-CPU data cache slice is more than "2 * batch", we can reuse cache-hot pages between CPUs. But considering the other usage of cache (code, other data accessing, etc.), "3 * batch" is used. Note: "3 * batch" is chosen to make sure the optimization works on recent x86_64 server CPUs. If you want to increase it, please check whether it breaks the optimization. On a 2-socket Intel server with 128 logical CPU, with the patch, the network bandwidth of the UNIX (AF_UNIX) test case of lmbench test suite with 16-pair processes increase 70.5%. The cycles% of the spinlock contention (mostly for zone lock) decreases from 46.1% to 21.3%. The number of PCP draining for high order pages freeing (free_high) decreases 89.9%. The cache miss rate keeps 0.2%. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: "Huang, Ying" <[email protected]> Acked-by: Mel Gorman <[email protected]> Cc: Sudeep Holla <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Arjan van de Ven <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
author: Huang Ying <[email protected]> 2023-10-16 13:29:56 +0800
committer: Andrew Morton <[email protected]> 2023-10-25 16:47:10 -0700
commit: 362d37a106dd3f6431b2fdd91d9208b0d023b50d (patch)
tree: f51d1db4a765d1594fd34b62a96ca872d8e639da /include/linux
parent: 94a3bfe4073cd88b05f7fb201ea7bf9dfa2cf5d5 (diff)
2 files changed, 7 insertions, 0 deletions
diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 665f06675c83..665edc11fb9f 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -325,6 +325,7 @@ void drain_all_pages(struct zone *zone);
 void drain_local_pages(struct zone *zone);
 
 void page_alloc_init_late(void);
+void setup_pcp_cacheinfo(void);
 
 /*
  * gfp_allowed_mask is set to GFP_BOOT_MASK during early boot to restrict what
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index de313f1c15f9..efe72b3f7872 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -680,8 +680,14 @@ enum zone_watermarks {
  * PCPF_PREV_FREE_HIGH_ORDER: a high-order page is freed in the
  * previous page freeing.  To avoid to drain PCP for an accident
  * high-order page freeing.
+ *
+ * PCPF_FREE_HIGH_BATCH: preserve "pcp->batch" pages in PCP before
+ * draining PCP for consecutive high-order pages freeing without
+ * allocation if data cache slice of CPU is large enough.  To reduce
+ * zone lock contention and keep cache-hot pages reusing.
  */
 #define	PCPF_PREV_FREE_HIGH_ORDER	BIT(0)
+#define	PCPF_FREE_HIGH_BATCH		BIT(1)
 
 struct per_cpu_pages {
 	spinlock_t lock;	/* Protects lists field */
author	Huang Ying <[email protected]>	2023-10-16 13:29:56 +0800
committer	Andrew Morton <[email protected]>	2023-10-25 16:47:10 -0700
commit	362d37a106dd3f6431b2fdd91d9208b0d023b50d (patch)
tree	f51d1db4a765d1594fd34b62a96ca872d8e639da /include/linux
parent	94a3bfe4073cd88b05f7fb201ea7bf9dfa2cf5d5 (diff)