diff options
author | Matt Fleming <[email protected]> | 2019-08-08 20:53:01 +0100 |
---|---|---|
committer | Ingo Molnar <[email protected]> | 2019-09-03 09:17:37 +0200 |
commit | a55c7454a8c887b226a01d7eed088ccb5374d81e (patch) | |
tree | 43123b848008c49e81f6511613a1599c92b48fe5 /lib/test_overflow.c | |
parent | a2cbfd46559e809c8165773b7fe8afa058b35414 (diff) |
sched/topology: Improve load balancing on AMD EPYC systems
SD_BALANCE_{FORK,EXEC} and SD_WAKE_AFFINE are stripped in sd_init()
for any sched domains with a NUMA distance greater than 2 hops
(RECLAIM_DISTANCE). The idea being that it's expensive to balance
across domains that far apart.
However, as is rather unfortunately explained in:
commit 32e45ff43eaf ("mm: increase RECLAIM_DISTANCE to 30")
the value for RECLAIM_DISTANCE is based on node distance tables from
2011-era hardware.
Current AMD EPYC machines have the following NUMA node distances:
node distances:
node 0 1 2 3 4 5 6 7
0: 10 16 16 16 32 32 32 32
1: 16 10 16 16 32 32 32 32
2: 16 16 10 16 32 32 32 32
3: 16 16 16 10 32 32 32 32
4: 32 32 32 32 10 16 16 16
5: 32 32 32 32 16 10 16 16
6: 32 32 32 32 16 16 10 16
7: 32 32 32 32 16 16 16 10
where 2 hops is 32.
The result is that the scheduler fails to load balance properly across
NUMA nodes on different sockets -- 2 hops apart.
For example, pinning 16 busy threads to NUMA nodes 0 (CPUs 0-7) and 4
(CPUs 32-39) like so,
$ numactl -C 0-7,32-39 ./spinner 16
causes all threads to fork and remain on node 0 until the active
balancer kicks in after a few seconds and forcibly moves some threads
to node 4.
Override node_reclaim_distance for AMD Zen.
Signed-off-by: Matt Fleming <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Mel Gorman <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: [email protected]
Cc: Thomas Gleixner <[email protected]>
Cc: [email protected]
Cc: Tony Luck <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
Diffstat (limited to 'lib/test_overflow.c')
0 files changed, 0 insertions, 0 deletions