diff options
author | Huang, Ying <[email protected]> | 2017-02-22 15:45:22 -0800 |
---|---|---|
committer | Linus Torvalds <[email protected]> | 2017-02-22 16:41:30 -0800 |
commit | 235b62176712b970c815923e36b9a9cc05d4d901 (patch) | |
tree | 5e64033c7a4f2e47e8d66a16a993f6aa37b6e63b /lib/mpi/mpi-cmp.c | |
parent | 6a991fc72d1243b8da0c644d3147d3ec41a0b281 (diff) |
mm/swap: add cluster lock
This patch is to reduce the lock contention of swap_info_struct->lock
via using a more fine grained lock in swap_cluster_info for some swap
operations. swap_info_struct->lock is heavily contended if multiple
processes reclaim pages simultaneously. Because there is only one lock
for each swap device. While in common configuration, there is only one
or several swap devices in the system. The lock protects almost all
swap related operations.
In fact, many swap operations only access one element of
swap_info_struct->swap_map array. And there is no dependency between
different elements of swap_info_struct->swap_map. So a fine grained
lock can be used to allow parallel access to the different elements of
swap_info_struct->swap_map.
In this patch, a spinlock is added to swap_cluster_info to protect the
elements of swap_info_struct->swap_map in the swap cluster and the
fields of swap_cluster_info. This reduced locking contention for
swap_info_struct->swap_map access greatly.
Because of the added spinlock, the size of swap_cluster_info increases
from 4 bytes to 8 bytes on the 64 bit and 32 bit system. This will use
additional 4k RAM for every 1G swap space.
Because the size of swap_cluster_info is much smaller than the size of
the cache line (8 vs 64 on x86_64 architecture), there may be false
cache line sharing between spinlocks in swap_cluster_info. To avoid the
false sharing in the first round of the swap cluster allocation, the
order of the swap clusters in the free clusters list is changed. So
that, the swap_cluster_info sharing the same cache line will be placed
as far as possible. After the first round of allocation, the order of
the clusters in free clusters list is expected to be random. So the
false sharing should be not serious.
Compared with a previous implementation using bit_spin_lock, the
sequential swap out throughput improved about 3.2%. Test was done on a
Xeon E5 v3 system. The swap device used is a RAM simulated PMEM
(persistent memory) device. To test the sequential swapping out, the
test case created 32 processes, which sequentially allocate and write to
the anonymous pages until the RAM and part of the swap device is used.
[[email protected]: v5]
Link: http://lkml.kernel.org/r/[email protected]
[[email protected]: initialize spinlock for swap_cluster_info]
Link: http://lkml.kernel.org/r/[email protected]
[[email protected]: annotate nested locking for cluster lock]
Link: http://lkml.kernel.org/r/[email protected]
Link: http://lkml.kernel.org/r/dbb860bbd825b1aaba18988015e8963f263c3f0d.1484082593.git.tim.c.chen@linux.intel.com
Signed-off-by: "Huang, Ying" <[email protected]>
Signed-off-by: Tim Chen <[email protected]>
Signed-off-by: Minchan Kim <[email protected]>
Signed-off-by: Hugh Dickins <[email protected]>
Cc: Aaron Lu <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Christian Borntraeger <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Hillf Danton <[email protected]>
Cc: Huang Ying <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Jonathan Corbet <[email protected]> escreveu:
Cc: Kirill A. Shutemov <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Shaohua Li <[email protected]>
Cc: Vladimir Davydov <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Diffstat (limited to 'lib/mpi/mpi-cmp.c')
0 files changed, 0 insertions, 0 deletions