mm/swap: add cluster lock - blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

diff options

author	Huang, Ying <[email protected]>	2017-02-22 15:45:22 -0800
committer	Linus Torvalds <[email protected]>	2017-02-22 16:41:30 -0800
commit	235b62176712b970c815923e36b9a9cc05d4d901 (patch)
tree	5e64033c7a4f2e47e8d66a16a993f6aa37b6e63b /lib/mpi/mpi-cmp.c
parent	6a991fc72d1243b8da0c644d3147d3ec41a0b281 (diff)

mm/swap: add cluster lock

This patch is to reduce the lock contention of swap_info_struct->lock via using a more fine grained lock in swap_cluster_info for some swap operations. swap_info_struct->lock is heavily contended if multiple processes reclaim pages simultaneously. Because there is only one lock for each swap device. While in common configuration, there is only one or several swap devices in the system. The lock protects almost all swap related operations. In fact, many swap operations only access one element of swap_info_struct->swap_map array. And there is no dependency between different elements of swap_info_struct->swap_map. So a fine grained lock can be used to allow parallel access to the different elements of swap_info_struct->swap_map. In this patch, a spinlock is added to swap_cluster_info to protect the elements of swap_info_struct->swap_map in the swap cluster and the fields of swap_cluster_info. This reduced locking contention for swap_info_struct->swap_map access greatly. Because of the added spinlock, the size of swap_cluster_info increases from 4 bytes to 8 bytes on the 64 bit and 32 bit system. This will use additional 4k RAM for every 1G swap space. Because the size of swap_cluster_info is much smaller than the size of the cache line (8 vs 64 on x86_64 architecture), there may be false cache line sharing between spinlocks in swap_cluster_info. To avoid the false sharing in the first round of the swap cluster allocation, the order of the swap clusters in the free clusters list is changed. So that, the swap_cluster_info sharing the same cache line will be placed as far as possible. After the first round of allocation, the order of the clusters in free clusters list is expected to be random. So the false sharing should be not serious. Compared with a previous implementation using bit_spin_lock, the sequential swap out throughput improved about 3.2%. Test was done on a Xeon E5 v3 system. The swap device used is a RAM simulated PMEM (persistent memory) device. To test the sequential swapping out, the test case created 32 processes, which sequentially allocate and write to the anonymous pages until the RAM and part of the swap device is used. [[email protected]: v5] Link: http://lkml.kernel.org/r/[email protected] [[email protected]: initialize spinlock for swap_cluster_info] Link: http://lkml.kernel.org/r/[email protected] [[email protected]: annotate nested locking for cluster lock] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/dbb860bbd825b1aaba18988015e8963f263c3f0d.1484082593.git.tim.c.chen@linux.intel.com Signed-off-by: "Huang, Ying" <[email protected]> Signed-off-by: Tim Chen <[email protected]> Signed-off-by: Minchan Kim <[email protected]> Signed-off-by: Hugh Dickins <[email protected]> Cc: Aaron Lu <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Hillf Danton <[email protected]> Cc: Huang Ying <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Jonathan Corbet <[email protected]> escreveu: Cc: Kirill A. Shutemov <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Shaohua Li <[email protected]> Cc: Vladimir Davydov <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>

Diffstat (limited to 'lib/mpi/mpi-cmp.c')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: