mm, swap: Fix a race in free_swap_and_cache()

Before using cluster lock in free_swap_and_cache(), the swap_info_struct->lock will be held during freeing the swap entry and acquiring page lock, so the page swap count will not change when testing page information later. But after using cluster lock, the cluster lock (or swap_info_struct->lock) will be held only during freeing the swap entry. So before acquiring the page lock, the page swap count may be changed in another thread. If the page swap count is not 0, we should not delete the page from the swap cache. This is fixed via checking page swap count again after acquiring the page lock. I found the race when I review the code, so I didn't trigger the race via a test program. If the race occurs for an anonymous page shared by multiple processes via fork, multiple pages will be allocated and swapped in from the swap device for the previously shared one page. That is, the user-visible runtime effect is more memory will be used and the access latency for the page will be higher, that is, the performance regression. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: "Huang, Ying" <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Shaohua Li <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Tim Chen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
author: Huang Ying <[email protected]> 2017-05-03 14:52:49 -0700
committer: Linus Torvalds <[email protected]> 2017-05-03 15:52:08 -0700
commit: 322b8afe4a65906c133102532e63a278775cc5f0 (patch)
tree: 479940969c31f1bd766de491f62fbdceec70d066
parent: 9a4caf1e9fa4864ce21ba9584a2c336bfbc72740 (diff)
1 files changed, 16 insertions, 9 deletions
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 178130880b90..6b6bb1bb6209 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1111,6 +1111,18 @@ int page_swapcount(struct page *page)
 	return count;
 }
 
+static int swap_swapcount(struct swap_info_struct *si, swp_entry_t entry)
+{
+	int count = 0;
+	pgoff_t offset = swp_offset(entry);
+	struct swap_cluster_info *ci;
+
+	ci = lock_cluster_or_swap_info(si, offset);
+	count = swap_count(si->swap_map[offset]);
+	unlock_cluster_or_swap_info(si, ci);
+	return count;
+}
+
 /*
  * How many references to @entry are currently swapped out?
  * This does not give an exact answer when swap count is continued,
@@ -1119,17 +1131,11 @@ int page_swapcount(struct page *page)
 int __swp_swapcount(swp_entry_t entry)
 {
 	int count = 0;
-	pgoff_t offset;
 	struct swap_info_struct *si;
-	struct swap_cluster_info *ci;
 
 	si = __swap_info_get(entry);
-	if (si) {
-		offset = swp_offset(entry);
-		ci = lock_cluster_or_swap_info(si, offset);
-		count = swap_count(si->swap_map[offset]);
-		unlock_cluster_or_swap_info(si, ci);
-	}
+	if (si)
+		count = swap_swapcount(si, entry);
 	return count;
 }
 
@@ -1291,7 +1297,8 @@ int free_swap_and_cache(swp_entry_t entry)
 		 * Also recheck PageSwapCache now page is locked (above).
 		 */
 		if (PageSwapCache(page) && !PageWriteback(page) &&
-		    (!page_mapped(page) || mem_cgroup_swap_full(page))) {
+		    (!page_mapped(page) || mem_cgroup_swap_full(page)) &&
+		    !swap_swapcount(p, entry)) {
 			delete_from_swap_cache(page);
 			SetPageDirty(page);
 		}
author	Huang Ying <[email protected]>	2017-05-03 14:52:49 -0700
committer	Linus Torvalds <[email protected]>	2017-05-03 15:52:08 -0700
commit	322b8afe4a65906c133102532e63a278775cc5f0 (patch)
tree	479940969c31f1bd766de491f62fbdceec70d066
parent	9a4caf1e9fa4864ce21ba9584a2c336bfbc72740 (diff)