diff options
author | Dev Jain <[email protected]> | 2024-08-30 10:46:09 +0530 |
---|---|---|
committer | Andrew Morton <[email protected]> | 2024-09-09 16:39:05 -0700 |
commit | 536ab838a5b37b6ae3f8d53552560b7c51daeb41 (patch) | |
tree | 574a06e93a2dc68e9edceff762900fd900c2de66 | |
parent | 7ae12a57c56e0b87e6e698479c1f8b65434a608f (diff) |
selftests/mm: relax test to fail after 100 migration failures
It was recently observed at [1] that during the folio unmapping stage of
migration, when the PTEs are cleared, a racing thread faulting on that
folio may increase the refcount of the folio, sleep on the folio lock (the
migration path has the lock), and migration ultimately fails when
asserting the actual refcount against the expected. Thereby, the
migration selftest fails on shared-anon mappings. The above enforces the
fact that migration is a best-effort service, therefore, it is wrong to
fail the test for just a single failure; hence, fail the test after 100
consecutive failures (where 100 is still a subjective choice). Note that,
this has no effect on the execution time of the test since that is
controlled by a timeout.
[1] https://lore.kernel.org/all/[email protected]/
Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Dev Jain <[email protected]>
Suggested-by: David Hildenbrand <[email protected]>
Reviewed-by: Ryan Roberts <[email protected]>
Tested-by: Ryan Roberts <[email protected]>
Cc: Alistair Popple <[email protected]>
Cc: Aneesh Kumar K.V <[email protected]>
Cc: Anshuman Khandual <[email protected]>
Cc: Baolin Wang <[email protected]>
Cc: Barry Song <[email protected]>
Cc: Catalin Marinas <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Gavin Shan <[email protected]>
Cc: "Huang, Ying" <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Kirill A. Shutemov <[email protected]>
Cc: Lance Yang <[email protected]>
Cc: Mark Brown <[email protected]>
Cc: Mark Rutland <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Shuah Khan <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Will Deacon <[email protected]>
Cc: Yang Shi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
-rw-r--r-- | tools/testing/selftests/mm/migration.c | 17 |
1 files changed, 11 insertions, 6 deletions
diff --git a/tools/testing/selftests/mm/migration.c b/tools/testing/selftests/mm/migration.c index 6908569ef406..64bcbb7151cf 100644 --- a/tools/testing/selftests/mm/migration.c +++ b/tools/testing/selftests/mm/migration.c @@ -15,10 +15,10 @@ #include <signal.h> #include <time.h> -#define TWOMEG (2<<20) -#define RUNTIME (20) - -#define ALIGN(x, a) (((x) + (a - 1)) & (~((a) - 1))) +#define TWOMEG (2<<20) +#define RUNTIME (20) +#define MAX_RETRIES 100 +#define ALIGN(x, a) (((x) + (a - 1)) & (~((a) - 1))) FIXTURE(migration) { @@ -65,6 +65,7 @@ int migrate(uint64_t *ptr, int n1, int n2) int ret, tmp; int status = 0; struct timespec ts1, ts2; + int failures = 0; if (clock_gettime(CLOCK_MONOTONIC, &ts1)) return -1; @@ -79,13 +80,17 @@ int migrate(uint64_t *ptr, int n1, int n2) ret = move_pages(0, 1, (void **) &ptr, &n2, &status, MPOL_MF_MOVE_ALL); if (ret) { - if (ret > 0) + if (ret > 0) { + /* Migration is best effort; try again */ + if (++failures < MAX_RETRIES) + continue; printf("Didn't migrate %d pages\n", ret); + } else perror("Couldn't migrate pages"); return -2; } - + failures = 0; tmp = n2; n2 = n1; n1 = tmp; |