aboutsummaryrefslogtreecommitdiff
path: root/include/linux/migrate.h
AgeCommit message (Collapse)AuthorFilesLines
2023-02-20include/linux/migrate.h: remove unneeded externsAndrew Morton1-8/+8
As suggested by Matthew. Suggested-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-02-20mm: change to return bool for isolate_movable_page()Baolin Wang1-3/+3
Now the isolate_movable_page() can only return 0 or -EBUSY, and no users will care about the negative return value, thus we can convert the isolate_movable_page() to return a boolean value to make the code more clear when checking the movable page isolation state. No functional changes intended. [[email protected]: remove unneeded comment, per Matthew] Link: https://lkml.kernel.org/r/cb877f73f4fff8d309611082ec740a7065b1ade0.1676424378.git.baolin.wang@linux.alibaba.com Signed-off-by: Baolin Wang <[email protected]> Acked-by: David Hildenbrand <[email protected]> Reviewed-by: Matthew Wilcox (Oracle) <[email protected]> Acked-by: Linus Torvalds <[email protected]> Reviewed-by: SeongJae Park <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-02-16migrate_pages: split unmap_and_move() to _unmap() and _move()Huang Ying1-0/+1
This is a preparation patch to batch the folio unmapping and moving. In this patch, unmap_and_move() is split to migrate_folio_unmap() and migrate_folio_move(). So, we can batch _unmap() and _move() in different loops later. To pass some information between unmap and move, the original unused dst->mapping and dst->private are used. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: "Huang, Ying" <[email protected]> Reviewed-by: Baolin Wang <[email protected]> Reviewed-by: Xin Hao <[email protected]> Cc: Zi Yan <[email protected]> Cc: Yang Shi <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Bharata B Rao <[email protected]> Cc: Alistair Popple <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Hyeonggon Yoo <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2023-02-13mm/migrate: add folio_movable_ops()Vishal Moola (Oracle)1-0/+9
folio_movable_ops() does the same as page_movable_ops() except uses folios instead of pages. This function will help make folio conversions in migrate.c more readable. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Vishal Moola (Oracle) <[email protected]> Cc: Matthew Wilcox <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-10-12mm/migrate_device.c: add migrate_device_range()Alistair Popple1-0/+7
Device drivers can use the migrate_vma family of functions to migrate existing private anonymous mappings to device private pages. These pages are backed by memory on the device with drivers being responsible for copying data to and from device memory. Device private pages are freed via the pgmap->page_free() callback when they are unmapped and their refcount drops to zero. Alternatively they may be freed indirectly via migration back to CPU memory in response to a pgmap->migrate_to_ram() callback called whenever the CPU accesses an address mapped to a device private page. In other words drivers cannot control the lifetime of data allocated on the devices and must wait until these pages are freed from userspace. This causes issues when memory needs to reclaimed on the device, either because the device is going away due to a ->release() callback or because another user needs to use the memory. Drivers could use the existing migrate_vma functions to migrate data off the device. However this would require them to track the mappings of each page which is both complicated and not always possible. Instead drivers need to be able to migrate device pages directly so they can free up device memory. To allow that this patch introduces the migrate_device family of functions which are functionally similar to migrate_vma but which skips the initial lookup based on mapping. Link: https://lkml.kernel.org/r/868116aab70b0c8ee467d62498bb2cf0ef907295.1664366292.git-series.apopple@nvidia.com Signed-off-by: Alistair Popple <[email protected]> Cc: "Huang, Ying" <[email protected]> Cc: Zi Yan <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Yang Shi <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Ralph Campbell <[email protected]> Cc: John Hubbard <[email protected]> Cc: Alex Deucher <[email protected]> Cc: Alex Sierra <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: Christian König <[email protected]> Cc: Dan Williams <[email protected]> Cc: Felix Kuehling <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Lyude Paul <[email protected]> Cc: Michael Ellerman <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-10-12mm/memory.c: fix race when faulting a device private pageAlistair Popple1-0/+8
Patch series "Fix several device private page reference counting issues", v2 This series aims to fix a number of page reference counting issues in drivers dealing with device private ZONE_DEVICE pages. These result in use-after-free type bugs, either from accessing a struct page which no longer exists because it has been removed or accessing fields within the struct page which are no longer valid because the page has been freed. During normal usage it is unlikely these will cause any problems. However without these fixes it is possible to crash the kernel from userspace. These crashes can be triggered either by unloading the kernel module or unbinding the device from the driver prior to a userspace task exiting. In modules such as Nouveau it is also possible to trigger some of these issues by explicitly closing the device file-descriptor prior to the task exiting and then accessing device private memory. This involves some minor changes to both PowerPC and AMD GPU code. Unfortunately I lack hardware to test either of those so any help there would be appreciated. The changes mimic what is done in for both Nouveau and hmm-tests though so I doubt they will cause problems. This patch (of 8): When the CPU tries to access a device private page the migrate_to_ram() callback associated with the pgmap for the page is called. However no reference is taken on the faulting page. Therefore a concurrent migration of the device private page can free the page and possibly the underlying pgmap. This results in a race which can crash the kernel due to the migrate_to_ram() function pointer becoming invalid. It also means drivers can't reliably read the zone_device_data field because the page may have been freed with memunmap_pages(). Close the race by getting a reference on the page while holding the ptl to ensure it has not been freed. Unfortunately the elevated reference count will cause the migration required to handle the fault to fail. To avoid this failure pass the faulting page into the migrate_vma functions so that if an elevated reference count is found it can be checked to see if it's expected or not. [[email protected]: fix build] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/cover.60659b549d8509ddecafad4f498ee7f03bb23c69.1664366292.git-series.apopple@nvidia.com Link: https://lkml.kernel.org/r/d3e813178a59e565e8d78d9b9a4e2562f6494f90.1664366292.git-series.apopple@nvidia.com Signed-off-by: Alistair Popple <[email protected]> Acked-by: Felix Kuehling <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: John Hubbard <[email protected]> Cc: Ralph Campbell <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Lyude Paul <[email protected]> Cc: Alex Deucher <[email protected]> Cc: Alex Sierra <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: Christian König <[email protected]> Cc: Dan Williams <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: "Huang, Ying" <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Yang Shi <[email protected]> Cc: Zi Yan <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-09-26mm/demotion: build demotion targets based on explicit memory tiersAneesh Kumar K.V1-13/+0
This patch switch the demotion target building logic to use memory tiers instead of NUMA distance. All N_MEMORY NUMA nodes will be placed in the default memory tier and additional memory tiers will be added by drivers like dax kmem. This patch builds the demotion target for a NUMA node by looking at all memory tiers below the tier to which the NUMA node belongs. The closest node in the immediately following memory tier is used as a demotion target. Since we are now only building demotion target for N_MEMORY NUMA nodes the CPU hotplug calls are removed in this patch. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Aneesh Kumar K.V <[email protected]> Reviewed-by: "Huang, Ying" <[email protected]> Acked-by: Wei Xu <[email protected]> Cc: Alistair Popple <[email protected]> Cc: Bharata B Rao <[email protected]> Cc: Dan Williams <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Hesham Almatary <[email protected]> Cc: Jagdish Gediya <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Tim Chen <[email protected]> Cc: Yang Shi <[email protected]> Cc: SeongJae Park <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-09-26mm/demotion: move memory demotion related codeAneesh Kumar K.V1-2/+0
This moves memory demotion related code to mm/memory-tiers.c. No functional change in this patch. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Aneesh Kumar K.V <[email protected]> Reviewed-by: "Huang, Ying" <[email protected]> Acked-by: Wei Xu <[email protected]> Cc: Alistair Popple <[email protected]> Cc: Bharata B Rao <[email protected]> Cc: Dan Williams <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Davidlohr Bueso <[email protected]> Cc: Hesham Almatary <[email protected]> Cc: Jagdish Gediya <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Tim Chen <[email protected]> Cc: Yang Shi <[email protected]> Cc: SeongJae Park <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-08-05Merge tag 'mm-stable-2022-08-03' of ↵Linus Torvalds1-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "Most of the MM queue. A few things are still pending. Liam's maple tree rework didn't make it. This has resulted in a few other minor patch series being held over for next time. Multi-gen LRU still isn't merged as we were waiting for mapletree to stabilize. The current plan is to merge MGLRU into -mm soon and to later reintroduce mapletree, with a view to hopefully getting both into 6.1-rc1. Summary: - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe Lin, Yang Shi, Anshuman Khandual and Mike Rapoport - Some kmemleak fixes from Patrick Wang and Waiman Long - DAMON updates from SeongJae Park - memcg debug/visibility work from Roman Gushchin - vmalloc speedup from Uladzislau Rezki - more folio conversion work from Matthew Wilcox - enhancements for coherent device memory mapping from Alex Sierra - addition of shared pages tracking and CoW support for fsdax, from Shiyang Ruan - hugetlb optimizations from Mike Kravetz - Mel Gorman has contributed some pagealloc changes to improve latency and realtime behaviour. - mprotect soft-dirty checking has been improved by Peter Xu - Many other singleton patches all over the place" [ XFS merge from hell as per Darrick Wong in https://lore.kernel.org/all/YshKnxb4VwXycPO8@magnolia/ ] * tag 'mm-stable-2022-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (282 commits) tools/testing/selftests/vm/hmm-tests.c: fix build mm: Kconfig: fix typo mm: memory-failure: convert to pr_fmt() mm: use is_zone_movable_page() helper hugetlbfs: fix inaccurate comment in hugetlbfs_statfs() hugetlbfs: cleanup some comments in inode.c hugetlbfs: remove unneeded header file hugetlbfs: remove unneeded hugetlbfs_ops forward declaration hugetlbfs: use helper macro SZ_1{K,M} mm: cleanup is_highmem() mm/hmm: add a test for cross device private faults selftests: add soft-dirty into run_vmtests.sh selftests: soft-dirty: add test for mprotect mm/mprotect: fix soft-dirty check in can_change_pte_writable() mm: memcontrol: fix potential oom_lock recursion deadlock mm/gup.c: fix formatting in check_and_migrate_movable_page() xfs: fail dax mount if reflink is enabled on a partition mm/memcontrol.c: remove the redundant updating of stats_flush_threshold userfaultfd: don't fail on unrecognized features hugetlb_cgroup: fix wrong hugetlb cgroup numa stat ...
2022-08-02mm/folio-compat: Remove migration compatibility functionsMatthew Wilcox (Oracle)1-11/+0
migrate_page_move_mapping(), migrate_page_copy() and migrate_page_states() are all now unused after converting all the filesystems from aops->migratepage() to aops->migrate_folio(). Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]>
2022-08-02hugetlb: Convert to migrate_folioMatthew Wilcox (Oracle)1-3/+3
This involves converting migrate_huge_page_move_mapping(). We also need a folio variant of hugetlb_set_page_subpool(), but that's for a later patch. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Acked-by: Muchun Song <[email protected]> Reviewed-by: Mike Kravetz <[email protected]>
2022-08-02mm/migrate: Convert migrate_page() to migrate_folio()Matthew Wilcox (Oracle)1-3/+2
Convert all callers to pass a folio. Most have the folio already available. Switch all users from aops->migratepage to aops->migrate_folio. Also turn the documentation into kerneldoc. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Acked-by: David Sterba <[email protected]>
2022-08-02mm: Convert all PageMovable users to movable_operationsMatthew Wilcox (Oracle)1-5/+51
These drivers are rather uncomfortably hammered into the address_space_operations hole. They aren't filesystems and don't behave like filesystems. They just need their own movable_operations structure, which we can point to directly from page->mapping. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]>
2022-07-17mm: add device coherent vma selection for memory migrationAlex Sierra1-0/+1
This case is used to migrate pages from device memory, back to system memory. Device coherent type memory is cache coherent from device and CPU point of view. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Alex Sierra <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Acked-by: Felix Kuehling <[email protected]> Reviewed-by: Alistair Poppple <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jerome Glisse <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Ralph Campbell <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-05-13mm: add folio_test_movable()Matthew Wilcox (Oracle)1-0/+5
This is the folio equivalent of PageMovable() which is needed to convert mm/migrate.c to folios. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-04-28mm: untangle config dependencies for demote-on-reclaimOscar Salvador1-19/+15
At the time demote-on-reclaim was introduced, it was tied to CONFIG_HOTPLUG_CPU + CONFIG_MIGRATE, but that is not really accurate. The only two things we need to depend on are CONFIG_NUMA + CONFIG_MIGRATE, so clean this up. Furthermore, we only register the hotplug memory notifier when the system has CONFIG_MEMORY_HOTPLUG. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Oscar Salvador <[email protected]> Suggested-by: "Huang, Ying" <[email protected]> Reviewed-by: "Huang, Ying" <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Abhishek Goel <[email protected]> Cc: Baolin Wang <[email protected]> Signed-off-by: Andrew Morton <[email protected]>
2022-03-22mm: only re-generate demotion targets when a numa node changes its N_CPU stateOscar Salvador1-0/+8
Abhishek reported that after patch [1], hotplug operations are taking roughly double the expected time. [2] The reason behind is that the CPU callbacks that migrate_on_reclaim_init() sets always call set_migration_target_nodes() whenever a CPU is brought up/down. But we only care about numa nodes going from having cpus to become cpuless, and vice versa, as that influences the demotion_target order. We do already have two CPU callbacks (vmstat_cpu_online() and vmstat_cpu_dead()) that check exactly that, so get rid of the CPU callbacks in migrate_on_reclaim_init() and only call set_migration_target_nodes() from vmstat_cpu_{dead,online}() whenever a numa node change its N_CPU state. [1] https://lore.kernel.org/linux-mm/[email protected]/ [2] https://lore.kernel.org/linux-mm/[email protected]/ [[email protected]: add feedback from Huang Ying] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: 884a6e5d1f93b ("mm/migrate: update node demotion order on hotplug events") Signed-off-by: Oscar Salvador <[email protected]> Reviewed-by: Baolin Wang <[email protected]> Tested-by: Baolin Wang <[email protected]> Reported-by: Abhishek Goel <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "Huang, Ying" <[email protected]> Cc: Abhishek Goel <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2022-01-22mm/migrate.c: rework migration_entry_wait() to not take a pagerefAlistair Popple1-0/+2
This fixes the FIXME in migrate_vma_check_page(). Before migrating a page migration code will take a reference and check there are no unexpected page references, failing the migration if there are. When a thread faults on a migration entry it will take a temporary reference to the page to wait for the page to become unlocked signifying the migration entry has been removed. This reference is dropped just prior to waiting on the page lock, however the extra reference can cause migration failures so it is desirable to avoid taking it. As migration code already has a reference to the migrating page an extra reference to wait on PG_locked is unnecessary so long as the reference can't be dropped whilst setting up the wait. When faulting on a migration entry the ptl is taken to check the migration entry. Removing a migration entry also requires the ptl, and migration code won't drop its page reference until after the migration entry has been removed. Therefore retaining the ptl of a migration entry is sufficient to ensure the page has a reference. Reworking migration_entry_wait() to hold the ptl until the wait setup is complete means the extra page reference is no longer needed. [[email protected]: v5] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Alistair Popple <[email protected]> Acked-by: David Hildenbrand <[email protected]> Cc: David Howells <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jerome Glisse <[email protected]> Cc: John Hubbard <[email protected]> Cc: Matthew Wilcox (Oracle) <[email protected]> Cc: Ralph Campbell <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-11-11mm/migrate.c: remove MIGRATE_PFN_LOCKEDAlistair Popple1-1/+0
MIGRATE_PFN_LOCKED is used to indicate to migrate_vma_prepare() that a source page was already locked during migrate_vma_collect(). If it wasn't then the a second attempt is made to lock the page. However if the first attempt failed it's unlikely a second attempt will succeed, and the retry adds complexity. So clean this up by removing the retry and MIGRATE_PFN_LOCKED flag. Destination pages are also meant to have the MIGRATE_PFN_LOCKED flag set, but nothing actually checks that. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Alistair Popple <[email protected]> Reviewed-by: Ralph Campbell <[email protected]> Acked-by: Felix Kuehling <[email protected]> Cc: Alex Deucher <[email protected]> Cc: Jerome Glisse <[email protected]> Cc: John Hubbard <[email protected]> Cc: Zi Yan <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Ben Skeggs <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-11-06Merge branch 'akpm' (patches from Andrew)Linus Torvalds1-18/+5
Merge misc updates from Andrew Morton: "257 patches. Subsystems affected by this patch series: scripts, ocfs2, vfs, and mm (slab-generic, slab, slub, kconfig, dax, kasan, debug, pagecache, gup, swap, memcg, pagemap, mprotect, mremap, iomap, tracing, vmalloc, pagealloc, memory-failure, hugetlb, userfaultfd, vmscan, tools, memblock, oom-kill, hugetlbfs, migration, thp, readahead, nommu, ksm, vmstat, madvise, memory-hotplug, rmap, zsmalloc, highmem, zram, cleanups, kfence, and damon)" * emailed patches from Andrew Morton <[email protected]>: (257 commits) mm/damon: remove return value from before_terminate callback mm/damon: fix a few spelling mistakes in comments and a pr_debug message mm/damon: simplify stop mechanism Docs/admin-guide/mm/pagemap: wordsmith page flags descriptions Docs/admin-guide/mm/damon/start: simplify the content Docs/admin-guide/mm/damon/start: fix a wrong link Docs/admin-guide/mm/damon/start: fix wrong example commands mm/damon/dbgfs: add adaptive_targets list check before enable monitor_on mm/damon: remove unnecessary variable initialization Documentation/admin-guide/mm/damon: add a document for DAMON_RECLAIM mm/damon: introduce DAMON-based Reclamation (DAMON_RECLAIM) selftests/damon: support watermarks mm/damon/dbgfs: support watermarks mm/damon/schemes: activate schemes based on a watermarks mechanism tools/selftests/damon: update for regions prioritization of schemes mm/damon/dbgfs: support prioritization weights mm/damon/vaddr,paddr: support pageout prioritization mm/damon/schemes: prioritize regions within the quotas mm/damon/selftests: support schemes quotas mm/damon/dbgfs: support quotas of schemes ...
2021-11-06mm: migrate: make demotion knob depend on migrationYang Shi1-0/+4
The memory demotion needs to call migrate_pages() to do the jobs. And it is controlled by a knob, however, the knob doesn't depend on CONFIG_MIGRATION. The knob could be truned on even though MIGRATION is disabled, this will not cause any crash since migrate_pages() would just return -ENOSYS. But it is definitely not optimal to go through demotion path then retry regular swap every time. And it doesn't make too much sense to have the knob visible to the users when !MIGRATION. Move the related code from mempolicy.[h|c] to migrate.[h|c]. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Yang Shi <[email protected]> Acked-by: "Huang, Ying" <[email protected]> Cc: Dave Hansen <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-11-06mm/migrate: de-duplicate migrate_reason stringsJohn Hubbard1-18/+1
In order to remove the need to manually keep three different files in synch, provide a common definition of the mapping between enum migrate_reason, and the associated strings for each enum item. 1. Use the tracing system's mapping of enums to strings, by redefining and reusing the MIGRATE_REASON and supporting macros, and using that to populate the string array in mm/debug.c. 2. Move enum migrate_reason to migrate_mode.h. This is not strictly necessary for this patch, but migrate mode and migrate reason go together, so this will slightly clarify things. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: John Hubbard <[email protected]> Reviewed-by: Weizhao Ouyang <[email protected]> Cc: "Huang, Ying" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-10-18mm/migrate: Add folio_migrate_copy()Matthew Wilcox (Oracle)1-0/+1
This is the folio equivalent of migrate_page_copy(), which is retained as a wrapper for filesystems which are not yet converted to folios. Also convert copy_huge_page() to folio_copy(). Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Zi Yan <[email protected]> Acked-by: Vlastimil Babka <[email protected]>
2021-10-18mm/migrate: Add folio_migrate_flags()Matthew Wilcox (Oracle)1-0/+1
Turn migrate_page_states() into a wrapper around folio_migrate_flags(). Also convert two functions only called from folio_migrate_flags() to be folio-based. ksm_migrate_page() becomes folio_migrate_ksm() and copy_page_owner() becomes folio_copy_owner(). folio_migrate_flags() alone shrinks by two thirds -- 1967 bytes down to 642 bytes. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Zi Yan <[email protected]> Reviewed-by: David Howells <[email protected]> Acked-by: Vlastimil Babka <[email protected]>
2021-10-18mm/migrate: Add folio_migrate_mapping()Matthew Wilcox (Oracle)1-0/+2
Reimplement migrate_page_move_mapping() as a wrapper around folio_migrate_mapping(). Saves 193 bytes of kernel text. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Christoph Hellwig <[email protected]> Reviewed-by: David Howells <[email protected]> Acked-by: Vlastimil Babka <[email protected]>
2021-09-24mm/debug: sync up latest migrate_reason to migrate_reason_namesWeizhao Ouyang1-1/+5
Sync up MR_DEMOTION to migrate_reason_names and add a synch prompt. Link: https://lkml.kernel.org/r/[email protected] Fixes: 26aa2d199d6f ("mm/migrate: demote pages during reclaim") Signed-off-by: Weizhao Ouyang <[email protected]> Reviewed-by: "Huang, Ying" <[email protected]> Reviewed-by: John Hubbard <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Yang Shi <[email protected]> Cc: Zi Yan <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Minchan Kim <[email protected]> Cc: Mina Almasry <[email protected]> Cc: "Matthew Wilcox (Oracle)" <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Wei Xu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03mm/migrate: demote pages during reclaimDave Hansen1-0/+9
This is mostly derived from a patch from Yang Shi: https://lore.kernel.org/linux-mm/[email protected]/ Add code to the reclaim path (shrink_page_list()) to "demote" data to another NUMA node instead of discarding the data. This always avoids the cost of I/O needed to read the page back in and sometimes avoids the writeout cost when the page is dirty. A second pass through shrink_page_list() will be made if any demotions fail. This essentially falls back to normal reclaim behavior in the case that demotions fail. Previous versions of this patch may have simply failed to reclaim pages which were eligible for demotion but were unable to be demoted in practice. For some cases, for example, MADV_PAGEOUT, the pages are always discarded instead of demoted to follow the kernel API definition. Because MADV_PAGEOUT is defined as freeing specified pages regardless in which tier they are. Note: This just adds the start of infrastructure for migration. It is actually disabled next to the FIXME in migrate_demote_page_ok(). [[email protected]: v11] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Dave Hansen <[email protected]> Signed-off-by: "Huang, Ying" <[email protected]> Reviewed-by: Yang Shi <[email protected]> Reviewed-by: Wei Xu <[email protected]> Reviewed-by: Oscar Salvador <[email protected]> Reviewed-by: Zi Yan <[email protected]> Cc: Michal Hocko <[email protected]> Cc: David Rientjes <[email protected]> Cc: Dan Williams <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Greg Thelen <[email protected]> Cc: Keith Busch <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-09-03mm/migrate: enable returning precise migrate_pages() success countYang Shi1-2/+3
Under normal circumstances, migrate_pages() returns the number of pages migrated. In error conditions, it returns an error code. When returning an error code, there is no way to know how many pages were migrated or not migrated. Make migrate_pages() return how many pages are demoted successfully for all cases, including when encountering errors. Page reclaim behavior will depend on this in subsequent patches. Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Yang Shi <[email protected]> Signed-off-by: Dave Hansen <[email protected]> Signed-off-by: "Huang, Ying" <[email protected]> Suggested-by: Oscar Salvador <[email protected]> [optional parameter] Reviewed-by: Yang Shi <[email protected]> Reviewed-by: Zi Yan <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Wei Xu <[email protected]> Cc: Dan Williams <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: David Rientjes <[email protected]> Cc: Greg Thelen <[email protected]> Cc: Keith Busch <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-07-12mm: Make copy_huge_page() always availableMatthew Wilcox (Oracle)1-5/+0
Rewrite copy_huge_page() and move it into mm/util.c so it's always available. Fixes an exposure of uninitialised memory on configurations with HUGETLB and UFFD enabled and MIGRATION disabled. Fixes: 8cc5fcbb5be8 ("mm, hugetlb: fix racy resv_huge_pages underflow on UFFDIO_COPY") Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Reviewed-by: Mike Kravetz <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-06-30mm: thp: refactor NUMA fault handlingYang Shi1-23/+0
When the THP NUMA fault support was added THP migration was not supported yet. So the ad hoc THP migration was implemented in NUMA fault handling. Since v4.14 THP migration has been supported so it doesn't make too much sense to still keep another THP migration implementation rather than using the generic migration code. This patch reworks the NUMA fault handling to use generic migration implementation to migrate misplaced page. There is no functional change. After the refactor the flow of NUMA fault handling looks just like its PTE counterpart: Acquire ptl Prepare for migration (elevate page refcount) Release ptl Isolate page from lru and elevate page refcount Migrate the misplaced THP If migration fails just restore the old normal PMD. In the old code anon_vma lock was needed to serialize THP migration against THP split, but since then the THP code has been reworked a lot, it seems anon_vma lock is not required anymore to avoid the race. The page refcount elevation when holding ptl should prevent from THP split. Use migrate_misplaced_page() for both base page and THP NUMA hinting fault and remove all the dead and duplicate code. [[email protected]: fix a double unlock bug] Link: https://lkml.kernel.org/r/YLX8uYN01JmfLnlK@mwanda Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Yang Shi <[email protected]> Signed-off-by: Dan Carpenter <[email protected]> Acked-by: Mel Gorman <[email protected]> Cc: Christian Borntraeger <[email protected]> Cc: Gerald Schaefer <[email protected]> Cc: Heiko Carstens <[email protected]> Cc: Huang Ying <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Vasily Gorbik <[email protected]> Cc: Zi Yan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-06-30mm, hugetlb: fix racy resv_huge_pages underflow on UFFDIO_COPYMina Almasry1-0/+4
On UFFDIO_COPY, if we fail to copy the page contents while holding the hugetlb_fault_mutex, we will drop the mutex and return to the caller after allocating a page that consumed a reservation. In this case there may be a fault that double consumes the reservation. To handle this, we free the allocated page, fix the reservations, and allocate a temporary hugetlb page and return that to the caller. When the caller does the copy outside of the lock, we again check the cache, and allocate a page consuming the reservation, and copy over the contents. Test: Hacked the code locally such that resv_huge_pages underflows produce a warning and the copy_huge_page_from_user() always fails, then: ./tools/testing/selftests/vm/userfaultfd hugetlb_shared 10 2 /tmp/kokonut_test/huge/userfaultfd_test && echo test success ./tools/testing/selftests/vm/userfaultfd hugetlb 10 2 /tmp/kokonut_test/huge/userfaultfd_test && echo test success Both tests succeed and produce no warnings. After the test runs number of free/resv hugepages is correct. [[email protected]: remove set but not used variable 'vm_alloc_shared'] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: fix allocation error check and copy func name] Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Mina Almasry <[email protected]> Signed-off-by: YueHaibing <[email protected]> Cc: Axel Rasmussen <[email protected]> Cc: Peter Xu <[email protected]> Cc: Mike Kravetz <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-05-05mm/gup: migrate pinned pages out of movable zonePavel Tatashin1-0/+1
We should not pin pages in ZONE_MOVABLE. Currently, we do not pin only movable CMA pages. Generalize the function that migrates CMA pages to migrate all movable pages. Use is_pinnable_page() to check which pages need to be migrated Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Pavel Tatashin <[email protected]> Reviewed-by: John Hubbard <[email protected]> Cc: Dan Williams <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: David Rientjes <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ira Weiny <[email protected]> Cc: James Morris <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Sasha Levin <[email protected]> Cc: Steven Rostedt (VMware) <[email protected]> Cc: Tyler Hicks <[email protected]> Cc: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-05-05mm/migrate.c: make putback_movable_page() staticMiaohe Lin1-1/+0
Patch series "Cleanup and fixup for mm/migrate.c", v3. This series contains cleanups to remove unnecessary VM_BUG_ON_PAGE and rc != MIGRATEPAGE_SUCCESS check. Also use helper function to remove some duplicated codes. What's more, this fixes potential deadlock in NUMA balancing shared exec THP case and so on. More details can be found in the respective changelogs. This patch (of 5): The putback_movable_page() is just called by putback_movable_pages() and we know the page is locked and both PageMovable() and PageIsolated() is checked right before calling putback_movable_page(). So we make it static and remove all the 3 VM_BUG_ON_PAGE(). Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Miaohe Lin <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Reviewed-by: Yang Shi <[email protected]> Cc: Jerome Glisse <[email protected]> Cc: Rafael Aquini <[email protected]> Cc: Alistair Popple <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-05-05mm: replace migrate_[prep|finish] with lru_cache_[disable|enable]Minchan Kim1-7/+0
Currently, migrate_[prep|finish] is merely a wrapper of lru_cache_[disable|enable]. There is not much to gain from having additional abstraction. Use lru_cache_[disable|enable] instead of migrate_[prep|finish], which would be more descriptive. note: migrate_prep_local in compaction.c changed into lru_add_drain to avoid CPU schedule cost with involving many other CPUs to keep old behavior. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Minchan Kim <[email protected]> Acked-by: Michal Hocko <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Cc: Chris Goldsworthy <[email protected]> Cc: John Dias <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Oliver Sang <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Cc: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-05-05mm: disable LRU pagevec during the migration temporarilyMinchan Kim1-0/+2
LRU pagevec holds refcount of pages until the pagevec are drained. It could prevent migration since the refcount of the page is greater than the expection in migration logic. To mitigate the issue, callers of migrate_pages drains LRU pagevec via migrate_prep or lru_add_drain_all before migrate_pages call. However, it's not enough because pages coming into pagevec after the draining call still could stay at the pagevec so it could keep preventing page migration. Since some callers of migrate_pages have retrial logic with LRU draining, the page would migrate at next trail but it is still fragile in that it doesn't close the fundamental race between upcoming LRU pages into pagvec and migration so the migration failure could cause contiguous memory allocation failure in the end. To close the race, this patch disables lru caches(i.e, pagevec) during ongoing migration until migrate is done. Since it's really hard to reproduce, I measured how many times migrate_pages retried with force mode(it is about a fallback to a sync migration) with below debug code. int migrate_pages(struct list_head *from, new_page_t get_new_page, .. .. if (rc && reason == MR_CONTIG_RANGE && pass > 2) { printk(KERN_ERR, "pfn 0x%lx reason %d", page_to_pfn(page), rc); dump_page(page, "fail to migrate"); } The test was repeating android apps launching with cma allocation in background every five seconds. Total cma allocation count was about 500 during the testing. With this patch, the dump_page count was reduced from 400 to 30. The new interface is also useful for memory hotplug which currently drains lru pcp caches after each migration failure. This is rather suboptimal as it has to disrupt others running during the operation. With the new interface the operation happens only once. This is also in line with pcp allocator cache which are disabled for the offlining as well. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Minchan Kim <[email protected]> Reviewed-by: Chris Goldsworthy <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: John Dias <[email protected]> Cc: Suren Baghdasaryan <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Oliver Sang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2021-02-24mm/migrate: remove unneeded semicolonsChengyang Fan1-1/+1
Remove superfluous semicolons after function definitions. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Chengyang Fan <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-12-15mm: migrate: clean up migrate_prep{_local}Yang Shi1-2/+2
The migrate_prep{_local} never fails, so it is pointless to have return value and check the return value. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Yang Shi <[email protected]> Reviewed-by: Zi Yan <[email protected]> Cc: Jan Kara <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Mel Gorman <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Song Liu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-08-12mm/migrate: introduce a standard migration target allocation functionJoonsoo Kim1-4/+5
There are some similar functions for migration target allocation. Since there is no fundamental difference, it's better to keep just one rather than keeping all variants. This patch implements base migration target allocation function. In the following patches, variants will be converted to use this function. Changes should be mechanical, but, unfortunately, there are some differences. First, some callers' nodemask is assgined to NULL since NULL nodemask will be considered as all available nodes, that is, &node_states[N_MEMORY]. Second, for hugetlb page allocation, gfp_mask is redefined as regular hugetlb allocation gfp_mask plus __GFP_THISNODE if user provided gfp_mask has it. This is because future caller of this function requires to set this node constaint. Lastly, if provided nodeid is NUMA_NO_NODE, nodeid is set up to the node where migration source lives. It helps to remove simple wrappers for setting up the nodeid. Note that PageHighmem() call in previous function is changed to open-code "is_highmem_idx()" since it provides more readability. [[email protected]: tweak patch title, per Vlastimil] [[email protected]: fix typo in comment] Signed-off-by: Joonsoo Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Roman Gushchin <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-08-12mm/migrate: move migration helper from .h to .cJoonsoo Kim1-28/+5
It's not performance sensitive function. Move it to .c. This is a preparation step for future change. Signed-off-by: Joonsoo Kim <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Vlastimil Babka <[email protected]> Acked-by: Mike Kravetz <[email protected]> Acked-by: Michal Hocko <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Roman Gushchin <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-07-28mm/notifier: add migration invalidation typeRalph Campbell1-0/+3
Currently migrate_vma_setup() calls mmu_notifier_invalidate_range_start() which flushes all device private page mappings whether or not a page is being migrated to/from device private memory. In order to not disrupt device mappings that are not being migrated, shift the responsibility for clearing device private mappings to the device driver and leave CPU page table unmapping handled by migrate_vma_setup(). To support this, the caller of migrate_vma_setup() should always set struct migrate_vma::pgmap_owner to a non NULL value that matches the device private page->pgmap->owner. This value is then passed to the struct mmu_notifier_range with a new event type which the driver's invalidation function can use to avoid device MMU invalidations. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ralph Campbell <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2020-07-28mm/migrate: add a flags parameter to migrate_vmaRalph Campbell1-4/+9
The src_owner field in struct migrate_vma is being used for two purposes, it acts as a selection filter for which types of pages are to be migrated and it identifies device private pages owned by the caller. Split this into separate parameters so the src_owner field can be used just to identify device private pages owned by the caller of migrate_vma_setup(). Rename the src_owner field to pgmap_owner to reflect it is now used only to identify which device private pages to migrate. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Ralph Campbell <[email protected]> Reviewed-by: Bharata B Rao <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2020-03-26mm: handle multiple owners of device private pages in migrate_vmaChristoph Hellwig1-0/+8
Add a new src_owner field to struct migrate_vma. If the field is set, only device private pages with page->pgmap->owner equal to that field are migrated. If the field is not set only "normal" pages are migrated. Fixes: df6ad69838fc ("mm/device-public-memory: device memory cache coherent with CPU") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Ralph Campbell <[email protected]> Tested-by: Bharata B Rao <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-08-20mm: remove the unused MIGRATE_PFN_DEVICE flagChristoph Hellwig1-1/+0
No one ever checks this flag, and we could easily get that information from the page if needed. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Ralph Campbell <[email protected]> Reviewed-by: Jason Gunthorpe <[email protected]> Tested-by: Ralph Campbell <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-08-20mm: remove the unused MIGRATE_PFN_ERROR flagChristoph Hellwig1-1/+0
Now that we can rely errors in the normal control flow there is no need for this flag, remove it. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Ralph Campbell <[email protected]> Tested-by: Ralph Campbell <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-08-20mm: turn migrate_vma upside downChristoph Hellwig1-99/+19
There isn't any good reason to pass callbacks to migrate_vma. Instead we can just export the three steps done by this function to drivers and let them sequence the operation without callbacks. This removes a lot of boilerplate code as-is, and will allow the drivers to drastically improve code flow and error handling further on. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Ralph Campbell <[email protected]> Tested-by: Ralph Campbell <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]>
2019-07-18mm: migrate: remove unused mode argumentKeith Busch1-2/+1
migrate_page_move_mapping() doesn't use the mode argument. Remove it and update callers accordingly. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Keith Busch <[email protected]> Reviewed-by: Zi Yan <[email protected]> Cc: Mel Gorman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-12-28mm: migrate: drop unused argument of migrate_page_move_mapping()Jan Kara1-2/+1
All callers of migrate_page_move_mapping() now pass NULL for 'head' argument. Drop it. Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Jan Kara <[email protected]> Acked-by: Mel Gorman <[email protected]> Cc: Michal Hocko <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-12-28mm/debug.c: make "migrate_reason_names[]" const char *Alexey Dobriyan1-1/+1
Those strings are immutable as well. Link: http://lkml.kernel.org/r/20181124090508.GB10877@avx2 Signed-off-by: Alexey Dobriyan <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Acked-by: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-04-11mm: unclutter THP migrationMichal Hocko1-2/+2
THP migration is hacked into the generic migration with rather surprising semantic. The migration allocation callback is supposed to check whether the THP can be migrated at once and if that is not the case then it allocates a simple page to migrate. unmap_and_move then fixes that up by spliting the THP into small pages while moving the head page to the newly allocated order-0 page. Remaning pages are moved to the LRU list by split_huge_page. The same happens if the THP allocation fails. This is really ugly and error prone [1]. I also believe that split_huge_page to the LRU lists is inherently wrong because all tail pages are not migrated. Some callers will just work around that by retrying (e.g. memory hotplug). There are other pfn walkers which are simply broken though. e.g. madvise_inject_error will migrate head and then advances next pfn by the huge page size. do_move_page_to_node_array, queue_pages_range (migrate_pages, mbind), will simply split the THP before migration if the THP migration is not supported then falls back to single page migration but it doesn't handle tail pages if the THP migration path is not able to allocate a fresh THP so we end up with ENOMEM and fail the whole migration which is a questionable behavior. Page compaction doesn't try to migrate large pages so it should be immune. This patch tries to unclutter the situation by moving the special THP handling up to the migrate_pages layer where it actually belongs. We simply split the THP page into the existing list if unmap_and_move fails with ENOMEM and retry. So we will _always_ migrate all THP subpages and specific migrate_pages users do not have to deal with this case in a special way. [1] http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Michal Hocko <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Reviewed-by: Zi Yan <[email protected]> Cc: Andrea Reale <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2018-04-11mm, migrate: remove reason argument from new_page_tMichal Hocko1-2/+1
No allocation callback is using this argument anymore. new_page_node used to use this parameter to convey node_id resp. migration error up to move_pages code (do_move_page_to_node_array). The error status never made it into the final status field and we have a better way to communicate node id to the status field now. All other allocation callbacks simply ignored the argument so we can drop it finally. [[email protected]: fix migration callback] Link: http://lkml.kernel.org/r/[email protected] [[email protected]: fix alloc_misplaced_dst_page()] [[email protected]: fix build] Link: http://lkml.kernel.org/r/[email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Michal Hocko <[email protected]> Reviewed-by: Zi Yan <[email protected]> Cc: Andrea Reale <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Naoya Horiguchi <[email protected]> Cc: Vlastimil Babka <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>