aboutsummaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2020-10-13swap: rename SWP_FS to SWAP_FS_OPS to avoid ambiguityGao Xiang4-6/+6
SWP_FS is used to make swap_{read,write}page() go through the filesystem, and it's only used for swap files over NFS for now. Otherwise it will directly submit IO to blockdev according to swapfile extents reported by filesystems in advance. As Matthew pointed out [1], SWP_FS naming is somewhat confusing, so let's rename to SWP_FS_OPS. [1] https://lore.kernel.org/r/[email protected] Suggested-by: Matthew Wilcox <[email protected]> Signed-off-by: Gao Xiang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/gup: protect unpin_user_pages() against npages==-ERRNOJohn Hubbard1-0/+7
As suggested by Dan Carpenter, fortify unpin_user_pages() just a bit, against a typical caller mistake: check if the npages arg is really a -ERRNO value, which would blow up the unpinning loop: WARN and return. If this new WARN_ON() fires, then the system *might* be leaking pages (by leaving them pinned), but probably not. More likely, gup/pup returned a hard -ERRNO error to the caller, who erroneously passed it here. Signed-off-by: John Hubbard <[email protected]> Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Souptick Joarder <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/gup: don't permit users to call get_user_pages with FOLL_LONGTERMBarry Song1-15/+22
gup prohibits users from calling get_user_pages() with FOLL_PIN. But it allows users to call get_user_pages() with FOLL_LONGTERM only. It seems insensible. Since FOLL_LONGTERM is a stricter case of FOLL_PIN, we should prohibit users from calling get_user_pages() with FOLL_LONGTERM while not with FOLL_PIN. mm/gup_benchmark.c used to be the only user who did this improperly. But it has been fixed by moving to use pin_user_pages(). [[email protected]: fix CONFIG_MMU=n build] Link: https://lkml.kernel.org/r/CA+G9fYuNS3k0DVT62twfV746pfNhCSrk5sVMcOcQ1PGGnEseyw@mail.gmail.com Signed-off-by: Barry Song <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Ira Weiny <[email protected]> Cc: John Hubbard <[email protected]> Cc: Jan Kara <[email protected]> Cc: Jérôme Glisse <[email protected]> Cc: "Matthew Wilcox (Oracle)" <[email protected]> Cc: Al Viro <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Dan Williams <[email protected]> Cc: Dave Chinner <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Naresh Kamboju <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/gup_benchmark: use pin_user_pages for FOLL_LONGTERM flagBarry Song2-18/+19
According to Documentation/core-api/pin_user_pages.rst, FOLL_PIN is a prerequisite to FOLL_LONGTERM. Another way of saying that is, FOLL_LONGTERM is a specific case, more restrictive case of FOLL_PIN. Almost all kernel modules are using pin_user_pages() with FOLL_LONGTERM, mm/gup_benchmark.c seems to the only exception in which FOLL_PIN is not a prerequisite to FOLL_LONGTERM. Signed-off-by: Barry Song <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: John Hubbard <[email protected]> Cc: Jan Kara <[email protected]> Cc: Jérôme Glisse <[email protected]> Cc: "Matthew Wilcox (Oracle)" <[email protected]> Cc: Al Viro <[email protected]> Cc: Christoph Hellwig <[email protected]> Cc: Dan Williams <[email protected]> Cc: Dave Chinner <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Mike Kravetz <[email protected]> Cc: Shuah Khan <[email protected]> Cc: Vlastimil Babka <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/gup_benchmark: update the documentation in KconfigBarry Song1-2/+2
In the beginning, mm/gup_benchmark.c supported get_user_pages_fast() only, but right now, it supports the benchmarking of a couple of get_user_pages() related calls like: * get_user_pages_fast() * get_user_pages() * pin_user_pages_fast() * pin_user_pages() The documentation is confusing and needs update. Signed-off-by: Barry Song <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: John Hubbard <[email protected]> Cc: Keith Busch <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm, fadvise: improve the expensive remote LRU cache draining after FADV_DONTNEEDYafang Shao3-22/+49
Our users reported that there're some random latency spikes when their RT process is running. Finally we found that latency spike is caused by FADV_DONTNEED. Which may call lru_add_drain_all() to drain LRU cache on remote CPUs, and then waits the per-cpu work to complete. The wait time is uncertain, which may be tens millisecond. That behavior is unreasonable, because this process is bound to a specific CPU and the file is only accessed by itself, IOW, there should be no pagecache pages on a per-cpu pagevec of a remote CPU. That unreasonable behavior is partially caused by the wrong comparation of the number of invalidated pages and the number of the target. For example, if (count < (end_index - start_index + 1)) The count above is how many pages were invalidated in the local CPU, and (end_index - start_index + 1) is how many pages should be invalidated. The usage of (end_index - start_index + 1) is incorrect, because they are virtual addresses, which may not mapped to pages. Besides that, there may be holes between start and end. So we'd better check whether there are still pages on per-cpu pagevec after drain the local cpu, and then decide whether or not to call lru_add_drain_all(). After I applied it with a hotfix to our production environment, most of the lru_add_drain_all() can be avoided. Suggested-by: Mel Gorman <[email protected]> Signed-off-by: Yafang Shao <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Mel Gorman <[email protected]> Cc: Johannes Weiner <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/filemap: fix filemap_map_pages for THPMatthew Wilcox (Oracle)1-15/+15
We dereference page->mapping and page->index directly after calling find_subpage() and these fields are not valid for tail pages. While commit 4101196b19d7 ("mm: page cache: store only head pages in i_pages") introduced the call to find_subpage(), the problem existed prior to this; I'm going to suggest all the way back to when THPs first existed. The user-visible effects of this are almost negligible. To hit it, you have to mmap a tmpfs file at an unaligned address and then it's only a disabled optimisation causing page faults to happen more frequently than they otherwise would. Fix this by keeping both head and page pointers and checking the appropriate one. We could use page_mapping() and page_to_index(), but that's higher overhead. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: William Kucharski <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: add find_lock_headMatthew Wilcox (Oracle)2-9/+32
Add a new FGP_HEAD flag which avoids calling find_subpage() and add a convenience wrapper for it. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Chris Wilson <[email protected]> Cc: Huang Ying <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Matthew Auld <[email protected]> Cc: William Kucharski <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/shmem: return head page from find_lock_entryMatthew Wilcox (Oracle)3-23/+30
Convert shmem_getpage_gfp() (the only remaining caller of find_lock_entry()) to cope with a head page being returned instead of the subpage for the index. [[email protected]: fix BUG()s] Link https://lore.kernel.org/linux-mm/[email protected]/ Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Chris Wilson <[email protected]> Cc: Huang Ying <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Matthew Auld <[email protected]> Cc: William Kucharski <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: convert find_get_entry to return the head pageMatthew Wilcox (Oracle)2-7/+10
There are only four callers remaining of find_get_entry(). get_shadow_from_swap_cache() only wants to see shadow entries and doesn't care about which page is returned. Push the find_subpage() call into find_lock_entry(), find_get_incore_page() and pagecache_get_page(). [[email protected]: fix oops] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Chris Wilson <[email protected]> Cc: Huang Ying <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Matthew Auld <[email protected]> Cc: William Kucharski <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13i915: use find_lock_page instead of find_lock_entryMatthew Wilcox (Oracle)4-5/+5
i915 does not want to see value entries. Switch it to use find_lock_page() instead, and remove the export of find_lock_entry(). Move find_lock_entry() and find_get_entry() to mm/internal.h to discourage any future use. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Johannes Weiner <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Chris Wilson <[email protected]> Cc: Huang Ying <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Matthew Auld <[email protected]> Cc: William Kucharski <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13proc: optimise smaps for shmem entriesMatthew Wilcox (Oracle)1-7/+1
Avoid bumping the refcount on pages when we're only interested in the swap entries. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Johannes Weiner <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Chris Wilson <[email protected]> Cc: Huang Ying <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Matthew Auld <[email protected]> Cc: William Kucharski <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: optimise madvise WILLNEEDMatthew Wilcox (Oracle)1-9/+12
Instead of calling find_get_entry() for every page index, use an XArray iterator to skip over NULL entries, and avoid calling get_page(), because we only want the swap entries. [[email protected]: fix LTP soft lockups] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Johannes Weiner <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Chris Wilson <[email protected]> Cc: Huang Ying <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Matthew Auld <[email protected]> Cc: William Kucharski <[email protected]> Cc: Qian Cai <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: use find_get_incore_page in memcontrolMatthew Wilcox (Oracle)1-22/+2
The current code does not protect against swapoff of the underlying swap device, so this is a bug fix as well as a worthwhile reduction in code complexity. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Chris Wilson <[email protected]> Cc: Huang Ying <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Matthew Auld <[email protected]> Cc: William Kucharski <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm: factor find_get_incore_page out of mincore_pageMatthew Wilcox (Oracle)3-26/+41
Patch series "Return head pages from find_*_entry", v2. This patch series started out as part of the THP patch set, but it has some nice effects along the way and it seems worth splitting it out and submitting separately. Currently find_get_entry() and find_lock_entry() return the page corresponding to the requested index, but the first thing most callers do is find the head page, which we just threw away. As part of auditing all the callers, I found some misuses of the APIs and some plain inefficiencies that I've fixed. The diffstat is unflattering, but I added more kernel-doc and a new wrapper. This patch (of 8); Provide this functionality from the swap cache. It's useful for more than just mincore(). Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: William Kucharski <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Chris Wilson <[email protected]> Cc: Matthew Auld <[email protected]> Cc: Huang Ying <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm, dump_page: rename head_mapcount() --> head_compound_mapcount()John Hubbard2-7/+7
Rename head_pincount() --> head_compound_pincount(). These names are more accurate (or less misleading) than the original ones. Signed-off-by: John Hubbard <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Qian Cai <[email protected]> Cc: Matthew Wilcox <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: William Kucharski <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/debug.c: do not dereference i_ino blindlyMatthew Wilcox (Oracle)1-5/+7
__dump_page() checks i_dentry is fetchable and i_ino is earlier in the struct than i_ino, so it ought to work fine, but it's possible that struct randomisation has reordered i_ino after i_dentry and the pointer is just wild enough that i_dentry is fetchable and i_ino isn't. Also print the inode number if the dentry is invalid. Reported-by: Vlastimil Babka <[email protected]> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: John Hubbard <[email protected]> Reviewed-by: Mike Rapoport <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13device-dax: add a range mapping allocation attributeJoao Martins1-0/+64
Add a sysfs attribute which denotes a range from the dax region to be allocated. It's an write only @mapping sysfs attribute in the format of '<start>-<end>' to allocate a range. @start and @end use hexadecimal values and the @pgoff is implicitly ordered wrt to previous writes to @mapping sysfs e.g. a write of a range of length 1G the pgoff is 0..1G(-4K), a second write will use @pgoff for 1G+4K..<size>. This range mapping interface is useful for: 1) Application which want to implement its own allocation logic, and thus pick the desired ranges from dax_region. 2) For use cases like VMM fast restart[0] where after kexec we want to the same gpa<->phys mappings (as originally created before kexec). [0] https://static.sched.com/hosted_files/kvmforum2019/66/VMM-fast-restart_kvmforum2019.pdf Signed-off-by: Joao Martins <[email protected]> Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dave Jiang <[email protected]> Cc: David Airlie <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Hulk Robot <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Yan <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Jia He <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Juergen Gross <[email protected]> Cc: kernel test robot <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Vivek Goyal <[email protected]> Cc: Wei Yang <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lkml.kernel.org/r/159643106970.4062302.10402616567780784722.stgit@dwillia2-desk3.amr.corp.intel.com Link: https://lore.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/160106119570.30709.4548889722645210610.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13dax/hmem: introduce dax_hmem.region_idle parameterJoao Martins1-1/+4
Introduce a new module parameter for dax_hmem which initializes all region devices as free, rather than allocating a pagemap for the region by default. All hmem devices created with dax_hmem.region_idle=1 will have full available size for creating dynamic dax devices. Signed-off-by: Joao Martins <[email protected]> Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dave Jiang <[email protected]> Cc: David Airlie <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Hulk Robot <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Yan <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Jia He <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Juergen Gross <[email protected]> Cc: kernel test robot <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Vivek Goyal <[email protected]> Cc: Wei Yang <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lkml.kernel.org/r/159643106460.4062302.5868522341307530091.stgit@dwillia2-desk3.amr.corp.intel.com Link: https://lore.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/160106119033.30709.11249962152222193448.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13device-dax: add an 'align' attributeDan Williams2-10/+101
Introduce a device align attribute. While doing so, rename the region align attribute to be more explicitly named as so, but keep it named as @align to retain the API for tools like daxctl. Changes on align may not always be valid, when say certain mappings were created with 2M and then we switch to 1G. So, we validate all ranges against the new value being attempted, post resizing. Signed-off-by: Joao Martins <[email protected]> Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dave Jiang <[email protected]> Cc: David Airlie <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Hulk Robot <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Yan <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Jia He <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Juergen Gross <[email protected]> Cc: kernel test robot <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Vivek Goyal <[email protected]> Cc: Wei Yang <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lkml.kernel.org/r/159643105944.4062302.3131761052969132784.stgit@dwillia2-desk3.amr.corp.intel.com Link: https://lore.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/160106118486.30709.13012322227204800596.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13device-dax: make align a per-device propertyJoao Martins3-26/+19
Introduce @align to struct dev_dax. When creating a new device, we still initialize to the default dax_region @align. Child devices belonging to a region may wish to keep a different alignment property instead of a global region-defined one. Signed-off-by: Joao Martins <[email protected]> Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dave Jiang <[email protected]> Cc: David Airlie <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Hulk Robot <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Yan <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Jia He <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Juergen Gross <[email protected]> Cc: kernel test robot <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Vivek Goyal <[email protected]> Cc: Wei Yang <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lkml.kernel.org/r/159643105377.4062302.4159447829955683131.stgit@dwillia2-desk3.amr.corp.intel.com Link: https://lore.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/160106117957.30709.1142303024324655705.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13device-dax: introduce 'mapping' devicesDan Williams2-2/+203
In support of interrogating the physical address layout of a device with dis-contiguous ranges, introduce a sysfs directory with 'start', 'end', and 'page_offset' attributes. The alternative is trying to parse /proc/iomem, and that file will not reflect the extent layout until the device is enabled. Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Joao Martins <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dave Jiang <[email protected]> Cc: David Airlie <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Hulk Robot <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Yan <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Jia He <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Juergen Gross <[email protected]> Cc: kernel test robot <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Vivek Goyal <[email protected]> Cc: Wei Yang <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lkml.kernel.org/r/159643104819.4062302.13691281391423291589.stgit@dwillia2-desk3.amr.corp.intel.com Link: https://lkml.kernel.org/r/160106117446.30709.2751020815463722537.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13device-dax: add dis-contiguous resource supportDan Williams5-124/+323
Break the requirement that device-dax instances are physically contiguous. With this constraint removed it allows fragmented available capacity to be fully allocated. This capability is useful to mitigate the "noisy neighbor" problem with memory-side-cache management for virtual machines, or any other scenario where a platform address boundary also designates a performance boundary. For example a direct mapped memory side cache might rotate cache colors at 1GB boundaries. With dis-contiguous allocations a device-dax instance could be configured to contain only 1 cache color. It also satisfies Joao's use case (see link) for partitioning memory for exclusive guest access. It allows for a future potential mode where the host kernel need not allocate 'struct page' capacity up-front. Reported-by: Joao Martins <[email protected]> Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dave Jiang <[email protected]> Cc: David Airlie <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Hulk Robot <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Yan <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Jia He <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Juergen Gross <[email protected]> Cc: kernel test robot <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Vivek Goyal <[email protected]> Cc: Wei Yang <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/lkml/[email protected]/ Link: https://lkml.kernel.org/r/159643104304.4062302.16561669534797528660.stgit@dwillia2-desk3.amr.corp.intel.com Link: https://lkml.kernel.org/r/160106116875.30709.11456649969327399771.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/memremap_pages: support multiple ranges per invocationDan Williams10-110/+166
In support of device-dax growing the ability to front physically dis-contiguous ranges of memory, update devm_memremap_pages() to track multiple ranges with a single reference counter and devm instance. Convert all [devm_]memremap_pages() users to specify the number of ranges they are mapping in their 'struct dev_pagemap' instance. Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Vivek Goyal <[email protected]> Cc: Dave Jiang <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: David Airlie <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: "Jérôme Glisse" <[email protected] Cc: Andy Lutomirski <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dave Hansen <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Hulk Robot <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Yan <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Jia He <[email protected]> Cc: Joao Martins <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: kernel test robot <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Wei Yang <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lkml.kernel.org/r/159643103789.4062302.18426128170217903785.stgit@dwillia2-desk3.amr.corp.intel.com Link: https://lkml.kernel.org/r/160106116293.30709.13350662794915396198.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/memremap_pages: convert to 'struct range'Dan Williams21-165/+195
The 'struct resource' in 'struct dev_pagemap' is only used for holding resource span information. The other fields, 'name', 'flags', 'desc', 'parent', 'sibling', and 'child' are all unused wasted space. This is in preparation for introducing a multi-range extension of devm_memremap_pages(). The bulk of this change is unwinding all the places internal to libnvdimm that used 'struct resource' unnecessarily, and replacing instances of 'struct dev_pagemap'.res with 'struct dev_pagemap'.range. P2PDMA had a minor usage of the resource flags field, but only to report failures with "%pR". That is replaced with an open coded print of the range. [[email protected]: mm/hmm/test: use after free in dmirror_allocate_chunk()] Link: https://lkml.kernel.org/r/20200926121402.GA7467@kadam Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Boris Ostrovsky <[email protected]> [xen] Cc: Paul Mackerras <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Vivek Goyal <[email protected]> Cc: Dave Jiang <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: David Airlie <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Juergen Gross <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Dave Hansen <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Hulk Robot <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Yan <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: Jia He <[email protected]> Cc: Joao Martins <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: kernel test robot <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Wei Yang <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lkml.kernel.org/r/159643103173.4062302.768998885691711532.stgit@dwillia2-desk3.amr.corp.intel.com Link: https://lkml.kernel.org/r/160106115761.30709.13539840236873663620.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13device-dax: add resize supportDan Williams1-9/+152
Make the device-dax 'size' attribute writable to allow capacity to be split between multiple instances in a region. The intended consumers of this capability are users that want to split a scarce memory resource between device-dax and System-RAM access, or users that want to have multiple security domains for a large region. By default the hmem instance provider allocates an entire region to the first instance. The process of creating a new instance (assuming a region-id of 0) is find the region and trigger the 'create' attribute which yields an empty instance to configure. For example: cd /sys/bus/dax/devices echo dax0.0 > dax0.0/driver/unbind echo $new_size > dax0.0/size echo 1 > $(readlink -f dax0.0)../dax_region/create seed=$(cat $(readlink -f dax0.0)../dax_region/seed) echo $new_size > $seed/size echo dax0.0 > ../drivers/{device_dax,kmem}/bind echo dax0.1 > ../drivers/{device_dax,kmem}/bind Instances can be destroyed by: echo $device > $(readlink -f $device)../dax_region/delete Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dave Jiang <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Jia He <[email protected]> Cc: Joao Martins <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: David Airlie <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Hulk Robot <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Yan <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Juergen Gross <[email protected]> Cc: kernel test robot <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Vivek Goyal <[email protected]> Cc: Wei Yang <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lkml.kernel.org/r/159643102625.4062302.7431838945566033852.stgit@dwillia2-desk3.amr.corp.intel.com Link: https://lkml.kernel.org/r/160106115239.30709.9850106928133493138.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13drivers/base: make device_find_child_by_name() compatible with sysfs inputsDan Williams1-1/+1
Use sysfs_streq() in device_find_child_by_name() to allow it to use a sysfs input string that might contain a trailing newline. The other "device by name" interfaces, {bus,driver,class}_find_device_by_name(), already account for sysfs strings. Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Greg Kroah-Hartman <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dave Jiang <[email protected]> Cc: David Airlie <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Hulk Robot <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Yan <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Jia He <[email protected]> Cc: Joao Martins <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Juergen Gross <[email protected]> Cc: kernel test robot <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Vivek Goyal <[email protected]> Cc: Wei Yang <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lkml.kernel.org/r/159643102106.4062302.12229802117645312104.stgit@dwillia2-desk3.amr.corp.intel.com Link: https://lkml.kernel.org/r/160106114576.30709.2960091665444712180.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13device-dax: introduce 'seed' devicesDan Williams3-40/+272
Add a seed device concept for dynamic dax regions to be able to split the region amongst multiple sub-instances. The seed device, similar to libnvdimm seed devices, is a device that starts with zero capacity allocated and unbound to a driver. In contrast to libnvdimm seed devices explicit 'create' and 'delete' interfaces are added to the region to trigger seeds to be created and unused devices to be reclaimed. The explicit create and delete replaces implicit create as a side effect of probe and implicit delete when writing 0 to the size that libnvdimm implements. Delete can be performed on any 0-sized and idle device. This avoids the gymnastics of needing to move device_unregister() to its own async context. Specifically, it avoids the deadlock of deleting a device via one of its own attributes. It is also less surprising to userspace which never sees an extra device it did not request. For now just add the device creation, teardown, and ->probe() prevention. A later patch will arrange for the 'dax/size' attribute to be writable to allocate capacity from the region. Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dave Jiang <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Jia He <[email protected]> Cc: Joao Martins <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: David Airlie <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Hulk Robot <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Yan <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Juergen Gross <[email protected]> Cc: kernel test robot <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Vivek Goyal <[email protected]> Cc: Wei Yang <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lkml.kernel.org/r/159643101583.4062302.12255093902950754962.stgit@dwillia2-desk3.amr.corp.intel.com Link: https://lkml.kernel.org/r/160106113873.30709.15168756050631539431.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13device-dax: introduce 'struct dev_dax' typed-driver operationsDan Williams5-19/+35
In preparation for introducing seed devices the dax-bus core needs to be able to intercept ->probe() and ->remove() operations. Towards that end arrange for the bus and drivers to switch from raw 'struct device' driver operations to 'struct dev_dax' typed operations. Reported-by: Hulk Robot <[email protected]> Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Jason Yan <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dave Jiang <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Jia He <[email protected]> Cc: Joao Martins <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: David Airlie <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Juergen Gross <[email protected]> Cc: kernel test robot <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Vivek Goyal <[email protected]> Cc: Wei Yang <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lkml.kernel.org/r/160106113357.30709.4541750544799737855.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13device-dax: add an allocation interface for device-dax instancesDan Williams5-23/+121
In preparation for a facility that enables dax regions to be sub-divided, introduce infrastructure to track and allocate region capacity. The new dax_region/available_size attribute is only enabled for volatile hmem devices, not pmem devices that are defined by nvdimm namespace boundaries. This is per Jeff's feedback the last time dynamic device-dax capacity allocation support was discussed. Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dave Jiang <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Jia He <[email protected]> Cc: Joao Martins <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: David Airlie <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Hulk Robot <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Yan <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Juergen Gross <[email protected]> Cc: kernel test robot <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Vivek Goyal <[email protected]> Cc: Wei Yang <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lore.kernel.org/linux-nvdimm/[email protected] Link: https://lkml.kernel.org/r/159643101035.4062302.6785857915652647857.stgit@dwillia2-desk3.amr.corp.intel.com Link: https://lkml.kernel.org/r/160106112801.30709.14601438735305335071.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13device-dax/kmem: replace release_resource() with release_mem_region()Dan Williams2-16/+7
Towards removing the mode specific @dax_kmem_res attribute from the generic 'struct dev_dax', and preparing for multi-range support, change the kmem driver to use the idiomatic release_mem_region() to pair with the initial request_mem_region(). This also eliminates the need to open code the release of the resource allocated by request_mem_region(). As there are no more dax_kmem_res users, delete this struct member. Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Dave Jiang <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Jia He <[email protected]> Cc: Joao Martins <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: David Airlie <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Hulk Robot <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Yan <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Juergen Gross <[email protected]> Cc: kernel test robot <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Vivek Goyal <[email protected]> Cc: Wei Yang <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lkml.kernel.org/r/160106112239.30709.15909567572288425294.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13device-dax/kmem: move resource name tracking to drvdataDan Williams1-7/+9
Towards removing the mode specific @dax_kmem_res attribute from the generic 'struct dev_dax', and preparing for multi-range support, move resource name tracking to driver data. The memory for the resource name needs to have its own lifetime separate from the device bind lifetime for cases where the driver is unbound, but the kmem range could not be unplugged from the page allocator. Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Dave Jiang <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Jia He <[email protected]> Cc: Joao Martins <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: David Airlie <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Hulk Robot <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Yan <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Juergen Gross <[email protected]> Cc: kernel test robot <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Vivek Goyal <[email protected]> Cc: Wei Yang <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lkml.kernel.org/r/160106111639.30709.17624822766862009183.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13device-dax/kmem: introduce dax_kmem_range()Dan Williams1-23/+17
Towards removing the mode specific @dax_kmem_res attribute from the generic 'struct dev_dax', and preparing for multi-range support, teach the driver to calculate the hotplug range from the device range. The hotplug range is the trivially calculated memory-block-size aligned version of the device range. Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Dave Jiang <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Jia He <[email protected]> Cc: Joao Martins <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: David Airlie <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Hulk Robot <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Yan <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Juergen Gross <[email protected]> Cc: kernel test robot <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Vivek Goyal <[email protected]> Cc: Wei Yang <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lkml.kernel.org/r/160106111109.30709.3173462396758431559.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13device-dax: make pgmap optional for instance creationDan Williams8-38/+62
The passed in dev_pagemap is only required in the pmem case as the libnvdimm core may have reserved a vmem_altmap for dev_memremap_pages() to place the memmap in pmem directly. In the hmem case there is no agent reserving an altmap so it can all be handled by a core internal default. Pass the resource range via a new @range property of 'struct dev_dax_data'. Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Dave Jiang <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Jia He <[email protected]> Cc: Joao Martins <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: David Airlie <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Hulk Robot <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jason Yan <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Juergen Gross <[email protected]> Cc: kernel test robot <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Vivek Goyal <[email protected]> Cc: Wei Yang <[email protected]> Cc: Will Deacon <[email protected]> Link: https://lkml.kernel.org/r/159643099958.4062302.10379230791041872886.stgit@dwillia2-desk3.amr.corp.intel.com Link: https://lkml.kernel.org/r/160106110513.30709.4303239334850606031.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13device-dax: move instance creation parameters to 'struct dev_dax_data'Dan Williams4-17/+30
In preparation for adding more parameters to instance creation, move existing parameters to a new struct. Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dave Jiang <[email protected]> Cc: David Airlie <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: Jia He <[email protected]> Cc: Joao Martins <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Wei Yang <[email protected]> Cc: Will Deacon <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Hulk Robot <[email protected]> Cc: Jason Yan <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Juergen Gross <[email protected]> Cc: kernel test robot <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Vivek Goyal <[email protected]> Link: https://lkml.kernel.org/r/159643099411.4062302.1337305960720423895.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13device-dax: drop the dax_region.pfn_flags attributeDan Williams6-33/+7
All callers specify the same flags to alloc_dax_region(), so there is no need to allow for anything other than PFN_DEV|PFN_MAP, or carry a ->pfn_flags around on the region. Device-dax instances are always page backed. Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dave Jiang <[email protected]> Cc: David Airlie <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: Jia He <[email protected]> Cc: Joao Martins <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Wei Yang <[email protected]> Cc: Will Deacon <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Hulk Robot <[email protected]> Cc: Jason Yan <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Juergen Gross <[email protected]> Cc: kernel test robot <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Vivek Goyal <[email protected]> Link: https://lkml.kernel.org/r/159643098829.4062302.13611520567669439046.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13ACPI: HMAT: attach a device for each soft-reserved rangeDan Williams3-1/+39
The hmem enabling in commit cf8741ac57ed ("ACPI: NUMA: HMAT: Register "soft reserved" memory as an "hmem" device") only registered ranges to the hmem driver for each soft-reservation that also appeared in the HMAT. While this is meant to encourage platform firmware to "do the right thing" and publish an HMAT, the corollary is that platforms that fail to publish an accurate HMAT will strand memory from Linux usage. Additionally, the "efi_fake_mem" kernel command line option enabling will strand memory by default without an HMAT. Arrange for "soft reserved" memory that goes unclaimed by HMAT entries to be published as raw resource ranges for the hmem driver to consume. Include a module parameter to disable either this fallback behavior, or the hmat enabling from creating hmem devices. The module parameter requires the hmem device enabling to have unique name in the module namespace: "device_hmem". The driver depends on the architecture providing phys_to_target_node() which is only x86 via numa_meminfo() and arm64 via a generic memblock implementation. [[email protected]: require NUMA_KEEP_MEMINFO for phys_to_target_node()] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Joao Martins <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Joao Martins <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Will Deacon <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dave Jiang <[email protected]> Cc: David Airlie <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jia He <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Wei Yang <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Hulk Robot <[email protected]> Cc: Jason Yan <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Juergen Gross <[email protected]> Cc: kernel test robot <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Vivek Goyal <[email protected]> Link: https://lkml.kernel.org/r/159643098298.4062302.17587338161136144730.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/memory_hotplug: introduce default phys_to_target_node() implementationDan Williams4-22/+23
In preparation to set a fallback value for dev_dax->target_node, introduce generic fallback helpers for phys_to_target_node() A generic implementation based on node-data or memblock was proposed, but as noted by Mike: "Here again, I would prefer to add a weak default for phys_to_target_node() because the "generic" implementation is not really generic. The fallback to reserved ranges is x86 specfic because on x86 most of the reserved areas is not in memblock.memory. AFAIK, no other architecture does this." The info message in the generic memory_add_physaddr_to_nid() implementation is fixed up to properly reflect that memory_add_physaddr_to_nid() communicates "online" node info and phys_to_target_node() indicates "target / to-be-onlined" node info. [[email protected]: fix CONFIG_MEMORY_HOTPLUG=n build] Link: https://lkml.kernel.org/r/202008252130.7YrHIyMI%[email protected] Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Jia He <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dave Jiang <[email protected]> Cc: David Airlie <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: Joao Martins <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Wei Yang <[email protected]> Cc: Will Deacon <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Hulk Robot <[email protected]> Cc: Jason Yan <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Juergen Gross <[email protected]> Cc: kernel test robot <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Vivek Goyal <[email protected]> Link: https://lkml.kernel.org/r/159643097768.4062302.3135192588966888630.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13resource: report parent to walk_iomem_res_desc() callbackDan Williams1-4/+7
In support of detecting whether a resource might have been been claimed, report the parent to the walk_iomem_res_desc() callback. For example, the ACPI HMAT parser publishes "hmem" platform devices per target range. However, if the HMAT is disabled / missing a fallback driver can attach devices to the raw memory ranges as a fallback if it sees unclaimed / orphan "Soft Reserved" resources in the resource tree. Otherwise, find_next_iomem_res() returns a resource with garbage data from the stack allocation in __walk_iomem_res_desc() for the res->parent field. There are currently no users that expect ->child and ->sibling to be valid, and the resource_lock would be needed to traverse them. Use a compound literal to implicitly zero initialize the fields that are not being returned in addition to setting ->parent. Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Wei Yang <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dave Jiang <[email protected]> Cc: David Airlie <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: Jia He <[email protected]> Cc: Joao Martins <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Will Deacon <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Hulk Robot <[email protected]> Cc: Jason Yan <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Juergen Gross <[email protected]> Cc: kernel test robot <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Vivek Goyal <[email protected]> Link: https://lkml.kernel.org/r/159643097166.4062302.11875688887228572793.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13ACPI: HMAT: refactor hmat_register_target_device to hmem_register_deviceDan Williams7-65/+90
In preparation for exposing "Soft Reserved" memory ranges without an HMAT, move the hmem device registration to its own compilation unit and make the implementation generic. The generic implementation drops usage acpi_map_pxm_to_online_node() that was translating ACPI proximity domain values and instead relies on numa_map_to_online_node() to determine the numa node for the device. [[email protected]: CONFIG_DEV_DAX_HMEM_DEVICES should depend on CONFIG_DAX=y] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Joao Martins <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dave Jiang <[email protected]> Cc: David Airlie <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: Jia He <[email protected]> Cc: Joao Martins <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Wei Yang <[email protected]> Cc: Will Deacon <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Hulk Robot <[email protected]> Cc: Jason Yan <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Juergen Gross <[email protected]> Cc: kernel test robot <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Vivek Goyal <[email protected]> Link: https://lkml.kernel.org/r/159643096584.4062302.5035370788475153738.stgit@dwillia2-desk3.amr.corp.intel.com Link: https://lore.kernel.org/r/158318761484.2216124.2049322072599482736.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13efi/fake_mem: arrange for a resource entry per efi_fake_mem instanceDan Williams2-4/+24
In preparation for attaching a platform device per iomem resource teach the efi_fake_mem code to create an e820 entry per instance. Similar to E820_TYPE_PRAM, bypass merging resource when the e820 map is sanitized. Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Ard Biesheuvel <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dave Jiang <[email protected]> Cc: David Airlie <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: Jia He <[email protected]> Cc: Joao Martins <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Wei Yang <[email protected]> Cc: Will Deacon <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Hulk Robot <[email protected]> Cc: Jason Yan <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Juergen Gross <[email protected]> Cc: kernel test robot <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Vivek Goyal <[email protected]> Link: https://lkml.kernel.org/r/159643096068.4062302.11590041070221681669.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13x86/numa: add 'nohmat' optionDan Williams5-1/+23
Disable parsing of the HMAT for debug, to workaround broken platform instances, or cases where it is otherwise not wanted. [[email protected]: fix build when CONFIG_ACPI is not set] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Randy Dunlap <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dave Jiang <[email protected]> Cc: David Airlie <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: Jia He <[email protected]> Cc: Joao Martins <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Wei Yang <[email protected]> Cc: Will Deacon <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Hulk Robot <[email protected]> Cc: Jason Yan <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Juergen Gross <[email protected]> Cc: kernel test robot <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Vivek Goyal <[email protected]> Link: https://lkml.kernel.org/r/159643095540.4062302.732962081968036212.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13x86/numa: cleanup configuration dependent command-line optionsDan Williams6-12/+24
Patch series "device-dax: Support sub-dividing soft-reserved ranges", v5. The device-dax facility allows an address range to be directly mapped through a chardev, or optionally hotplugged to the core kernel page allocator as System-RAM. It is the mechanism for converting persistent memory (pmem) to be used as another volatile memory pool i.e. the current Memory Tiering hot topic on linux-mm. In the case of pmem the nvdimm-namespace-label mechanism can sub-divide it, but that labeling mechanism is not available / applicable to soft-reserved ("EFI specific purpose") memory [3]. This series provides a sysfs-mechanism for the daxctl utility to enable provisioning of volatile-soft-reserved memory ranges. The motivations for this facility are: 1/ Allow performance differentiated memory ranges to be split between kernel-managed and directly-accessed use cases. 2/ Allow physical memory to be provisioned along performance relevant address boundaries. For example, divide a memory-side cache [4] along cache-color boundaries. 3/ Parcel out soft-reserved memory to VMs using device-dax as a security / permissions boundary [5]. Specifically I have seen people (ab)using memmap=nn!ss (mark System-RAM as Persistent Memory) just to get the device-dax interface on custom address ranges. A follow-on for the VM use case is to teach device-dax to dynamically allocate 'struct page' at runtime to reduce the duplication of 'struct page' space in both the guest and the host kernel for the same physical pages. [2]: http://lore.kernel.org/r/[email protected] [3]: http://lore.kernel.org/r/157309097008.1579826.12818463304589384434.stgit@dwillia2-desk3.amr.corp.intel.com [4]: http://lore.kernel.org/r/154899811738.3165233.12325692939590944259.stgit@dwillia2-desk3.amr.corp.intel.com [5]: http://lore.kernel.org/r/[email protected] This patch (of 23): In preparation for adding a new numa= option clean up the existing ones to avoid ifdefs in numa_setup(), and provide feedback when the option is numa=fake= option is invalid due to kernel config. The same does not need to be done for numa=noacpi, since the capability is already hard disabled at compile-time. Suggested-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Benjamin Herrenschmidt <[email protected]> Cc: Ben Skeggs <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Daniel Vetter <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Dave Jiang <[email protected]> Cc: David Airlie <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Cc: "H. Peter Anvin" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Jason Gunthorpe <[email protected]> Cc: Jeff Moyer <[email protected]> Cc: Jia He <[email protected]> Cc: Joao Martins <[email protected]> Cc: Jonathan Cameron <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Mike Rapoport <[email protected]> Cc: Paul Mackerras <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Tom Lendacky <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Wei Yang <[email protected]> Cc: Will Deacon <[email protected]> Cc: Ard Biesheuvel <[email protected]> Cc: Bjorn Helgaas <[email protected]> Cc: Boris Ostrovsky <[email protected]> Cc: Hulk Robot <[email protected]> Cc: Jason Yan <[email protected]> Cc: "Jérôme Glisse" <[email protected]> Cc: Juergen Gross <[email protected]> Cc: kernel test robot <[email protected]> Cc: Randy Dunlap <[email protected]> Cc: Stefano Stabellini <[email protected]> Cc: Vivek Goyal <[email protected]> Link: https://lkml.kernel.org/r/160106109960.30709.7379926726669669398.stgit@dwillia2-desk3.amr.corp.intel.com Link: https://lkml.kernel.org/r/159643094279.4062302.17779410714418721328.stgit@dwillia2-desk3.amr.corp.intel.com Link: https://lkml.kernel.org/r/159643094925.4062302.14979872973043772305.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm,kmemleak-test.c: move kmemleak-test.c to samples dirHui Su6-4/+7
kmemleak-test.c is just a kmemleak test module, which also can not be used as a built-in kernel module. Thus, i think it may should not be in mm dir, and move the kmemleak-test.c to samples/kmemleak/kmemleak-test.c. Fix the spelling of built-in by the way. Signed-off-by: Hui Su <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Mauro Carvalho Chehab <[email protected]> Cc: David S. Miller <[email protected]> Cc: Rob Herring <[email protected]> Cc: Masahiro Yamada <[email protected]> Cc: Sam Ravnborg <[email protected]> Cc: Josh Poimboeuf <[email protected]> Cc: Steven Rostedt (VMware) <[email protected]> Cc: Miguel Ojeda <[email protected]> Cc: Divya Indi <[email protected]> Cc: Tomas Winkler <[email protected]> Cc: David Howells <[email protected]> Link: https://lkml.kernel.org/r/20200925183729.GA172837@rlk Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/kmemleak: rely on rcu for task stack scanningDavidlohr Bueso1-4/+4
kmemleak_scan() currently relies on the big tasklist_lock hammer to stabilize iterating through the tasklist. Instead, this patch proposes simply using rcu along with the rcu-safe for_each_process_thread flavor (without changing scan semantics), which doesn't make use of next_thread/p->thread_group and thus cannot race with exit. Furthermore, any races with fork() and not seeing the new child should be benign as it's not running yet and can also be detected by the next scan. Avoiding the tasklist_lock could prove beneficial for performance considering the scan operation is done periodically. I have seen improvements of 30%-ish when doing similar replacements on very pathological microbenchmarks (ie stressing get/setpriority(2)). However my main motivation is that it's one less user of the global lock, something that Linus has long time wanted to see gone eventually (if ever) even if the traditional fairness issues has been dealt with now with qrwlocks. Of course this is a very long ways ahead. This patch also kills another user of the deprecated tsk->thread_group. Signed-off-by: Davidlohr Bueso <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Qian Cai <[email protected]> Acked-by: Catalin Marinas <[email protected]> Acked-by: Oleg Nesterov <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/slub: make add_full() condition more explicitAbel Wu1-1/+3
The commit below is incomplete, as it didn't handle the add_full() part. commit a4d3f8916c65 ("slub: remove useless kmem_cache_debug() before remove_full()") This patch checks for SLAB_STORE_USER instead of kmem_cache_debug(), since that should be the only context in which we need the list_lock for add_full(). Signed-off-by: Abel Wu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: David Rientjes <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: Liu Xiang <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/slub: fix missing ALLOC_SLOWPATH stat when bulk allocAbel Wu1-1/+2
The ALLOC_SLOWPATH statistics is missing in bulk allocation now. Fix it by doing statistics in alloc slow path. Signed-off-by: Abel Wu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Pekka Enberg <[email protected]> Acked-by: David Rientjes <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: Hewenliang <[email protected]> Cc: Hu Shiyuan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/slub.c: branch optimization in free slowpathAbel Wu1-11/+12
The two conditions are mutually exclusive and gcc compiler will optimise this into if-else-like pattern. Given that the majority of free_slowpath is free_frozen, let's provide some hint to the compilers. Tests (perf bench sched messaging -g 20 -l 400000, executed 10x after reboot) are done and the summarized result: un-patched patched max. 192.316 189.851 min. 187.267 186.252 avg. 189.154 188.086 stdev. 1.37 0.99 Signed-off-by: Abel Wu <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Christoph Lameter <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: David Rientjes <[email protected]> Cc: Joonsoo Kim <[email protected]> Cc: Hewenliang <[email protected]> Cc: Hu Shiyuan <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13include/linux/slab.h: fix a typo error in commenttangjianqiang1-1/+1
fix a typo error in slab.h "allocagtor" -> "allocator" Signed-off-by: tangjianqiang <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Souptick Joarder <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-13mm/slab.c: clean code by removing redundant if conditionMateusz Nosek1-2/+0
The removed code was unnecessary and changed nothing in the flow, since in case of returning NULL by 'kmem_cache_alloc_node' returning 'freelist' from the function in question is the same as returning NULL. Signed-off-by: Mateusz Nosek <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Andrew Morton <[email protected]> Cc: Christoph Lameter <[email protected]> Cc: Pekka Enberg <[email protected]> Cc: David Rientjes <[email protected]> Cc: Joonsoo Kim <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>