blaster4385/linux-IllusionX - Linux kernel with personal config changes for arch linux

Age	Commit message (Collapse)	Author	Files	Lines
2020-10-16	mm, hwpoison: remove recalculating hpage	Naoya Horiguchi	1	-5/+1
	hpage is never used after try_to_split_thp_page() in memory_failure(), so we don't have to update hpage. So let's not recalculate/use hpage. Suggested-by: "Aneesh Kumar K.V" <[email protected]> Signed-off-by: Naoya Horiguchi <[email protected]> Signed-off-by: Oscar Salvador <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Mike Kravetz <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: Aristeu Rozanski <[email protected]> Cc: Dave Hansen <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Dmitry Yakunin <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Oscar Salvador <[email protected]> Cc: Qian Cai <[email protected]> Cc: Tony Luck <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm,hwpoison: cleanup unused PageHuge() check	Naoya Horiguchi	1	-4/+1
	Patch series "HWPOISON: soft offline rework", v7. This patchset fixes a couple of issues that the patchset Naoya sent [1] contained due to rebasing problems and a misunterdansting. Main focus of this series is to stabilize soft offline. Historically soft offlined pages have suffered from racy conditions because PageHWPoison is used to a little too aggressively, which (directly or indirectly) invades other mm code which cares little about hwpoison. This results in unexpected behavior or kernel panic, which is very far from soft offline's "do not disturb userspace or other kernel component" policy. An example of this can be found here [2]. Along with several cleanups, this code refactors and changes the way soft offline work. Main point of this change set is to contain target page "via buddy allocator" or in migrating path. For ther former we first free the target page as we do for normal pages, and once it has reached buddy and it has been taken off the freelists, we flag it as HWpoison. For the latter we never get to release the page in unmap_and_move, so the page is under our control and we can handle it in hwpoison code. [1] https://patchwork.kernel.org/cover/11704083/ [2] https://lore.kernel.org/linux-mm/20190826104144.GA7849@linux/T/#u This patch (of 14): Drop the PageHuge check, which is dead code since memory_failure() forks into memory_failure_hugetlb() for hugetlb pages. memory_failure() and memory_failure_hugetlb() shares some functions like hwpoison_user_mappings() and identify_page_state(), so they should properly handle 4kB page, thp, and hugetlb. Signed-off-by: Naoya Horiguchi <[email protected]> Signed-off-by: Oscar Salvador <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Mike Kravetz <[email protected]> Cc: Michal Hocko <[email protected]> Cc: Tony Luck <[email protected]> Cc: David Hildenbrand <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: Dmitry Yakunin <[email protected]> Cc: Qian Cai <[email protected]> Cc: Dave Hansen <[email protected]> Cc: "Aneesh Kumar K.V" <[email protected]> Cc: Aristeu Rozanski <[email protected]> Cc: Oscar Salvador <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm/readahead: pass a file_ra_state into force_page_cache_ra	David Howells	2	-5/+5
	The file_ra_state being passed into page_cache_sync_readahead() was being ignored in favour of using the one embedded in the struct file. The only caller for which this makes a difference is the fsverity code if the file has been marked as POSIX_FADV_RANDOM, but it's confusing and worth fixing. Signed-off-by: David Howells <[email protected]> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Eric Biggers <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm/filemap: fold ra_submit into do_sync_mmap_readahead	David Howells	2	-15/+5
	Fold ra_submit() into its last remaining user and pass the readahead_control struct to both do_page_cache_ra() and page_cache_sync_ra(). Signed-off-by: David Howells <[email protected]> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Eric Biggers <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm/readahead: add page_cache_sync_ra and page_cache_async_ra	Matthew Wilcox (Oracle)	2	-56/+66
	Reimplement page_cache_sync_readahead() and page_cache_async_readahead() as wrappers around versions of the function which take a readahead_control in preparation for making do_sync_mmap_readahead() pass down an RAC struct. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: David Howells <[email protected]> Cc: Eric Biggers <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm/readahead: pass readahead_control to force_page_cache_ra	David Howells	2	-12/+19
	Reimplement force_page_cache_readahead() as a wrapper around force_page_cache_ra(). Pass the existing readahead_control from page_cache_sync_readahead(). Signed-off-by: David Howells <[email protected]> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Eric Biggers <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm/readahead: make ondemand_readahead take a readahead_control	David Howells	1	-12/+17
	Make ondemand_readahead() take a readahead_control struct in preparation for making do_sync_mmap_readahead() pass down an RAC struct. Signed-off-by: David Howells <[email protected]> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Eric Biggers <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm/readahead: make do_page_cache_ra take a readahead_control	Matthew Wilcox (Oracle)	2	-19/+20
	Rename __do_page_cache_readahead() to do_page_cache_ra() and call it directly from ondemand_readahead() instead of indirecting via ra_submit(). Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: David Howells <[email protected]> Cc: Eric Biggers <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm/readahead: make page_cache_ra_unbounded take a readahead_control	Matthew Wilcox (Oracle)	4	-23/+20
	Define it in the callers instead of in page_cache_ra_unbounded(). Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: David Howells <[email protected]> Cc: Eric Biggers <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm/readahead: add DEFINE_READAHEAD	Matthew Wilcox (Oracle)	2	-5/+8
	Patch series "Readahead patches for 5.9/5.10". These are infrastructure for both the THP patchset and for the fscache rewrite, For both pieces of infrastructure being build on top of this patchset, we want the ractl to be available higher in the call-stack. For David's work, he wants to add the 'critical page' to the ractl so that he knows which page NEEDS to be brought in from storage, and which ones are nice-to-have. We might want something similar in block storage too. It used to be simple -- the first page was the critical one, but then mmap added fault-around and so for that usecase, the middle page is the critical one. Anyway, I don't have any code to show that yet, we just know that the lowest point in the callchain where we have that information is do_sync_mmap_readahead() and so the ractl needs to start its life there. For THP, we havew the code that needs it. It's actually the apex patch to the series; the one which finally starts to allocate THPs and present them to consenting filesystems: http://git.infradead.org/users/willy/pagecache.git/commitdiff/798bcf30ab2eff278caad03a9edca74d2f8ae760 This patch (of 8): Allow for a more concise definition of a struct readahead_control. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Eric Biggers <[email protected]> Cc: David Howells <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm: fix a race during THP splitting	Huang Ying	1	-6/+7
	It is reported that the following bug is triggered if the HDD is used as swap device, [ 5758.157556] BUG: kernel NULL pointer dereference, address: 0000000000000007 [ 5758.165331] #PF: supervisor write access in kernel mode [ 5758.171161] #PF: error_code(0x0002) - not-present page [ 5758.176894] PGD 0 P4D 0 [ 5758.179721] Oops: 0002 [#1] SMP PTI [ 5758.183614] CPU: 10 PID: 316 Comm: kswapd1 Kdump: loaded Tainted: G S --------- --- 5.9.0-0.rc3.1.tst.el8.x86_64 #1 [ 5758.196717] Hardware name: Intel Corporation S2600CP/S2600CP, BIOS SE5C600.86B.02.01.0002.082220131453 08/22/2013 [ 5758.208176] RIP: 0010:split_swap_cluster+0x47/0x60 [ 5758.213522] Code: c1 e3 06 48 c1 eb 0f 48 8d 1c d8 48 89 df e8 d0 20 6a 00 80 63 07 fb 48 85 db 74 16 48 89 df c6 07 00 66 66 66 90 31 c0 5b c3 <80> 24 25 07 00 00 00 fb 31 c0 5b c3 b8 f0 ff ff ff 5b c3 66 0f 1f [ 5758.234478] RSP: 0018:ffffb147442d7af0 EFLAGS: 00010246 [ 5758.240309] RAX: 0000000000000000 RBX: 000000000014b217 RCX: ffffb14779fd9000 [ 5758.248281] RDX: 000000000014b217 RSI: ffff9c52f2ab1400 RDI: 000000000014b217 [ 5758.256246] RBP: ffffe00c51168080 R08: ffffe00c5116fe08 R09: ffff9c52fffd3000 [ 5758.264208] R10: ffffe00c511537c8 R11: ffff9c52fffd3c90 R12: 0000000000000000 [ 5758.272172] R13: ffffe00c51170000 R14: ffffe00c51170000 R15: ffffe00c51168040 [ 5758.280134] FS: 0000000000000000(0000) GS:ffff9c52f2a80000(0000) knlGS:0000000000000000 [ 5758.289163] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5758.295575] CR2: 0000000000000007 CR3: 0000000022a0e003 CR4: 00000000000606e0 [ 5758.303538] Call Trace: [ 5758.306273] split_huge_page_to_list+0x88b/0x950 [ 5758.311433] deferred_split_scan+0x1ca/0x310 [ 5758.316202] do_shrink_slab+0x12c/0x2a0 [ 5758.320491] shrink_slab+0x20f/0x2c0 [ 5758.324482] shrink_node+0x240/0x6c0 [ 5758.328469] balance_pgdat+0x2d1/0x550 [ 5758.332652] kswapd+0x201/0x3c0 [ 5758.336157] ? finish_wait+0x80/0x80 [ 5758.340147] ? balance_pgdat+0x550/0x550 [ 5758.344525] kthread+0x114/0x130 [ 5758.348126] ? kthread_park+0x80/0x80 [ 5758.352214] ret_from_fork+0x22/0x30 [ 5758.356203] Modules linked in: fuse zram rfkill sunrpc intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp mgag200 iTCO_wdt crct10dif_pclmul iTCO_vendor_support drm_kms_helper crc32_pclmul ghash_clmulni_intel syscopyarea sysfillrect sysimgblt fb_sys_fops cec rapl joydev intel_cstate ipmi_si ipmi_devintf drm intel_uncore i2c_i801 ipmi_msghandler pcspkr lpc_ich mei_me i2c_smbus mei ioatdma ip_tables xfs libcrc32c sr_mod sd_mod cdrom t10_pi sg igb ahci libahci i2c_algo_bit crc32c_intel libata dca wmi dm_mirror dm_region_hash dm_log dm_mod [ 5758.412673] CR2: 0000000000000007 [ 0.000000] Linux version 5.9.0-0.rc3.1.tst.el8.x86_64 ([email protected]) (gcc (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5), GNU ld version 2.30-79.el8) #1 SMP Wed Sep 9 16:03:34 EDT 2020 After further digging it's found that the following race condition exists in the original implementation, CPU1 CPU2 ---- ---- deferred_split_scan() split_huge_page(page) /* page isn't compound head / split_huge_page_to_list(page, NULL) __split_huge_page(page, ) ClearPageCompound(head) / unlock all subpages except page (not head) / add_to_swap(head) / not THP / get_swap_page(head) add_to_swap_cache(head, ) SetPageSwapCache(head) if PageSwapCache(head) split_swap_cluster(/ swap entry of head /) / Deref sis->cluster_info: NULL accessing! */ So, in split_huge_page_to_list(), PageSwapCache() is called for the already split and unlocked "head", which may be added to swap cache in another CPU. So split_swap_cluster() may be called wrongly. To fix the race, the call to split_swap_cluster() is moved to __split_huge_page() before all subpages are unlocked. So that the PageSwapCache() is stable. Fixes: 59807685a7e77 ("mm, THP, swap: support splitting THP for THP swap out") Reported-by: Rafael Aquini <[email protected]> Signed-off-by: "Huang, Ying" <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Tested-by: Rafael Aquini <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Andrea Arcangeli <[email protected]> Cc: Matthew Wilcox <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	fs: do not update nr_thps for mappings which support THPs	Matthew Wilcox (Oracle)	2	-27/+29
	The nr_thps counter is to support THPs in the page cache when the filesystem doesn't understand THPs. Eventually it will be removed, but we should still support filesystems which do not understand THPs yet. Move the nr_thp manipulation functions to filemap.h since they're page-cache specific. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Alexander Viro <[email protected]> Cc: "Matthew Wilcox (Oracle)" <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Song Liu <[email protected]> Cc: Rik van Riel <[email protected]> Cc: "Kirill A . Shutemov" <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Dave Chinner <[email protected]> Cc: Christoph Hellwig <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	fs: add a filesystem flag for THPs	Matthew Wilcox (Oracle)	4	-1/+10
	The page cache needs to know whether the filesystem supports THPs so that it doesn't send THPs to filesystems which can't handle them. Dave Chinner points out that getting from the page mapping to the filesystem type is too many steps (mapping->host->i_sb->s_type->fs_flags) so cache that information in the address space flags. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Alexander Viro <[email protected]> Cc: "Matthew Wilcox (Oracle)" <[email protected]> Cc: Hugh Dickins <[email protected]> Cc: Song Liu <[email protected]> Cc: Rik van Riel <[email protected]> Cc: "Kirill A . Shutemov" <[email protected]> Cc: Johannes Weiner <[email protected]> Cc: Dave Chinner <[email protected]> Cc: Christoph Hellwig <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm/vmscan: allow arbitrary sized pages to be paged out	Matthew Wilcox (Oracle)	1	-2/+1
	Remove the assumption that a compound page has HPAGE_PMD_NR pins from the page cache. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: SeongJae Park <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Acked-by: "Huang, Ying" <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm/page-writeback: support tail pages in wait_for_stable_page	Matthew Wilcox (Oracle)	1	-0/+1
	page->mapping is undefined for tail pages, so operate exclusively on the head page. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: SeongJae Park <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: Huang Ying <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm/truncate: fix truncation for pages of arbitrary size	Matthew Wilcox (Oracle)	1	-3/+3
	Remove the assumption that a compound page is HPAGE_PMD_SIZE, and the assumption that any page is PAGE_SIZE. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: SeongJae Park <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: Huang Ying <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm/rmap: fix assumptions of THP size	Matthew Wilcox (Oracle)	1	-5/+5
	Ask the page what size it is instead of assuming it's PMD size. Do this for anon pages as well as file pages for when someone decides to support that. Leave the assumption alone for pages which are PMD mapped; we don't currently grow THPs beyond PMD size, so we don't need to change this code yet. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: SeongJae Park <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: Huang Ying <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm/huge_memory: fix can_split_huge_page assumption of THP size	Matthew Wilcox (Oracle)	1	-2/+2
	Ask the page how many subpages it has instead of assuming it's PMD size. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: SeongJae Park <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Acked-by: "Huang, Ying" <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm/huge_memory: fix page_trans_huge_mapcount assumption of THP size	Matthew Wilcox (Oracle)	1	-2/+2
	Ask the page what size it is instead of assuming it's PMD size. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: SeongJae Park <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: Huang Ying <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm/huge_memory: fix split assumption of page size	Kirill A. Shutemov	1	-7/+8
	File THPs may now be of arbitrary size, and we can't rely on that size after doing the split so remember the number of pages before we start the split. Signed-off-by: Kirill A. Shutemov <[email protected]> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: SeongJae Park <[email protected]> Cc: Huang Ying <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm/huge_memory: fix total_mapcount assumption of page size	Kirill A. Shutemov	1	-4/+5
	File THPs may now be of arbitrary order. Signed-off-by: Kirill A. Shutemov <[email protected]> Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: SeongJae Park <[email protected]> Cc: Huang Ying <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm/page_owner: change split_page_owner to take a count	Matthew Wilcox (Oracle)	4	-7/+7
	The implementation of split_page_owner() prefers a count rather than the old order of the page. When we support a variable size THP, we won't have the order at this point, but we will have the number of pages. So change the interface to what the caller and callee would prefer. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: SeongJae Park <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: Huang Ying <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm/memory: remove page fault assumption of compound page size	Matthew Wilcox (Oracle)	1	-3/+4
	A compound page in the page cache will not necessarily be of PMD size, so check explicitly. [[email protected]: fix remove page fault assumption of compound page size] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Huang Ying <[email protected]> Cc: Kirill A. Shutemov <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm/filemap: fix page cache removal for arbitrary sized THPs	Matthew Wilcox (Oracle)	1	-1/+1
	Patch series "Remove assumptions of THP size". There are a number of places in the VM which assume that a THP is a PMD in size. That's true today, and remains true after this patch series, but this is a prerequisite for switching to arbitrary-sized THPs. thp_nr_pages() still returns either HPAGE_PMD_NR or 1, but will be changed later. This patch (of 11): page_cache_free_page() assumes THPs are PMD_SIZE; fix that assumption. Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Acked-by: Kirill A. Shutemov <[email protected]> Cc: Huang Ying <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm/filemap: fix storing to a THP shadow entry	Matthew Wilcox (Oracle)	1	-11/+31
	When a THP is removed from the page cache by reclaim, we replace it with a shadow entry that occupies all slots of the XArray previously occupied by the THP. If the user then accesses that page again, we only allocate a single page, but storing it into the shadow entry replaces all entries with that one page. That leads to bugs like page dumped because: VM_BUG_ON_PAGE(page_to_pgoff(page) != offset) ------------[ cut here ]------------ kernel BUG at mm/filemap.c:2529! https://bugzilla.kernel.org/show_bug.cgi?id=206569 This is hard to reproduce with mainline, but happens regularly with the THP patchset (as so many more THPs are created). This solution is take from the THP patchset. It splits the shadow entry into order-0 pieces at the time that we bring a new page into cache. Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS") Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Song Liu <[email protected]> Cc: "Kirill A . Shutemov" <[email protected]> Cc: Qian Cai <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	XArray: add xas_split	Matthew Wilcox (Oracle)	4	-16/+225
	In order to use multi-index entries for huge pages in the page cache, we need to be able to split a multi-index entry (eg if a file is truncated in the middle of a huge page entry). This version does not support splitting more than one level of the tree at a time. This is an acceptable limitation for the page cache as we do not expect to support order-12 pages in the near future. [[email protected]: export xas_split_alloc() to modules] [[email protected]: fix xarray split] Link: https://lkml.kernel.org/r/[email protected] [[email protected]: fix xarray] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: "Kirill A . Shutemov" <[email protected]> Cc: Qian Cai <[email protected]> Cc: Song Liu <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	XArray: add xa_get_order	Matthew Wilcox (Oracle)	3	-0/+70
	Patch series "Fix read-only THP for non-tmpfs filesystems". As described more verbosely in the [3/3] changelog, we can inadvertently put an order-0 page in the page cache which occupies 512 consecutive entries. Users are running into this if they enable the READ_ONLY_THP_FOR_FS config option; see https://bugzilla.kernel.org/show_bug.cgi?id=206569 and Qian Cai has also reported it here: https://lore.kernel.org/lkml/[email protected]/ This is a rather intrusive way of fixing the problem, but has the advantage that I've actually been testing it with the THP patches, which means that it sees far more use than it does upstream -- indeed, Song has been entirely unable to reproduce it. It also has the advantage that it removes a few patches from my gargantuan backlog of THP patches. This patch (of 3): This function returns the order of the entry at the index. We need this because there isn't space in the shadow entry to encode its order. [[email protected]: export xa_get_order to modules] Signed-off-by: Matthew Wilcox (Oracle) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: "Kirill A . Shutemov" <[email protected]> Cc: Qian Cai <[email protected]> Cc: Song Liu <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm/debug_vm_pgtable: avoid doing memory allocation with pgtable_t mapped.	Aneesh Kumar K.V	1	-3/+8
	With highmem, pte_alloc_map() keep the level4 page table mapped using kmap_atomic(). Avoid doing new memory allocation with page table mapped like above. [ 9.409233] BUG: sleeping function called from invalid context at mm/page_alloc.c:4822 [ 9.410557] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper [ 9.411932] no locks held by swapper/1. [ 9.412595] CPU: 0 PID: 1 Comm: swapper Not tainted 5.9.0-rc3-00323-gc50eb1ed654b5 #2 [ 9.413824] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 [ 9.415207] Call Trace: [ 9.415651] ? ___might_sleep.cold+0xa7/0xcc [ 9.416367] ? __alloc_pages_nodemask+0x14c/0x5b0 [ 9.417055] ? swap_migration_tests+0x50/0x293 [ 9.417704] ? debug_vm_pgtable+0x4bc/0x708 [ 9.418287] ? swap_migration_tests+0x293/0x293 [ 9.418911] ? do_one_initcall+0x82/0x3cb [ 9.419465] ? parse_args+0x1bd/0x280 [ 9.419983] ? rcu_read_lock_sched_held+0x36/0x60 [ 9.420673] ? trace_initcall_level+0x1f/0xf3 [ 9.421279] ? trace_initcall_level+0xbd/0xf3 [ 9.421881] ? do_basic_setup+0x9d/0xdd [ 9.422410] ? do_basic_setup+0xc3/0xdd [ 9.422938] ? kernel_init_freeable+0x72/0xa3 [ 9.423539] ? rest_init+0x134/0x134 [ 9.424055] ? kernel_init+0x5/0x12c [ 9.424574] ? ret_from_fork+0x19/0x30 Reported-by: kernel test robot <[email protected]> Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm/debug_vm_pgtable: avoid none pte in pte_clear_test	Aneesh Kumar K.V	1	-3/+6
	pte_clear_tests operate on an existing pte entry. Make sure that is not a none pte entry. [[email protected]: avoid kernel crash with riscv] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Anshuman Khandual <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Nathan Chancellor <[email protected]> Cc: Guenter Roeck <[email protected]> Cc: Paul Walmsley <[email protected]> Cc: Albert Ou <[email protected]> Cc: Palmer Dabbelt <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm/debug_vm_pgtable/hugetlb: disable hugetlb test on ppc64	Aneesh Kumar K.V	1	-51/+0
	The seems to be missing quite a lot of details w.r.t allocating the correct pgtable_t page (huge_pte_alloc()), holding the right lock (huge_pte_lock()) etc. The vma used is also not a hugetlb VMA. ppc64 do have runtime checks within CONFIG_DEBUG_VM for most of these. Hence disable the test on ppc64. [[email protected]: drop hugetlb_advanced_tests()] Link: https://lore.kernel.org/lkml/[email protected]/#t Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Anshuman Khandual <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Michael Ellerman <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm/debug_vm_pgtable/pmd_clear: don't use pmd/pud_clear on pte entries	Aneesh Kumar K.V	1	-3/+4
	pmd_clear() should not be used to clear pmd level pte entries. Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Michael Ellerman <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm/debug_vm_pgtable/thp: use page table depost/withdraw with THP	Aneesh Kumar K.V	1	-3/+7
	Architectures like ppc64 use deposited page table while updating the huge pte entries. Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Anshuman Khandual <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Michael Ellerman <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm/debug_vm_pgtable/locks: take correct page table lock	Aneesh Kumar K.V	1	-13/+22
	Make sure we call pte accessors with correct lock held. Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Michael Ellerman <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm/debug_vm_pgtable/locks: move non page table modifying test together	Aneesh Kumar K.V	1	-23/+28
	This will help in adding proper locks in a later patch Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Michael Ellerman <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm/debug_vm_pgtable/set_pte/pmd/pud: don't use set_*_at to update an ↵	Aneesh Kumar K.V	1	-20/+15
	existing pte entry set_pte_at() should not be used to set a pte entry at locations that already holds a valid pte entry. Architectures like ppc64 don't do TLB invalidate in set_pte_at() and hence expect it to be used to set locations that are not a valid PTE. Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Aneesh Kumar K.V <[email protected]> Reviewed-by: Anshuman Khandual <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Michael Ellerman <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm/debug_vm_pgtable/savedwrite: enable savedwrite test with ↵	Aneesh Kumar K.V	1	-2/+9
	CONFIG_NUMA_BALANCING Saved write support was added to track the write bit of a pte after marking the pte protnone. This was done so that AUTONUMA can convert a write pte to protnone and still track the old write bit. When converting it back we set the pte write bit correctly thereby avoiding a write fault again. Hence enable the test only when CONFIG_NUMA_BALANCING is enabled and use protnone protflags. Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Anshuman Khandual <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Michael Ellerman <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm/debug_vm_pgtables/hugevmap: use the arch helper to identify huge vmap ↵	Aneesh Kumar K.V	1	-2/+12
	support. ppc64 supports huge vmap only with radix translation. Hence use arch helper to determine the huge vmap support. Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Anshuman Khandual <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Michael Ellerman <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	mm/debug_vm_pgtable/ppc64: avoid setting top bits in radom value	Aneesh Kumar K.V	1	-3/+10
	ppc64 use bit 62 to indicate a pte entry (_PAGE_PTE). Avoid setting that bit in random value. Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Anshuman Khandual <[email protected]> Cc: Christophe Leroy <[email protected]> Cc: Michael Ellerman <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	powerpc/mm: move setting pte specific flags to pfn_pte	Aneesh Kumar K.V	3	-16/+9
	powerpc used to set the pte specific flags in set_pte_at(). This is different from other architectures. To be consistent with other architecture update pfn_pte to set _PAGE_PTE on ppc64. Also, drop now unused pte_mkpte. We add a VM_WARN_ON() to catch the usage of calling set_pte_at() without setting _PAGE_PTE bit. We will remove that after a few releases. With respect to huge pmd entries, pmd_mkhuge() takes care of adding the _PAGE_PTE bit. [[email protected]: whitespace fix, per Christophe] Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: Christophe Leroy <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Michael Ellerman <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	powerpc/mm: add DEBUG_VM WARN for pmd_clear	Aneesh Kumar K.V	1	-0/+14
	Patch series "mm/debug_vm_pgtable fixes", v4. This patch series includes fixes for debug_vm_pgtable test code so that they follow page table updates rules correctly. The first two patches introduce changes w.r.t ppc64. Hugetlb test is disabled on ppc64 because that needs larger change to satisfy page table update rules. These tests are broken w.r.t page table update rules and results in kernel crash as below. [ 21.083519] kernel BUG at arch/powerpc/mm/pgtable.c:304! cpu 0x0: Vector: 700 (Program Check) at [c000000c6d1e76c0] pc: c00000000009a5ec: assert_pte_locked+0x14c/0x380 lr: c0000000005eeeec: pte_update+0x11c/0x190 sp: c000000c6d1e7950 msr: 8000000002029033 current = 0xc000000c6d172c80 paca = 0xc000000003ba0000 irqmask: 0x03 irq_happened: 0x01 pid = 1, comm = swapper/0 kernel BUG at arch/powerpc/mm/pgtable.c:304! [link register ] c0000000005eeeec pte_update+0x11c/0x190 [c000000c6d1e7950] 0000000000000001 (unreliable) [c000000c6d1e79b0] c0000000005eee14 pte_update+0x44/0x190 [c000000c6d1e7a10] c000000001a2ca9c pte_advanced_tests+0x160/0x3d8 [c000000c6d1e7ab0] c000000001a2d4fc debug_vm_pgtable+0x7e8/0x1338 [c000000c6d1e7ba0] c0000000000116ec do_one_initcall+0xac/0x5f0 [c000000c6d1e7c80] c0000000019e4fac kernel_init_freeable+0x4dc/0x5a4 [c000000c6d1e7db0] c000000000012474 kernel_init+0x24/0x160 [c000000c6d1e7e20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c With DEBUG_VM disabled [ 20.530152] BUG: Kernel NULL pointer dereference on read at 0x00000000 [ 20.530183] Faulting instruction address: 0xc0000000000df330 cpu 0x33: Vector: 380 (Data SLB Access) at [c000000c6d19f700] pc: c0000000000df330: memset+0x68/0x104 lr: c00000000009f6d8: hash__pmdp_huge_get_and_clear+0xe8/0x1b0 sp: c000000c6d19f990 msr: 8000000002009033 dar: 0 current = 0xc000000c6d177480 paca = 0xc00000001ec4f400 irqmask: 0x03 irq_happened: 0x01 pid = 1, comm = swapper/0 [link register ] c00000000009f6d8 hash__pmdp_huge_get_and_clear+0xe8/0x1b0 [c000000c6d19f990] c00000000009f748 hash__pmdp_huge_get_and_clear+0x158/0x1b0 (unreliable) [c000000c6d19fa10] c0000000019ebf30 pmd_advanced_tests+0x1f0/0x378 [c000000c6d19fab0] c0000000019ed088 debug_vm_pgtable+0x79c/0x1244 [c000000c6d19fba0] c0000000000116ec do_one_initcall+0xac/0x5f0 [c000000c6d19fc80] c0000000019a4fac kernel_init_freeable+0x4dc/0x5a4 [c000000c6d19fdb0] c000000000012474 kernel_init+0x24/0x160 [c000000c6d19fe20] c00000000000cbd0 ret_from_kernel_thread+0x5c/0x6c This patch (of 13): With the hash page table, the kernel should not use pmd_clear for clearing huge pte entries. Add a DEBUG_VM WARN to catch the wrong usage. Signed-off-by: Aneesh Kumar K.V <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Cc: Michael Ellerman <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: Aneesh Kumar K.V <[email protected]> Cc: Christophe Leroy <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	device-dax/kmem: fix resource release	Dan Williams	1	-14/+34
	The conversion to request_mem_region() is broken because it assumes that the range is marked busy prior to release. However, due to the way that the kmem driver manipulates the IORESOURCE_BUSY flag (clears it to let {add,remove}_memory() handle busy) it requires a manual release_resource() to perform cleanup. Given that the actual 'struct resource *' needs to be recalled, not just the range, add that tracking to the kmem driver-data. Fixes: 0513bd5bb114 ("device-dax/kmem: replace release_resource() with release_mem_region()") Reported-by: David Hildenbrand <[email protected]> Signed-off-by: Dan Williams <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Reviewed-by: David Hildenbrand <[email protected]> Cc: Vishal Verma <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Pavel Tatashin <[email protected]> Cc: Brice Goglin <[email protected]> Cc: Dave Jiang <[email protected]> Cc: Ira Weiny <[email protected]> Cc: Jia He <[email protected]> Cc: Joao Martins <[email protected]> Cc: Jonathan Cameron <[email protected]> Link: https://lkml.kernel.org/r/160272252925.3136502.17220638073995895400.stgit@dwillia2-desk3.amr.corp.intel.com Signed-off-by: Linus Torvalds <[email protected]>
2020-10-16	tracing: Remove __init from __trace_early_add_new_event()	Masami Hiramatsu	1	-1/+1
	The commit 720dee53ad8d ("tracing/boot: Initialize per-instance event list in early boot") removes __init from __trace_early_add_events() but __trace_early_add_new_event() still has __init and will cause a section mismatch. Remove __init from __trace_early_add_new_event() as same as __trace_early_add_events(). Link: https://lore.kernel.org/lkml/CAHk-=wjU86UhovK4XuwvCqTOfc+nvtpAuaN2PJBz15z=w=u0Xg@mail.gmail.com/ Reported-by: Linus Torvalds <[email protected]> Signed-off-by: Masami Hiramatsu <[email protected]> Signed-off-by: Steven Rostedt (VMware) <[email protected]>
2020-10-16	afs: Don't assert on unpurgeable server records	David Howells	2	-1/+8
	Don't give an assertion failure on unpurgeable afs_server records - which kills the thread - but rather emit a trace line when we are purging a record (which only happens during network namespace removal or rmmod) and print a notice of the problem. Signed-off-by: David Howells <[email protected]>
2020-10-16	afs: Add tracing for cell refcount and active user count	David Howells	10	-55/+208
	Add a tracepoint to log the cell refcount and active user count and pass in a reason code through various functions that manipulate these counters. Additionally, a helper function, afs_see_cell(), is provided to log interesting places that deal with a cell without actually doing any accounting directly. Signed-off-by: David Howells <[email protected]>
2020-10-16	afs: Fix cell removal	David Howells	2	-6/+11
	Fix cell removal by inserting a more final state than AFS_CELL_FAILED that indicates that the cell has been unpublished in case the manager is already requeued and will go through again. The new AFS_CELL_REMOVED state will just immediately leave the manager function. Going through a second time in the AFS_CELL_FAILED state will cause it to try to remove the cell again, potentially leading to the proc list being removed. Fixes: 989782dcdc91 ("afs: Overhaul cell database management") Reported-by: [email protected] Reported-by: [email protected] Reported-by: [email protected] Reported-by: [email protected] Reported-by: [email protected] Reported-by: [email protected] Suggested-by: Hillf Danton <[email protected]> Signed-off-by: David Howells <[email protected]> cc: Hillf Danton <[email protected]>
2020-10-16	afs: Fix cell purging with aliases	David Howells	1	-0/+3
	When the afs module is removed, one of the things that has to be done is to purge the cell database. afs_cell_purge() cancels the management timer and then starts the cell manager work item to do the purging. This does a single run through and then assumes that all cells are now purged - but this is no longer the case. With the introduction of alias detection, a later cell in the database can now be holding an active count on an earlier cell (cell->alias_of). The purge scan passes by the earlier cell first, but this can't be got rid of until it has discarded the alias. Ordinarily, afs_unuse_cell() would handle this by setting the management timer to trigger another pass - but afs_set_cell_timer() doesn't do anything if the namespace is being removed (net->live == false). rmmod then hangs in the wait on cells_outstanding in afs_cell_purge(). Fix this by making afs_set_cell_timer() directly queue the cell manager if net->live is false. This causes additional management passes. Queueing the cell manager increments cells_outstanding to make sure the wait won't complete until all cells are destroyed. Fixes: 8a070a964877 ("afs: Detect cell aliases 1 - Cells with root volumes") Signed-off-by: David Howells <[email protected]>
2020-10-16	afs: Fix cell refcounting by splitting the usage counter	David Howells	9	-76/+136
	Management of the lifetime of afs_cell struct has some problems due to the usage counter being used to determine whether objects of that type are in use in addition to whether anyone might be interested in the structure. This is made trickier by cell objects being cached for a period of time in case they're quickly reused as they hold the result of a setup process that may be slow (DNS lookups, AFS RPC ops). Problems include the cached root volume from alias resolution pinning its parent cell record, rmmod occasionally hanging and occasionally producing assertion failures. Fix this by splitting the count of active users from the struct reference count. Things then work as follows: (1) The cell cache keeps +1 on the cell's activity count and this has to be dropped before the cell can be removed. afs_manage_cell() tries to exchange the 1 to a 0 with the cells_lock write-locked, and if successful, the record is removed from the net->cells. (2) One struct ref is 'owned' by the activity count. That is put when the active count is reduced to 0 (final_destruction label). (3) A ref can be held on a cell whilst it is queued for management on a work queue without confusing the active count. afs_queue_cell() is added to wrap this. (4) The queue's ref is dropped at the end of the management. This is split out into a separate function, afs_manage_cell_work(). (5) The root volume record is put after a cell is removed (at the final_destruction label) rather then in the RCU destruction routine. (6) Volumes hold struct refs, but aren't active users. (7) Both counts are displayed in /proc/net/afs/cells. There are some management function changes: () afs_put_cell() now just decrements the refcount and triggers the RCU destruction if it becomes 0. It no longer sets a timer to have the manager do this. () afs_use_cell() and afs_unuse_cell() are added to increase and decrease the active count. afs_unuse_cell() sets the management timer. () afs_queue_cell() is added to queue a cell with approprate refs. There are also some other fixes: () Don't let /proc/net/afs/cells access a cell's vllist if it's NULL. () Make sure that candidate cells in lookups are properly destroyed rather than being simply kfree'd. This ensures the bits it points to are destroyed also. () afs_dec_cells_outstanding() is now called in cell destruction rather than at "final_destruction". This ensures that cell->net is still valid to the end of the destructor. (*) As a consequence of the previous two changes, move the increment of net->cells_outstanding that was at the point of insertion into the tree to the allocation routine to correctly balance things. Fixes: 989782dcdc91 ("afs: Overhaul cell database management") Signed-off-by: David Howells <[email protected]>
2020-10-16	afs: Fix rapid cell addition/removal by not using RCU on cells tree	David Howells	5	-93/+71
	There are a number of problems that are being seen by the rapidly mounting and unmounting an afs dynamic root with an explicit cell and volume specified (which should probably be rejected, but that's a separate issue): What the tests are doing is to look up/create a cell record for the name given and then tear it down again without actually using it to try to talk to a server. This is repeated endlessly, very fast, and the new cell collides with the old one if it's not quick enough to reuse it. It appears (as suggested by Hillf Danton) that the search through the RB tree under a read_seqbegin_or_lock() under RCU conditions isn't safe and that it's not blocking the write_seqlock(), despite taking two passes at it. He suggested that the code should take a ref on the cell it's attempting to look at - but this shouldn't be necessary until we've compared the cell names. It's possible that I'm missing a barrier somewhere. However, using an RCU search for this is overkill, really - we only need to access the cell name in a few places, and they're places where we're may end up sleeping anyway. Fix this by switching to an R/W semaphore instead. Additionally, draw the down_read() call inside the function (renamed to afs_find_cell()) since all the callers were taking the RCU read lock (or should've been[]). [] afs_probe_cell_name() should have been, but that doesn't appear to be involved in the bug reports. The symptoms of this look like: general protection fault, probably for non-canonical address 0xf27d208691691fdb: 0000 [#1] PREEMPT SMP KASAN KASAN: maybe wild-memory-access in range [0x93e924348b48fed8-0x93e924348b48fedf] ... RIP: 0010:strncasecmp lib/string.c:52 [inline] RIP: 0010:strncasecmp+0x5f/0x240 lib/string.c:43 afs_lookup_cell_rcu+0x313/0x720 fs/afs/cell.c:88 afs_lookup_cell+0x2ee/0x1440 fs/afs/cell.c:249 afs_parse_source fs/afs/super.c:290 [inline] ... Fixes: 989782dcdc91 ("afs: Overhaul cell database management") Reported-by: [email protected] Signed-off-by: David Howells <[email protected]> cc: Hillf Danton <[email protected]> cc: [email protected]
2020-10-16	PM / devfreq: remove a duplicated kernel-doc markup	Mauro Carvalho Chehab	1	-6/+1
	The update_devfreq() is also documented at devfreq.c, which has a more complete note. So, drop the duplicated markup, in order to avoid this warning: .../Documentation/driver-api/device_link.rst: WARNING: Duplicate C declaration, also defined in 'driver-api/infrastructure'. Declaration is 'device_link_state'. (and to cause a problem with cross-references to it) Signed-off-by: Mauro Carvalho Chehab <[email protected]>
2020-10-16	mm/doc: fix a literal block markup	Mauro Carvalho Chehab	1	-1/+1
	Literal blocks with :: markup should be indented, as otherwise Sphinx complains: Documentation/vm/hmm.rst:363: WARNING: Literal block expected; none found. Fixes: f7ebd9ed7767 ("mm/doc: add usage description for migrate_vma_*()") Signed-off-by: Mauro Carvalho Chehab <[email protected]>